Building a Continuous Integration Pipeline: Decisions and Considerations
Back in the day, software releases were very special events. It was not uncommon for a release to combine the efforts of several people and multiple machines just to make the binaries. When there is one person responsible for the backend, one for the UI, one for the installer, and one for putting the data together, you can imagine the process taking several days.
When a bug is discovered, at least some steps of the process will need to be performed again. Depending on the size of the project, it can take anywhere from days to weeks of concentrated effort just to push the next version of the software to customers.
Continuous integration (CI) cures us of at least some of the headaches that this scenario causes. The main purpose of CI is to merge and integrate all code of all developers several times per day. This is achieved using unit tests (for checking each component individually) and integration tests (to test the whole — or subsets — of the product together).
There are multiple ways of ensuring that components integrate well — some examples are smoke tests and end-to-end tests. The basic premise is first to test the components on a unit level. This helps to isolate problems that would be difficult to debug if they were discovered at a later stage.
Release Early, Release Often
In the scenario above, each release cycle would have combined multiple changes to multiple components. Afterward, they would be tested. When a bug is discovered, it is difficult not only to track down which component is behaving incorrectly but also to find which change caused the malfunction.
By modularization and keeping focused on the unit and integration testing, CI makes small releases possible. Debugging is far easier this way, and an automated gating solution may discard such changes until they pass all required quality checks. This way, the code always complies and is always ready to be shipped right away.
An interesting byproduct of a good CI workflow is that the build process itself gets documented in the code. All CI tools require a description of steps necessary to build a component and test it. As I will show you in the next section, even planning for CI can lead to better quality of your code and processes.
What Tools Are Out There?
We are accustomed to perpetual software wars. Operating systems, web browsers, and editors are aplenty, and they all have their loyal supporters and fierce opponents. CI is no different. There are some strong names in the market, but it seems that we have one that is everywhere: Jenkins.
A short history lesson. Jenkins started as Hudson, which was developed on Sun. After Oracle acquired Sun, Oracle claimed Hudson’s trademark name — an action that obviously pissed off the open source community, which voted to rename of the project. And Jenkins was born.
Both projects are being actively developed (although Jenkins has been accepted more widely within the community). Which one is the fork is still an arguable question.
Whether you love it or hate it, chances are you will or have used Jenkins if you require an on-premise CI system. Some other competitors worth mentioning in the on-premise court are Teamcity, Bamboo, GitLab, Buildbot, and GoCD. Of course, there are some really good SaaS alternatives that usually pretty fast for relatively simple build process such as TravisCI, CircleCI, and GitLab (again using its SaaS offering).
For the sake of simplicity, we will illustrate our examples with the Jenkins way of thinking. The implementation may differ, but the architecture mostly stays the same with CI systems that consist of:
- A master server or other central location where all the information is stored
- Build slaves (also called build agents) that do the actual work and may be configured for different needs (such as a Ruby slave, Java slave, or Nvidia GPU slave)
Prepare Your CI Process
Previously, we mentioned that just planning for CI can be beneficial to the quality of your project. This is because you first need to answer some questions that can give you better insights into the actual state of the project. Here are some examples:
- Does a build require specialized hardware?
If you are using one dedicated build machine, ask yourself, ”Why?” Is it because the configuration is really hard? If so, you should make all the effort to document the configuration before your dedicated machine stops functioning properly.
Maybe it is because your build process requires some specialized hardware? At this point, it is worth asking whether such hardware is needed in build-time as well or maybe only at run-time. Maybe its use could be split to a separate component that would be tied to a build agent on such a dedicated machine?
- Does it require a specific system configuration?
The necessary configuration to perform a build should be documented — or better yet, automated — so that you can apply it to build agents. What is stopping you from making configuration automation part of a build?
- Does a build require access to external services or credentials?
The master server needs to store at least SCM credentials to check out the source code. Other kinds of secrets may, for example, be access and secret keys for any third-party integrations that you need to test. Most CI tools come with a secret store so that you can configure it once and have Jenkins, for example, manage the encryption when you need it.
- Can a build be achieved without resorting to GUI?
As long as your project build uses command line tools, all automation should be easy. The trouble starts when GUI tools are needed as build-slaves have to be provisioned accordingly. In some cases, resources for such builds may be higher and automation will not be as easy. Consider again whether GUI tools are really needed. Perhaps there is an alternative?
- Do you have automated tests?
What we mentioned so far covers the build part, but continuous integration brings the greatest number of benefits when automated testing is used. Unit tests for components along with functional tests for integration are probably the best combination. One without the other may not give a full picture. When in doubt, consult the Test Pyramid.
Keep it Up
The final part of a CI journey is to make sure that the tests are relevant and passing. Make sure that they don’t fall below a set threshold, and try to add more cases if you feel the need for them. Remember: Continuous integration is a process and a state of mind, not a feature.
More on the subject:
Once you feel confident with how your CI pipeline performs, you may consider taking the next step into continuous deployment (CD). Although this is more of a cultural transition rather than a technical one, CD has its own set of challenges and is more complex “under the hood” than it might appear.
But more on that to come!
Get started for free
Completely free for 14 days, no strings attached.