How to Gain Observability into Your CI/CD Pipeline

By: Dotan Horovits

We all know that observability is a must-have for operating systems in production. But we often neglect our own backyard — our software release process.

We noticed we made that mistake here at Logz.io. We were wasting time and energy in handling failures in the CI/CD pipeline, and made our Developer-on-Duty (DoD) shifts tedious. That’s why it’s critical to incorporate your observability practices into your CI/CD pipeline.

Some CI/CD tools provide some observability capabilities out of the box. At Logz.io, we use Jenkins and have explored its capabilities and plugins in that area. Jenkins lets you enter into individual runs and see how that run went.

But often, it’s not enough when you wish to monitor aggregated information from all the pipeline’s runs, across all branches and machines, with your own filters and time ranges to really understand the patterns.

We found basic aggregative questions tricky or cumbersome to answer, such as:

Did all runs fail on the same step?
Did all runs fail for the same reason?
Did the failure occur only in a specific branch?
Did the failure occur on a specific machine?
Which fail the most?
What’s the normal run time for identifying outliers?

If you also exhausted the built-in observability capabilities of your CI/CD tool, it’s time to set up proper observability – just like you have for your Production environment, with a dedicated monitoring and observability setup.

You can achieve observability into your CI/CD pipeline in four steps. In this longform guide, Fighting Slow and Flaky CI/CD Pipelines Starts with Observability, I use Jenkins as the reference tool, as many know this popular open source project, and as in my company we’ve used it extensively.

But even if you’re using other tools, you’ll find much of that largely applicable. In order to achieve observability into your CI/CD pipeline, you’ll need to:

Collect data on CI/CD pipeline runs
Index and store the data for fast query and retrieval
Visualize the data with custom dashboards
Build reports and set alert rules on the data

Investing in good CI/CD observability will pay off with a significant improvement in your Lead Time for Changes, effectively shortening the cycle time it takes a commit to reach production.

Learn More at My Continuous Delivery Summit Talk

Want to learn more about this topic? If you’re attending KubeCon North America, I’ll be discussing this and more at the Continuous Delivery Summit — co-located at KubCon in Detroit — on Tuesday, Oct. 25 from 9:25 to 9:55 am

At this talk, I’ll like to share how we built effective observability into our Jenkins pipeline using intelligent data collection, dashboarding and alerting, to boost our response to failures and improve our quality of life on the way.

This talk will give practical guidance on how to improve observability into your CI/CD pipeline, as well as open source tools to consider. Whether you use Jenkins like we do, or other CI/CD tools, you’ll learn how to augment them and reach higher productivity.

Add my talk to your Continuous Delivery Summit schedule now. Hope to see you there!