Monitoring is often not the first thing on the mind of the modern developer. Yet, it’s necessary at many points of the software development lifecycle, including: before deprecating an API, before launching a new feature, after launching the feature, and more. In fact, monitoring needs can vary much more than the classic Ops monitoring.
My podcast guest Liran Haimovitch is the co-founder and CTO of Rookout, a live data collection and debugging platform. He’s an observability and instrumentation expert with deep understanding of Java, Python, Node, and C++, as well as having broad experience in cybersecurity and compliance in his past roles. He’s also a fellow podcaster, and you can check out his podcast here.
On this episode of OpenObservability Talks, we discussed what observability means for developers, how to determine what they should be monitoring, how observability fits into current dev tools and processes, and how observability can actually be fun for a community that doesn’t typically put a premium on it.
What Do Developers Need for Observability?
In Liran’s own words, great observability is about continuous feedback, and continuous learning, and continuous improvement, more than it is about any specific state an organization is trying to achieve. But how does observability differ for developers, compared to operations?
According to Liran, it’s about facilitating the day-to-day tasks of developers, and not just about identifying if a system is up or down, or where bottlenecks or risks may reside. “Those are architectural questions they would ask,” he said. “But more often than not, they’re very daily questions such as ‘I just got a ticket about a bug. How do I fix it?’ Even more so today, as more and more engineering is actually carried out on cloud environments and not local environments. Then even the development environments can become very opaque, and very far and remote. And all of a sudden, you find engineers using observability tools in their pre-production environments regularly.”
Developer Skills Familiarity Challenges for Observability
Processes around observability are likely an area where developers are typically less specialized than other areas. For them, it might be a bit harder to use metrics, and even more so to use distributed tracing. I asked Liran what can be done to make processes more accessible for developers and if he’d experienced challenges with skill sets and familiarity for developers around observability.
“I think we’re seeing that metrics and traces are used to great impact by specialized teams,” he said. “We’re seeing more and more people specialize in observability, in metrics and traces. Logs are still by far and large the most popular observability tool. It’s coming up in most surveys to be about four or six times more popular than the next runner-up. And the reason is that logs are so simple. They’re truly as simple as it gets. Once you write your first “Hello World” application, you know what logs are, how to use them, how to analyze them. There’s no black magic involved, no complex statistical analysis or data. What you see is what you get.”
This leads into the importance of centralized teams, who are experts beyond logging and understand how metrics and tracing factor into a complete observability picture. The rest of the team may stick to logging, Liran said, where the state of an application can be more easily surmised, because “they don’t have the time and knowledge to dive into those super hyper specialized tools that sometimes the observability industry is so excited about.”
How Observability Tools Fit in the Developer Tech Stack
Each engineering organization already has its set of development tools, stack, and processes. How does observability fit into all that? For Liran, the place to start is ownership. Many enterprises have specialized teams who own the observability stack. Typically, they share three main responsibilities: maintaining observability tools, managing company-wide observability challenges, and last but not least, spreading the word.
“The thing is where many of those teams fail is that they assume that whatever tools work best for them are going to work best for the average engineer, and the average engineers are definitely getting assistance out of those centralized teams,” Liran said. “Let’s say if an engineer has a performance problem, and he’s looking at tracing data and gets stuck, he’s probably going to reach out to the centralized team to get help. Essentially, quite often they’re just going to sit together, whether it’s on Zoom or shoulder-to-shoulder in the same office, look at the data and try to figure it out. But the thing is quite often that engineer is not going to be any better off next time. Because tracing can be so complex and metrics can be so complex, that the next time he runs into a significant problem, he’s going to need that team’s assistance again.”
There are amazing technologies for observability that have been built, but they’re just tailored for the power users, not for the everyday users, Liran added. A change in approach is needed, where even though some users may not necessarily need to use observability everyday, they will have the ability to do so.
“We want them to be more connected to the production environments, to the code they’re owning in production,” he said. “We definitely see a lot of potential in them using observability tools in pre-production environments, to speed up their work and deliver high quality software. We need to think of them as very important consumers of the observability stack. And then we need to make sure we tailor tools for them as well, and not just for our power users who get all the glory and all the budget for choosing their observability tool of choice.”
In future posts, we will get deeper into our discussion, including how Liran utilizes application snapshots at Rookout and how that may possibly be a new observability signal for organizations.
Want to learn more? Check out the OpenObservability Talks episode: Observability for Developers Demystified on: