Some background. Having implemented at least 20 or more APM systems in production as an end-user at various companies, and both deployed and managed countless monitoring tools outside APM, I understand the role of the practitioner.
Later on, I shifted to Gartner and led the APM Magic Quadrant for four years, finally spending another four years at AppDynamics (operating under Cisco after two years). I feel this qualifies me as having been on all sides of the equation when it comes to seeing what APM is today, and moreover how tracing has changed over the last few years.
Given this experience, it was nice to step away from APM proper as we have worked to build out the observability platform at Logz.io. In the last two years, we have moved away from being a log-centric company (Log analytics and SIEM) to having a full product portfolio including metrics and tracing. I have also been working in the open-source community extensively within OpenTelemetry and Jaeger, along with many of our engineers contributing back to these projects, and helping the projects evolve. We have been watching intently how these technologies have been maturing and adopted by more users.
The APM Market is Shifting
Recently, I received a copy of the Gartner APM Magic Quadrant survey for 2022. After careful reading, one can clearly discern how Gartner is shifting away from traditional concepts of APM – as myself and Will Cappelli defined the market – into a more modern rendition, including the concepts of Observability and Distributed Tracing. Talking to many customers, it’s clear that there is significant confusion around this ongoing evolution.
Do they need APM? Or can they get away with a more simple and holistic approach to solve both monitoring and observability use cases? The answer is…. It depends.
APM tools were initially designed (around 2000) to instrument vendor code, and as they grew in popularity, along with languages like Java, platform maturity increased and prices started coming down. At that point, most of the next generation of APM (2.0) tools were created (around 2010, but they were not largely built for microservices.
Critically, as we continue to expand usage of microservices in “modern” cloud environments, the pricing of these APM solutions has not kept up, particularly in terms of cost per application. The reality is that the cost of maintaining these agents is high for vendors.
Between the four largest APM vendors who represent the “leaders” of the latest APM Magic Quadrant (2021), there are at least 2,000 engineers working on agents, probably more. Assuming an average salary of $180k per year, this costs roughly $360m for the industry. These engineers could be working on solving other problems for the business.
OpenTelemetry and the Future of Tracing
This is why projects like OpenTelemetry are so important. By instrumenting libraries out of the box we get automatic instrumentation and the code, which isn’t instrumented, can be auto-instrumented to collect lightweight traces and metrics.
This solves most of the problem of getting visibility, but does not provide the level of diagnostics and profiling that you get in commercial APM tools. OpenTelemetry will likely expand to profiling in the next three years, but the diagnostics of APM will also likely fade in terms of impact and utilization over this time due to the limited benefit that they will deliver. Users don’t generally need the in-depth diagnostics because microservices are breaking up the applications into smaller components which can be troubleshot.
The “leaders” of the Gartner Magic Quadrant do not want to invest in OpenTelemetry, especially for bidirectional agent interoperability because their agents create lock in and revenue with agents and data collection. Protecting a collective $2.5b in revenue or around 25% of the market size in APM is paramount for them.
Sure, they all talk a big game about OpenTelemetry and interoperability, but the flow is all data from OpenTelemetry into the tool, and they continue to create and push proprietary agents as the #1 choice. Money is the driver, and not what is best for customers. There are countless features which have been on the roadmap for open-source agents like New Relic and Datadog, which are never finalized. Companies like AppDynamics and Dynatrace still have completely closed source agents and data collection, as that is their secret sauce. They also still have the best agents on the market due to the amount of time they have been building them.
Over time, Gartner will evolve the definition of APM towards the needs of modern teams who are interested not just in tracing and metrics, but also using logging in a unified manner. The use cases remain the same, managing and troubleshooting modern applications to get digital business visibility.
Presently, there continues to be confusion as the analysts catch up with forward-thinking teams.
Just remember, APM is for legacy workloads, and one should focus on distributed tracing as one of the signals in your observability strategy.