DevOps Pulse 2023:
Observability Trends and Challenges

Cloud Native Complexity Is Contributing to the
Increasing MTTR Of Production Issues.

Report Brief

What is DevOps Pulse?

The DevOps Pulse survey and report is an annual analysis of DevOps industry trends conducted by Logz.io. This report highlights key trends, challenges, and insights reported by organizations and individuals who employ observability and other relevant practices. 

The 2023 survey compiles data from around 500 respondents from various organizations, with titles ranging from developers and DevOps engineers to IT professionals and executives. The responses highlight various aspects of observability that have had a substantial impact on organizations. 

This year’s research shines a spotlight on a host of pressing issues, including a continued increase in the all-important metric of mean time to recovery (MTTR), along with the growing complexity of modern cloud architectures, burgeoning data volumes, and costs, and how these obstacles are currently impacting observability in modern environments. 

Who is Logz.io?

Logz.io provides a SaaS-based observability and security platform powered by the world’s most popular open source monitoring technologies.

The Logz.io Open 360TM Platform unifies log, metric, trace, and security data into a single platform offering full visibility into the health and performance of cloud applications and infrastructure.  

Overview

The DevOps Pulse survey has been conducted annually for the last five years. Aggregated data from these surveys show that respondents are slowly closing observability knowledge gaps, and seeing observability maturity come to fruition.

Alongside these indicators of maturity, there have been a variety of growing pains such as ecosystem complexity, massive telemetry data volumes, and increased tool sprawl that are preventing faster progress.

These challenges can increase observability system complexity, drive up costs, hinder troubleshooting efficiency during production incidents, and introduce security risks.

In this year’s report, the following insights from respondents command further investigation and provide a snapshot of many pervasive challenges:

Knowledge and adoption of DevOps practices and cloud-native technologies are progressing at a consistent rate. Over 45% of respondents reported that they’ve fully embraced DevOps practices, a 7% increase compared to last year’s results. 
Mean time to recovery (MTTR), a key metric for observability efficiency, has been steadily increasing within organizations—after a 10% increase from last year, 75% of respondents now say it’s taking them hours to resolve production issues. Only 14% of respondents are satisfied with their current MTTR, indicating a strong need for improvement.
While observability practices are maturing at a consistent pace, the complexity of cloud-native technologies is inhibiting further progress. Almost 50% of respondents cited technologies such as Kubernetes as their main obstacle to gaining full observability into their environment. 
The integration of security alongside observability practices is a growing concern. The data shows that observability teams are major stakeholders in the security of their applications and tools, with over 72% of respondents already employing or planning to enlist a unified model for observability and security. 
Open source tools are growing in popularity, as over 93% of respondents now utilize them for observability. However, related factors including management, scalability, and upgradability of these tools continue to prove difficult.

The Current State of Observability and DevOps Maturity

DevOps and cloud native technologies are critical for the success of digital transformation.

Maturing observability methodologies can accelerate this transition to cloud native architectures, such as adopting a shared services model, developing automation, implementing open source tools, and more. 

Respondents to this year’s survey indicate that both cloud native architectures and observability are maturing, but certain factors are mitigating the speed of adoption for organizations:

  • Adoption of DevOps continues to grow year after year: In the 2023 survey, 45% of users have fully adopted and embraced DevOps practices – a 7% increase compared to 38% the year before. Likewise, there is also measured growth among users who have partially adopted DevOps practices.
Where are you today in your overall DevOps journey?
  • Responsibility for observability is mostly falling to DevOps: 55% of respondents indicated that in their organizations, DevOps teams are the main stakeholder when it comes to administering observability.  
  • DevOps Principles in Practice: The most popular methods for implementing DevOps practices among respondents this year include: CI/CD implementation (75% of respondents), building out automation (71%), and using infrastructure as code (62%). The percentage of respondents across all of these categories has increased from last year’s survey.
  • The leading observability tools from 2021/2022 remain dominant: The most popular tools from 2022, those being Grafana (43% of total respondents), Prometheus (38%), and AWS CloudWatch (37%) remain at the top in 2023.
43%
Grafana
38%
AWS CloudWatch
38%
Prometheus
  • The shared services model is widespread and popular: Observability responsibilities — which might include user access control, other security policies, data infrastructure performance, user support, and other tasks depending on the organization — are increasingly rolled up to a dedicated team. This type of structure is known as the shared services model. Among the received responses, over 85% of respondents are either planning to or have already adopted a shared services/platform engineering model in some capacity, with an additional 6% planning to adopt one in the near future.
  • Tool sprawl continues to accelerate: There are indicators that observability tool sprawl among respondents is an unresolved issue. As of this year’s survey, over 85% of respondents are still using multiple tools in their observability strategy. And this isn’t viewed as a good thing: there has been a 5% increase in respondents year over year who cited tool sprawl — and consequently siloed telemetry data — as a key challenge in gaining full observability into their ecosystems. 
Who owns or administers observability in your organization? (Select all that apply)
What are some of the ways that you are currently practicing DevOps? (Select all that apply)

Increasing MTTR and the Impact on Observability Value

Managing the cost of telemetry data has emerged as a core pillar of observability strategies. In addition to high costs, exploding data volumes can lead to daunting complexity and prolonged resolution time, resulting in mounting expenses.

As a result, this year’s report illustrates a troublesome trend: increasing MTTR and costs seem to be going hand in hand.

While respondents cited that the cost of ownership remains consistently high, increased MTTR and complexity issues negatively impacted the overall value that respondents are seeing from their observability practices.

73%
Report That It Takes Multiple Hours to Resolve Production Issues in 2023
73% Report That It Takes Multiple Hours to Resolve Production Issues in 2023
  • The MTTR for production issues is getting longer: Growing MTTR for a business can have costly implications — slow troubleshooting prolongs interrupted customer experiences that can impact the bottom line. In a disconcerting trend, 73% of respondents reported that it still takes multiple hours to resolve production issues. This is a substantial jump from last year when only 64% of respondents reported this, and a massive leap over the 47% reported two years prior.
  • Teams want faster response times to production issues: Only 14% of organizations are satisfied with their MTTR for production issues. Similarly, almost 30% of respondents noted that the response time to production issues needs to be faster than it currently is, which has grown from 25% of respondents stating this in the previous year.

Only 14% of organizations are satisfied with
their MTTR for production issues.

DevOps Pulse 2023
  • Observability strategies are evolving due to cost: When respondents were asked how they plan to mitigate cost concerns, strategies such as gaining a better understanding and visibility of costs (36% of respondents), adapted data management practices (27%), and open-source adoption (26%) were the favored solutions. The adoption of these strategies has seen a rise in the proportion of users who are implementing them across the board.
  • Data volume control and automation are key components for successful data management: To control data usage, over 61% of survey respondents are employing automation and data volume control practices. 
What is your team’s current Mean Time to Recovery (MTTR) during production incidents?
How are you evolving your observability strategy based on costs? (Select all that apply)
Are you increasingly seeking out means to control observability costs through improved data cleansing and optimization? (Select all that apply)

Challenges with Cloud Native Architecture

When it comes to the management of cloud native technologies, complexity is rife. Complications such as scalability, observability infrastructure management, and security considerations can all adversely impact the ROI of using cloud technologies.

With over 65% of respondents partially or fully migrated to the cloud, technologies such as Kubernetes continue to be a core emphasis for DevOps organizations. However, the inherent complexity of these systems is proving difficult for observability and security teams to manage.

Observability Complexity Issues Persist 

While DevOps and cloud native architecture are mission-critical for digital transformation, they also present complex observability challenges – a potential cause of the MTTR issues discussed in the previous section.

This year’s survey highlighted a recurring challenge for those implementing both observability and security; Kubernetes. Respondents noted difficulties in both monitoring Kubernetes itself and how this is a main challenge in gaining full observability into their environment. 

Below are the key insights related to observability challenges:

  • Cloud-native architectures pose the largest observability challenge: When asked about the main concerns in gaining full observability, the largest category of respondents (almost 50%) stated that Kubernetes posed one of their main challenges. Others noted that scaling and overall management (31% of respondents) along with lack of knowledge among the team (30%) were other related adversities.
  • Similarly, a leading challenge for Kubernetes is monitoring and observability: The hardships and complexities surrounding monitoring Kubernetes are growing, with 41% of survey respondents citing this as a challenging component of running Kubernetes in production. This issue has seen a sharp increase since the DevOps Pulse 2022 survey, in which only 31% of respondents chose monitoring Kubernetes as a key concern. 
Kubernetes has
quickly become
one of the biggest observability
challenges in 2023
Kubernetes has 
quickly become 
one of the biggest observability 
challenges in 2023
  • The challenges of running Kubernetes in production: The primary challenges for those running Kubernetes in production are security, aggregating the relevant data during troubleshooting, and cluster networking. 
What are your main challenges in gaining observability into your cloud-native environment? (Select all that apply)
Where do you find the most difficulties when running Kubernetes in production? (Select all that apply)

Security Challenges and Integration

It’s becoming apparent that observability teams are major stakeholders in the security of their applications and infrastructure. Only 10% of respondents noted that their observability and security teams maintained a siloed structure, illustrating a strong overlap between the two practices. 

Among the various challenges faced in security, respondents noted integrating their security tools with the rest of their ecosystem as an increasingly complex and significant problem.

  • Observability teams are accountable for security: Consistent with last year’s results, observability-centric teams are primarily responsible for application and infrastructure security; 46% of teams have primary responsibility for infrastructure and application security, while 25% of teams share it with a specialized security group. Some 72% stated that they plan to or have already enacted a unified model for their security and observability monitoring.
  • A shift in focus for security challenges: There has been a 9% increase in respondents reporting security tool integration challenges after migrating to cloud-native technologies as compared to the 2022 survey data. Meanwhile, the same percentage of respondents as compared to last year reported difficulties when it comes to prioritizing relevant security data in their ecosystem.
  • Like observability, the undertaking of Kubernetes security proves difficult: When it comes to cloud-native technologies such as Kubernetes, having detailed visibility into related vulnerabilities and overall insight into the system remains a challenge. That’s why it’s no surprise almost 50% agree that Kubernetes security is the most difficult component of running Kubernetes in production.
Security is the top
challenge for running
Kubernetes in production
DevOps Pulse 2023
To what extent does your organization employ a unified model for observability and security monitoring?
Does your team have primary responsibility for infrastructure and application security?

Open Source Remains Vital

Open source continues to play a major role in DevOps and observability practices with adoption growing steadily, year after year. Developers choose open source technologies for a multitude of reasons, including their low cost of entry, integration capabilities, avoidance of vendor lock-in, and strong community innovation.

In this year’s survey, the findings indicate continued growth in open source adoption. That said, many respondents continue to struggle with managing and scaling their open source toolset. 

What percentage of your observability tools are open source?
  • The many benefits of open source: Open source observability users cite ease of integration (47% of respondents), benefits of the community (36%), and lower cost of ownership as the main drivers for open source adoption. These factors remain constant as the top benefits of utilizing open source for observability compared to the 2022 survey. 
  • Open source tools for observability are seeing wider adoption: When it comes to observability tools, roughly 53% of this year’s respondents stated that half or more of their observability tools are open source. The uptick from last year is notable, as only 37% of 2022 survey respondents indicated that half or more of their tools were open source. Additionally, over 93% utilize open source in some capacity for observability. 
  • Cost reduction through open source remains popular: More than 25% of the respondents for this year’s survey are utilizing open source tools to manage the growing costs associated with their observability strategy. This has stayed consistent year over year and continues to be a frequently employed strategy by observability practitioners. 
  • Challenges in the face of adoption: When asked about the key challenges of utilizing open source tools for observability, 32% of respondents cited issues with infrastructure management and scaling. Some 28% noted difficulties when it came to upgrading relevant components, and another 28% faced problems with data pipeline performance and troubleshooting.

The many benefits of open source

  • Easy integration with DevOps / cloud native tech (47%)
  • The open source community (37%)
  • Familiarity with tools (28%)
  • Avoid vendor lock-in (33%)
  • Lower cost of ownership (35%)

Challenges facing adoption

  • Siloed telemetry data (21%)
  • Infrastructure management / scaling (33%)
  • Monitoring / troubleshooting data pipeline performance (27%)
  • Upgrading related components (28%)
  • Telemetry data security (18%)
  • Not enough skilled experts 26%)

Observability in 2023

In conclusion, the 2023 DevOps Pulse report recounts a consistent narrative among DevOps and observability users who are slowly improving their observability practices but remain hindered by numerous factors. Pervasive issues include the growing complexity of cloud environments, higher costs, increasing MTTR, and matters of tool integration and consolidation.

This annual research continues to validate the increasing adoption of cloud technologies, along with observability cementing itself as a required practice to understand and optimize these systems. The continued popularity of Kubernetes and a growing emphasis on security stand out as emerging priorities that cannot be overlooked.

For these reasons and many others examined in the 2023 DevOps Pulse Report, it will be important to continue to research and elevate related trends in the years to come.

Get started for free

Completely free for 14 days, no strings attached.