The Definitive Guide to OpenSearch for Observability

OpenSearch

Executive Summary

IT organizations recognize that comprehensive observability of mission-critical infrastructure and application performance is an essential function. And they look for every available tool that can help them achieve that observability.

OpenSearch™ – an open-source, distributed search, log analytics and data visualization technology with growing adoption among DevOps organizations – is a proven solution for incorporating robust search, visualization, and analytics functions into a unified observability platform.

As of October 2022, more than 40 companies had signed on as partners, dozens of projects based on OpenSearch have been launched, and the open-source software has been downloaded more than 100 million times.

Introduction

The ability to conduct analysis quickly and thoroughly is essential for DevOps organizations. Numerous use cases require this capability, including collection and analysis of log data, performance  metrics; business information; and security monitoring.

Powerful querying and analytics are integral to observability, which itself is an essential function within the context of today’s cloud environments. Without comprehensive observability, developers and IT managers leave themselves vulnerable to inefficiencies at best, and catastrophic applications failures and security breaches at worst.

With comprehensive and effective observability, IT managers gain a deeper understanding of application performance, leading to an optimized experience for customers and end-users; faster recognition of potential or emerging problems that could impact application performance; and reduced mean time to resolution (MTTR) in cases where a problem occurs. For all of these reasons, gaining effective and cost efficient observability has become a primary focus for DevOps and engineering teams at organizations of every size and orientation.

The Rise of OpenSearch

For many years, Elasticsearch has prevailed as one of the most popular enterprise search and analytics tools. Much of its popularity was based on its basis in being maintained by a community as a fully open source offering. Elasticsearch was built on the Apache 2.0 open source license and also incorporated the popular log-collection engine Logstash and Kibana, a visualization interface. Together this collection of capabilities is known as the ELK stack.

However, in January 2021, Elastic – the organization that launched and oversees the ELK stack – announced that starting with version 7.11, it would relicense Elasticsearch and Kibana under Server Side Public License (SSPL) and the Elastic License (ELv2), neither of which is recognized as retaining a pure open source approach.

In response, Amazon Web Services (AWS), which previously offered Elasticsearch capabilities through its cloud computing services, organized an effort to create a new enterprise search solution that remained 100% committed to open-source principles. That solution is OpenSearch, a distributed, community-driven, open-source search and analytics suite.

opensearch project milestones

The founding principles of OpenSearch fully embrace the needs of the DevOps community, namely:

  • Great software – If it doesn’t solve your problems, everything else is moot. It’s going to be software you love to use.
  • Open source like we mean it – We are invested in this being a successful open-source project for the long term. It’s all Apache 2.0. There’s no Contributor License Agreement.
  • A level playing field – We will not tweak the software so that it runs better for any vendor (including AWS) at the expense of others. If this happens, call it out and we will fix it as a community.
  • Used everywhere – Our goal is for as many people as possible to use it in their business, their software, and their projects. Use it however you want.
  • Made with your input – We will ask for public input on direction, requirements, and implementation for any feature we build.
  • Open to contributions — Great open-source software is built together, with a diverse community of contributors. If you want to get involved at any level – big, small, or huge – we will find a way to make that happen.
  • Respectful, approachable, and friendly – This will be a community where you will be heard, accepted, and valued, whether you are a new or experienced user or contributor.
  • A place to invent – You will be able to innovate rapidly. This project will have a stable and predictable foundation that is modular, making it easy to extend.

The OpenSearch suite, based on Apache 2.0, represents a fork from Elasticsearch and Kibana, and maintains compatibility with all versions of Elasticsearch up to version 7.10. It also continues support for Kibana.

Support for OpenSearch is growing rapidly. As of October 2022, more than 40 companies had signed on as partners, dozens of projects based on OpenSearch have been launched, and the open-source software has been downloaded more than 100 million times.

Leading companies such as AWS, Aiven.io, CapitalOne, Coralogix, Instaclustr, Logz.io, Opster, Oracle, RedHat, and Virtozzo are active participants in the OpenSearch community. 

As additional indicators of growing momentum, V3.2 of OpenSearch was released in September 2022, and the first OpenSearch Con conference was held that month in Seattle.

Key elements of OpenSearch

OpenSearch is a community-driven, Apache 2.0-licensed open-source search and analytics suite that makes it easy to ingest, search, visualize, and analyze data. Developers build with OpenSearch for use cases such as application search, log analytics, data observability, data ingestion, and more.

OpenSearch consists of a data store and search engine (OpenSearch), and a visualization and user interface (OpenSearch Dashboards). Users can extend the functionality of OpenSearch with a selection of plugins that enhance search, security, performance analysis, and machine learning.

OpenSearch delivers a robust range of features including (but not limited to) the following:

SearchProvides several features that allow users to customize their search experience, such as full-text querying, autocomplete, scroll search, and customizable scoring and ranking.
Application analyticsFeatures a tool to create custom observability applications to view the status of systems, such as combining log events with trace and metric data into a single view of overall system health. It also allows users to quickly pivot between logs, traces, and metrics to uncover details about the source of any issue.
SQL queryDelivers the familiar SQL query syntax. Users can apply aggregations, group by, and where clauses to investigate data, as well as read data such as JSON documents or CSV tables. In addition, SQL query syntax can be used to gain access to the rich set of OpenSearch search capabilities such as fuzzy matching, boosting, and phrase matching.
Asynchronous searchComplex queries are run in the background, eliminating worries about them timing out. Query progress can be tracked in real-time and partial results can be retrieved as they become available.
Piped Processing LanguageProvides a familiar query syntax with a comprehensive set of commands delimited by pipes (|) to query data. PPL can be used to build visualizations that enhance observability of data.
Data PrepperOffers a server-side data collector capable of filtering, enriching, transforming, normalizing, and aggregating data for downstream analytics and visualization. Data Prepper lets developers build custom pipelines to improve the operational view of applications.
ML Commons LibraryA range of machine learning algorithms, like k-means and anomaly detection, to train models and predict trends in data. ML Commons integrates directly with PPL and the REST API.
Dashboard notebooksCombines dashboards, visualizations, and text to provide context and detailed explanations when analyzing data.
Index managementDefines custom policies to automate routine index management tasks, such as rollover and delete, and apply them to indices and index patterns.
Index transformsCreates a summarized view of data centered around certain fields, so users can visualize or analyze the data in different ways.
Performance analyzer and RCA frameworkEnables users to query numerous cluster performance metrics and aggregations using PerfTop, the command line interface, to quickly display and analyze those metrics. Developers can also use the root cause analysis (RCA) framework to investigate performance and reliability issues in clusters.
Anomaly detectionBased on the Random Cut Forest (RCF) algorithm, this automatically detects anomalies as data is ingested. It monitors data in near real-time and automatically sends alerts.
Trace analyticsUsers can ingest and visualize OpenTelemetry data for distributed applications. This provides the ability to view the flow of events between these applications to identify performance problems.
k-NN searchApplying machine learning, this powerful feature runs the nearest-neighbor search algorithm on billions of documents across thousands of dimensions with the same ease as running any regular OpenSearch query. k-NN search powers use cases such as product recommendations, fraud detection, image and video search, and related document search.
AlertingLeveraging OpenSearch’s intuitive interface and a powerful API, users can easily set up, manage, and monitor alerts that can automatically notify affected stakeholders of any problems. Highly specific alert conditions can be established using OpenSearch’s full query language and scripting capabilities.
Advanced securityOffers encryption, authentication, authorization, and auditing features. They include integration with Active Directory, LDAP, SAML, Kerberos, and JSON web tokens. OpenSearch also provides fine-grained, role-based access control to indices, documents, and fields.

All software in the OpenSearch project is released under Apache License, version 2.0 (ALv2). ALv2 grants permissive usage rights that deliver the freedom users expect from open-source software. This includes the ability to use, modify, extend, monetize, and resell software. Examples of software offered by members of the OpenSearch community can be seen on the Community Projects page. Complete documentation also is provided.

Primary use cases

OpenSearch capabilities can be applied in a wide variety of use cases. These include:

  • Any type of search – within applications, across an enterprise, or across the internet 
  • Ingestion, visualization, and analysis of applications running on any cloud
  • End-to-end monitoring of Kubernetes across logs, metrics, and traces
  • Cloud-native SIEM solutions for security monitoring
  • Business analytics and monitoring
  • Observability for cloud infrastructure and applications

Because observability is so important to maintaining a healthy, reliable IT infrastructure, let’s look in more detail how OpenSearch supports this essential function. OpenSearch approaches observability, trace analytics, log analytics, and application performance monitoring (APM) across four dimensions:

  1. Collection: Gathering, enriching, filtering, transforming, storing, and normalizing data from multiple sources.
  1. Detection: Anomaly detection links related alarms to reduce “alarm fatigue. This is made possible through visualization and monitoring which OpenSearch accomplishes with OpenSearch Dashboards.” Users can interactively analyze the data with tools like PPL.
  1. Investigation: This is the largest contributor to Mean Time to Incident (MTTI) and Mean Time to Recovery (MTTR). OpenSearch Dashboards help users cut through oceans of data by leveraging  logs, metrics, and tracing to quickly conduct root-cause analysis. 
  1. Remediation: Armed with the necessary data and analysis made possible through OpenSearch, IT teams can remedy a problem, then conduct forensics to determine how to prevent a recurrence.

Details about applying OpenSearch to observability are available through the documentation site.

Differences between OpenSearch and Elasticsearch

There are numerous differences between OpenSearch and Elasticsearch. The most significant is the fact that OpenSearch is open-source and Elasticsearch’s licenses (ELv2 and SSPL) are no longer open source. The language in these licenses are legally ambiguous – Section 13 in the SSPL states: “…enabling third parties to interact with the functionality..“. This can be applied to a significant amount of Elasticsearch use cases, which raises questions around legal use of the software.

There are many other reasons to choose true open source software, including:

  • Freedom from vendor lock-in
  • Lower costs
  • Enhanced security
  • Full transparency
  • Higher quality and reliability
  • Faster time to market
  • A continually expanding community of users
  • The freedom and flexibility to innovate

Other critical differences include:

  • OpenSearch has an active forum that encourages contributions from the community. In contrast, only Elastic NV employees can commit changes to Elasticsearch.
  • OpenSearch includes access controls for centralized management. This is a premium feature in Elasticsearch.
  • OpenSearch has a full suite of security features, including encryption, authentication, access control, and audit logging and compliance. These are premium features in Elasticsearch.
  • Phone support and helpful tools are available free through the OpenSearch community but are premium features in Elasticsearch.
  • Apache 2.0 (ALv2) allows users to modify, distribute and sublicense code without restrictions. ELv2 and SSPL are much more restrictive.
  • ML Commons makes it easy to add machine learning features. ML tools are premium features in Elasticsearch.
  • OpenSearch is available as a managed service from multiple providers (such as those offered by AWS, Oracle and Aiven); only Elastic can offer a managed Elasticsearch service.

Additional OpenSearch features that are not available for free from Elasticsearch include:

  • Centralized user accounts/access control
  • Cross-cluster replication
  • IP filtering
  • Configurable retention period
  • Anomaly detection
  • Tableau connector
  • JDBC and ODBC drivers

Should You Move to OpenSearch?

Moving from Elasticsearch to OpenSearch can bring many benefits. Among them are all the emancipating aspects of open-source software. You can use, modify, and extend the software whenever and however you wish. You can benefit from the collective innovation generated by the user community that becomes available as soon as projects are completed. This is in direct contrast to being “held hostage” by one vendor’s timetable. In addition, you’ll avoid the added costs now associated with Elasticsearch.
The timing of a move to OpenSearch will depend on your own circumstances. A migration will take some time and effort, but it generally is minimal. Moreover, the OpenSearch community provides numerous tools and advice for making the change, such as guidance from Bonsai and Opster.

Logz.io and OpenSearch

Logz.io has long been an advocate of open-source observability. As such, when the OpenSearch project began, we were an early and outspoken advocate for the project. We also fully embraced OpenSearch as part of our product roadmap, believing it to be in the best interests of our customers.

When OpenSearch reached GA status in 2021, we began introducing a steady stream of capabilities, beginning with making OpenSearch our backend database. Later, we added OpenSearch Dashboards as the primary interface for our platform, replacing Kibana. Our long term commitment is to enable our users to capitalize on the industry’s first and only full-stack observability platform that integrates the power of open-source to deliver a highly reliable, scalable SaaS solution.

This commitment to OpenSearch extends to a broad alliance with AWS. Logz.io’s relationship with AWS spans numerous practices and channels, including:

  • Customer: Logz.io’s SaaS observability platform is hosted on AWS.
  • Technical Partner: We are directly involved in advancing each other’s technology and products through the development and delivery of common capabilities.
  • Business partner: The vast majority of Logz.io’s customers run on AWS. That’s why the ingestion/visualization/analysis of AWS application services is a primary use case and capability for Logz.io and most of our customers. We also partner for joint go-to-market efforts.
  • Channel partner: AWS and Logz.io cooperate to drive business via the AWS Marketplace, where Logz.io is listed and available for direct sales interaction.

The close cooperation between Logz.io and AWS delivers multiple benefits to our joint customers. These include the ability to:

  • monitor AWS services most effectively
  • benefit from best-in-class observability features, driven by close technical/business partnerships
  • tap into the power of shared product innovation
  • engage optimized AWS monitoring from a single source
  • benefit from joint pricing and sourcing options
  • receive co-orchestrated customer/product support.

We currently offer OpenSearch and OpenSearch dashboards-as-a-service, which provides full log collection, storage, and analytics on our SaaS platform. This effectively offloads all deployment, scaling, infrastructure management, upgrading, parsing, and other management requirements from running OpenSearch yourself. And it is unified with Prometheus metrics and Jaeger traces for full observability.

By using Logz.io’s open-source observability platform built on OpenSearch, customers benefit from greater ease of use, higher reliability at scale, reduced total cost of ownership, and advanced troubleshooting backed by machine learning. Further, for organizations seeking to evolve from use of the traditional ELK stack to engage the benefits of OpenSearch, Logz.io serves as a method of rapid migration.

Logz.io’s presence as the only full stack observability platform based entirely on open-source means DevOps organizations seeking end-to-end observability and security can limit the number of platforms that they need to maintain – saving invaluable time and technical resources.

Conclusion

As a community-driven, open-source approach, OpenSearch liberates developers by offering the best approach for incorporating open source search into core monitoring and observability tools, today and in the future. For DevOps teams seeking the best observability platform and service, Logz.io provides the best solution for full-stack, unified observability for infrastructure and application monitoring, security and event management and operational health tracking.

To learn more about how Logz.io with OpenSearch can deliver the most comprehensive  observability solution, visit Logz.io

    Observability at scale, powered by open source

    Internal

    Organize Your Kubernetes Logs On One Unified SaaS Platform

    Learn More