Database Decision-Making for Observability, from Simple to Complex

By: Jonah Kowall

A goal of open-source observability is unifying several different signals to provide the observability everyone wants. It’s always interesting to speak to people on this journey, and how they try to provide it through open-source projects, and the challenges they can face.

I was thrilled to host Pranay Prateek on the most recent episode of the OpenObservability Talks podcast. Pranay is the CEO and co-founder of SigNoz, the company and creator of their namesake solution which is an open-source observability project. The company was founded after Pranay, an electrical engineer, got interested in the idea of signal and noise in data (hence the company name).

Pranay and his team began working on the project in Oct. 2020, initially as an idea for building on top of Prometheus and Grafana. But, they realized how difficult it is to get a one click-through experience in Prometheus.

“At that time, open-source products were there, and the current SaaS products were there, right,” Pranay said. “That’s where we realized that if we want to do this right, we have to build our own frontend and make a single app which talks to different signals. Down the path, we also took the call of having a single data store…The idea was that we have a single app which is easy to maintain, easy to install, like one-click install, boom, you get started very quickly, and that you get all the signals in one place. We didn’t want to have a different UI for one, and then second.”

The Challenge of the Unified Data Store in Observability

What Pranay is describing–the unified data store–is, in my opinion, something even most of the commercial tools haven’t figured out yet. When you look under the covers, they end up using different backends for the different data types. A lot of the databases haven’t really evolved well in dealing with time series and event-based data, like logs and traces.

The SigNoz project, when it started, was running on Druid, which is complicated to manage and scale. Pranay said they used it in part because it was more established and his team didn’t have a lot of database experience. I asked Pranay how their evolution went in the database and data stores area, since I know they wanted to avoid having multiple data stores.

“We launched the project in February 2021, and we got lots of inputs that it may be tough to run,” he said. “Those were right…It’s a great database. There’s no doubt about that, but because it just has so many components, which it needs to run upfront, I think many people were finding it difficult to run it in their local machine or laptop, to start with. This is the thing or insight which we didn’t have earlier, that, if you want to open-source something, first, people would want to try it in a laptop, preferably, or at max, a single, small machine.”

As a result, by default with their architecture, they were running Kafka (which in Pranay’s words was another “beast to manage”) with Druid all the time.

“I think it’s good architecture for companies which are using at scale, but for an open-source project which is at a very early stage, people would not invest so much in running at scale,” he said. “So, it needs to be very easy to run it in a single laptop environment or a single VM environment. Then as people get confidence in that, they can mature to dedicating more resources, more engineers. That’s what we have seen. And I think after June or July of 2020, say four, five months down the line, we decided that hey, Druid may be very complex for our project. Hence, we shifted to ClickHouse, and ClickHouse has provided much better performance.”

Pranay finds that ClickHouse starts quickly, and you can start with a single binary and very little resources. They don’t have Kafka in their infrastructure at SigNoz for now, but they may need to introduce it as users take advantage of the platform at a much higher scale. They’ve also introduced logs, which will necessitate adding Kafka as at least an optional component.

Adoption of SigNoz went up after the move to ClickHouse. They currently have around 7,200 GitHub stars, with more than 70 contributors from across the globe and over 1,200 members in their Slack community.

If you’d like to learn more, see the full OpenObservability Talks episode with Pranay.