The Top Query Languages You Should Know for Monitoring (and a couple more)

The Top Querying Languages in Monitoring (and some others from across DevOps)

Sifting data can be fun for some people. Connecting the dots and finding correlations where they weren’t obvious before. It’s the crux of what drives people’s motivation in data science. It’s no different in any other field, especially in one involving systems observability, telemetry, or monitoring. And the best way to do that is to develop a fluency with query languages for different database structures and open source tools.

There are a number of tools out there, so many that they sometimes overwhelm professionals. But there are some clear leaders when it comes to logging, monitoring (metrics), and tracing. The ELK Stack – with Elasticsearch, Logstash, and Kibana – is the big one for logging. Prometheus and Grafana are prominent for storage and visualization of metrics, respectively. Finally, Jaeger, OpenTelemetry, and Zipkin are the hottest names in tracing.

All of these platforms and various subcomponents use unique DSLs for queries: domain-specific languages. DSLs are oft built on top of existing, more general languages. The unique aspects of a given tool, or a desire for devs using the tool to make certain ops easier, motivates development of newer augments – dialects if you will – of those mother languages.

This list could go on forever, and as an enthusiast of natural languages, I wouldn’t mind updating it. In the meantime, this list covers some of the important ones for the monitoring sector. We have covered some of these in more detail, so be sure to click through the links to those blog posts below.

SQL

SQL is a fundamental, declarative language for developers. For anyone looking up this info for the first time, you should know it is hardly confined to managing logs of infrastructure monitoring. 

Pronounced “sequel” and short for Structured Query Language, many databases are built around (or even against) the principles of SQL. It’s mainly for structured data, and is foundational for relating different points of data to each other within single commands, and all without an index.

Its main features go without saying for people with experience in the field and are expected of most query languages today: definition, manipulation, access control, etc. Many DSLs are built based on SQL.

Databases developed around the table-based standard, while others went in a non-tabular direction (NoSQL). Depending on the type of database, the querying language will look more or look less like SQL.

Open source SQLite uses a simpler, diluted form of SQL (with fewer command options). Popular open source SQL databases include MySQL, PostgreSQL, and MariaDB. Many databases that do have their own DSLs will often also allow users to query using SQL.

Apache Lucene 

Elasticsearch and the rest of the Elastic/ELK Stack is built on Lucene, the search library for Apache. and exposes Lucene’s query syntax. It’s such an integral part of Elasticsearch that when you query the root of an Elasticsearch cluster, it will tell you the Lucene version.

And, while Elasticsearch has its own query language (more on that in a minute), Lucene is essential knowledge for Elasticsearch queries.

Lucene queries are relatively simple. Lucene operators include AND, OR, and NOT:

NOT type: mysql
type: postgresql

Ranges are defined by [ ] brackets with the TO operator in between the values:

age:[11 TO 17]

This will return events with age between 12 and 16

price:{50 TO 500}

This will return events with prices between 51 and 499

name: [Allison TO Zachary]

This will return names between and including Allison and Zachary.

Wildcards can be done using ? or * markers (just not at the beginning of term).

m?ld 

The above will match “meld,” “mild,” and “mold.”

m*ld 

This will match mould, midfield, millionfold, minefield, etc.

Elasticsearch Query DSL

Elasticsearch Query DSL is built on JSON, though it gets inspiration from Lucene. Query DSL uses two different kinds of clauses: leaf query clauses and compound query clauses. Leaves are standard queries. Compounds are, as the name implies, for combining multiple queries.There are multiple classes of queries also, which have particularly high up-front costs and are slower. Those include fuzzy queries, regexp queries, prefix, wildcard, range, joining, script score, and percolate queries (which are obviously exciting).

With all these details, it can be a bit much for more casual users. Later updates to Elasticsearch allowed for more Elasticsearch SQL-friendly requests. On the flip side, Query DSL fills in SQL’s gaps. 

Query clauses can continuously stack within others.

{
 QUERY: {
   QUERY: {
     QUERY: {
       FIELD: {
         ARGUMENT: VALUE,
         ARGUMENT: VALUE,...
         ARGUMENT: VALUE
       }
     }
   }
 }
}

Its Boolean operators are very different: must, must_not, filter, and should.

You can add parameters to an SQL inquiry to include a Query DSL request inside of it:

GET _xpack/sql/translate
{
  "query": "SELECT avg(system.process.memory.rss.bytes) FROM metricbeat* GROUP BY system.process.name"
}

In order to convert an SQL query to Elasticsearch Query DSL, use the following command with the SQL query:

GET _xpack/sql/translate
{
  "query": "SELECT avg(system.process.memory.rss.bytes) FROM metricbeat* GROUP BY system.process.name"
}

PromQL

PromQL is the DSL for Prometheus, the most prolific time series database in use for metrics. The language actually traces its own roots back to Go, so if you notice any dialectal similarities, you can understand why. Its import has increased as Prometheus started eclipsing other services in popularity. Its main advantage though, and perhaps one of the reasons Prometheus actually got that popular, is in its concise syntax.

PromQL is a nested function language, where one line can contain multiple queries, whose values can then be used in a larger function. A common combo of functions is sum(rate()), but there are plenty of others.

An example:

sum(rate(http_requests_total{job="prometheus"}[5d]

A similar feature, subqueries, lets you interpret data in relative ways.

rate(http_requests_total[4m])[2h:30s]

For anyone wondering, Grafana – the popular open source metrics visualization tool – does not have its own querying language. In fact, it will work with whatever the relevant data calls for. That could be PromQL or another like InfluxQL.

Kibana Query Language 

KQL — not to be confused with Kusto which we will mention below — is a separate language from Elasticsearch Query DSL. Kibana Query Language does not support regex or fuzzy terms (like ES Query DSL). However, it does support nesting fields within queries and scripted fields as in ES Query DSL. Kibana provides the built-in option to turn KQL off and use Lucene instead.

KQL supports terms queries, boolean queries, wildcards, and range queries (including date ranges).

Its boolean operators are again and, or, and not (in that order). 

response:(200 or 404)

To ensure the right order of operations, use parentheses:

(response:200 or response:404) and (extension:php or extension:css)

Kusto Query Language

Kusto is the main QL used by Azure Data Explorer. The explorer uses SQL queries, but relies on Kusto queries for more complex analytical queries like column calculation, join functions, groupby aggregation, and row filters. It comes up when dealing with Azure Log Analytics and Azure Application Insights.

Its focus is more on speed than detail, prioritizing data ingestion and performance over in-place row or cross-tabular updates. That being said, it deals with structured and semi-structured nested queries very well, all while handling free text too.

The following Kusto query example creates a table with two columns (location and address) whose values are strings:

.create table Logs (Location:string, Address:string)

Microsoft has built-in transitions for Kusto learners. If you preface an SQL query with EXPLAIN inside the Azure Data Explorer, a translation to Kusto will follow. Microsoft also maintains excellent documentation on Kusto and a fuller list of SQL-Kusto translations.

A Linguistic Oasis

Besides the fact that coming up with non-generic sub-headers is evidently difficult, the title of this section describes that’s out there pretty well. There are of course many, many more query-specific DSLs to at least be aware of.

Most on this list have to do with querying databases or are directly involved in logging and monitoring. Some prominent open source tools don’t have their own QLs, and some will be more complicated than others. But if you’re getting more into monitoring and logging, getting some familiarity with the ones mentioned above would be extremely helpful.

Observability at scale, powered by open source

Internal

× Announcing Logz.io’s native integration with Azure for frictionless observability Learn More