Logs have been around since the advent of computers and have probably not changed all too much since. What has changed, however, are the applications and systems generating them.
Modern architectures — i.e. software and the infrastructure they are deployed on, have undergone vast changes over the past decade or so with the move to cloud computing and distributed environments. For engineers logging in this new world, these developments have resulted in a new set of challenges: huge and ever-growing logging pipelines that exact a cost in terms of time and money invested in developing and managing logs.
No one really likes logs. But almost everyone recognizes their importance. Several methodologies have emerged over the past few years to help engineers overcome this dissonance and make logging workflows more efficient and user-friendly. Let’s take a closer look.
Logs are extremely easy to create. Most applications provide a built-in logging mechanism that generates a log output, usually to a file. Most modern applications will also output in JSON format which makes handling the logs a much easier task.
Logs, however, are not always easy to analyze, especially together with other log types. In most environments, you will most likely be looking at multiple log types, each varying in structure and format. When trying to connect the bits and pieces across data sources to create a story, this inconsistency poses a huge obstacle.
Structured logging — i.e., the method of standardizing logs across applications — is not a new concept. Yet the understanding that it is a prerequisite for guaranteeing better observability into modern IT environments has made it into a best practice. ECS (Elastic Common Schema) is a great example of how structured logging has become a central piece of logging workflows today.
The costs accompanying logging are real. Organizations are paying a hefty amount of money for logging infrastructure or log management solutions. But logging also entails another type of cost, that of time spent on implementing logging within the architecture.
As described above, shifting left and implementing structured logging can help in formalizing the logging workflows within an organization. The truth of the matter, however, is that many teams will still struggle with some of the very mundane and basic elements of logging. This has given birth to new types of methodologies and accompanying technologies to minimize this friction.
Rookout, for example, allows developers to add log lines on-the-fly, without restarts or redeployments, in development, staging, and production. These logs can later be delivered to any 3rd party log management tool for aggregation and analysis. The idea behind this approach is to help developers improve their signal/noise ratio and also get the visibility they need even if they didn’t add logs into the code in advance.
The topic of structured logging provides a nice segway into the next trend — shifting left. In the context of logging, shifting left means making logging a core element of the development workflow and not just an afterthought. All too often, engineers will understand the importance of logging the hard way — after a critical error takes place without a log providing the context needed to resolve it.
In the spirit of DevOps, shifting left with logging means giving developers full responsibility for implementing logging in their code across all the initial development stages: design, code review, testing and onwards into production. The sooner logs are hardcoded into the application’s logic, the more visibility will be gained throughout the application delivery lifecycle.
Moreover, automation is added into the mix to make logging a much easier process. For example, companies like Grab have tied logging into their configuration management system to automatically change log levels during runtime. This way they can be sure the correct logs are being generated when necessary.
“Log less” logging
As mentioned above, logs come at a cost. Modern applications and the infrastructure they are deployed on can be extremely verbose. There is a high price tag attached to collecting and storing all of the data they generate. Plus, the sheer amount of logs generated can obscure the visibility we were trying to gain with logging in the first place.
These two pitfalls — cost and obscured visibility — have caused engineers to contemplate upon the veracity of the notion of implementing logging across the system. Instead, a more “log less” strategy is implemented, in which only critical components are logged or logs are merely sampled over time. Despite the risk of limited visibility in times of crisis, the “logging less” school is slowly gaining momentum.
The compromise over what to log and what not to log can be a painful one to make. Realizing this, and understanding the growing costs of logging, we’ve introduced a series of cost and logging optimization capabilities. You can read more about this here.
It’s hard to overestimate the importance of logs in gaining visibility into a system. If structured correctly, they can contain a wealth of information about what happened and when, and most importantly they can provide the context for understanding why. That’s why logs are still the backbone of monitoring systems.
But yes. They’re not easy to handle. Teams spend time figuring out what to log, how to log, and when to log. Logging a distributed system today can result in millions of log messages that when taken together are more noise than anything else.
That is unless logging is given a more predominant place within the engineering processes. For teams rethinking their logging pipelines or just starting out with implementing logging workflows, the methodologies described above will help you make logs great again.