Guide to Kubernetes Logging

Originally created by Google and donated to the Cloud Native Computing Foundation, Kubernetes is widely used in production environments to handle Docker containers in a fault-tolerant manner. As an open-source product, it is available on various platforms and systems. For example, Google Cloud and Azure offer official support for it, so you don’t need to handle a difficult setup to manage the cluster itself. Using Kubernetes, containers can be managed in large clusters, with thousands of machines available. 

Every new technology should hopefully solve past problems, but unfortunately, it often also creates new ones. How do we handle a high-availability container? If we are going to need more computing resources, how do we scale up the number of containers running the same application? If a container breaks, how can we start it again? How do we manage logging if the application can be running in several different machines? Container orchestration tools, such as Kubernetes (K8S), are applications designed specifically to handle all these issues (and others besides). 

Logging in Kubernetes 

K8S does not have a decent native logging mechanism built-in, but it does offer solutions that will help a cluster without a proper, solid, logging solution. The difficulty with logging relates to how the cluster manages its applications (or pods): a dead or crashed container might have some useful debugging information in its logs; however, it will be difficult, if not impossible, to locate them, since all of the container’s storage will have been destroyed.  

K8S always stores the standard output and standard error output of each container. You simply need to run the command kubectl logs <container name> to reach all the logs of the desired container. The attribute –previous is available to reach the logs of a crashed instance, and this might be sufficient for minor clusters, with a small number of running containers and instances. However, it becomes very hard to manage cluster logs when the number of applications increases and the cluster runs on several machines. 

Another solution, which does not depend on cluster configuration, is to use an application-level logging configuration. In this solution, each container is responsible for configuring its own logging. You can use a logging service such as Logz.io to centralize and analyze all logging provided by your cluster, but this can be difficult and is very error-prone, since each container needs a proper configuration. Moreover, configuration changes, such as changing the output destination for logs, will demand a full redeployment of all containers that will be monitored.  

The proposed solution described in detail in this article uses logging agents.  

Logging agents are tools that expose K8S logs and push them to a configured location. Typically, these agents are containers that have access to a directory with logs from all applications on the node. This solution provides a transparent approach, as the deployed applications remain unchanged, and the number of nodes can increase or decrease without any effort to maintain the logging operation. 

To understand how to use logging agents, we first need to understand some underlying K8S concepts.  

Every pod in a K8S cluster has its standard output and standard error captured and stored in the /var/log/containers/ node directory. In addition, a special type of pod called a DaemonSet enables the creation of pods that exist on each cluster node. A logging agent could therefore be described as a DaemonSet pod that captures the logs provided in each node’s /var/log/containers/ directory and processes them somehow.  

The logging approach explored below includes the creation of a fluentd DaemonSet pod to capture the logs. Fluentd is an open-source framework for data collection that unifies the collection and consumption of data in a pluggable manner. There are several producer and consumer loggers for various kinds of applications. In the case of this article for example, K8S acts as the producer and the ELK Stack or Logz.io as the consumer.  

To allow these components to work together, a fluentd configuration file maps how the input data will be found and processed and where it will be saved. An agent container in each running node is deployed for monitoring and storing logs provided by the cluster and the applications.  

For the sake of simplicity, the examples below use Google Cloud to instantiate and handle new clusters.  

Logging K8S with Google Stackdriver 

Stackdriver is a monitoring and logging tool that allows you to monitor your applications deployed on GCP. If you’re using Google Cloud as your cloud provider, using Stackdriver with K8S is pretty straightforward. If you are not, but still want to use the Stackdriver service, you will need to configure a fluentd service to transform the data.  

To log to Stackdriver, first create a cluster using the Google Container Engine. Don’t forget to select Stackdriver logging on the configuration screen and enable Stackdriver on your account: 

stackdriver

By simply creating the cluster, you should already see some logging in Stackdriver:

create cluster

Before creating an application, we will deploy a ConfigMap and a DaemonSet to properly configure fluentd. K8S provides nice scripts for this. 

To create the ConfigMap: 

To create the DaemonSet:

Next, configure a pod to log to Stackdriver. For this article, we will use a simple log generator available on the Docker Hub. The random_logger.yaml file is available on GitHub here 

Run the command below to create the pod: 

That’s all there is to it. Logs will appear in Stackdriver as follows: 

Logs in Stackdriver

Logging K8S with ELK 

The ELK Stack, AKA the Elastic Stack, is today the de facto industry standard for centralized logging, especially in the world of open source. This section explores logging K8S logs to both the open source stack and Logz.io’s hosted ELK solution.  

As mentioned above, the files presented here are available on GitHub. The example used below is based on a post on Stack Overflow. 

As before, we will start with creating our cluster, this time without logging enabled: 

Next, we will create a DaemonSet to handle fluentd and the sending of the data to Elasticsearch: 

We will then run a series of commands using the provided files: 

These files were created for this article and perform the following: 

  • ‘daemonset-startup.yaml’ – runs the latest version of Elasticsearch on K8S without having to make manual corrections. 
  • ‘es-controller.yaml’ – creates an Elasticsearch instance on the cluster. The file describes which Docker image should be started and handled by K8S. 
  • ‘es-service.yaml’ – creates the communication layer between the instance and other containers. In this case, it provides a DNS name on the virtual network provided by K8S so that the fluentd DaemonSet can locate it with zero configuration. 
  • ‘kibana-controller.yaml’ – creates a Kibana instance on the cluster. 
  • ‘kibana-service.yaml’ – creates the communication layer between the instance and the Kibana server, allowing us to access Kibana. 

Next, we will use the following command to access the K8S dashboard. The command safely connects your local machine to the cluster on port 8001. The dashboard can then be accessed at: http://localhost:8001/ui: 

After the services start, Kibana should be available for analyzing and visualizing the logs. To access it, we will retrieve the service public IP via the K8S dashboard by clicking External Endpoints URL 

Next, we will create our random logger: 

Accessing Kibana, you should be seeing a screen similar to the one below: 

X Pack

Kibana is now asking us to define an index pattern. Our solution uses the fluentd-plugin-elasticsearch plugin which creates a Logstash index and so that is the index pattern suggested by Kibana. 

Once defined, the Discover page in Kibana will display K8S logs as follows:

K8S logs

 

You can now use Kibana’s analysis and visualization capabilities to query and visualize the logs.  

An art on its own, querying Kibana can be done in a number of ways. For example, if you want to see only the logs of a specific pod (e.g. random-logger), use a field-level query: 

Read more about querying in Kibana here 

Logging K8S with Logz.io 

There are various ways of shipping logs from into Logz.io. This section explains how to use fluentd to push the K8S logs into Logz.io. 

Note: Logz.io does not yet include support for Kubernetes. This article will be updated once this integration is introduced.

Assuming you have a Logz.io account already, make sure you have your user token available. The token can be found on the Settings page within the Logz.io UI.  

As before, we will begin by creating a new cluster: 

Then, we will open the daemonset-logz.yaml file (located in the logging_logz folder in GitHub repo referred to above) and change the key values as follows: 

  • __YOUR_LOGZ_URL_ the URL for your Logz service 
  • __YOUR_LOGZ_TOKEN_ your personal token so Logz can identify your 
  • __YOUR_LOGZ_TYPE_ a tag used to identify the origin of the logs.   

Then, we will create a DaemonSet that allows us to configure fluentd and the Logz.io–fluentd plugin effortlessly (based on this GitHub repo): 

And last but not least, we will create our random logger again: 

 After a short while, open Logz.io. The random logger logs should show up in Logz.io: 

random logger logs

Summing it up 

Using the data generated by your K8S cluster and collected into the ELK Stack, you can get a good picture on how your application is performing and more easily drill down into pod and container activity.  

As explained above, Kibana’s analysis and visualization features will allow you to slice and dice the logs in any way you want. As a simple example, you can analyze the log distribution over containers or the volume of logs per hour, coming from each pod.   

volume of logs

The sky’s the limit.  

The importance of centralized logging, implemented either in a simplistic manner with Stackdriver, or a complete way using an ELK-based solution, is crucial for being able to analyze and visualize these logs efficiently, and ultimately — monitor and troubleshoot your K8S-based environment.

Get ready-made Kibana dashboards and visualizations at your fingertips.