Kubernetes log analysis

Logging is one of the major challenges with any large deployment on platforms such as Kubernetes, but configuring and maintaining a central repository for log collection can ease the day-to-day operations. For that purpose, the combination of Fluentd, Elasticsearch, and Kibana can create a powerful logging layer on top of Kubernetes clusters.

In this article, we will describe how to log Kubernetes using dedicated Fluentd, Elasticsearch and Kibana nodes. To do this, we will define our own pods so that when our cluster is created, the standard output and standard error output for each container is ingested into Elasticsearch and Kibana using a Fluentd agent.

elasticsearch fluentd kibana

If you are using Google Cloud Platform, a section at the end describes how to use the default logging option for Google cloud.

Collecting Logging with Fluentd

Fluentd is an open source data collector for unified logging layers. In this article, we will be using Fluentd pods to gather all of the logs that are stored within individual nodes in our Kubernetes cluster (these logs can be found under the /var/log/containers directory in the cluster).

Installing and Configuring Fluentd

First, we will need to install Fluentd (for instructions, use these installation guides).

Next, we will install the fluent-plugin-kubernetes_metadata_filter, which allows Kubernetes to create symlinks to Docker log files in /var/log/containers/*.log.

Now, to view the logs from the entire cluster, we have to launch a single instance of the Fluentd agent on each of the nodes. Below is the Fluentd configuration file, which provides the path for the logging directory:

Running Fluentd

Our next step is to run Fluentd on each of our nodes.

Kubelet is the primary “node agent” that runs on each node and is used to launch PodSpec written in YAML or JSON. We need to specify the pod’s definition in the Kubernetes manifests directory at /etc/kubernetes/manifests (learn how to set this one up in case you don’t have it).

To create the Fluentd pod, use the following yaml configuration (fluentd-pod.yaml):

In this configuration file, you have to provide the Docker image repository from where you will launch the container.

Once you have the manifest in place, restart the kubelet so that it picks up the definition. The nodes will then start to download and run the container defined in the manifest:

Then, bring it up again using kube­up.sh as shown:

You will then need to define the resources depending on your system. If you wish to change the application label name, make sure that you also change it in the service.

Launching our pods

Now, we have to launch the pod manually. We can do this by configuring the fluentd-pod.yaml file and using the “create” command to launch the pod as follows:

You can now check that your pod is up and running:

We can check the logs in the Fluentd container by executing the following command:

Use your container ID to see the logs inside the container:

see logs inside containers

Streaming logs from Fluentd into Elasticsearch

Now that we have our Fluentd pods up and running, it’s time to set up the pipeline into Elasticsearch (see our complete guide to the ELK Stack to learn how to install and use Elasticsearch).

Configuring and Launching Elasticsearch as a replication controller

Since there is no need to have Elasticsearch running on each and every node, we will first launch a single instance and run it as a replication controller.

Here is the .yaml file in which our Elasticsearch instance is defined as a replication controller:

To launch the Elasticsearch replication controller, execute the following command:

We can check that the replication controller has been created and is running as expected using this:

check replication controller

Make sure that the pod is also up and running with this command:

make sure pod is running

Creating an Elasticsearch service for communicating with Fluentd

To gather the logs from the nodes in our Kubernetes cluster, we need to launch a service that will establish the communication between our Elasticsearch pod and the Fluentd pods running on the nodes.

To ensure that our Fluentd pods will be able to locate the Elasticsearch instance, we are first going to use a Kubernetes service to expose an externally visible name for an endpoint. A Kubernetes service has a single IP address, a DNS scheme, and a SkyDNS add-on (the service launches automatically in the kube-system namespace when we run the kube cluster).

We can then use the container label in the Elasticsearch service definition to point to the Elasticsearch container (see the selector section in service definition below). The service will now be registered in DNS, allowing Fluentd to communicate with the Elasticsearch instance.

This following .yaml file contains the definition for the Elasticsearch service (elasticsearch-svc.yaml):

The container kubernetes/fluentd-elasticsearch will constantly look at the containers’ log files (/var/lib/docker/containers/*) and will send the data, in Logstash format, to port 9200 on the local node.

In our case, the service specification will group all the containers that have the label k8s- app=Elasticsearch-logging together. It will also map their internal port to 9200.

Launching the Elasticsearch service

To launch the Elasticsearch service, use this command:

We can check that the service is up and running this way:

check that service is up and running

Querying the logs with Elasticsearch

With the help of the Elasticsearch service that we have launched, we can now view and query container logs.

We will use the provided cluster IP address to access the Elasticsearch logs and use the ‘q=* &pretty’ search query to see web browser logs:

And the output:

elasticsearch log output

To search for warning messages, use the following query:

The output:

elasticsearch warning messages

Analyzing Kubernetes logs in Kibana

Now that we have our logs stored in Elasticsearch, the next step is to display them in Kibana. To do this, we will need to run Kibana in our cluster. Just as with Elasticsearch, we need only one Kibana instance.

Let’s set that up.

Configuring and launching Kibana as a replication controller

Launching Kibana as a replication controller ensures that at least one instance is always running in the cluster.

In ours. yaml file (kibana-rc.yaml), we need to specify the Kibana image to use for the pod:

To launch the Kibana replication controller, execute the following command:

To make sure that the Kibana replication controller is up and running, use this:

check kibana replication controller

Check that the pod is running with the following command:

Creating a Kibana service to communicate with Elasticsearch

To allow our pod to retrieve logs from Elasticsearch, we will need to configure a Kibana service.

In the configuration file (kibana-svc.yaml), use the default port 5601 for both the service and Kibana pod:

This service will select our Kibana pod with the help of the k8s-app: Kibana-logging label that we provided in the pod definition.

Launching and accessing Kibana

To launch the Kibana service, use this command:

To verify that the service is up and running, use this:

$ kubectl get –namespace=kube-system services

verify kibana

To view and access the Kibana interface, install an NGINX web server (skip to next step if one is installed already):

Change the NGINX configuration files in /etc/nginx/sites-available/default and etc/nginx/sites-enabled/default as shown here:

Specify your service IP address and the port to where you want to listen. In this case, the IP address is and listen port is 80 (default).

Now, you can start the NGINX service:

Go to your browser and enter: http://localhost:80. You should be able to see Kibana:

kibana discover

kibana visualize

Logging Kubernetes Using Google Cloud

Since Kubernetes works natively with Google Cloud, users can enable cluster-level logging easily. But if we want to get the logging through Fluentd and Elasticsearch, we can set the environment variables in the configuration file for our cluster as follows (but this works only if you use GCP as your Kubernetes provider):

Then, you can bring up and start the cluster with the following command:

Once the cluster is up and running, Kubernetes will launch the Fluentd and Elasticsearch pods automatically.

A Final Note

In this tutorial, we used single Elasticsearch and Kibana instances. Needless to say, this is not a scalable solution for larger environments. In a real-life scenario, a more robust and flexible logging system is required for Kubernetes logging.

Easily Configure and Ship Logs with Logz.io ELK as a Service.