A Practical Guide to Kubernetes Logging

Kubernetes logging

Kubernetes has become the de-facto industry standard for container orchestration. It provides the required abstraction for efficiently managing large-scale containerized applications with declarative configurations, an easy deployment mechanism, and both scaling and self-healing capabilities.

As with any system, logs help engineers gain observability into containers and the Kubernetes clusters they’re running on and the key role they play is evident in a lot of incidents featuring Kubernetes failures.  Yet Kubernetes poses a set of unique logging challenges. 

Kubernetes is a highly distributed and dynamic environment. In production, you’ll most likely be running dozens of machines with hundreds of containers that can be terminated, restarted, or rescheduled at any point in time. This transient and dynamic nature of the system is a challenge in itself. Kubernetes clusters are also comprised of multiple layers that need to be monitored, each producing different types of logs. 

Worried? Don’t be. Thankfully, there is a lot of literature available on how to gain visibility into Kubernetes. There are also various logging tools that integrate natively with Kubernetes to make the task easier. In this article, we’ll review some of these tools as well as review the Kubernetes logging architecture. 

Kubernetes logging architecture

As mentioned, one main challenge with logging Kubernetes is understanding what logs are generated and how to use them. Let’s start by examining the Kubernetes logging architecture from a birds eye view.

Container logging

The first layer of logs that can be collected from a Kubernetes cluster are those being generated by your containerized applications. The easiest method for logging containers is to write to the standard output (stdout) and standard error (stderr) streams.

Let’s take a look at an example pod manifest that will result in running one container logging to stdout: 

apiVersion: v1
kind: Pod
metadata:
name: example
spec:
containers:
- name: example
image: busybox
args: [/bin/sh, -c, 'while true; do echo $(date); sleep 1; done']

To apply the manifest, run:

kubectl apply -f example.yaml 

To take a look the logs for this container, we’ll use the kubectl log <container-name> command.

For persisting container logs, the common approach is to write logs to a log file and then use a sidecar container:

apiVersion: v1
kind: Pod
metadata:
  name: example
spec:
  containers:
  - name: example
    image: busybox
    args:
    - /bin/sh
    - -c
    - >
      while true;
      do
        echo "$(date)\n" >> /var/log/example.log;
        sleep 1;
      done
    volumeMounts:
    - name: varlog
      mountPath: /var/log
  - name: sidecar
    image: busybox
    args: [/bin/sh, -c, 'tail -f /var/log/example.log']
    volumeMounts:
    - name: varlog
      mountPath: /var/log
  volumes:
  - name: varlog
    emptyDir: {}

As seen in the pod configuration above, a sidecar container will run in the same pod along with the application container, mounting the same volume and processing the logs separately. 

Node logging

When a container running on Kubernetes writes its logs to stdout or stderr streams, the container engine streams them to the logging driver configured in Kubernetes. 

In most cases, these logs will end up in the /var/log/containers directory on your host. Docker supports multiple logging drivers but unfortunately, driver configuration is not supported via the Kubernetes API.

Once a container is terminated or restarted, kubelet stores logs on the node. To prevent these files from consuming all of the host’s storage, the Kubernetes node implements a log rotation mechanism. When a container is evicted from the node, all containers with corresponding log files are evicted.

Depending on what operating system and additional services you’re running on your host machine, you might need to take a look at additional logs. For example, systemd logs can be retrieved using the following command:

journalctl -u 

$ journalctl -u docker
-- Logs begin at Wed 2019-05-29 10:59:24 CEST, end at Mon 2019-07-15 10:55:17 CEST. --
jul 29 10:59:35 thinkpad systemd[1]: Starting Docker Application Container Engine...
jul 29 10:59:35 thinkpad dockerd[2172]: time="2019-05-29T10:59:35.285765854+02:00" level=info msg="libcontainerd: started new docker-containerd process" p
jul 29 10:59:35 thinkpad dockerd[2172]: time="2019-05-29T10:59:35.286021587+02:00" level=info msg="parsed scheme: \"unix\"" module=grpc

Logging kernel events might also be required in some scenarios. You might, for example, do the debug issues by mounting external volumes:

$ dmesg
[ 0.000000] microcode: microcode updated early to revision 0xb4, date = 2019-04-01
[ 0.000000] Linux version 4.15.0-54-generic (buildd@lgw01-amd64-014) (gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1)) #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 (Ubuntu 4.15.0-54.58-generic 4.15.18)
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.15.0-54-generic root=UUID=6e228d30-6415-4b41-b992-172d6899693e ro quiet splash vt.handoff=1
[ 0.000000] KERNEL supported cpus:
[ 0.000000] Intel GenuineIntel
[ 0.000000] AMD AuthenticAMD
[ 0.000000] Centaur CentaurHauls
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x008: 'MPX bounds registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x010: 'MPX CSR'

Cluster logging

On the level of the Kubernetes cluster itself, there is a long list of cluster components that can be logged as well as additional data types that can be used (events, audit logs). Together, these different types of data can give you visibility into how Kubernetes is performing as a ystem.

Core Kubernetes components

The following main core components comprise a Kubernetes cluster:

  • Kube-apiserver – the entry point to the cluster
  • Kubelet with container runtime – the primary node agent focused on running containers
  • Kube-proxy – the part that does TCP or UDP forwarding based on iptables or ipvs
  • Kube-scheduler – the element that determines where to run containers
  • Etcd—the key-value store used as Kubernetes’ cluster configuration storage

Some of these components run in a container, and some of them run on the operating system level (in most cases, a systemd service). The systemd services write to journald, and components running in containers write logs to the /var/log directory, unless the container engine has been configured to stream logs differently.

Events

Kubernetes events can indicate any Kubernetes resource state changes and errors, such as exceeded resource quota or pending pods, as well as any informational messages.

The kubectl get events -n <namespace> command returns all events within a specific namespace:

NAMESPACE	LAST SEEN	TYPE	  REASON OBJECT	MESSAGE
kube-system 	8m22s		Normal	  Scheduled            pod/metrics-server-66dbbb67db-lh865                                       Successfully assigned kube-system/metrics-server-66dbbb67db-lh865 to aks-agentpool-42213468-1
kube-system     8m14s               Normal    Pulling                   pod/metrics-server-66dbbb67db-lh865                                       Pulling image "aksrepos.azurecr.io/mirror/metrics-server-amd64:v0.2.1"
kube-system     7m58s               Normal    Pulled                    pod/metrics-server-66dbbb67db-lh865                                       Successfully pulled image "aksrepos.azurecr.io/mirror/metrics-server-amd64:v0.2.1"
kube-system     7m57s               Normal     Created                   pod/metrics-server-66dbbb67db-lh865                                       Created container metrics-server
kube-system     7m57s               Normal    Started                   pod/metrics-server-66dbbb67db-lh865                                       Started container metrics-server
kube-system     8m23s               Normal    SuccessfulCreate          replicaset/metrics-server-66dbbb67db             Created pod: metrics-server-66dbbb67db-lh865

Using kubectl describe pod <pod-name> will show the latest events for this specific Kubernetes resource:

Events:
  Type    Reason     Age   From                               Message
  ----    ------     ----  ----                               -------
  Normal  Scheduled  14m   default-scheduler                  Successfully assigned kube-system/coredns-7b54b5b97c-dpll7 to aks-agentpool-42213468-1
  Normal  Pulled     13m   kubelet, aks-agentpool-42213468-1  Container image "aksrepos.azurecr.io/mirror/coredns:1.3.1" already present on machine
  Normal  Created    13m   kubelet, aks-agentpool-42213468-1  Created container coredns
  Normal  Started    13m   kubelet, aks-agentpool-42213468-1  Started container coredns

Audit logs

Audit logs can be useful for compliance as they should help you answer the questions of what happened, who did what and when. 

Kubernetes provides flexible auditing of kube-apiserver requests based on policies. These help you track all activities in chronological order.

Here is an example of an audit log:

{
  "kind":"Event",
  "apiVersion":"audit.k8s.io/v1beta1",
  "metadata":{ "creationTimestamp":"2019-08-22T12:00:00Z" },
  "level":"Metadata",
  "timestamp":"2019-08-22T12:00:00Z",
  "auditID":"23bc44ds-2452-242g-fsf2-4242fe3ggfes",
  "stage":"RequestReceived",
  "requestURI":"/api/v1/namespaces/default/persistentvolumeclaims",
  "verb":"list",
  "user": {
    "username":"user@example.org",
    "groups":[ "system:authenticated" ]
  },
  "sourceIPs":[ "172.12.56.1" ],
  "objectRef": {
    "resource":"persistentvolumeclaims",
    "namespace":"default",
    "apiVersion":"v1"
  },
  "requestReceivedTimestamp":"2019-08-22T12:00:00Z",
  "stageTimestamp":"2019-08-22T12:00:00Z"
}

Kubernetes logging tools

Hopefully, you’ve now got a better understanding of the different logging layers and log types available in Kubernetes. The logging tools reviewed in this section play an important role in putting all of this together to build a Kubernetes logging pipeline. 

Fluentd

Fluentd is a popular open-source log aggregator that allows you to collect various logs from your Kubernetes cluster, process them, and then ship them to a data storage backend of your choice. 

Fluentd is Kubernetes-native and integrates seamlessly with Kubernetes deployments.  The most common method for deploying fluentd is as a daemonset which ensures a fluentd pod runs on each pod. Similar to other log forwarders and aggregators, fluentd appends useful metadata fields to logs such as the pod name and Kubernetes namespace, which helps provide more context.

ELK Stack

The ELK Stack (Elasticsearch, Logstash and Kibana) is another very popular open-source tool used for logging Kubernetes, and is actually comprised of four components: 

  • Elasticsearch – provides a scalable, RESTful search and analytics engine for storing Kubernetes logs
  • Kibana – the visualization layer, allowing you with a user interface to query and visualize logs
  • Logstash – the log aggregator used to collect and process the logs before sending them into Elasticsearch
  • Beats – Filebeat and Metricbeat are ELK-native lightweight data shippers used for shipping log files and metrics into Elasticsearch

ELK can be deployed on Kubernetes as well, on-prem or in the cloud. 

Together, these four components provide Kubernetes users with an end-to-end logging solution. As effective as it is, deploying and managing ELK deployments at scale is a challenge unto itself. 

Logz.io offers users with a fully-managed option for using the stack to log Kubernetes, with built-in integrations and monitoring dashboards. More information on logging Kubernetes with Logz.io’s ELK solution can be found here.

Google Stackdriver

And last but not least…Google Stackdriver.

Stackdriver is another Kubernetes-native logging tool that provides users with a centralized logging solution. Recently, Stackdriver also added support for  Prometheus. If you’re using GKE, Stackdriver can be easily enabled using the following command: 

gcloud container clusters create [CLUSTER_NAME] \
  --zone [ZONE]
  --project-id [PROJECT_ID]
  --enable-stackdriver-kubernetes \
  --cluster-version=latest

stackdriver

For more information on using Stackdriver to log Kubernetes, check out Logging Using Stackdriver

Endnotes

Once a cluster is up and running with logging in place, you can make sure your workloads and underlying infrastructure stay healthy. Logging also helps you to be prepared for issues that may arise during the deployment of a new production release and stop them before they affect the customer’s experience. 

It takes time to implement production-ready logging for your services, as well as to set up alerts and tune them appropriately. However, an effective logging solution allows you to focus on monitoring your key business metrics, which, in turn, increases the reliability of your products and your company’s revenue. 

To learn more contact us or visit our blog.

Useful Commands Cheat Sheet

Some useful kubectl commands are listed below:

kubectl logs  -f # stream logs
kubectl logs  --since=1h # return logs newer than a relative duration
kubectl logs  --since-time=”??”  # return logs after a specific date (RFC3339)
kubectl logs  --previous # print the logs for the previous instance of the container
kubectl logs  -c  # print the logs of this container	
kubectl logs -l 
Easily monitor, troubleshoot, and secure your Kubernetes environment with Logz.io
Artboard Created with Sketch.
× Book time with us at re:Invent here! Book