An important element of operating Kubernetes is monitoring. Hosted Kubernetes services simplify the deployment and management of clusters, but the task of setting up logging and monitoring is mostly up to us. Yes, Kubernetes offer built-in monitoring plumbing, making it easier to ship logs to either Stackdriver or the ELK Stack, but these two endpoints, as well as the data pipeline itself, still need to be set up and configured.
GKE simplifies matters considerably compared to other hosted Kubernetes solutions by enabling Stackdriver logging for newly created clusters, but many users still find Stackdriver lacking compared to more robust log management solution. Parsing is still problematic, and often requires extra customization of the default fluentd pods deployed when creating a cluster. Querying and visualizing data in Stackdriver is possible but not easy, not to mention the missing ability to create alerts to get notified on specific events.
In this article, I’ll show how to collect and ship logs from a Kubernetes cluster deployed on GKE to Logz.io’s ELK Stack using a fluentd daemonset.
I use Cloud Shell for connecting to and interacting with the cluster, but if you are using your local CLI, Kubectl and gcloud need to be installed and configured. To deploy a sample application that generates some data, you can use this article.
Step 1: Create a new project
If you already have a GCP project, just skip to the next step.
I recommend you start by creating a new project for your Kubernetes cluster — this will enable you to sandbox your resources more easily and safely.
In the console, simply click the project name in the menu bar at the top of the page, click New Project, and enter the details of the new project:
Step 2: Creating your cluster
If you already have a Kubernetes cluster running on GKE, skip to the next step.
Open the Kubernetes Engine page in the console, and click the Create cluster button (the first time you access this page, the Kubernetes API will be enabled. This might take a minute or two):
GKE offers a number of cluster templates you can use, but for this tutorial, we will make do with the template selected by default — a Standard cluster and the default settings provided for this template (for more information on deploying a Kubernetes cluster on GKE, check out this article.)
Just hit the Create button at the bottom of the page. After a minute or two, your Kubernetes cluster is deployed and available for use.
To connect to the newly created cluster, you will need to configure kubectl to communicate with it. You can do this via your CLI or using GCP’s Cloud Shell.
For the latter, simply click the Connect button on the right, and then the Run in Cloud Shell button.
The command to connect to the cluster is already entered in Cloud Shell:
Hit Enter to connect:
Fetching cluster endpoint and auth data.
kubeconfig entry generated for daniel-cluster.
To test the connection, use:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
gke-standard-cluster-1-default-pool-227dd1e4-4vrk Ready <none> 15m v1.11.7-gke.4
gke-standard-cluster-1-default-pool-227dd1e4-k2k2 Ready <none> 15m v1.11.7-gke.4
gke-standard-cluster-1-default-pool-227dd1e4-k79k Ready <none> 15m v1.11.7-gke.4
Step 3: Enabling RBAC
To ship Kubernetes cluster logs to Logz.io, we will be using a fluentd daemonset with defined RBAC settings. RBAC helps you apply finer-grained control over who is accessing different resources in your Kubernetes cluster.
Before you deploy the daemonset, first grant your user the ability to create roles in Kubernetes by running the following command (replace [USER_ACCOUNT] with the user’s email address):
kubectl create clusterrolebinding cluster-admin-binding --clusterrole
cluster-admin --user [USER_ACCOUNT]
And the output:
clusterrolebinding.rbac.authorization.k8s.io "cluster-admin-binding" created
Step 4: Deploying the fluentd daemonset
Fluentd pods are deployed by default in a new GKE cluster for shipping logs to Stackdriver, but we will be deploying a dedicated daemonset for shipping logs to Logz.io. Every node in your cluster will deploy a fluentd pod that is configured to ship stderr and stdout logs from the containers in the pods on that node to Logz.io.
First, clone the Logz.io Kubernetes repo:
git clone https://github.com/logzio/logzio-k8s/
Open the daemonset configuration file:
sudo vim logzio-daemonset-rbc.yaml
- kind: ServiceAccount
- key: node-role.kubernetes.io/master
- name: fluentd
- name: LOGZIO_TOKEN
- name: LOGZIO_URL
- name: varlog
- name: varlibdockercontainers
- name: varlog
- name: varlibdockercontainers
Enter the values for the following two environment variables in the file:
- LOGZIO_TOKEN – your Logz.io account token. Can be retrieved from within the Logz.io UI, on the Settings page.
- LOGZIO_URL – the Logz.io listener URL. If the account is in the EU region insert https://listener-eu.logz.io:8071. Otherwise, use https://listener.logz.io:8071. You can tell your account’s region by checking your login URL – app.logz.io means you are in the US. app-eu.logz.io means you are in the EU.
Save the file.
All that’s left is to deploy the daemonset with:
kubectl create -f logzio-daemonset-rbc.yaml
The output reports all the resources we require are created:
serviceaccount "fluentd" created
clusterrole.rbac.authorization.k8s.io "fluentd" created
clusterrolebinding.rbac.authorization.k8s.io "fluentd" created
daemonset.extensions "fluentd-logzio" created
You can verify that the pods were created with:
kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
event-exporter-v0.2.3-85644fcdf-bngk9 2/2 Running 0 2h
fluentd-gcp-scaler-8b674f786-d6lmb 1/1 Running 0 2h
fluentd-gcp-v3.2.0-h4zg5 2/2 Running 0 2h
fluentd-gcp-v3.2.0-qrfbs 2/2 Running 0 2h
fluentd-gcp-v3.2.0-wrn46 2/2 Running 0 2h
fluentd-logzio-9knkr 1/1 Running 0 1m
fluentd-logzio-nfxwz 1/1 Running 0 1m
fluentd-logzio-xtcq4 1/1 Running 0 1m
heapster-v1.6.0-beta.1-69878744d4-n9lsf 3/3 Running 0 2h
kube-dns-6b98c9c9bf-5286g 4/4 Running 0 2h
kube-dns-6b98c9c9bf-xr9nh 4/4 Running 0 2h
kube-dns-autoscaler-67c97c87fb-9spmf 1/1 Running 0 2h
kube-proxy-gke-daniel-cluster-default-pool-de9fa6e4-0cl9 1/1 Running 0 2h
kube-proxy-gke-daniel-cluster-default-pool-de9fa6e4-5lvx 1/1 Running 0 2h
kube-proxy-gke-daniel-cluster-default-pool-de9fa6e4-d7bb 1/1 Running 0 2h
kubernetes-dashboard-69db8c7745-wnb4n 1/1 Running 0 2h
l7-default-backend-7ff48cffd7-slxrj 1/1 Running 0 2h
metrics-server-v0.2.1-fd596d746-f4ljm 2/2 Running 0 2h
As seen here, a fluentd pod has been deployed for each node in the cluster (the other fluentd pods are the default pods deployed when creating a new GKE cluster).
In Logz.io, you will see container logs displayed on the Discover page in Kibana after a minute or two:
Step 5: Analyzing your Kubernetes cluster logs
Congrats. You’ve built a logging pipeline from a Kubernetes cluster on GKE to Logz.io. What now? How do you make sense of all the log data being generated by your cluster?
By default, container logs are shipped in JSON format using Docker’s json-file logging driver. This means that they will be parsed automatically by Logz.io. This makes it much easier to slice and dice the data with the analysis tools provided by Logz.io.
Still, you may find that some messages require some extra parsing, in which case you can, of course, tweak parsing in fluentd or simply ping Logz.io’s support team for some help.
You can query the data using a variety of different queries. You could perform a simple free-text search looking for errors but Kibana offers much more advanced filtering and querying options that will help you find the information you’re looking for.
For example, the filter box allows you to easily look at logs for a specific pod or container:
Logz.io also provides advanced machine learning capabilities that help reveal events that otherwise would have gone unnoticed within the piles of log messages generated in our environment.
In the example below, Cognitive Insights has flagged an issue with kubelet — the Kubernetes “agent” that runs on each node in the cluster. Opening the event reveals contextual information that helps us understand whether there is a real issue here or not:
If you want to see a live feed of the cluster logs, either in their raw format or in parsed format, you can use Logz.io’s Live Tail page. Sure, you could use the kubectl logs command to tail logs but in an environment consisting of multiple nodes and an even larger amount of pods, this approach is far from being efficient.
Step 6: Building a monitoring dashboard
Kibana is a great tool for visualizing log data. You can create a variety of different visualizations that help you monitor your Kubernetes cluster — from simple metric visualizations to line charts and geographical maps. Below are a few basic examples.
No. of pods
Monitoring the number of pods running will show you if the number of nodes available is sufficient and if they will be able to handle the entire workload in case a node fails. A simple metric visualization can be created to help you keep tabs on this number:
Logs per pod
Monitoring noisy pods or a sudden spike in logs from a specific pod can indicate whether an error taking place. You could create a bar chart visualization to monitor the log traffic:
Once you have all your visualizations lined up, you can add them into a dashboard that provides a comprehensive picture of how your pods are performing.
Of course, monitoring Kubernetes involves tracking a whole lot more than just container logs — container metrics, Kubernetes metrics and even application metrics — all these need to be collected in addition to the stderr and stdout output of your pods.
In a future article, we will explore how to ship Kubernetes metrics to Logz.io using a daemonset based on Metricbeat. In the meantime, I recommend reading up on Kubernetes monitoring best practices in this article.
The combination of GKE and the analysis tools provided in the ELK Stack and Logz.io is a powerful combination that can help simplify not only the deployment and management of your Kubernetes cluster but also troubleshooting and monitoring it.