A List of Resources to Learn Kubernetes' Toughest Subjects

By: Gedalyah Reback

What Are the Hardest Parts of Kubernetes to Learn?

Many enterprises have already adopted Kubernetes or have a Kubernetes migration plan in place, making it clear that the platform is here to stay. While it provides a lot of benefits to its users, to take advantage of them, you need to thoroughly learn Kubernetes and how it works in production. Typically, the most difficult aspects of Kubernetes are learned through experience solving real-world problems. This post will focus on providing resources to help you do exactly that, while also explaining some of the core concepts behind Kubernetes.

1. Cluster Administration

If you have to start somewhere, learn Kubernetes cluster concepts. The major cloud providers (Google, AWS, Azure, etc.) all provide Kubernetes clusters as a service. This is a great place to get started. With just a couple of clicks in their cloud consoles, you can create a cluster.

Furthermore, the cloud providers manage the cluster, performing routine tasks like certificate rotation and version patching. If, however, you have to set up your own Kubernetes clusters, you’ll need to manage everything yourself, making the situation a bit more complex.

Described below are a few of the scenarios you might come across when managing clusters on your own, as well as some resources that can help you become more proficient at this process.

Bootstrap Kubernetes from Scratch

While you can easily use a Kubernetes bootstrapping tool like kubeadm to set up clusters, it’s important to know how the cluster is bootstrapped from scratch. Accordingly, you’ll be able to tackle production issues that may arise later on. The following two sources explain this well:

Kubernetes the Hard Way on Bare Metal, Oahcran
Kubernetes the Hard Way, Kelsey Hightower

Bootstrap Kubernetes Using kubeadm

Kubeadm will automatically perform the configuration you had to do on your own in the previous section. As a result, it’s a pretty popular tool to bootstrap Kubernetes.

There are multiple architectures to decide from, and, during your set up process, you’ll have to answer questions like, “Should the architecture be highly available?” and, “Is a single master cluster sufficient?” Below are a few guides that will help you make these decisions:

Options for Highly Available topology, Kubernetes Documentation
Creating a single control-plane cluster with kubeadm, Kubernetes Documentation
Kubernetes on CentOS 7 with Firewalld, Platformer Cloud, Nilesh Jayanandana
Demystifying High Availability in Kubernetes Using Kubeadm, Velotio Technologies
Adding Windows nodes, Kubernetes Documentation

Server Patching

You still need to maintain clusters after they are configured. It’s recommended that you update Kubernetes versions as soon as new stable versions are released, and these updates need to be done incrementally. In other words, if your current version is 1.15, and the latest version is 1.18, you can’t update your servers directly to 1.18. The following guides will help you to update Kubernetes with minimum disruption to your running workloads:

Upgrading kubeadm clusters, Kubernetes Documentation
Upgrading Windows nodes, Kubernetes Documentation
Kubernetes Upgrade: The Definitive Guide to Do-It-Yourself, Platform9

Prevent Resource Overuse

When you deploy your workloads in Kubernetes clusters, there’s a chance you’ll end up overusing the cluster node resources, a situation which can eventually lead to system crashes.

You can minimize these risks by learning Kubernetes pod limits and resource quotas, processes that the following resources tackle in depth:

Managing Compute Resources for Containers, Kubernetes Documentation
Assigning Pods to Nodes, Kubernetes Documentation
Configure Memory and CPU Quotas for a Namespace, Kubernetes Documentation
Kubernetes Resource Quotas, Alibaba Cloud

Backup and Restore

Every system should have a backup and restore plan in place, and Kubernetes is no exception. The resources listed below describe how to envision and set up an effective plan. Note that the persistence layer in Kubernetes is a key value store called etcd.

Backup Kubernetes – how and why (with examples for etcd), Elastisys
Operating etcd clusters for Kubernetes, Kubernetes Documentation
Creating etcd backup, CoreOS

2. Networking

Kubernetes networking-related issues are common in misconfigured Kubernetes systems. Networking is a core layer in Kubernetes, and, early in the process of bootstrapping a cluster, you need to make a decision about which container networking interface (CNI) you want to use in your cluster.

Depending on the CNI you choose, you may gain access to additional features like network policy support and encryption over network. Network policy support may be a critical requirement for you if you want to control ingress and egress traffic through your namespaces or pods inside the Kubernetes cluster.

CNI Providers

Choosing the correct CNI depends on your security policies, performance targets, and scalability, as well as the hardware running in your data center. These resources can help you to select the best CNI for your use case:

Container Network Interface (CNI) Providers, Rancher
Comparing Kubernetes Networking Providers, Rancher
Benchmark results of Kubernetes network plugins (CNI) over 10Gbit/s network (Updated: April 2019), ITNext
Kubernetes Multi-Cluster Networking-Cilium Cluster Mesh, ITNext
Flannel vs Calico : A battle of L2 vs L3 based networking, Shashank Jain

Network Policies

Kubernetes allows you to configure network policies in the cluster network. These act as a virtual firewall inside the cluster and allow you to fine tune and control traffic between pods and namespaces in Kubernetes.

In order to do this, you’ll need to configure a network policy supported CNI. The most common CNI used in tutorials, Flannel, doesn’t support network policies. The resources below examine the details of working with network policies:

Network Policies, Kubernetes Documentation
Secure Your Kubernetes Application With Network Policies, Bitnami

Service Discovery

Service discovery in Kubernetes works with the component coredns, which requires your CNI to be configured properly. Coredns provides other functionalities as well, allowing you to configure a wide array of DNS plugins to suit your specific requirements.

Debugging DNS Resolution, Kubernetes Documentation
Customizing DNS Service, Kubernetes Documentation
CoreDNS Plugins, CoreDNS
How to Add Plugins to CoreDNS, CoreDNS

3. Kubernetes Storage

Kubernetes is a PaaS that allows you to run workloads as containers. More often than not, these workloads will need to persist their state.

K8s supports a wide range of storage drivers natively, and the option to use external drivers exists as well. When choosing a storage driver, you need to consider performance, volume access modes, availability and scalability.

Kubernetes Volume Access Modes

Kubernetes supports three volume access modes: ReadOnly, ReadWriteOnly, and ReadWriteMany. Be careful when choosing the volume drivers, since some may not support all three modes.

Many drivers don’t support ReadWriteMany. However, if ReadWriteMany mode is important to you, the most commonly used driver is NFS. More information about volume access modes can be found at these links:

Persistent Volumes, Kubernetes Documentation
Storage Classes, Kubernetes Documentation
Dynamic Volume Provisioning, Kubernetes Documentation
NFS Persistent Volumes with Kubernetes on GKE — A Case Study, Nilesh Jayanandana
Storage on Kubernetes: OpenEBS vs Rook (Ceph) vs Rancher Longhorn vs StorageOS vs Robin vs Portworx vs Linstor, Vito Botta
Configuring NFS Storage for Kubernetes, Docker
Using overlay mounts with Kubernetes, Amartey Pearson

Persistent Volume Backup and Restore

After you’ve configured storage for Kubernetes, you need to learn Kubernetes backup and restore protocol. There are multiple tools out there to support this, and some storage drivers already have backup systems implemented. Here are a few resources addressing possible solutions:

Volume Snapshots, Kubernetes Documentation
Stash by AppsCode, AppsCode

4. Learn Kubernetes Security

Whether your Kubernetes cluster is running on-prem or in the cloud, its security is of utmost importance. An ill-configured Kubernetes cluster will be vulnerable to attacks.

As a rule of thumb, it’s best to avoid exposing the Kubernetes API to the public in order to reduce the surface area of potential attacks.

Of course, there are other ways to attack a cluster and consequently to prevent those attacks, many of which we describe here.

Service Accounts

Service accounts are the resources associated with Kubernetes’ authentication mechanism. They let you log in and use the Kubernetes API. From there, you can create multiple roles with specific permissions in Kubernetes and bind these roles to service accounts.

The majority of developers use the default kubectl config file from kubeadm instead of creating a separate user and a separate config file for every kubectl user. However, this isn’t advisable since it might open up security vulnerabilities.

The resources below can help you navigate service accounts:

Kubernetes RBAC and TLS certificates – Kubernetes security guide (part 1), Sysdig
Securing Kubernetes components: kubelet, Kubernetes etcd and Docker registry – Kubernetes security guide (part 3), Sysdig
How to Secure Kubernetes Clusters from Pod to Network, Logz
Using RBAC Authorization, Kubernetes Documentation
OPA Gatekeeper: Policy and Governance for Kubernetes, Kubernetes Blog
Auditing, Kubernetes Documentation

Avoiding Root Containers

It’s best to avoid running containers in root mode. You can avoid containers running as root using the security contexts available in Kubernetes and by using container sandboxing methodologies if you’re running a multi-tenant system in a single Kubernetes cluster. The resources below describe these processes:

Configure a Security Context for a Pod or Container, Kubernetes Documentation
Pod Security Policies, Kubernetes Documentation
Kubernetes security context, security policy, and network policy – Kubernetes security guide (part 2), Sysdig
How to Implement Secure Containers Using Google’s gVisor, Karthikeyan Shanmugam
Kata Containers on Kubernetes and Kata Firecracker VMM support, Gokul Chandra

User Federation

Many enterprises use Active Directory or some other tool for user management. When you set up Kubernetes clusters, you must federate users to these clusters and configure access using LDAP or SAML protocols. Here are a few guides that can help you get started with this process:

Ingress Traffic

Your Kubernetes cluster will almost always have a few services exposed to the internet. To have more control over your cluster’s ingress traffic, send it through an API gateway.

Getting Started With Ambassador, DZone
Kong Gateway in Kubernetes, ITNext
Three Strategies for Managing APIs, Ingress, and the Edge with Kubernetes, ITNext
API Gateway as an Ingress Controller for Amazon EKS, AWS
Patterns for deploying Kubernetes APIs at scale with Apigee, Google Cloud

Secure Secrets

The secrets you store in Kubernetes are just base64 encrypted strings. Consequently, just about anyone can easily decrypt them. Unfortunately, you cannot commit configuration files to a repository without exposing your secrets to third parties.

The resources in this section provide best practices and tools for storing your sensitive data in Kubernetes securely.

Protecting Kubernetes Secrets: A Practical Guide, AquaSec
Injecting Vault Secrets Into Kubernetes Pods via a Sidecar, Vault
Managing secrets deployment in Kubernetes using Sealed Secrets, AWS
Encrypting Secret Data at Rest, Kubernetes Blog
Securely Keeping Kubernetes Secrets in Git, VictorOps
Kubernetes Secrets Management, DZone
Securing Secrets With HashiCorp Vault and Logz.io Security Analytics, Logz

Kubernetes Security Best Practices

A lot of security implementations are required to secure your Kubernetes cluster. This section’s resources explain best practices for Kubernetes clusters.

Remember to always verify and benchmark your Kubernetes cluster to make sure that it meets the security standards specified by your organization.

Kubernetes Best Security Practices, Logz
Find Security Vulnerabilities in Kubernetes Clusters, Rancher
The top Kubernetes security best practices, Sqreen Blog
Hardening your cluster’s security, Google Cloud
9 Kubernetes Security Best Practices Everyone Must Follow, CNCF
K8s security guide, Sysdig
Kube-hunter – an open source tool for Kubernetes penetration testing, AquaSec
Kubernetes Audit: Making Log Auditing a Viable Practice Again, CNCF
Kubernetes Audit Logging, Sysdig

5. Kubernetes Observability

Logging and monitoring (metrics) complete the setup of your production-grade Kubernetes cluster.

Metrics, also known as monitoring (though the term can sometimes be found extending to other aspects of observability) allows you to observe and act upon cluster behavior over time. Concurrently, logging lets you debug running workloads and observe their status.

Monitoring

There are many monitoring stacks, but the most popular combine Prometheus with Grafana and/or the ELK Stack. Both provide similar features (though Grafana is designed with metrics in mind; Elastic with a focus on logs).

Fortunately, if you’re using a cloud-managed Kubernetes solution, your cloud vendor will have a dedicated monitoring stack available to you. For example, GKE uses Stackdriver.

Below are a few guides describing how to set up monitoring in Kubernetes:

Kubernetes Monitoring: Best Practices, Methods, and Solutions, Logz
Kubernetes Monitoring with Prometheus -The ultimate guide (part 1), Sysdig
Kubernetes Monitoring with Prometheus: AlertManager, Grafana, PushGateway (part 2), Sysdig
Kubernetes monitoring with Prometheus – Prometheus operator tutorial (part 3), Sysdig
Configure Liveness, Readiness and Startup Probes, Kubernetes Documentation
Kubernetes Monitoring, ELK Stack

Logging

Containers are stateless. Logs are streamed to standard out, and log files are saved in the container engine log folder in the host operating system.

However, looking for these logs manually is incredibly tedious, so logs are typically aggregated and pushed to a centralized logging provider where users can easily look into them. The following list of links should help you navigate the details of logging:

An Introduction to Kubernetes Logging, Logz
Logging Architecture, Kubernetes Documentation
Logging Using Elasticsearch and Kibana, Kubernetes Documentation
Monitor and Troubleshoot Your IT Environment, Logz
Fluentd vs. Logstash: A Comparison of Log Collectors, Logz
How To Set Up an Elasticsearch, Fluentd and Kibana (EFK) Logging Stack on Kubernetes, Hanif Jetha

Tracing

In a microservice environment, where you have hundreds if not thousands of microservices running, debugging gets complicated. All your requests hop off these microservices to complete one business domain request in your application.

Tracing manages this complexity by monitoring and analyzing request payloads and request hops between your services, allowing you to get a clearer picture of how your services are running in your environment.

Top 11 Open Source Monitoring Tools for Kubernetes, Logz
Jaeger Operator for Kubernetes – JaegerTracing, Jaeger
Using Kubernetes Pod Metadata to Improve Zipkin Traces, SoundCloud
When Istio Meets Jaeger – An Example of End-to-end Distributed Tracing.md, Stevenc81
A guide to distributed tracing with Linkerd, Linkerd
Distributed Tracing with Java “MicroDonuts”, Kubernetes and the Ambassador API Gateway, Ambassador

Conclusion

Getting started with Kubernetes is easy; doing things the right way requires practice. To master it fully, you need to have hands-on experience using it to solve real world problems.

Sometimes, you need a little guidance from an expert on where to start looking and how to get going. There are a lot of different opinions out there regarding how to best achieve the outcomes we discussed above. As such, we’ve tried to provide a collection of some of the best resources to help you sort through them. The resources in this post have been compiled in hopes of providing you with that initial direction.

It’s up to you to dive into these articles and build a Kubernetes strategy that suits your requirements and needs. With a little time and effort, you, too, can be a pro-Kubernetes administrator!