It’s no big secret that container orchestration is all the rage today. Being the topic of many articles and conferences, it can sometimes seem as though it is the ONLY topic worthy of discussion.
At Logz.io, we are now at the end of the process of migrating all of our containers into Kubernetes, and I would like to tell you the story of the process we went through when deciding which orchestration platform to use in the hopes of helping those of you who are still unsure of which tool to use or whether you need orchestration to start with.
What is orchestration and do I need it?
In my opinion, the first ground rule is that if you don’t know why you need orchestration you probably don’t. Container orchestration (and I’m purposely avoiding using the word Docker) is not for everyone and does not answer every need.
So what is orchestration?
Imagine you have 10 containers that serve different purposes. Using a bunch of instances and running these containers is pretty easy. When your application begins to grow and the number of containers you’ve deployed goes up to 100, pressure mounts but it’s still bearable. But when you find yourself managing thousands of containers, each with different versions, relationships and network configurations, things begin to get a bit crazy.
For companies using modern development techniques that heavily rely on containers, the challenge of scaling this type of architecture can be too much to handle.
And this is where orchestration comes into the picture.
The entire point of an orchestration infrastructure is to provide a simple way to “schedule” containers and let the underlying infrastructure do the rest.
It sounds like pure magic, but there is a lot of complicated software running this, and as extremely complicated software tends to be, everything works great until it doesn’t.
Narrowing down Container Orchestration Tools
There are five big names you will hear over and over again in the context of container orchestration: Kubernetes, Mesos (DC/OS), ECS, Swarm and Nomad. The process of deciding which of these tools to use will differ according to the company and individuals involved. Sometimes it will just boil down to personal preferences.
At Logz.io, we ended up with the two platforms named in the title of this article after a process of elimination.
- Swarm – and yes, this is a matter of opinion – was profiled as being too basic and simple for our needs. Good for testing but not really a tool we felt comfortable using in production.
- Amazon’s ECS has improved greatly since its initial release, but it still seems to be falling behind the other main players. Since we wanted a tool that was cloud agnostic, ECS was not really an option for us to start with.
- We felt Nomad was too young a project and not mature enough to be seriously considered but with all due fairness, it might deserve another evaluation in the future.
Which left us with two strong players — the ever growing in popularity and usage, Kubernetes, and the evolving DC/OS.
About DC/OS and Kubernetes
Just to make sure we are all referring to the same concepts, here is a short historical background and explanation to help clarify matters.
Mesos is a project by Apache that gives you the ability to run both containerized, and non-containerized workloads in a distributed manner.
It was initially written as a research project at Berkeley and was later adopted by Twitter as an answer to Google’s Borg (Kubernetes’ predecessor). To combat its high degree of complexity (Mesos is super complicated and hard to manage!), Mesosphere came into the picture to try and make Mesos into something regular human beings can use.
Mesosphere supplied the superb Marathon “plugin” to Mesos, which provides users with an easy way to manage container orchestration over Mesos.
In mid-2016, DC/OS (Data Center Operating System) — an open source project backed by Mesosphere — was introduced, which simplifies Mesos even further and allows you to deploy your own Mesos cluster, with Marathon, in a matter of minutes.
When referring to Mesos in this article, I am referring to DC/OS.
Kubernetes? Well, just in case you’ve lived on the moon for the past few years, Kubernetes is a container orchestration platform that was released by Google in mid-2014 and has since been contributed to the Cloud Native Computing Foundation.
Mesos vs. Kubernetes
The first thing to point out is that you can actually run Kubernetes on top of DC/OS and schedule containers with it instead of using Marathon. This implies the biggest difference of all — DC/OS, as it name suggests, is more similar to an operating system rather than an orchestration framework. You can run non-containerized, stateful workloads on it. Container scheduling is handled by Marathon.
Simplicity wise, Marathon’s general approach to APIs is straightforward in comparison to Kubernetes. Marathon aggregates APIs and provides a relatively small amount of API resources, whereas Kubernetes provides a larger variety of resources and is based on label selectors.
Third, there is an obvious difference in the level of popularity the two platforms enjoy. Why does this matter? For the obvious reasons — the size of the community-driven development and offering support. Kubernetes has almost 10x the commits and GitHub stars as Marathon.
DC/OS has a “Premium” subscription that opens up extra features, while Kubernetes is a completely open source.
Which brings me to the next bullet.
Why Kubernetes won
At Logz.io, I was a champion of DC/OS. I loved the simplicity of it, and the ability to run stateful workloads. I was fully ready to give up on some of Kubernetes’ strengths in favor of choosing DC/OS.
Then I discovered that a simple feature I needed to automate the deployment process is only included in the enterprise version. From that point on, and after talking to Mesosphere, we came to the realization that this might not be a one-time thing, and even if we overcome this specific hurdle, DC/OS is controlled by a commercial company, for better and for worse.
So began our Kubernetes project.
After a lengthy process of adapting our containers to the Kubernetes state of mind, and perhaps more significantly – after overcoming organizational and cultural challenges (a topic for an entirely different post), we are now managing hundreds of containers with Kubernetes.
This change has also facilitated more efficient Continuous Deployment by helping us shift all deployment responsibilities to our developers who now deploy new code multiple times a day. Bottom line – the move to container orchestration with Kubernetes has shortened the “Jira ticket -> Production” development cycle to 30 minutes.
Summing it up
So while orchestration platforms are one of the hottest technologies in town, it still doesn’t mean you actually need it, but in case you do – I hope I shed some light on the reason we chose Kubernetes over other existing solutions.
In a future blog post, I’ll dive deeper into the technical difficulties needed to safely move to Kubernetes, and the cultural changes we had to go through to make Logz.io continuously deployed.