In a previous article, we examined service meshes in detail. Briefly, a service mesh takes care of network functionality for the applications running on your platform. As Kubernetes has matured as a technology, service meshes have become a hot topic, with various products being developed to solve the challenges associated with areas like traffic management, security, and observability. To understand the topic well, you should see a thorough service mesh comparison.
This article will compare three service meshes. First, the biggest player in the service mesh space: Istio. It was open-sourced in May 2017 by Google, IBM, and Lyft, and it has since gained a lot of mindshare. The second, Linkerd, has been around a bit longer, starting as a network proxy in version 1.0. It merged with a preexisting service mesh (Conduit) in September 2018 to form Linkerd 2.0, which adds service mesh features to the network proxy. In this article, we will focus on Linkerd 2.0. Finally, we will examine Consul Connect, the product Hashicorp (creators of Vault, Terraform, and Vagrant) has thrown into the ring.
Service Mesh Architecture
All three of these products use a similar architecture. They separate a “control plane” that manages the paths that data take at a cluster level from a “data plane” that refers to the functions and processes that forward data from one interface to another within the mesh.
Consul Connect uses an agent running on each node in a daemonset as the control plane, while Istio and Linkerd’s Conduit use centralized services.
For the data plane, all three mesh products use a “sidecar” pattern that places a proxy running in a separate container within each pod. This sidecar container receives the data from and sends the data to the application. It also does the heavy lifting involved with moving or transforming the data to other pods or to spaces outside the cluster. Istio uses the Envoy proxy to perform this function, which appears to be the best-documented and best-supported choice. Linkerd 2.0 has adopted the Conduit product as its proxy. Consul Connect, by contrast, has a pluggable architecture for its data plane that allows different proxies to be used.
The traffic management picture is somewhat complicated.
At present, Istio has more traffic management features than Linkerd, including circuit breakers, fault injection, retries, timeouts, routing rules, virtual servers, load balancing, and others. Linkerd has a roadmap to catch up to Istio’s offerings. Consul Connect has been trying to do the same, recently adding features for path-based routing, traffic shifting, load balancing, and telemetry.
All three products have good basic support for certificate rotation and external root certificate support, but Istio leads the pack when it comes to security features.
With respect to mutual TLS (mTLS), Istio and Consult Connect offer support for both HTTP and TCP. Linkerd, however, does not support TCP mTLS. Istio is particularly strong on the policy management front, since it allows different providers to integrate their products into the “template” policy management framework, and it allows administrators to set rules that determine which applications can communicate with each other.
In the increasingly crowded world of observability, the picture is, once again, complicated.
Consul Connect takes an unbiased approach relative to Linkerd and Istio, allowing observability tools such as the metrics tool Prometheus to plug into the product for monitoring purposes.
Linkerd offers Grafana dashboards out of the box that provide service insights, while Istio has close integration with Kiali. Kiali is an observability tool designed for Istio that can produce metrics, infer network topology, and integrate with Grafana for more advanced querying capabilities.
In order to fully take advantage of tracing applications, your applications may need to be adjusted to add appropriate headers. In this sense, tracing differs from other service mesh features. There is a lot of developer focus on tracing, and the meshes are quickly adding features to support more backends.
According to the servicemesh.es website, Istio is compatible only with Jaeger’s, Zipkin’s, and Solarwinds’ tracing backends. Beginning with version 2.6 (released in October 2019), Linkerd also supports any provider adhering to the OpenCensus standard. This includes Jaeger and Zipkin (but not Solarwinds), as well as Honeycomb. Consul Connect supports Jaeger, Zipkin, OpenTracing, DataDog, and Honeycomb.
If you run a service mesh, then it is quite likely that you will want to log events like network activity and policy violations in addition to maintaining your standard application logs. All three of these products have the capability to link up to the standard Kubernetes logging stacks. An example of Istio integrated with the ELK stack is available here.
Service meshes have historically been known for being difficult to set up and maintain, so, if you’re evaluating meshes, this area may be one you pay particular attention to.
Istio has been considered to be especially difficult to install and operate. The project has tried to address this by abandoning its microservices architecture in favor of a monolithic approach. While it maintains a microservices philosophy internally, with strict boundaries between the code and interactions between what were formerly separate services, from the perspective of the cluster administrator, it is a single process: istiod. While this flexible approach is good for engineering, it can be a challenge to maintain your operation’s stability in the face of changes like these.
All three products can be installed using Helm, so there is little difference among them on that front. Linkerd has a reputation as being the easiest to configure and operate due to its relative architectural simplicity, reduced feature surface area, and opinionated tooling choices.
While Istio has several services making up its control plane (all of which can fail and require configuration in various ways) and an Envoy sidecar model for each and every pod, Linkerd only has one process running on each node.
If you’re looking for paid support, then Consul Connect is arguably the best choice, since it is owned and maintained by Hashicorp and is tightly integrated with the company’s other products.
Similarly, Buoyant, the original creators of Linkerd, offers support, training, and enterprise products around the open-source Linkerd tool.
Then there’s Istio. Although Google, IBM, and Lyft sponsored the original development of Istio, they do not offer any kind of support for it. However, IBM’s OpenShift Enterprise product offers paid support for “OpenShift Service Mesh,” a productized version of Istio designed for performance and operational stability.
In the realm of performance, Istio does less well than the other two service meshes. This is not surprising, since Istio’s complex policy management components and integrations can impact network performance.
Indeed, one benchmark comparison showed that, at a base-queries-per-second level, Linkerd performed an order of magnitude better than Istio, reducing to a ~3x processing rate under load. Similar figures for Consul are not available, but its distributed architecture suggests that its performance should be similar to Linkerd, since Consul’s traffic can be managed by agents local to each host rather than having to hop to the control plane.
Comparing service meshes can feel bewildering, since there are so many categories of features to choose from and so many aspects to consider within each category. Even after you’ve made a choice, the technology continues to change under your feet, bringing your selection further into doubt.
Consul Connect is a simple, flexible solution that has the benefit of being well-supported by Hashicorp. If you are already comfortable with using Hashicorp products, then this might be the factor that tips the balance for you. Linkerd is similarly simple, and it also has support from Buoyant, its creators. Istio is, in many ways, the market leader, with many already-implemented features and an impressive set of names backing it. However, this has come at the price of a reputation for being complex to support.
Taking a step back, the best approach to choosing a service mesh may be to determine the two or three most important features for your organization. They might be well-supported integrations with your existing software stack, a bet on which product will win out in the market, or some key function that you can’t do without. From there, take a close look at which of the service mesh products on the market prioritizes these features, and make your selection. Whichever solution you choose, you will need to be prepared to keep up with all the changes and upgrades to come.