“Service mesh” is an umbrella term for products that seek to solve the problems that microservices’ architectures create. These challenges include security, network traffic control, and application telemetry. The resolution of these challenges can be achieved by decoupling your application at layer five of the network stack, which is one definition of what service meshes do.
This article will look at Istio, an open-source service mesh product developed by Google, IBM, and Lyft. It is one of the more feature-rich and complex options available today. The alternatives tend to have fewer features or require piecing together the functionality you want from additional products. Two of these alternatives, Envoy and Linkerd, are Istio’s main rivals in the service mesh market. Despite their popularity, they still may require you to use other products—or even pair with Istio itself—to match Istio’s feature set.
This post will examine the use of Istio within the context of a service mesh Kubernetes would use, but it should be noted that Istio and other service mesh products can be run independently of this container orchestration product.
Functional Areas That Make Up a Service Mesh
While they are not clearly defined anywhere, service meshes’ areas of functionality can be divided into three categories: traffic management, security, and observability.
Istio allows you to route traffic based on criteria that you define. It achieves this by using Envoy proxies as sidecars within each pod and by keeping a service registry in its control plane. In the security domain, the Envoy proxies and the control plane allow you to manage traffic between services by setting policies and encrypting traffic within the cluster. Istio offers you observability by tracking service metrics, tracing traffic, and performing application logging tasks across the cluster.
Before Service Meshes
Before service meshes, application teams had to implement at least some of the aforementioned functionalities within their applications. They might have rolled their own routing policies with some complex bespoke code, implemented mutual TLS authentication, or developed their own metrics application and its associated storage. These tasks are not trivial, and they often require additional work, such as provisioning extra storage or opening up firewall rules, to mitigate their knock-on effects. These processes are themselves prone to error, in addition to being potentially costly and time-consuming. Service meshes were developed to eliminate them.
After Service Meshes
When a service mesh is implemented, application teams can consume common implementations that fulfill these standard requirements. In a sense, the use of service meshes can be viewed as an extension of the principle of Kubernetes clusters and other application platforms; they all provide a standard interface for running applications and meeting their associated needs.
If application teams still need to fulfill the requirements addressed by a service mesh product in their own ways, they can do so. However, they will likely incur costs and experience delays in the process, and they are more likely to make mistakes in the implementation of either functionality or compliance.
In the ideal world, application teams can focus on their business logic and leave platform concerns to a centrally-managed and standardized service mesh.
Service Mesh Architecture
Before delving into the details of service meshes’ three functional areas, it’s important to understand their architecture. Istio divides its operations into two high-level areas: the control plane and the data plane.
The control plane is a set of centrally-managed services that operates independently of the applications running within the service mesh. It provides the backbone of the functional areas discussed in this post.
The data plane, by contrast, works with the applications directly to provide features locally such as load balancing, mutual TLS, and routing policies. In Kubernetes, this is achieved by adding sidecar containers to each pod that is deployed. Sidecar containers take care of supplying all of the network functionality provided by the service mesh without interfering with the application containers themselves. The sidecars also communicate with the central control plane to deliver their features.
Traffic Management: A Deeper Dive
Service meshes can provide a dizzying array of features that allow you to direct network traffic around your applications. The most commonly-used features allow you to use “destination rules” to load balance traffic between instances of your application within your cluster using algorithms such as “round robin,” “random,” or “least requests.”
In Istio, you can extend the use of destination rules to weight traffic direction based on application versions. This can be useful in A/B testing scenarios where the request is routed based on which subnet the originating request came from. You can even route to entirely separate or external services if you want.
You can also manage traffic outside of the cluster. Istio allows you set up “egress gateways” which configure a dedicated exit node for traffic leaving the cluster. These gateway abstractions can be configured to allow you to define policies for retries and timeouts, to inject faults into the system at will to test its resilience, to direct traffic to legacy services, or even to add services in another service mesh through a multicluster configuration.
Security: A Deeper Dive
Istio provides several security features as part of its service mesh. Its Citadel component can act as a certificate issuer within the control plane, allowing certificates to be signed and delivered to applications securely within the Kubernetes cluster. This enables applications to have mutual TLS security, which is often a requirement of applications running in enterprise organizations.
Policies around these authentication requirements can be set at the namespace, cluster, or service level as required. These authentication policies can also be used to ensure that only specific services can to talk to each other, which, in turn, allows more sophisticated security policies to be defined and enforced. If the service level is not sufficient for your requirements, you can use the Open Policy Agent (OPA) framework to enforce more fine-grained attributes. Specific service paths or subnets are among the attributes that can be specified to allow or disallow access to and from running services.
Observability: A Deeper Dive
Service meshes offer centralized, platform-level solutions to the general problems surrounding the observability of applications.
For distributed application request tracing, Istio has a plug-in architecture that allows different tracing backends (such as Zipkin or Jaeger) to be used, depending on application needs or preferences. Similarly, application logs can be captured and analyzed by custom log access backends such as Fluentd.
Finally, platform engineers can configure the collection of metrics by the service mesh based on their needs. A balance may need to be struck between the richness and level of detail of the information being gathered and the ability to store and analyze that information.
Service meshes allow software platforms to do a lot of your applications’ heavy lifting. The infrastructure standardization they offer allows security, traffic management, and observability challenges to be taken out of developers’ hands and managed centrally.
These benefits do not come for free, however, and service meshes such as Istio are famous for their management complexities. They are also relatively new and changing quickly. However, the promise of being able to “decouple at layer five,” for example, is a powerful one. The potential of service meshes makes them likely to persist as more and more organizations move to microservices architectures.