< Home

A Service Mesh for Kubernetes, Part I: Top-line service metrics

Alex Leong 4 October 2016

What is a service mesh, and how is it used by cloud native apps—apps designed for the cloud? In this article, we’ll show you how to use linkerd as a service mesh on Kubernetes, and how it can capture and report top-level service metrics such as success rates, request volumes, and latencies without requiring changes to application code.

Note: this is Part I of a series of articles about linkerd and cloud native applications. In upcoming weeks, we’ll cover:

  1. Top-line service metrics (this article)
  2. Pods are great, until they’re not
  3. Encrypting all the things
  4. Continuous deployment via traffic shifting
  5. Dogfood environments, ingress, and edge routing
  6. Staging microservices without the tears
  7. Distributed tracing made easy
  8. Retry budgets, deadline propagation, and failing gracefully
  9. Autoscaling by top-line metrics
  10. gRPC for fun and profit

The services must mesh

One of the most common questions we see about linkerd is, what exactly is a service mesh? And why is a service mesh a critical component of cloud native apps, when environments like Kubernetes provide primitives like service objects and load balancers?

In short, a service mesh is a layer that manages the communication between apps (or between parts of the same app, e.g. microservices). In traditional apps, this logic is built directly into the application itself: retries and timeouts, monitoring/visibility, tracing, service discovery, etc. are all hard-coded into each application.

However, as application architectures become increasingly segmented into services, moving communications logic out of the application and into the underlying infrastructure becomes increasingly important. Just as applications shouldn’t be writing their own TCP stack, they also shouldn’t be managing their own load balancing logic, or their own service discovery management, or their own retry and timeout logic. (For example, see Oliver Gould’s MesosCon talk for more about the difficulty of coordinating retries and timeouts across multiple services.)

A service mesh like linkerd provides critical features to multi-service applications running at scale:

In this article, we’re going to focus just on visibility: how a service mesh can automatically capture and report top-line metrics, such as success rate, for services. We’ll walk you through a quick example in Kubernetes.

Using linkerd for service monitoring in Kubernetes

One of the advantages of operating at the request layer is that the service mesh has access to protocol-level semantics of success and failure. For example, if you’re running an HTTP service, linkerd can understand the semantics of 200 versus 400 versus 500 responses and can calculate metrics like success rate automatically. (Operating at this layer becomes doubly important when we talk about retries—more on that in later articles.)

Let’s walk through a quick example of how to install linkerd on Kubernetes to automatically capture aggregated, top-line service success rates without requiring application changes.

Step 1: Install linkerd

Install linkerd using this Kubernetes config. This will install linkerd as a DaemonSet (i.e., one instance per host) running in the default Kubernetes namespace:

kubectl apply -f https://raw.githubusercontent.com/BuoyantIO/linkerd-examples/master/k8s-daemonset/k8s/linkerd.yml

You can confirm that installation was successful by viewing linkerd’s admin page:

INGRESS_LB=$(kubectl get svc l5d -o jsonpath="{.status.loadBalancer.ingress[0].*}")
open http://$INGRESS_LB:9990 # on OS X

k8s-linkerd-admin

Step 2: Install the sample apps

Install two services, “hello” and “world”, using this hello-world config. This will install the services into the default namespace:

kubectl apply -f https://raw.githubusercontent.com/BuoyantIO/linkerd-examples/master/k8s-daemonset/k8s/hello-world.yml

These two services function together to make a highly scalable, “hello world” microservice (where the hello service, naturally, calls the world service to complete its request).

You can see this in action by sending traffic through linkerd’s external IP:

http_proxy=$INGRESS_LB:4140 curl -s http://hello

You should see the string “Hello world”.

Step 3: Install linkerd-viz

Finally, let’s take a look at what our services are doing by installing linkerd-viz. linkerd-viz is a supplemental package that includes a simple Prometheus and Grafana setup and is configured to automatically find linkerd instances.

Install linkerd-viz using this linkerd-viz config. This will install linkerd-viz into the default namespace:

kubectl apply -f https://raw.githubusercontent.com/BuoyantIO/linkerd-viz/master/k8s/linkerd-viz.yml

Open linkerd-viz’s external IP to view the dashboard:

VIZ_INGRESS_LB=$(kubectl get svc linkerd-viz -o jsonpath="{.status.loadBalancer.ingress[0].*}")
open http://$VIZ_INGRESS_LB # on OS X

You should see a dashboard, including selectors by service and instance. All charts respond to these service and instance selectors:

k8s-linkerd-viz

The linkerd-viz dashboard includes three sections:

That’s all!

With just three simple commands we were able to install linkerd on our Kubernetes cluster, install an app, and use linkerd to gain visibility into the health of the app’s services. Of course, linkerd is providing much more than visibility: under the hood, we’ve enabled latency-aware load balancing, automatic retries and circuit breaking, distributed tracing, and more. In upcoming posts in this series, we’ll walk through how to take advantage of all these features.

In the meantime, for more details about running linkerd in Kubernetes, visit the Kubernetes Getting Started Guide or hop in the linkerd slack and say hi!

Stay tuned for Part II in this series: Pods Are Great Until They’re Not.

Discuss this article on Hacker News.

< Home