Buoyant

Industrial-strength operability for cloud-native applications

The development of distributed systems is full of strange paradoxes. The reasoning we develop as engineers working on a single computer can break down in unexpected ways when applied to systems made of many computers. In this article, we’ll examine one such case—how the introduction of an additional network hop can actually decrease the end-to-end response time of a distributed system.

Cross-posted on the Cloud Native Computing Foundation blog.

Today, the Cloud Native Computing Foundation’s (CNCF) Technical Oversight Committee (TOC) voted to accept linkerd as its fifth hosted project, alongside Kubernetes, Prometheus, OpenTracing and Fluentd.

One of the inevitabilities of moving to a microservices architecture is that you’ll start to encounter partial failures—failures of one or more instances of a service. These partial failures can quickly escalate to full-blown production outages. In this post, we’ll show how circuit breaking can be used to mitigate this type of failure, and we’ll give some example circuit breaking strategies and show how they affect success rate.

In March 2016 at Kubecon EU, I gave my my first public talk on linkerd. At the end of this talk, like most of the other 20+ talks I gave in 2016, I presented a high-level linkerd roadmap that aspirationally included HTTP/2 & gRPC integration. As we enter 2017, I’m pleased to say that we’ve reached this initial goal. Let me take this opportunity to summarize what I think is novel about these technologies and how they relate to the future of linkerd service meshes.

Staging new code before exposing it to production traffic is a critical part of building reliable, low-downtime software. Unfortunately, with microservices, the addition of each new service increases the complexity of the staging process, as the dependency graph between services grows quadratically with the number of services. In this article, we’ll show you how one of linkerd’s most powerful features, per-request routing, allows you to neatly sidestep this problem.

Risha Mars 6 January 2017 Read more »

Linkerd, our service mesh for cloud-native applications, needs to handle very high volumes of production traffic over extended periods of time. In this post, we’ll describe the load testing strategies and tools we use to ensure linkerd can meet this goal. We’ll review some of the problems we faced when trying to use popular load testers. Finally, we’ll introduce slow_cooker, an open source load tester written in Go, which is designed for long-running load tests and lifecycle issue identification.

We’re happy to announce that we’ve released linkerd 0.8.4! With this release, two important notes. First, Kubernetes and Consul support are now officially production-grade features—high time coming, since they’re actually already used widely in production. Second, this release features some significant improvements to linkerd’s HTTP/2 and gRPC support, especially around backpressure and request cancelation.

In this post we’ll show you how to use a service mesh of linkerd instances to handle ingress traffic on Kubernetes, distributing traffic across every instance in the mesh. We’ll also walk through an example that showcases linkerd’s advanced routing capabilities by creating a dogfood environment that routes certain requests to a newer version of the underlying application, e.g. for internal, pre-release testing.

Risha Mars 18 November 2016 Read more »

Beyond service discovery, top-line metrics, and TLS, linkerd also has a powerful routing language, called dtabs, that can be used to alter the ways that requests—even individual requests—flow through the application topology. In this article, we’ll show you how to use linkerd as a service mesh to do blue-green deployments of new code as the final step of a CI/CD pipeline.

Sarah Brown 4 November 2016 Read more »

In this article, we’ll show you how to use linkerd as a service mesh to add TLS to all service-to-service HTTP calls, without modifying any application code.

Alex Leong 24 October 2016 Read more »