The development of distributed systems is full of strange paradoxes. The reasoning we develop as engineers working on a single computer can break down in unexpected ways when applied to systems made of many computers. In this article, we’ll examine one such case—how the introduction of an additional network hop can actually decrease the end-to-end response time of a distributed system.
Cross-posted on the Cloud Native Computing Foundation blog.
Today, the Cloud Native Computing Foundation’s (CNCF) Technical Oversight Committee (TOC) voted to accept linkerd as its fifth hosted project, alongside Kubernetes, Prometheus, OpenTracing and Fluentd.
One of the inevitabilities of moving to a microservices architecture is that you’ll start to encounter partial failures—failures of one or more instances of a service. These partial failures can quickly escalate to full-blown production outages. In this post, we’ll show how circuit breaking can be used to mitigate this type of failure, and we’ll give some example circuit breaking strategies and show how they affect success rate.
Staging new code before exposing it to production traffic is a critical part of building reliable, low-downtime software. Unfortunately, with microservices, the addition of each new service increases the complexity of the staging process, as the dependency graph between services grows quadratically with the number of services. In this article, we’ll show you how one of linkerd’s most powerful features, per-request routing, allows you to neatly sidestep this problem.
Linkerd, our service mesh for cloud-native applications, needs to handle very high volumes of production traffic over extended periods of time. In this post, we’ll describe the load testing strategies and tools we use to ensure linkerd can meet this goal. We’ll review some of the problems we faced when trying to use popular load testers. Finally, we’ll introduce slow_cooker, an open source load tester written in Go, which is designed for long-running load tests and lifecycle issue identification.
We’re happy to announce that we’ve released linkerd 0.8.4! With this release, two important notes. First, Kubernetes and Consul support are now officially production-grade features—high time coming, since they’re actually already used widely in production. Second, this release features some significant improvements to linkerd’s HTTP/2 and gRPC support, especially around backpressure and request cancelation.
In this post we’ll show you how to use a service mesh of linkerd instances to handle ingress traffic on Kubernetes, distributing traffic across every instance in the mesh. We’ll also walk through an example that showcases linkerd’s advanced routing capabilities by creating a dogfood environment that routes certain requests to a newer version of the underlying application, e.g. for internal, pre-release testing.
Beyond service discovery, top-line metrics, and TLS, linkerd also has a powerful routing language, called dtabs, that can be used to alter the ways that requests—even individual requests—flow through the application topology. In this article, we’ll show you how to use linkerd as a service mesh to do blue-green deployments of new code as the final step of a CI/CD pipeline.
In this article, we’ll show you how to use linkerd as a service mesh to add TLS to all service-to-service HTTP calls, without modifying any application code.