Kubernetes is a container orchestration system that originated at Google, and is now being maintained by the Cloud Native Computing Foundation. In this post, I am going to dissect some Kubernetes internals—especially, Deployments and how gradual rollouts of new containers are handled.

What Is a Deployment?

This is how the Kubernetes documentation describes Deployments:

A Deployment controller provides declarative updates for Pods and ReplicaSets.

A Pod is a group of one or more containers which can be started inside a cluster. A pod started manually is not going to be very useful though, as it won't automatically be restarted if it crashes. A ReplicaSet ensures that a Pod...

I spend most of my time at Heroku working on our support tools and services; help.heroku.com is one such example. Heroku's help application depends on the Platform API to, amongst other things, authenticate users, authorize or deny access, and fetch user data.

So, what happens to tools and services like help.heroku.com during a platform incident? They must remain available to both agents and customers—regardless of the status of the Platform API. There is simply no substitute for communication during an outage.

To ensure this is the case, we use api-maintenance-sim, an app we recently open-sourced, to regularly simulate Platform API incidents.


Simulating downtime

During a Platform...

Working with our support team, I often see customers having timeout problems. Typically, their applications will start throwing H12 errors.

The decision to timeout requests quickly wasn't made to avoid having long-running requests on our router, nor to only have fast apps on our platform, but because standard web servers do not handle these types of requests particularly well.

Subscribe to the full-text RSS feed for Damien Mathieu.