Fun fact: the Heroku API consumes more endpoints than it serves. Our availability is heavily dependent on the availability of the services we interact with, which is the textbook definition of when to apply the circuit breaker pattern.

And so we did:

API web queue, p95 latencies

Circuit breakers really helped us keep the service stable despite third-party interruptions, as this graph of p95 HTTP queue latency shows.

Here I'll cover the benefits, challenges and lessons learned by introducing this pattern to a large scale production app.

A brief reminder that everything fails

Our API composes over 20 services – some public (S3, Twilio), some internal (run a process, map DNS record to an app) and some provided...


Subscribe to the full-text RSS feed for Pedro Belo.