This article was originally authored by Srinath Ananthakrishnan, an engineer on the Heroku Runtime Networking Team

Summary

This following story outlines a recent issue we saw with migrating one of our internal systems over to a new EC2 substrate and in the process breaking one of our customer’s use cases. We also outline how we went about discovering the root of the issue, how we fixed it, and how we enjoyed solving a complex problem that helped keep the Heroku customer experience as simple and straightforward as possible!

History

Heroku has been leveraging AWS and EC2 since the very early days. All these years, the Common Runtime has been running on EC2 Classic and while there have...


In true JavaScript fashion, there was no shortage of releases in the JavaScript ecosystem this year. This includes the Yarn project’s release of Yarn 2 with a compressed cache of JavaScript dependencies, including a Yarn binary to reference, that can be used for a zero-install deployment.

Ball of yarn and knitting needles illustration

Yarn is a package manager that also provides developers a project management toolset. Now, Yarn 2 is now officially supported by Heroku, and Heroku developers are able to take advantage of leveraging zero-installs during their Node.js builds. We’ll go over a popular use case for Yarn that is enhanced by Yarn 2: using workspaces to manage dependencies for your monorepo.

We will cover taking advantage of...


This post previously appeared on the Salesforce Architects blog.

Event-driven application architectures have proven to be effective for implementing enterprise solutions using loosely coupled services that interact by exchanging asynchronous events. Salesforce enables event-driven architectures (EDAs) with Platform Events and Change Data Capture (CDC) events as well as triggers and Apex callouts, which makes the Salesforce Platform a great way to build all of your digital customer experiences. This post is the first in a series that covers various EDA patterns, considerations for using them, and examples deployed on the Salesforce Platform.

Expanding the event-driven architecture of the...


This post is an update on a previous post about how Heroku handles incident response.

As a service provider, when things go wrong, you try to get them fixed as quickly as possible. In addition to technical troubleshooting, there’s a lot of coordination and communication that needs to happen in resolving issues with systems like Heroku’s.

At Heroku we’ve codified our practices around these aspects into an incident response framework. Whether you’re just interested in how incident response works at Heroku, or looking to adopt and apply some of these practices for yourself, we hope you find this inside look helpful.

Incident Response and the Incident Commander Role

We describe Heroku’s...


Incidents are inevitable. Any platform, large or small will have them. While resiliency work will definitely be an important factor in reducing the number of incidents, hoping to remove all of them (and therefore reach 100% uptime) is not an achievable goal.

We should, however, learn as much as we can from incidents, so we can avoid repeating them.

In this post, we will look at one of those incidents, #2105, see how it happened (spoiler: I messed up), and what we’re doing to avoid it from happening again (spoiler: I’m not fired).


Browse the archives for engineering or all blogs Subscribe to the RSS feed for engineering or all blogs.