Simulate Third-Party Downtime

engineering Last Updated: March 03, 2017 Damien Mathieu, Software Craftsman

I spend most of my time at Heroku working on our support tools and services; help.heroku.com is one such example. Heroku's help application depends on the Platform API to, amongst other things, authenticate users, authorize or deny access, and fetch user data.

So, what happens to tools and services like help.heroku.com during a platform incident? They must remain available to both agents and customers—regardless of the status of the Platform API. There is simply no substitute for communication during an outage.

To ensure this is the case, we use api-maintenance-sim, an app we recently open-sourced, to regularly simulate Platform API incidents.

this-is-fine

Simulating downtime

During a Platform...

Preparing for Major Response

engineering Last Updated: April 27, 2017 Kevin Thompson

Earlier this month, the OpenSSL project team announced that three days later it would be releasing a new version of OpenSSL to address a high-severity security defect. In the end, this vulnerability resulted in another non-event for our customers, but we thought it might be useful and informative to share the process we went through to prepare for the issue.

Triage

The announcement from the OpenSSL project team only said that a vulnerability would be patched, but kept the specifics of the vulnerability embargoed to limit the likelihood of an attack before they could release their patch. Obviously, it’s difficult to gauge the potential impact of a vulnerability when you don’t know the...

New Heroku Status Site

news Last Updated: September 08, 2016 Mark Pundsack

Developers like you deploy code to hundreds of thousands of apps every month on the Heroku platform. Some of these are production apps which serve hundreds of millions or even billions of requests per month. Uptime of the platform is critical for such apps.

We want to achieve the sustained reliability that these apps require. But when there are incidents that impact uptime, we want to maximize our transparency and accountability to you and all developers on the platform.

Today, we’re launching a completely redesigned status.heroku.com, which provides real-time status of the platform, the ability to sign up for email or SMS notification of incidents, and recent uptime history in both...

All posts tagged with uptime

Simulate Third-Party Downtime

Simulating downtime

Preparing for Major Response

Triage

New Heroku Status Site