At Heroku, we're always working towards increased operational stability with the services we offer. As we recently launched the beta of Apache Kafka on Heroku, we've been running a number of clusters on behalf of our beta customers.
Over the course of the beta, we have thoroughly exercised Kafka through a wide range of cases, which is an important part of bringing a fast-moving open-source project to market as a managed service. This breadth of exposure led us to the discovery of a memory leak in Kafka, having a bit of an adventure debugging it, and then contributing a patch to the Apache Kafka community to fix it.
Issue Discovery
For the most part, we’ve seen very few issues...