Dawn of the Dead Ends: Fixing a Memory Leak in Apache Kafka

news Last Updated: June 03, 2024 Tom Crayford, Engineer

At Heroku, we're always working towards increased operational stability with the services we offer. As we recently launched the beta of Apache Kafka on Heroku, we've been running a number of clusters on behalf of our beta customers.

Over the course of the beta, we have thoroughly exercised Kafka through a wide range of cases, which is an important part of bringing a fast-moving open-source project to market as a managed service. This breadth of exposure led us to the discovery of a memory leak in Kafka, having a bit of an adventure debugging it, and then contributing a patch to the Apache Kafka community to fix it.

Issue Discovery

For the most part, we’ve seen very few issues...

Powering the Heroku Platform API: A Distributed Systems Approach Using Streams and Apache Kafka

news Last Updated: September 08, 2016 Scott Persinger

We recently launched Apache Kafka on Heroku into beta. Just like we do with Heroku Postgres, our internal engineering teams have been using our Kafka service to power a number of our internal systems.

The Big Idea

The Heroku platform comprises a large number of independent services. Traditionally we’ve used HTTP calls to communicate between these services. While this approach is simple to implement and easy to reason about, it has a number of drawbacks. Synchronous calls mean that the top-level request time will be gated by the slowest backend component. Also, internal API calls create tight point-to-point couplings between services that can become very brittle over time.

Asynchronous...

Apache Kafka 0.10

engineering Last Updated: June 03, 2024 Tom Crayford, Engineer

At Heroku, we're always striving to provide the best operational experience with the services we offer. As we’ve recently launched Heroku Kafka, we were excited to help out with testing of the latest release of Apache Kafka, version 0.10, which landed earlier this week. While testing Kafka 0.10, we uncovered what seemed like a 33% throughput drop relative to the prior release. As others have noted, “it’s slow” is the hardest problem you’ll ever debug, and debugging this turned out to be very tricky indeed. We had to dig deep into Kafka’s configuration and operation to uncover what was going on.

Background

We've been benchmarking Heroku Kafka for some time, as we prepared for the...

Announcing Heroku Kafka Early Access

news Last Updated: April 24, 2024 Rand Arete, Senior Director of Product

Today we are happy to announce early access to Heroku Kafka. We think Kafka is interesting and exciting because it provides a powerful and scalable set of primitives for reasoning about, building, and scaling systems that can handle high volumes and velocities of data. Heroku Kafka makes Kafka more accessible, reliable, and easy to integrate into your applications.

What is Kafka?

Apache Kafka is a distributed commit log for fast, fault-tolerant communication between producers and consumers using message based topics. Kafka provides the messaging backbone for building a new generation of distributed applications capable of handling billions of events and millions of transactions.

At...

All posts tagged with kafka