Video Transcript


Introducing the Streaming Data Connectors Beta: Capture Heroku Postgres Changes in Apache Kafka on Heroku

Today we are announcing a beta release of our new streaming data connector between Heroku Postgres and Apache Kafka on Heroku. Heroku runs millions of Postgres services and tens of thousands of Apache Kafka services, and we increasingly see developers choosing to start with Apache Kafka as the foundation of their data architecture. But for those who are Postgres-first, it is challenging to adopt without a full app rewrite. Developers want a seamless integration between the two services, and we are delivering it today, at no additional charge, for Heroku Private Spaces and Shield Spaces customers.

Heroku streaming data connectors

Moving beyond Postgres and Kafka, the Heroku Data team sees the use cases for data growing more complex and diverse, and we know they can no longer be solved by one database technology alone. As new data services emerge and existing offerings become more sophisticated, the days of a single monolithic datastore are over. Apache Kafka is a key enabling technology for these emerging data architectures.

We spent the last year focused on embracing this new reality outside of our four walls. We shipped new features that allow Heroku Managed Data Services to integrate with external resources in Amazon VPCs over Private Link and resources in other public clouds or private data centers over mutual TLS. But we had a problem inside that we wanted to solve too.

Effortless Change Data Capture (CDC) by Heroku

CDC isn’t a new idea. It involves monitoring one or more Postgres tables for writes, updates, and deletes, and then writing each change to an Apache Kafka topic. Sounds simple enough, but the underlying complexity is significant. We took the time to experiment with the open-source technologies that made it possible and were thrilled to find a path forward that provides a stable service at scale.

We use Kafka Connect and Debezium to take data at rest and put it in motion. Like Heroku Postgres and Apache Kafka on Heroku, the connector is fully-managed, has a simple and powerful user experience, and comes with our operational excellence built-in every aspect of the service.

It’s as Easy as heroku data:connectors:create

To get started, make sure you have Heroku Postgres and Apache Kafka on Heroku add-ons in a Private or Shield Space, as well as the CLI plugin. Then create a connector by identifying the Postgres source and Apache Kafka store by name, specifying which table(s) to include, and optionally blocking which columns to exclude:

heroku data:connectors:create \
    --source postgresql-neato-98765 \
    --store kafka-lovely-12345 \
    --table public.posts --table public.users \
    --exclude public.users.password

See the full instructions and best practices for more detail.

Once provisioned, which takes about 15 minutes, the connector automatically streams changes from Heroku Postgres to Apache Kafka on Heroku. From there, you can refactor your monolith into microservices, implement an event-based architecture, integrate with other downstream data services, build a data lake, archive data in lower-cost storage services, and so much more.

Feedback Welcome

We are thrilled to share our latest work with you and eager to get your feedback. Please send any questions, comments, or feature requests our way.

Originally published: July 09, 2020

Browse the archives for news or all blogs Subscribe to the RSS feed for news or all blogs.