Turn Code into Docker Images with Cloud Native Buildpacks

When we open-sourced buildpacks nearly seven years ago, we knew they would simplify the application deployment process. After a developer runs git push heroku main, a buildpack ensures the application's dependencies and compilation steps are taken care of as part of the deploy.

As previously announced, we've taken the same philosophies that made buildpacks so successful and applied them towards creating Cloud Native Buildpacks (CNB), a standard for turning source code into Docker images without the need for Dockerfiles. In this post, we'll take a look at how CNBs work, how they aim to solve many of the problems that exist with Dockerfile, and how you can use them with the recent beta release of the buildpacks.io project. As part of this release, we’ve created a Heroku buildpacks builder image for Ruby, Node.js, Java, Python, PHP, and Go that works with the CNB tooling.

Let’s start by creating a strawman. We’ll walk through some of the tedious but essential steps necessary to create a Dockerfile for a Ruby on Rails app.

A Leaky Abstraction: Incrementally Writing a Dockerfile

Most developers use Docker by creating a Dockerfile, which defines a build process that generates a Docker image. For example, let's say you have an existing Rails project you want to deploy as a Docker container. You'll need to start from a base Ruby image, and include additional packages that are necessary for the application to run. If you've never used Docker before, you probably have to learn several things just to get this far:

FROM ruby
RUN apt-get update -qq \
  && apt-get install -y nodejs libpq-dev build-essential
COPY . /app
WORKDIR /app
RUN bundle install
RUN bundle exec rake assets:precompile
EXPOSE 5000
CMD bin/rails s

In addition to Ruby, a Docker image for a Rails app also needs several additional Apt packages. It must include the nodejs runtime in order to run the necessary tooling to precompile assets, libpq-dev is required for communicating with a Postgres database, and build-essential is required for gcc to build native extensions for several Ruby gems.

This Dockerfile is enough to run a simple Rails application in production, but the image will be bloated with extraneous cache directories. You might want to reduce the image size by removing those files, which are only useful for sequential local builds:

RUN apt-get update -qq \
  && apt-get install -y nodejs libpq-dev build-essential \
  && apt-get clean autoclean && apt-get autoremove -y \
  && rm -rf /var/lib/apt /var/lib/dpkg /var/lib/cache /var/lib/log
# ..
RUN bundle exec rake assets:precompile
  && rm -rf /app/tmp/cache/assets/

This highlights one of the shortcomings of Dockerfile when it comes to speed. It can’t properly make use of those cache directories because a rebuild takes everything or nothing. However, there are some clever tricks you can use to speed up builds by caching information about dependencies. Instead of copying the entire app in at once, you can selectively add files like this:

ADD Gemfile /app/
ADD Gemfile.lock  /app/
RUN bundle install

Copying the Gemfile and Gemfile.lock works around the Dockerfile caching mechanism to prevent invalidating your entire cache when you change a single line of code.

These examples highlight just a few of the challenges you’ll face when constructing a Dockerfile for your app. You’ll need to repeat those challenges for every app that requires a Dockerfile. Oftentimes, you’ll end up copying and pasting sections of Dockerfiles from one app to another, which is a recipe for a maintenance nightmare.

Maintenance is the biggest shortcoming of Dockerfile. Aside from copy-pasting code, it introduces lower level concerns that you wouldn’t need to worry about without it. For instance, Ruby, as with many languages, has several base images you can inherit from, and each one comes with its own size and security considerations. If, one day, the Rails ecosystem requires a new dependency that isn't included in your existing Dockerfile, you are responsible for updating the configuration as needed. If you've broken your project out into microservices, that could also mean updating several files across multiple locations.

Ultimately,Dockerfile is a leaky abstraction. It forces developers to be aware of operational and platform concerns that were previously abstracted away. To write a good Dockerfile, you must understand the underlying mechanisms and how each step of the image generation process works in order to properly handle future updates.

All of these problems stem from Dockerfile's lack of app awareness. Without context about your application or the frameworks you use, there's a giant mismatch between how a developer builds an application and the tools they use to deploy that app.

Mixing operational concerns with application concerns like this results in a poor tool for developers who just want to write code and ship it as painlessly as possible. Given these deficiencies, let’s take a look at an alternative to reduce this complexity.

Learning From Buildpacks

If you've ever deployed an application using Heroku, you know that it's as easy as running git push heroku main in your local directory. Behind the scenes, a buildpack retrieves dependencies, processes assets, handles caching, and compiles code for whatever language your app is built in. For example, consider a Rails application. The Ruby buildpack will install Ruby and bundler. Your gem dependencies are fetched, your assets are compiled, and the cache is cleaned up:

$ git push heroku main
remote: Compressing source files... done.
remote: Building source:
remote: -----> Ruby app detected
remote: -----> Compiling Ruby/Rails
remote: -----> Using Ruby version: ruby-2.6.0
remote: -----> Installing dependencies using bundler 1.15.2
remote:        Running: bundle install --without development:test --path vendor/bundle --binstubs vendor/bundle/bin -j4 --deployment
...
remote:        Bundle complete! 18 Gemfile dependencies, 61 gems now installed.
remote:        Gems in the groups development and test were not installed.
remote:        Bundled gems are installed into `./vendor/bundle`
remote:        Removing bundler (1.15.2)
remote:        Bundle completed (42.62s)
remote:        Cleaning up the bundler cache.
...
remote:        Asset precompilation completed (3.72s)
remote:        Cleaning assets
remote:        Running: rake assets:clean
remote: -----> Detecting rails configuration
remote: -----> Compressing...
remote:        Done: 41.3M
remote: -----> Launching...
remote:        Released v6
remote:        https://myapp.herokuapp.com/ deployed to Heroku
remote:
remote: Verifying deploy... done.

A buildpack automatically handles all these steps for you by recognizing the conventions of your application's language. Buildpacks were designed to configure whatever is necessary to run your application. With Cloud Native Buildpacks, we wanted a similar system that allowed developers to focus on their app and not piece together a build pipeline, while taking advantage of Docker and modern container standards.

Running Cloud Native Buildpacks

The desire to combine the simplicity and usability of buildpacks with the benefits of containers led us to develop Cloud Native Buildpacks (CNB), which produce an OCI-compliant image that works with existing Docker tooling and the broader container ecosystem.

The buildpacks.io project is the home of the open source tooling that makes our vision possible. The first of these tools is pack build, which behaves in much the same way as git push heroku main. You can run it against any arbitrary repository and it will produce a Docker image. Here's an example of running the Heroku Cloud Native Ruby buildpack against a Rails app:

cnb-beta-3-no-shadow

Much like buildpacks that produce execution-ready slugs, a CNB will identify what is necessary to install based on the existing files in your project. There is no configuration necessary to identify your application's requirements. Since the buildpack is app-aware and knows the precise languages and dependencies your app uses, the build phases also come with sane defaults for memory performance and handling concurrency.

The steps that a CNB process undertakes to produce the final image are very similar to the stages for existing Heroku buildpacks:

The CLI detects the primary language of your project. For example, if your source code directory has a Gemfile, the CNB will identify it as a Ruby project; a pom.xml file identifies it as a Java project, and so on.
The execution environment then analyzes a previous build to determine if there are any steps which can be reused in a subsequent build.
The CNB runs the build, downloading any dependencies and preparing the application to run in production.
Finally, it exports the result of that build as a Docker image.

The underlying processes to generate the image are handled behind-the-scenes by the buildpack. If this process needs to be updated—such as when a vulnerability is detected—you can easily fetch a new toolchain and rebuild an image using pack rebase, which will update your image in less than a second without rebuilding. This saves an enormous amount of time compared to rebuilding from a Dockerfile on every one of your apps--a process that can take hours.

Try Cloud Native Buildpacks Today

There’s no better time to give Cloud Native Buildpacks a try than right now. The project has reached its first Beta release, and it’s ready for you to use and provide feedback.

To get started, download the pack CLI and use one of our buildpacks (Ruby, Node.js, Java, Python, PHP, or Go) in your app source directory:

$ pack build --builder heroku/builder:22 <docker image name>

Come join us on Slack. We also have API documentation available that defines the buildpack spec if you'd like to generate your own OCI images.

Heroku has always found it important to meet developers where they are: at their application's source code. We believe that Cloud Native Buildpacks reduce the operational complexity with building container-based applications and frees developers up to focus on building great features for their users.

Video Transcript

Turn Your Code into Docker Images with Cloud Native Buildpacks

A Leaky Abstraction: Incrementally Writing a Dockerfile

Learning From Buildpacks

Running Cloud Native Buildpacks

Try Cloud Native Buildpacks Today

More from the author

Terence Lee