Git Push Heroku Master: Now 40% Faster

Flow is an important part of software development. The ability to achieve flow during daily work makes software development a uniquely enjoyable profession. Interruptions in your code/test loop make this state harder to achieve. Whether you are running unit tests locally, launching a local webserver, or deploying to Heroku there's always some waiting and some interruption. Every second saved helps you stay in your flow.

We’ve been working on reducing the time it takes to build your code on Heroku. Read through this post for details on the process we used to make builds fast, or check out the end result from the graph below:

heroku_build_speed@2x

Let's take a look at our process in delivering these improvements further.

Read more →

Hacking Buildpacks

Buildpacks are an extremely powerful tool for specifying the ecosystem of tools and dependencies packaged with your Heroku application and controlling the way the application is built from code to a deployed app.

In the post announcing the release of buildpacks we illustrated this point, explaining how buildpacks provide the mechanism by which Heroku supports a variety of languages and frameworks, not just Ruby and Rails. We also briefly covered some of the end-user customizations that can be achieved with custom buildpacks, such as adding binary support and modifying the build process.

Today we'll examine the basic structure of buildpacks and study some example customizations to better understand how they can be used to extend the capabilities of the Heroku defaults.

The Anatomy of a Buildpack

At its core, a buildpack is a collection of 3 Bash scripts stored in the bin directory. These scripts are called detect, compile, and release. We'll take a quick look at how each of these scripts contributes to supporting a specific language or framework.

An excellent skeleton buildpack with all of these minimal components is Ryan Smith's null-buildpack. If you're creating a buildpack from scratch, forking null-buildpack is a good place to start.

detect

The bin/detect script is important to Heroku's default buildpacks. When an app is deployed, the detect script is used to figure out which buildpack is appropriate for the project.

Most buildpacks detect frameworks by searching for certain config files. For example, the Ruby buildpack looks for a Gemfile. The Node.js buildpack looks for packages.json, and the Python buildpack looks for requirements.txt.

If a buildpack matches, it returns an exit code of 0 and prints the language/framework name to STDOUT. The aforementioned null-buildpack shows what an absolutely minimal detect script would look like.

In the case of custom buildpacks you'll be specifying the buildpack directly, so detection isn't as important, and a minimal detect script is usually sufficient.

compile

The bin/compile script is where most of the magic happens. This script takes 2 arguments, BUILD_DIR and CACHE_DIR.

BUILD_DIR gives you a handle for the root directory of the app, so you can read and write files into the slug. This is where binaries are installed, Heroku-specific config files are written, dependencies are resolved and installed, and static files are built.

CACHE_DIR gives you a location to persist build artifacts between deployments.

We'll take a closer look at modifying the slug and caching build artifacts in the examples below.

release

Whereas the bin/compile script modifies the slug, the bin/release script modifies the release. Instead of modifying files, this script returns a YAML-formatted hash to define any default config variables, Add-ons, or default process types needed by the buildpack.

Now that we understand the basic structure of buildpack scripts and their roles, lets take a look at some example hacks.

Example 1: Adding Binaries for Language Support

The ability to install binaries is critical for most language support. Richard Schneeman's mruby buildpack article over on RubySource provides an excellent example of installing custom binaries to support a new language.

Building the binary files with Vulcan

The first step for getting new binaries onto Heroku is building them in such a way that they can be run on Heroku dynos (64-bit Linux virtual machines). It turns out that building these binaries directly on Heroku is the easiest method, since the operating system of a Heroku dyno contains common Linux development tools.

Heroku user Jonathan Hoyt discovered these tools early on and blogged about the process of building xpdf for Heroku using:

  • heroku run bash to boot a new dyno and get a bash session on it
  • curl to download the source
  • make to build the project
  • scp to copy the build artifacts to a local machine

Although you could certainly copy Jon's procedure, Heroku now provides a tool called Vulcan to make this process much easier. The vulcan create command deploys a custom app to Heroku under your account. Once the app is created, you use the vulcan build command to upload source, build the project, and download the results all in one step.

For more info on Vulcan, see its README and the Heroku Dev Center article on packaging binaries.

Hosting the built files

Once Vulcan has completed and the build artifacts have been downloaded, you'll need to host them somewhere on the web so that Heroku's build servers will be able to download them. Heroku's default language binaries are stored on AmazonS3, for example. Make sure the location you use is publicly readable.

Modifying the compile script

Next you need to make the buildpack copy the binary files down into your project. This is done in the buildpack's bin/compile script. We can again refer to the mruby buildpack for a straightforward example of how the files are copied down. The steps used are as follows:

  1. Change directories into the build directory. (This directory will be the root of any apps deployed with this buildpack.)

  2. Fetch the archive of the binary files.

  3. Make a directory under /vendor to store the binaries.

  4. Extract the archive.

Modifying the PATH

Finally, you'll need to add the location of your binaries to the PATH environment variable so that they can be called from anywhere. As discussed earlier, default environment variables are defined by the YAML string returned by the bin/release script. Here you can see PATH being set for the mruby binaries.

Example 2: Using the Build Cache to Speed Up Deployments

The default Ruby buildpack provides support for the Rails asset pipeline by running rake assets:precompile, building shorthand source like coffeescript and sass files into static, browser-consumable javascript and css files. As long as the correct conditions are met, this task will be run on every deploy. Unfortunately, assets:precompile can be very slow, especially on large projects with lots of assets.

To address this slowness, Nathan Broadbent released the turbo-sprockets-rails3 gem. TurboSprockets only compiles assets whose source files have changed, making the asset compilation step much faster after the initial run. This is a nifty enhancement, but it depends on the ability to cache assets between builds. This is where the CACHE_DIR argument to the compile script comes in handy.

(NOTE: The Ruby buildpack is a bit different in that it's not totally Bash. The bin/compile script invokes a Ruby script. The Ruby code provides some convenience methods for manipulating files in and out of the build cache.)

Nathan forked and extended the default Ruby buildpack to take advantage of turbo-sprockets-rails3's abilities. This buildpack modifies the default behavior by loading cached files from CACHE_DIR/public/assets into BUILD_DIR/public/assets. The assets:precompile task then runs as usual, but since turbo-sprockets-rails3 is installed, any unmodified assets won't be rebuilt. The script then runs a custom asset expiration task, storing assets back to CACHE_DIR/public/assets if it's successful, and clearing CACHE_DIR/public/assets if it fails.

Example 3: Installing Framework Tools

Many popular languages have several competing frameworks for web apps; Rails and Sinatra for Ruby and Django and Pylons for Python are some well-known examples. Usually, these frameworks are contained within a few libraries, so if you want to support a new framework it makes sense to modify existing language buildpacks instead of starting from scratch. James Ward took this approach when he wanted to support Revel, a web framework for the Go programming language.

First he forked the existing Go buildpack.

Next, he modified the bin/detect script to look for a Revel-specific file and announce that it's a Revel buildpack.

Then he added some functionality to the end of the bin/compile script to fetch and build Revel.

A new line in the bin/release script defines a default web process in the Procfile for running Revel.

Conclusion

I hope this tour into hacking buildpacks has been informative, and you've gained some insight into how buildpacks work and how they can be extended to meet your needs. Whether you want to host apps in a new language, or tweak the tools for an existing one, buildpacks are a step towards always answering "yes" to the question, "Does it run on Heroku?"

For more reference information, please check out the buildpack articles available in our Dev Center.

Buildpacks: Heroku for Everything

Last summer, Heroku became a polyglot platform, with official support for Ruby, Node.js, Clojure, Java, Python, and Scala. Building a platform that works equally well for such a wide variety of programming languages was a unique technical design challenge.

siloed products would be a non-scalable design

We knew from the outset that maintaining siloed, language-specific products – a Heroku for Ruby, a Heroku for Node.js, a Heroku for Clojure, and so on – wouldn't be scalable over the long-term.

Instead, we created Cedar: a single, general-purpose stack with no native support for any language. Adding support for any language is a matter of layering on a build-time adapter that can compile an app written in a particular language or framework into an executable that can run on the universal runtime provided by Cedar. We call this adapter a buildpack.

build-time language adapters support a single runtime stack

An Example: Ruby on Rails

If you've deployed any app to the Cedar stack, then you've already used at least one buildpack, since the buildpack is what executes during git push heroku master. Let's explore the Ruby buildpack by looking at the terminal output that results when deploying a Rails 3.2 app:

$ git push heroku master
Counting objects: 67, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (53/53), done.
Writing objects: 100% (67/67), 26.33 KiB, done.
Total 67 (delta 5), reused 0 (delta 0)

-----> Heroku receiving push
-----> Ruby/Rails app detected
-----> Installing dependencies using Bundler version 1.2.0.pre
       Running: bundle install --without development:test --path vendor/bundle --binstubs bin/ --deployment
       Fetching gem metadata from https://rubygems.org/.......
       Installing rake (0.9.2.2)
       ...
       Your bundle is complete! It was installed into ./vendor/bundle
-----> Writing config/database.yml to read from DATABASE_URL
-----> Preparing app for Rails asset pipeline
       Running: rake assets:precompile
       Asset precompilation completed (16.16s)
-----> Rails plugin injection
       Injecting rails_log_stdout
       Injecting rails3_serve_static_assets
-----> Discovering process types
       Procfile declares types      -> (none)
       Default types for Ruby/Rails -> console, rake, web, worker
-----> Compiled slug size is 9.6MB
-----> Launching... done, v4
       http://chutoriaru.herokuapp.com deployed to Heroku

To git@heroku.com:chutoriaru.git
 * [new branch]      master -> master

Everything that happens between Heroku receiving push and Compiled slug size is 9.6MB is part of the buildpack. In order:

The slug that results from this Rails-specific build process can now be booted on our language-agnostic dyno manifold alongside Python, Java, and many other types of applications.

Using a Custom Buildpack

In the example above, the appropriate buildpack was automatically detected from our list of Heroku-maintained defaults.

However, you can also specify your desired buildpack using arguments to the heroku create command or by setting the BUILDPACK_URL config variable. This enables the use of custom buildpacks. If you want to run your Rails app on JRuby, for example, specify the buildpack created by the JRuby team at app creation time:

$ heroku create --buildpack https://github.com/jruby/heroku-buildpack-jruby

Arbitrary Language Support

Since language support can be completely contained inside a buildpack, it is possible to deploy an app written in nearly any language to Heroku. Indeed, there are a variety of third-party buildpacks already available:

See the full list of third party buildpacks in the Dev Center.

Customizing the Build Process

In addition to enabling new language support, the ability to select a buildpack allows you to modify the previously closed Heroku build process for popular languages.

For example, consider a Ruby app that needs to generate static files using Jekyll. Before buildpacks, the only solutions would have been to 1) generate the files before deployment and commit them to the repository or 2) generate the files on-the-fly at runtime. Neither of these solutions are ideal as they violate the strict separation that should be maintained between the codebase, the build stage, and the run stage.

By forking the official Ruby buildpack, you could add a site generation step to your build process, putting file generation in the build stage where it belongs.

All of the default buildpacks are open source, available for you to inspect, and fork to modify for your own purposes. And if you make a change that you think would be useful to others, please submit an upstream pull request!

Adding Binary Support

Your app might depend on binaries such as language VMs or extensions that are not present in the default runtime. If this is the case, these dependencies can be packaged into the buildpack. A good example is this fork of the default Ruby buildpack which adds library support for the couchbase gem. Vulcan is a tool to help you build binaries compatible with the 64-bit Linux architecture which dynos run on.

Buildpacks Beyond Heroku

Buildpacks are potentially useful in any environment, and we'd love to see their usage spread beyond the Heroku platform. Minimizing lock-in and maximizing transparency is an ongoing goal for Heroku.

Using buildpacks can be a convenient way to leverage existing, open-source code to add new language and framework support to your own platform. Stackato, a platform-as-a-service by ActiveState, recently announced support for Heroku buildpacks.

You can also run buildpacks on your local workstation or in a traditional server-based environment with Mason.

Conclusion

Get started hacking buildpacks today by forking the Hello Buildpack! Read up on the implementation specifics laid out in the Buildpack API documentation, and join the public Buildpacks Google Group. If you make a buildpack you think would be useful and that you intend to maintain, send us an email at buildpacks@heroku.com and for potential inclusion on the third-party buildpacks page.

Browse the blog archives or subscribe to the full-text feed.