Adding Concurrency to Rails Apps with Unicorn

With support for Node.js, Java, Scala and other multi-threaded languages, Heroku allows you to take full advantage of concurrent request processing and get more performance out of each dyno. Ruby should be no exception.

If you are running Ruby on Rails with Thin, or another single-threaded server, you may be seeing bottlenecks in your application. These servers only process one request at a time and can cause unnecessary queuing. Instead, you can improve performance by choosing a concurrent server such as Unicorn which will make your app faster and make better use of your system resources. In this article we will explore how Unicorn works, how it gives you more processing power, and how to run it on Heroku.

Concurrency and Forking

At the core of Heroku is the Unix Philosophy, and we see this philosphy at work in Unicorn. Unicorn uses the Unix concept of forking to give you more concurrency.

Process forking is a critical component of Unix's design. When a process forks it creates a copy of itself. Unicorn forks multiple OS processes within each dyno to allow a Rails app to support multiple concurrent requests without requiring them to be thread-safe. This means that even if your app is only designed to handle one request at a time, with Unicorn you can handle concurrent connections.

Unicorn leverages the operating system to do most of the heavy lifting when creating and maintaining these forks. Unix-based systems are extremely efficient at forking, and even take advantage of Copy on Write optimizations that are similar to those in the recently released Ruby 2.0.

Unicorn on Rails

By running Unicorn in production, you can significantly increase throughput per dyno and avoid or reduce queuing when your app is under load. Unicorn can be difficult to setup and configure, so we’ve provided configuration documentation to make it easier to get started.

Let's set up a Rails app to use Unicorn.

Setting up Unicorn

First, add Unicorn to your application Gemfile:

gem 'unicorn'

Run $ bundle install, now you are ready to configure your app to use Unicorn.

Create a configuration file for Unicorn at config/unicorn.rb:

$ touch config/unicorn.rb

Now we're going to add Unicorn-specific configuration options, that we explain in detail in Heroku's Unicorn documentation:

# config/unicorn.rb
worker_processes 3
timeout 30
preload_app true

before_fork do |server, worker|

  Signal.trap 'TERM' do
    puts 'Unicorn master intercepting TERM and sending myself QUIT instead'
    Process.kill 'QUIT',

  defined?(ActiveRecord::Base) and

after_fork do |server, worker|

  Signal.trap 'TERM' do
    puts 'Unicorn worker intercepting TERM and doing nothing. Wait for master to sent QUIT'

  defined?(ActiveRecord::Base) and

This default configuration assumes a standard Rails app with Active Record, see Heroku's Unicorn documentation for more information. You should also get acquainted with the different options in the official Unicorn documentation.

Now that we've got your app setup to use Unicorn, you’ll need to tell Heroku how to run it in production.

Unicorn in your Procfile

Change the web command in your Procfile to:

web: bundle exec unicorn -p $PORT -c ./config/unicorn.rb

Now try running your server locally with $ foreman start. Once you're happy with your changes, commit to git, deploy to staging, and when you're ready deploy to production.

A World of Concurrency

With the recent release of the Rails 4 beta, which is threadsafe by default, it's becoming increasingly clear that Rubyists care about concurrency.

Unicorn gives us the ability to take multiple requests at a time, but it is by no means the only option when it comes to concurrent Rack servers. Another popular alternative is Puma which uses threads instead of forking processes. Puma does however require that your code is threadsafe.

If you've never run a concurrent server in production, we encourage you to spend some time exploring the ecosystem. After all no one knows your app's requirements better than you.

Whatever you do don't settle for one request at a time. Demand performance, demand concurrency, and try Unicorn today.

Concurrency is not Parallelism. Rob Pike at Waza 2012 [video]

In planning Waza 2013 we went back to reflect on last year’s speakers. And we want to make the talks readily availble to anybody who could not make it last year—or who wants a refresher. Check back soon for more talks from Waza 2012. And we hope to see you in person at Waza 2013 coming up FAST on Feb. 28 in San Francisco.

In a world of evolving languages, frameworks and development patterns, we developers must continually improve our craft. Innovative developers already have jumped on board many of these shifts. We’ve seen this with the adoption of more agile frameworks (such as Rails, Django, and Play). We’ve seen it too with a shift towards asynchronous programming patterns such as in Node.js and with evented programming in Rails.

One clear example of this evolution is the re-emergence of a focus on concurrency.

Rob Pike—with the help of a few gophers—gave this fantastic educational talk on concurrency at last year’s Heroku waza conference. Rob covered big themes that are important to developers—speed, efficiency and productivity. And he covered parallelism and concurrency in programming processes—making it very clear that they are not the same thing. If you want to click through Rob’s slides while watching, they are hosted at GoogleCode.

Rob (@rob_pike) is a software pioneer. His influence is everywhere: Unix, Plan 9 OS, The Unix Programming Environment book, UTF-8, and most recently the Go programming language.

Waza is the Japanese word for art and technique and it's where we celebrate craft and the creative process of software development with technical sessions and interactive artistic happenings.

Better Queuing Metrics With Updated New Relic Add-On

Today our partner, New Relic, released an update to the Ruby New Relic agent that addresses issues brought up by our customers. The new version corrects how New Relic reports performance metrics for applications running on Heroku. Queueing time is now reported as the total time from when a request enters the router until the application starts processing it. Previous versions of New Relic only reported queueing time in the router. The new approach will result in more accurate queueing metrics that allow you to better understand and tune the performance of your application.

Update, Feb 22:

New Relic has released a similar update for Python. Python developers should update to this latest version to benefit from the improved metrics. JVM language developers do not need to take any action. The current New Relic Java agent already includes the improved queue time metrics.

Install or update the New Relic Ruby Add-on

If you are already using New Relic with your Ruby apps, then simply update your Gemfile to reference the new agent version:

gem "newrelic_rpm", "~>"

then run

$ bundle update newrelic_rpm
$ git add Gemfile Gemfile.lock
$ git commit -m 'update new relic agent'
$ git push heroku master

If you are not yet using New Relic, you can learn how to install and configure the add-on on Dev Center.

How It Works

The updated New Relic agent uses an improved strategy for reporting request queue times on Heroku. Prior to this update, New Relic reported request queue time using a value set by the Heroku routing mesh. This only reflected the time a request spent in the router queue and did not properly include time spent waiting in the dyno’s request queue.

Our routing performance update documents our finding that some applications have requests that may spend significant time queued on dynos. To help our customers understand precisely where their applications are being delayed, the updated New Relic agent includes dyno wait time in the total queue time metric. The new queue time is calculated as the difference between the time the Heroku router first gets a request and the time the request begins processing on the dyno. The result is a more accurate picture of how long requests wait in queues.

Clock Skew

The new version of New Relic calculates queue times using two different clocks — the dyno and router clocks. While Heroku servers regularly sync their clocks, it’s common for clocks to drift apart between syncs. This phenomenon is known as clock skew and it can affect the queue time metric collected by New Relic. In our experience, even though clock skew can cause small inaccuracies, the overall trend data displayed by New Relic will still accurately reflect your application’s queue times.

How to Learn More

If you’d like more information on how to install and configure the New Relic add-on, please see the New Relic Dev Center article and the Unicorn specific instructions. For general suggestions on how to improve the performance of your app, check out our performance overview page.

What’s Happening at Waza

Waza (技) 2013 is only a week away and the schedule is packed with amazing speakers and hands-on craft experiences. We can’t wait to share this day with all of you. If you haven’t yet, register now before it’s too late!

This year, Waza will have three stages with a total of 20 talks. The rest of the venue is packed with lounges, co-working spaces, snack and beverage stations, and, thanks to our sponsors, all kinds of interactive, craft-based activities to fuel your creative mind.

##Hands-on Crafts

In addition to our great sponsored happenings, we have quilting, dye-making and printmaking artists on hand. Come experience their unique crafts, hands-on and up-close.

Quilting and Dye-making: Maura Grace Ambrose is bringing her Folk Fibers all the way from Austin, TX. Maura collects natural materials to dye fabrics then uses them to stitch together special quilts. Join Maura in the hands-on creation of a custom Waza quilt.

Printmaking: Marissa Marquez joins us at Waza for the second time. Marissa uses woodworking tools to hand-carve original designs into blocks and stamp them onto paper. She has created some beautiful prints for Waza which you can use to print your own postcard, or she can teach you how to make your own.

##All Things Delicious

Blue Bottle Coffee: We’re a bit obsessed with coffee at Heroku. And it’s an obsession we like to share. Doors open at 10am for badge pickup, show up early and enjoy a cup of pour-over coffee while you get to know some Herokai. But don’t worry, this cup-at-a-time coffee service will be available all day.

Tea Lounge: We know some people prefer tea, including many of our own staff, so we’ve set aside space for the Waza tea lounge where you’ll find a variety of loose-leaf teas.

Food Trucks: We will have an assortment of local food trucks offering a selection of lunch specials. Use the ticket on your badge and select your favorite.

##Don’t Want to Deal with Parking?

Secure bike parking will be available thanks to the fine folks at the San Francisco Bicycle Coalition.

##Meet our Sponsors

####Atlassian Well-known for their collaboration tools that help teams build better products, Atlassian is providing Waza with co-working spaces to get the job done.

####DODOcase Creating artisan products for technology is the core of DODOcase’s business. Stop by and create your own Waza-branded, hand-bound notepad to take home .

####Github We all know that Github knows how to throw a great party, so we’re pumped that they are sponsoring the Waza afterparty. The fun starts at 9 p.m., and everyone with a Waza badge is invited.

####MongoHQ We are pleased to have an origami artist, Linda Mihara as part of the Waza experience. And thanks to MongoHQ for adding rockets to her repertoire. Learn the ancient art of paper folding and make your own shiny silver rocket to take home.

####Neo4j Adding an art experience to one of the lounges, Neo4j is bringing an amaizng Zen Table to Waza. They’ll also use their superstar graphing skills to monitor and display the event’s Twitter activity.

####Neon Roots A full service interactive agency specializing in custom web & mobile development that contributed to our web site.

####New Relic Cheers! Thanks to New Relic, we’ll be serving craft beers at Happy Hour. To kick things off at 5 p.m., an expert from 21st Amendment Brewery will share how those delicious flavors you’re enjoying came to be.

####SendGrid SendGrid is bringing us a Waza first: Arduino hacking in the Garden! If you are new to Arduino, SendGrid will be leading two intro talks to get you started. If you’re already an Arduino, just sit down and hack! You might even win an Arduino kit to take home.

####Treasure Data Saving us all from huddling in a corner near a power source, or missing anything to charge our laptops, Treasure Data is providing Waza attendees with two power valet stations. Drop off your electronics to be safely stored and charged while you enjoy the talks and activities.

##Register Now

February 28th is less than two short weeks away. If you haven’t registered yet, now is the time. Looking forward to seeing you all at the Concourse for a sure-to-be-epic Waza!

Routing Performance Update

Over the past couple of years Heroku customers have occasionally reported unexplained latency on Heroku. There are many causes of latency—some of them have nothing to do with Heroku—but until this week, we failed to see a common thread among these reports. We now know that our routing and load balancing mechanism on the Bamboo and Cedar stacks created latency issues for our Rails customers, which manifested themselves in several ways, including:

  • Unexplainable, high latencies for some requests
  • Mismatch between reported queuing and service time metrics and the observed reality
  • Discrepancies between documented and observed behaviors

For applications running on the Bamboo stack, the root cause of these issues is the nature of routing on the Bamboo stack coupled with gradual, horizontal expansion of the routing cluster. On the Cedar stack, the root cause is the fact that Cedar is optimized for concurrent request routing, while some frameworks, like Rails, are not concurrent in their default configurations.

We want Heroku to be the best place to build, deploy and scale web and mobile applications. In this case, we’ve fallen short of that promise. We failed to:

  • Properly document how routing works on the Bamboo stack
  • Understand the service degradation being experienced by our customers and take corrective action
  • Identify and correct confusing metrics reported from the routing layer and displayed by third party tools
  • Clearly communicate the product strategy for our routing service
  • Provide customers with an upgrade path from non-concurrent apps on Bamboo to concurrent Rails apps on Cedar
  • Deliver on the Heroku promise of letting you focus on developing apps while we worry about the infrastructure

We are immediately taking the following actions:

  • Improving our documentation so that it accurately reflects how our service works across both Bamboo and Cedar stacks
  • Removing incorrect and confusing metrics reported by Heroku or partner services like New Relic
  • Adding metrics that let customers determine queuing impact on application response times
  • Providing additional tools that developers can use to augment our latency and queuing metrics
  • Working to better support concurrent-request Rails apps on Cedar

The remainder of this blog post explains the technical details and history of our routing infrastructure, the intent behind the decisions we made along the way, the mistakes we made and what we think is the path forward.

How routing works on the Bamboo stack

In 2009, Heroku introduced the Bamboo stack. It supported only one language, one web framework and one embedded webserver. These were: Ruby (MRI 1.8), Rails (2.x) and Thin, respectively.

The Bamboo stack does not support concurrency. On Bamboo, a single process can serve only one request at a time. To support this architecture, Heroku’s HTTP router was designed to queue requests at the router level. This enabled it to efficiently distribute requests to all available dynos.

The Bamboo router never used a global per-application request queue. The router is a clustered service where each node in the cluster maintains its own per-application request queue. This is less efficient than routing with a global request queue, but it is a reasonable compromise as long as the cluster is small.

To see why, let’s look at a simplistic example. In the two diagrams below, requests are coming in through three router nodes and being passed to two dynos. The majority of requests take 50ms, while a rare slow request takes 5000ms. In the first diagram, you can see how a slow request, coming in to Router 1, is passed to Dyno 1. Until Dyno 1 is finished with that request, Router 1 will not send any more requests to that dyno. However, Routers 2 and 3 may still send requests to that dyno.

Meanwhile, as illustrated in the next diagram, because Routers 2 and 3 are not aware that Dyno 1 is busy, they may still queue up one request each for Dyno 1. These requests are delayed until Dyno 1 finishes processing the slow request.

The inefficiency in request routing gets worse as the number of routers increases. This is essentially what’s been happening with Rails apps running on the Bamboo stack. Our routing cluster remained small for most of Bamboo’s history, which masked this inefficiency. However, as the platform grew, it was only a matter of time before we had to scale out and address the associated challenges.

Routing on Cedar

As part of the new Cedar stack, we chose to evolve our router design to achieve the following:

  • Support additional HTTP features like long polling and chunked responses
  • Support multi-threaded and multi-process runtimes like JVM, Node.js, Unicorn and Puma
  • Stateless architecture to optimize for reliability and scalability

Additionally, to meet the scalability requirements of Cedar we chose to remove the queuing logic and switch to random assignment. This new routing design was released exclusively on Cedar and was significantly different from the old design. What’s important to note is we intended customers to get the new routing behavior only when they deployed applications to Cedar.

Degradation of Bamboo routing

In theory, customers who had relied on the behavior of Bamboo routing could continue to use the Bamboo stack until they were ready to migrate to Cedar. Unfortunately that is not what happened. As traffic on Heroku grew, we added new nodes to the routing cluster rendering the per-node request queues less and less efficient, until Bamboo was effectively performing random load balancing.

We did not document this evolution for our customers nor update our reporting to match the changing behavior. As a result, customers were presented with confusing metrics. Specifically, our router logs captured the service time and the depth of the per app request queue and present that to customers, who in turn were relying on these metrics to determine scaling needs. However, as the cluster grew, the time-and-depth metric for an individual router was no longer a relevant way to determine latency in your app.

As a result, customers experienced what was effectively random load balancing applied to their Bamboo applications. This was not caused by an explicit change to the Bamboo routing code. Nor was it related to the new routing logic on Cedar. It was a pure side-effect of the expansion of the routing cluster.

No path for concurrent Rails apps on Cedar

We launched Cedar in beta in May 2011 with support for Node.js and Ruby on Rails. Our documentation recommends the use of Thin, which is a single-threaded, evented web server. In theory, an evented server like Thin can process multiple concurrent requests, but doing this successfully depends on the code you write and the libraries you use. Rails, in fact, does not yet reliably support concurrent request handling. This leaves Rails developers unable to leverage the additional concurrency capabilities offered by the Cedar stack, unless they move to a concurrent web server like Puma or Unicorn.

Rails apps deployed to Cedar with Thin can rather quickly end up with request queuing problems. Because the Cedar router no longer does any queuing on behalf of the app, requests queued at the dyno must wait until the single Rails process works its way through the queue. Many customers have run into this issue and we failed to take action and provide them with a better approach to deploying Rails apps on Cedar.

Next Steps

To reiterate, here is what we are doing now:

  • Improving our documentation so that it accurately reflects how our service works across both Bamboo and Cedar stacks
  • Removing incorrect and confusing metrics reported by Heroku or partner services like New Relic
  • Adding metrics that let customers determine queuing impact on application response times
  • Providing additional tools that developers can use to augment our latency and queuing metrics
  • Working to better support concurrent-request Rails apps on Cedar

If you have thoughts or questions, please comment below or reach out to me directly at

Browse the blog archives, subscribe to the full-text feed, or visit the engineering blog.