Posted by Tom
Today, we're announcing the release of a key part of our authentication infrastructure - id.heroku.com - under the MIT license. This is the service that accepts passwords on login and manages all things OAuth for our API. The repo is now world-readable at https://github.com/heroku/identity . Pull requests welcome.
While OAuth was originally designed to allow service providers to delegate some access on behalf of a customer to a third party, and we do use it that way too, Heroku also uses OAuth for SSO. We'd like to take this opportunity to provide a technical overview.
A quick bit of terminology
We use the term "properties" to refer to public-facing sites owned and operated by Heroku. See our post on Fancy Pants SSL Certificates on how to identify a Heroku property. As a concrete example of a property in this document, I use addons.heroku.com.
When a server makes some form of API call to another server, and is doing so in response to a browser request, this server-to-server request is a "backpost". An example is popular cat-picture-injecting service Meowbify. Requests to http://cats.www.heroku.com.meowbify.com/ result in Meowbify making a backpost to www.heroku.com.
The colloquial term "OAuth Dance" refers to the sequence of browser redirects which communicate an OAuth authorization code from service provider to consumer. It should also be read to include the backpost from consumer to provider that exchanges the code for an access & refresh token.
Single Sign On
When you click "login" on addons.heroku.com, you've probably noticed you get redirected to id.heroku.com. This creates a longer-lived cookie on id.heroku.com, and kicks off an OAuth dance with addons.heroku.com. If you log in to dashboard.heroku.com first, you'll also notice you can go to addons.heroku.com, click "login", and it'll also trigger an OAuth dance with addons rather than prompt you again for your password.
Our internal use of OAuth for SSO is necessary in part due to legacy concerns. For most of our history, Heroku's Aspen and Bamboo stacks served customer applications out of the *.heroku.com domain, the same domain we use ourselves for all public-facing Heroku properties like www.heroku.com and dashboard.heroku.com. This significantly complicates our use of cookie-based authentication, as discussed in . While we have retired Aspen and disabled the creation of new Bamboo applications, existing Bamboo apps continue to pose cookie-stuffing concerns and curtail our use of domain-wide cookies for any sensitive information.
This brings us to our first debatable deviation from the spec: pre-approval. A normal OAuth flow requires the customer be prompted to approve or deny the grant to an OAuth consumer. All Heroku properties are whitelisted to suppress this prompt, and implicitly grant. Additions to the list are tightly controlled, and strictly limited to our own properties. The rationale is that asking a customer to approve each individual property is a bit silly, since there is no third party.
When addons completes the OAuth dance, the refresh token is immediately discarded. The access token is then saved in the _heroku_addons_session cookie. The payload is encrypted with AES-256-CBC, random IV. The ciphertext is then HMAC'd with SHA256 for tamper protection. Secrets are held only by the addons app itself. Also present in the payload is a timestamp, telling addons to expire the cookie in six hours. All interactions between addons and our main API server are authenticated using that access token. For example, if you use addons.heroku.com to add Redis To Go or New Relic to your app, that results in addons.heroku.com decrypting your session cookie (after verification of the HMAC tag), extracting the OAuth access token, and making a backpost to api.heroku.com to execute the change. We use this same "save token in encrypted cookie" approach across all Heroku properties that require API access.
Yo dawg, I heard you like oauth...
It's worth noting that identity itself also follows this access-token-in-session-cookie approach. Identity itself is "just" an OAuth consumer, albeit one that takes passwords. Due to the complexities of OAuth, we've split it off from the "real" service provider - api.heroku.com. Identity does not retain refresh or access tokens in a server-side database - those are held only by API. It does have two super-powers, though. Identity can manage OAuth grants and revocations on behalf of a user, but all such management requires the user's OAuth access token (again: that lives in the user's session cookie). It can also request and receive access tokens with an exceptionally long lifespan - up to 30 days. This extended lifespan allows the user to remain logged in for an extended period without the need to store or retain a never-expiring refresh token.
The other SSO
As you can imagine, all this introduces a single-sign-OUT problem. If I log in to dashboard, then visit addons, I now have two encrypted cookies with OAuth tokens in them, plus a third for Identity itself. If I click "logout" on dashboard, my expectation as a user is that this logs me out of all Heroku properties. For the most part, that's true. Regardless of what site you're logged in to, clicking logout results in the revocation of both your cookie based session and its OAuth token on id.heroku.com.
This is our second spec deviation: on logout, Identity has API revoke the OAuth access tokens of all Heroku properties that were issued for that browser. This revocation happens server-side, which prevents any long redirect chains that the browser must follow (e.g., google.com's redirect to youtube.com on login/logout). The encrypted access tokens of course remains present in the session cookies of addons and other properties. The next time addons receives a request with this now-stale cookie that requires communicating with API, API will 403 the revoked token, which addons interprets as a logout. The customer is then thrown back into an OAuth dance with identity, and must retry the request. Currently, this "revoke the access tokens on logout" behavior is limited to Herkou properties, and is not publicly available.
However, replaying an old/stolen cookie against an addons URL that does not perform a backpost to API, you still appear to be logged in. The most interesting case for addons is that this can be done to get a list of your applications. What's going on? For performance reasons, addons.heroku.com uses memcache to cache the list of applications owned by a user. The network round-trip to memcache is much faster than the roundtrip to API, plus a call to API's database. However, that means changes on API aren't immediately reflected on addons. Because the goal of the cache is to avoid an API call, that means not every URL on addons does a call to API, and addons doesn't properly realize the cookie has a dead access token until it actually makes an API call. The result is that it's possible to appear still logged in to addons, even after you log out.
Here's a hint
Having to retry a request is a horrible user experience. To minimize that, we give properties a "hint" as to whether a user has logged out. On login, identity issues a second cookie, scoped to *.heroku.com, called
heroku_session_nonce. As the name indicates it contains a random nonce. The session nonce is reset on logout. All Heroku properties, when completing their OAuth dance, observe the current
heroku_session_nonce value and save it in their private session cookie. On all subsequent authenticated requests, the private nonce is compared with the domain-global
heroku_session_nonce cookie. A mismatch is treated as an authentication failure, and the browser is redirected to id.heroku.com to do a fresh oauth dance. The use of a domain-global cookie to indicate logout allows us to avoid any additional database roundtrips, and lets us avoid forcing a backpost to API on every request to every property.
But wait, I said at the start we're not using a global cookie because of our legacy of Bamboo, and untrusted people having access to all such cookies. Doesn't
heroku_session_nonce suffer from the same problem? Of course it does! Controlling a user's nonce cookie has two noteworthy security implications:
- Denial-of-Service: An attacker able to coerce a victim into visiting a site under their control can use a malicious Bamboo app to continuously delete or overwrite the nonce with gibberish. This can forcibly log out the victim, making it difficult to interact with Heroku. While this is undesirable, we've decided it's an acceptable risk given the complexity of any other solution.
- Session Fixation (kinda-sorta): An attacker who observes a victim's nonce can set a tracking cookie on the browser to uniquely identify the victim. When the nonce changes (i.e., on logout), the attacker can continuously re-set the nonce to the old value. This would result in addons and other properties believing that, until the property's session cookie expires, the victim is still logged in. Without the presence of an XSS or similar vuln, however, the attacker is unable to leverage this further. In the shared-browser threat model (e.g., internet cafes in developing regions), this becomes slightly more interesting. However, a plethora of more serious attacks come into play in that case, such as keystroke logging. Given that, and the lack of a useful attack, we are again OK with this risk.
We realize this is still sub-optimal, and certainly aesthetically displeasing to security folks. An elegant, performant, provably secure solution to handling distributed cache invalidation is a special case of one of the two hard problems in computer science. If you're able to solve it, you'll probably get a Turing Award, our field's closest thing to a Nobel Prize. Until this happens, we're stuck with a series of workarounds and complex interactions as I describe above.
Isn't this confidential??? What if there's a security vulnerability?
Heroku would not exist without open source. Other security sensitive open source software we use include "Rails" and "The Linux Kernel". While we use GitHub's issue tracker extensively, as always we ask security researchers submit vulnerability reports to firstname.lastname@example.org. The Security Team's PGP key is available in our vulnerability reporting guidelines.
Vulns like this?
https://github.com/heroku/identity/pull/49 - we had a CSRF issue in the approve/deny page. This was originally reported to us from an independent security researcher. It's a good example of some ambiguity in the OAuth spec. From RFC 6749, sec 3.1:
The authorization server MUST support the use of the HTTP "GET"
method [RFC2616] for the authorization endpoint and MAY support the
use of the "POST" method as well.
There's two ways to read this:
The OAuth consumer redirects the end user to identity to start the dance, and should be able to use an HTTP 302 temporary redirect to do so. Browsers do a GET for all 301 and 302's, so the identity needs to accept a GET to display the approve/deny page.
By "endpoint", they mean the state-changing URL that the browser posts to when the end user pushes the "Approve" button. Normally, since this is a state-changing action, you'd only use POST (or arguably, PUT). But, the spec says supporting GET is an RFC MUST, so GET it is.
As you can imagine, an extra line in the spec to differentiate what "endpoint" it means would go a long way.
Given our limited resources, the low impact of the information available to an attacker, the compensating measure of the six-hour cookie lifetime, and the high difficulty of being able to execute an attack like this without also being able to gain greater access to a victim's computer that would result in more rewarding attacks (e.g., a keystroke logger to capture the user's password), we are comfortable releasing identity. As we continue to build and improve Heroku as a platform, we are constantly looking for how to incorporate security needs into our underlying architecture.
If you have any feedback or suggestions on this, we would be delighted to hear them. We have considered the "obvious" solution of having API do a callback to all properties to notify them to do any logout-related cleanup (i.e., flush their caches). We haven't gone down that path yet because this would be a moderately large endeavor, and something that should really be handled in the OAuth protocol specification itself. Because it's not complicated enough, you see.
While our adoption of OAuth for SSO was a major team-spanning effort, identity was principally written by my colleague Brandur Leach. All thanks go to him, but any factual errors here are mine alone.
Heroku Security Hall-of-Famer Tejash Patel's report to us in July 2013 was the impetus behind this blog post and the open-sourcing of identity.
The key idea of decentralizing credentials out to the browser, and thus making id.heroku.com a less tempting target, originally came from Scott Renfro, my former coworker and mentor in paranoia.
- "Origin Cookies: Session Integrity for Web Applications", Web 2.0 Security and Privacy Workshop, 2011. A. Bortz, A. Barth, A. Czeskis. http://w2spconf.com/2011/papers/session-integrity.pdf
- "There are only two hard problems in computer science: cache invalidation, naming things, and off-by-one errors." -Paraphrase of saying generally attributed to the late Phil Karlton.
Heroku Security Team