OAuth: what you need to know

OAuth explained
Enable alien keys to control authorisation and authentication for your online life

OAuth is an authentication and authorisation protocol, originally developed for web applications, born inside Twitter in 2006.

It enables third-party software to do something on your behalf, for a limited time and without giving that software full, permanent access to reserved information. The most common analogy is valet keys.

Let's delve a little deeper and find out more about OAuth.

Q: So, valet keys. You mean those keys normally handed to parking attendants at hotels?

A: Yes. Those keys make it possible to open, start and drive your car, but only for a very short trip and without opening the trunk. OAuth works like a valet key for your data. It gives temporary and restricted access to something that's yours, without giving away full control.

Q: Now, I understand what you mean but... is this a real world problem?

A: It became one when online services and social networks in all their forms, from Twitter and Flickr to remote banking, became not only ubiquitous, but inter-connected - they're much more useful when you can make them work together.

Q: You refer to cases such as publishing a Flickr gallery on Facebook.

A: Yes, exactly. Being able to do that without re-entering everything manually is great. However, doing it without something like OAuth may mean giving those sites full access to all of your stuff (such as files, contact lists or full access to services).

Q: So that's why you talked about both authentication and authorisation?

A: Correct. Authentication means having a way to prove that you are really you. Please note that, in general, it makes no difference if 'you' is a human being or a software program. Whereas authorisation is a separate, equally necessary service. If a person or software program has already proved to Facebook who they are, this doesn't mean that they have permission to update our status as if they were us.

Q: Couldn't OpenID have been used for this?

A: OpenID only deals with authentication. OAuth, instead, helps in all those cases in which (using OAuth terminology) some software (client) that would like to access data on behalf of whoever has the right to authorise such access (resource owner) is completely separate from, and unknown to, the software or service that actually stores those resources.

Q: Wait a second! Something like this was possible years before OAuth!

A: Yes, but in most cases it meant either using only one account of a network of already co-operating websites, or giving to at least one of them your usernames and passwords on all the others. OAuth attempts to close this security hole.

Q: You mean authorising access to what's inside a web account without giving out my password and username?

A: Let's assume that you made a comment on some blog, and want that blog to post it on Twitter on your behalf, to save typing. When you tell the blog software to do this (for example by clicking a button), it will send a request to Twitter, that includes an identification key and the list of data or services it'd like to access on your behalf. Twitter (not the blog!) will present you a custom authorisation web form hosted on its server. If you log in successfully on Twitter and answer "yes" to that request, you'll have authorised Twitter to satisfy the request of that blog. Without handing over your password and username.

Q: Cool! Then what?

A: Twitter will tell your browser to go back to the blog, but with a special URL that includes an 'access token' or single-use authorisation key. At that point the blog software will be able to present that token to Twitter, as proof that it is the one that just got your permission to do something to, or with, your account.

Q: And this will work with every OAuth compatible website, not just Twitter?

A: That's correct. As long as those websites don't reject the initial request, of course. Besides convenience for the end user, another powerful driver for OAuth was the wish to make life harder for spambots and other malicious applications.

Q: How would OAuth do that?

A: Regardless of user authorisation, a software program can work as described only if it has permission to do so from the website it wants to access. OAuth accomplishes this by using several identification keys, or credentials, in parallel.

Q: What are these credentials and who issues them?

A: The one we've already mentioned, those used to declare that access from some program is allowed without giving your password to it, are called token credentials. Before getting to that point, however, the client must have sent to the server its valid client credentials.

In general, they're issued by the web server itself. When the developers of some software want to add OAuth capabilities to it, they register with the server to obtain such credentials, or keys. This makes it a bit easier to stop some malware, but also broke lots of existing programs.

Q: You keep speaking of websites. Does this mean that OAuth is unusable by desktop software?

A: Now that's a trick question. Technically, there is nothing in OAuth that prevents clients from being traditional desktop applications running inside your computer. In practice, doing it (at least with OAuth 1.0) either makes life harder for good faith developers, or the whole client credentials concept almost useless. Especially when using open source software.

Q: Argh! Now that's bad, but why?

A: Because the scheme I described works perfectly when the client credentials are embedded in source code and/or compiled programs that only run inside some web server, where nobody can read said credentials in the source code or, using hex editors and similar tools, in executable files.

Q: Is this why the problem is even bigger with open source desktop software?

A: Precisely. If you put something that's supposed to stay private in some source code that everybody has the right to download and study... it's not private by definition, is it?

Q: Sure, but this only makes the scheme less useful. Why did you also say that OAuth breaks existing software?

A: Because before OAuth 1.0, anybody with a basic knowledge of shell scripting and cUrl (including yours truly!) could, in just a few minutes, wrap up a script that would automatically sign on in Twitter, to read a timeline or post a tweet. OAuth made this impossible without valid, registered client credentials. Even when getting those credentials takes much longer than writing the script in the first place!

Q: Isn't there any way to patch those scripts?

A: Of course there is: just use one of the many software libraries that have already been registered. However, this still makes those scripts much more complicated to write and maintain than they used to be. Until OAuth 2.0 is released, at least.

Q: You mean there's a version 2.0 coming? When?

A: The forecast, while we write, is that OAuth 2.0 should be completed by the end of 2011.

Q: What's new in OAuth 2.0? Will it solve these problems?

A: It could. One of the most important changes is the addition or redefinition of several so-called 'flows' to obtain credentials in the most straightforward way, even in scenarios where clients are not web servers but, for example, software running on mobile devices. There's also a cookie-based flow that should make it possible to resurrect the old cURL-based web automation scripts. There should also be several performance optimisations, because OAuth 1.0 doesn't scale very well.

Q: Where can I find out more?

The official OAuth Introduction.