This is my rundown of the basics of Oauth2. I found the implementation of login security seems to be vague when dealing with token based API authentication, and many people seem to jump to using pre-made solutions instead of learning and building their own implementations. Although it is hard to argue with “Don’t roll your own crypto”, I think it is valuable to understand the flow and risks associated with the technology you are using.
As for anything, this only applies at the time that this was written. Always read into the security techniques you are using to ensure that they are up to date as I cannot guarantee that this information will apply 5, 10, or 20+ years from now. Before implementing or delivering anything for production you should read the Oauth2 RFC, and have a firm understanding of the security risks (or hire a professional to check what you're doing).
What is OAuth2?
OAuth2 is the second version of the OAuth standard which was created to allow for third-party applications to gain access to a service on behalf of a resource owner1. When we create an API or Application Programming Interface we often want to allow for scaling beyond our own usage, allowing users to write add-ons that can access the data on a users account without having to worry about the security of the user’s password (from now on referred to as the secret) or for allowing for simple account creation across platforms or services (this is more along the lines of OpenID).
This is also useful because the usage of OAuth tends to force developers to use better practices when dealing with user security, removing the likelihood that a secret will be shared with the wrong person or stored in an un-hashed or insecure format. This does not mean that OAuth solves all security problems when dealing with logins, but it does force good habits.
OAuth 2 vs 1
You may already be saying: “Nathan, you’re talking about OAuth2 and we haven’t even heard about OAuth1. What is the difference?”. OAuth Core was the original proposal for the OAuth standard created in 2007. OAuth2 was a proposed standard to replace the original, and is not backwards compatible. It was an attempt to simplify and clarify many of the aspects of OAuth1.
Why not just use passwords?
Good question, think about the way that you authenticate for your API. Let’s say we have an add-on for Facebook called “FavBook” which allows users to tell other users about their favourite books automatically (Great idea right? Why am I not already a millionaire?). Now this service needs to have access to your Facebook account, so how can this be done?
A really simple way (but very bad) would be to have the add-on accept the user’s password as input and send it to Facebook when it needed access to the user’s account.
Now do you see the red flags? How about when I ask these questions:
- Who gets to see the credentials?
- Where are the credentials being stored?
- How are the credentials being stored?
If you didn’t just cringe a bit let me remind you that Troy Hunt has a directory of large tech companies that have been breached, many of which have stored passwords in plain text. And even if they do hash the passwords, what is to say they didn’t accidentally log the password before hashing them? Or what if they get breached and someone steals the passwords? And so on and so on.
Before we get into the specifics of OAuth2 I will outline how OAuth2 works. In our model we have a couple participants, as defined in the RFC:
- The Resource Client: The person who is logging in.
- The Client: The service requesting permission to access the data for the client.
- The Authorization Server The server that is handling the authentication requests.
- The Resource Server: The server that is running the main API we want to access.
The following are the most generic steps involved with gaining access using OAuth2:
- The Client makes a request to the Resource Owner to gain access to the data.
- The Resource Owner grants access to the Client through the Client or Directly to the Authorization Server
- The Authorization Server then responds with an access token for the API.
- The Client then stores and uses that access token with the Resource Server to access the Resource Owner’s data.
This model can change drastically based on the model you decide to use.
An authorization grant is some credential that represents the authorization to the resources by the Resource Owner. There are 4 core authorization grants in OAuth2: Owner Password Credentials, Client Credentials, Implicit and Authorization Code. Starting from simplest but least secure and moving to more complicated but more secure.
Resource Owner Password Credentials
Again I should start this off by saying I do not suggest this flow. The Resource Owner Password Credentials can be passed directly to the Client which can then be exchanged for an Access Token. The Client should then delete the Credentials and only use the Access Token. Again this seems very simple but the risk is quite large, as you must have a large amount of trust in the client as they could intentionally or unintentionally store your Password Credentials which defeats the whole purpose of OAuth.
If we are only working with data that is owned by the Client and we do not need Resource Owner material we could use Client Credentials. Again this limits your use case and can complicate things later on.
I am going to start off with Implicit grants, which are the most simple flow for OAuth. I would not suggest this technique as there are some security implications that are better handled in the Authorization Code flow. Implicit grant is a simplified version of the authorization code, where the Resource Owner is redirected to the Authorization Server with a set of parameters that identify the Client. The Resource Owner then logs in, and the created Access Token is sent back to the Client. As the token is returned in a GET request, it is stored in the history of the browser, which is a risk. It is common for developers to use this technique to evade CORS issues when using a pre-made authorization server. This is because you need to use a POST request with the Authorization Code flow, but if you are writing your own you have the flexibility to set the CORS headers appropriately.
An authorization code is obtained by redirecting the Resource Owner to the Authorization Server to login, and then redirecting back with the authorization code. This allows for the user to only communicate their credentials with the Authorization Server but allows for the Client to gain access. The Authorization Code is sent back in a GET request which should have a short lifespan to reduce the possibility of it being stored (I won’t advise on how long this should last, but some say less than 10 minutes, but can be as short as 30 seconds). This is then exchanged for an access token to access the data in the form of a POST request.
PKCE Authorization Code
PKCE or Proof Key for Code Exchange is an attempt to mitigate the risks associated with transmitting Authorization Codes using a GET request for native applications. As mentioned in both Authorization Code and Implicit there is a risk associated with sending any token over a GET request as it could be intercepted and used to obtain a legitimate long term Resource Token. There is also the risk for native applications, as two applications could have the same URI scheme, resulting in the token being sent to both applications (oh no, I’m in danger!).
To mitigate this, the client should have a secret that is at least 43 characters, and up to 128 characters using any combination of upper case (A-Z), lower case (a-z), numbers (0-9), and hyphens, periods, underscores, and tildes (- . ~ _). This is created by the Client before the transaction has started.
We then need to hash this value using a SHA-256 algorithm to create a code_challenge. We then Base64 URL Encode this value which is sent to the Authorization Server. The Authorization server sends this value to the Token Endpoint Server, and sends the Authorization Code back to the client. The client can then send the code_verifier (secret) along with the Authorization Code to the Token Endpoint Server allowing for the server to ensure that the client is who they say they are before creating and sending an Authorization Token.
The server can verify that the Client is who they say they are by simply taking the send Access Token and Hashing it and comparing it to the code it received.
Since we do not want to set an infinite time on Authorization Tokens, but we also don’t want to have the user have to randomly re-login we need a way of getting a refreshed token from the Authorization Server. We can accomplish this using a Refresh Token.
The Refresh Token is sent alongside the original Access Token, and is only sent to the Authorization Server, which is easier to ensure that the information was securely sent over HTTPS. This allows us to reduce the lifespan of the Access Token which is higher risk, as it has a larger use area (which could result in it being sent over an insecure channel e.g. HTTP). Refresh Tokens can also be used to create additional Access Tokens with the same or more restricted access.
Again this only adds security if you have created a secure environment to store and transmit the token.
Another important distinction is the different client types: Confidential and Public. The RFC defines Confidential as
"Clients capable of maintaining the confidentiality of their credentials (e.g., client implemented on a secure server with restricted access to the client credentials), or capable of secure client authentication using other means."
And public as:
“Clients incapable of maintaining the confidentiality of their credentials (e.g., clients executing on the device used by the resource owner, such as an installed native application or a web browser-based application), and incapable of secure client authentication via any other means.”
Now to break this down, the requirements for a confidential vs public client are:
- Confidential Client: Can secure their credentials.
- Public Client: Cannot secure their credentials.
So, if we are working with someone who’s credentials are stored on someone’s machine (e.g. Native or Web Based Applications that do not use a server) they are public. But if you can secure the credentials (e.g. a service that connects to a server owned by the Client which makes the request for them using credentials stored only on the Client’s server) it can be classified as a confidential client. For most of this article I am talking about Public Clients, unless explicitly stated.
Why do we care? Well if we have a public client we should not assign a client secret. Why? Because if a User can see the client secret they could steal it which would be horrible as they could use that secret in a different application and pretend to be you.
Depending on if you are creating a Confidential or Public application you need to create a client_secret. To ensure that Clients do not accidentally leak information you should design the client registration to not give a client_secret if the application is Public.
You initially need to create a client ID which should be random and unique. Try to make this long and random enough to not be guessable.
If you decide that you do need a client secret use a cryptographically- secure random number generator at a minimum length of 42 bytes. Do not store the secret in a raw format, this must be treated the same as a password and therefore should be stored as a hash.
When registering an application you will need to get enough information from the Client to ensure that the application has a unique endpoint. The suggested information can be:
- Application Name
- URL for the application
- Description of the application
- Redirection URLs
The most important is likely the Application Name and the Redirection URLs. The redirection URLs are where the user will be redirected after logging into the Authorization Server. This keeps someone from specifying your application as the request and using an alternative redirection to get the Authorization Token.