The Old Way:#This would be a typical way to handle tokens for an API, and if you’ve worked with any kind of API you’re probably familiar with this design:
- Generate random session or token key
- Store payload data in a datastore
- Use session or token key to lookup the payload
With this in mind we can look at how token design and cryptography can lead us to a better architecture.
The Better Way: (This method is essentially the JSON Web Token)#We can use encryption to have a token design that avoids database lookups entirely. Let’s start with the complete process and then we’ll review it in detail:
- Minimize payload data. The payload should only contain minimal information, like user id, timestamps, and possibly other identification codes.
- Serialize the payload.
- Sign the payload. There are different ways to go about this, but it’s typically done with HMAC. (think JSON Web Signature)
- Encrypt payload + signature combo (think JSON Web Encryption)
- Base64 encode the encryption result
For all this we basically just need a library with a generate token and a parse token methods. There are 3 obvious benefits from this design:
- No database lookups
- No storage requirements for most tokens
- Constant-time token parsing
And there’s only two requirements to be able to parse the tokens correctly:
- Shared encryption and MAC keys between internal services
- Properly implemented crypto libraries
Revocation & Logout#One small issue is that revocation and logout become problematic with this approach. Since we haven’t stored the token anywhere, how do we go about revoking its permission?
To answer that let’s get some details about the data on a typical payload:
- User identification
- ID, name, avatarURL
- Token metadata
- issuedAt, expires, sessionID
A token like this has roughly the following characteristics:
- Around 220 bytes in size
- The payload data is encrypted
- Two secret keys are required to parse or generate the tokens. This protects against both spoofing and data leakage.
Now back to logging out. For most APIs, revocation is more of an edge case rather than a common task. It won’t matter much then that when using this design the revocation process is going to be a bit slower than usual. In brief, we’ll have to use a back-channel (e.g. a message queue) that will inform all our services that a specific token has been revoked. Each service must then store in memory (or in a fast data-store cache like memcache or redis) that the token has been revoked. It’s in fact much easier and faster to store only the revoked tokens rather than storing every single one. This way instead of checking if a token is valid we can just check if a payload has been de-authorized.
- Propagate the de-authorization to nodes through a back-channel
- Wait for propagation to finish before responding