A walk-through of a JWT verification

A JWT-token is a Base64 encoded, digitally signed JSON structure. As it turns out, they're pretty easy to make sense of once you peel away the different parts. I'll do that in this post, starting with the JWT token shown in example 1, and end with a completed decode and verification.

eyJhbGciOiJIUzI1NiJ9.eyJleHAiOjE0ODU2MzA2MDAsInN1YiI6InVzZXIxIiwiaXNzIjoiamRhdm
llcyIsImF1ZCI6Imdsb2JhbGNvIiwibmJmIjoxNDg1NjI5NzAwLCJpYXQiOjE0ODU2MzAwMDB9.JXcg
9jSHbVklARfhAIwGbDFeVS_6XrkC8cogy5lAM8U

Example 1: A sample JWT token

Base 64 decode

A JWT token looks indistinguishable from random gibberish (or a cat walking across a keyboard), but if you have some familiarity with web standards, you'll probably recognize the alphanumeric structure as Base64. If you look closer, though, you'll see that it isn't quite standard Base64; the period character that appears twice is not a valid Base64 delimiter. In fact, a JWT token is a period-delimited collection of three separate Base64 encoded structures. And, as you can see from the results of listing 1 in example 2, below, the contents of the first two substructures are just standard uncompressed JSON.

var s = 'eyJhbGciOiJIUzI1NiJ9.eyJleHAiOjE0ODU2MzA2MDAsInN1YiI6InVzZXIxIiwiaXNzI
joiamRhdmllcyIsImF1ZCI6Imdsb2JhbGNvIiwibmJmIjoxNDg1NjI5NzAwLCJpYXQiOjE0ODU2MzAw
MDB9.JXcg9jSHbVklARfhAIwGbDFeVS_6XrkC8cogy5lAM8U';
var parts = s.split('.');
console.log(new Buffer(parts[0], 'base64').toString());
console.log(new Buffer(parts[1], 'base64').toString());

Listing 1: JWT decoder

$ node jwt.js 
{"alg":"HS256"}
{"exp":1485630600,"sub":"user1","iss":"jdavies","aud":"globalco","nbf":1485629700,"iat":1485630000}

Example 2: JWT contents

So what do all of these cryptic three-letter codes mean? They're short so that JWT tokens take up as little space as possible, but they're standardized in RFC 7519. The first component is the header which describes how to process the remainder of the token. Here there's only one element, the alg (algorithm), which describes what to do with the third substructure and I'll describe below. Some JWT examples include a redundant second element called typ which is always set to JWT (all uppercase).

The second substructure is the body which is where most of the interesting claims of the JWT token appear. This JWT token is claiming that it was issued (iss) by jdavies for audience (aud) globalco and describes subject (sub) user1. The remaining claims are timestamps; seconds since Jan. 1 1970 (the epoch) which describe a precise point in time. In particular, it was issued at (iat) 1:00 PM on 1/28/17, should be used not before (nbf) 12:55 PM, and expires (exp) at 1:10 PM. A couple of minor points to notice about these dates: first, the dates are given in seconds since the epoch, whereas Java and Javascript both expect milliseconds since the epoch, so if you want to actually see what times these correspond to, you'll need to multiply these values by 1000. Second, if you want to compare them for validity, you need to be aware not only of the possible clock drift between sender and receiver but also any time zone differentials. This can get complicated, but is required to avoid replay attacks.

Signature Verification

So what about that last substructure? If you try to base64-decode it and convert it to an ASCII string, you won't get anything recognizable. This is where things start to get interesting. The last part validates the first two parts, providing assurance that the JWT token was actually generated by the issuing (iss) party. Specifically, the signature is generated by combining the encoded contents of the first two structures with a shared secret - both sender and receiver must be in posession of this secret [1]; the sender creates the signature and attaches it and the receiver goes through the exact same signature creation process and compares it, byte-for-byte, with the received one. If any of the bytes fail to match, the signature is rejected.

So, the first step in both the signature as well as the verification is to generate a secure hash of the header and the body. This is where the algorithm (alg) of the header comes into play — in this case, the secure hash algorithm is SHA-256. This algorithm reduces any input, of any size, into a 256-bit (32 byte) "digest". The algorithm is designed so that collisions are extremely unlikely and effectively impossible to engineer. Listing 2 and example 3 illustrate the SHA-256 hash of the header and body of the JWT token in example 1.

var crypto = require('crypto');

var s = 'eyJhbGciOiJIUzI1NiJ9.eyJleHAiOjE0ODU2MzA2MDAsInN1YiI6InVzZXIxIiwiaXNzI
joiamRhdmllcyIsImF1ZCI6Imdsb2JhbGNvIiwibmJmIjoxNDg1NjI5NzAwLCJpYXQiOjE0ODU2MzAw
MDB9.JXcg9jSHbVklARfhAIwGbDFeVS_6XrkC8cogy5lAM8U';
var parts = s.split('.');
console.log(new Buffer(parts[0], 'base64').toString());
console.log(new Buffer(parts[1], 'base64').toString());

var sha = crypto.createHash('sha256');
sha.update(parts[0]);
sha.update('.');
sha.update(parts[1]);

console.log(sha.digest('base64'));

Listing 2: SHA-256 compuatation

$ node jwt.js 
{"alg":"HS256"}
{"exp":1485630600,"sub":"user1","iss":"jdavies","aud":"globalco","nbf":1485629700,"iat":1485630000}
fxir+6mRstGSbsHb095BWmizCF8iDVTz/SNLg8irv5A=

Example 3: SHA-256 hash value

Now, notice what happens if I try to alter any element of the JWT body — for instance, to change the subject value:

{"alg":"HS256"}
{"exp":1485630600,"sub":"user2","iss":"jdavies","aud":"globalco","nbf":1485629700,"iat":1485630000}
g2ZGn0ceO6f9vIXM1XZo105jBc0T7rJbNJI6RZc/yDo=

Although I only changed one bit of the input, every byte of the digest changed (except for the last, the '=' sign, but that's just an artifact of the Base64 encoding - every byte of the actual digest did change).

However, if you notice, the hash code in example 3 doesn't match the last part of the JWT token in example 1. Remember, I haven't used the shared secret yet! Without a bit of additional authentication, anybody could generate any JWT body and associated hash that they wanted; the SHA-256 hash must subsequently be combined with a secret value. You might be tempted to solve this problem by just prepending or appending the secret value to the input like this:

sha.update(secretValue);
sha.update(parts[0]);
sha.update('.');
sha.update(parts[1]);

But as it turns out, this approach is flawed and can lead to forgeries. Instead, the SHA-256 is combined with a shared secret using an algorithm called HMAC. Node's crypto module includes an HMAC implementation that combines the digest computation with the secure signature, as shown in Listing 3.

var crypto = require('crypto');

var s = 'eyJhbGciOiJIUzI1NiJ9.eyJleHAiOjE0ODU2MzA2MDAsInN1YiI6InVzZXIxIiwiaXNzI
joiamRhdmllcyIsImF1ZCI6Imdsb2JhbGNvIiwibmJmIjoxNDg1NjI5NzAwLCJpYXQiOjE0ODU2MzAw
MDB9.JXcg9jSHbVklARfhAIwGbDFeVS_6XrkC8cogy5lAM8U';
var parts = s.split('.');
console.log(new Buffer(parts[0], 'base64').toString());
console.log(new Buffer(parts[1], 'base64').toString());

var hmac = crypto.createHmac('sha256', 'password');
hmac.update(parts[0]);
hmac.update('.');
hmac.update(parts[1]);

console.log(hmac.digest('base64'));

Listing 3: HMAC computation

Note that the crypto HMAC computation internally calculates the SHA-256 hash shown in example 2 before running the HMAC algorithm against it (and the secret, 'password' in this case) in the last line of the example. Example 4 shows the final output of the HMAC.

$ node jwt.js 
{"alg":"HS256"}
{"exp":1485630600,"sub":"user1","iss":"jdavies","aud":"globalco","nbf":1485629700,"iat":1485630000}
JXcg9jSHbVklARfhAIwGbDFeVS/6XrkC8cogy5lAM8U=

Example 4: HMAC output

But wait - if you compare this, byte for byte, against the final component in example 1, it still doesn't quite match! It's close, but the signature in example 1 is one character shorter, and the characters in column 27 are different - example 1 shows an "_", but example 3 shows a "/". Did something go wrong?

Well, as it turns out, Base64 is more of a family of specifications that one unified specification. They all have a lot in common, of course - all Base64 implementations break the input down into six-bit chunks and output an 8-bit equivalent for each six-bit chunk. The differences have to do with exactly what 8-bit equivalents correspond to each six-bit chunk and what they do when the input isn't exactly 6-bit aligned.

The 64 in the name "base64" comes from the fact that there are exactly 64 6-bit codes that must be encoded. Since Base64 was originally conceived as a way to transmit arbitrary binary data across devices that could only safely represent US-ASCII, all Base64 implementations agree that the first 62 codes should be mapped, respectively, to the upper-case alphabetic characters A-Z, the lower-case alphabetic characters a-z, and the 10 numeric codes 0-9. However, that left two more combinations that needed to be mapped. The first Base64 specification, in RFC 1421, suggested + and /; those seemed like pretty safe characters. Finally, the issue of what to do with non-aligned output required an additional "placeholder" character. The binary input was always 8-bit aligned, of course, so when you split it into six bit chunks, every three bytes (24 bits) corresponds to exactly 4 output bytes (since 6 divides evenly into 24). If you don't have an even multiple of 3 bytes, though, you end up with either one or two "left over" bytes. The '=' character is used by Base64 to mark how many left over bytes were in the original input.

This worked pretty well for a long time, but Base64-encoded values started appearing more and more often in HTTP URLs. This was problematic, because +, / and = all had sepcial meaning in the HTTP protocol, and each instance had to be escaped — this made the input even longer than it already had to be, as well as creating another source of (potentially hard to track down) errors. RFC 4648 proposed an alternative Base64 encoding called "Base64URL" which translated the charcaters + and / to the more HTTP-friendly - and _. And the =? If the length was known (such as is the case for an HMAC in a JWT token), the = padding delimiter could be left off entirely.

JWT structure verification

The techincally complex part of JWT verification is in the signature; once the structures have been base64 decoded and fully verified by signature comparison, the receiver must validate that the audience is correct (i.e. himself), and that the current time lies between the nbf and exp times, after adjusting for potential time-zone differences.

And that's it. As long as the signer is verified (in this example, through a secret key exchange), the contents of the body are as trustworthy as the signer.

[1] If you're using HS256 signatures as I do here. JWT supports, and has always supported, RSA and ECDSA signatures that are based on distributed keys, which I didn't explore here.

Add a comment:

Completely off-topic or spam comments will be removed at the discretion of the moderator.

You may preserve formatting (e.g. a code sample) by indenting with four spaces preceding the formatted line(s)

Name: Name is required
Email (will not be displayed publicly):
Comment:
Comment is required
Miller, 2021-01-22
Excellent Article!
When you mention "You might be tempted to solve this problem by just prepending or appending the secret value to the input [...] But as it turns out, this approach is flawed and can lead to forgeries." - Are you referring to the Length extension attack?
Joshua Davies, 2021-01-26
That's right - an attacker can append data to the original payload and generate a hash that would be accepted as genuine. This might be harder to get away with in JWT, since the JWT payload is a JSON structure with a closing bracket, but since HMAC is already available, it's best to use as many layers of security as are available.
My Book

I'm the author of the book "Implementing SSL/TLS Using Cryptography and PKI". Like the title says, this is a from-the-ground-up examination of the SSL protocol that provides security, integrity and privacy to most application-level internet protocols, most notably HTTP. I include the source code to a complete working SSL implementation, including the most popular cryptographic algorithms (DES, 3DES, RC4, AES, RSA, DSA, Diffie-Hellman, HMAC, MD5, SHA-1, SHA-256, and ECC), and show how they all fit together to provide transport-layer security.

My Picture

Joshua Davies

Past Posts