XYZ: Cryptographic Binding

This article is part of a series about XYZ and how it works, also including articles on Why?, Handles, Interaction, and Compatibility.

OAuth 2 loves its bearer tokens. They’re a really useful construct because they are simple: if you have the token, you can do whatever the token is good for. It’s all the proof that you need: you can present it in exactly the format that you received it in, no cryptography involved.

This simplicity makes it easy for client developers to get it right, most of the time. You give them a magic value, they put the magic value into the right header, and suddenly the API works and they can stop thinking about this security thing.

Sender Constraints

The downside to bearer tokens, and it is a big downside, is that anyone who has a copy of the token can use it. TLS protects the token in transit over a direct hop, but it doesn’t fix the fact that the people who are allowed to see the token aren’t the same as the people who are allowed to use the token. This means that if a client sends its token to a rogue resource server, or even the wrong resource server, then that resource server can replay the token somewhere else.

To address this in the OAuth 2 world, there is active work to define how to do sender constrained access tokens. These tokens would move beyond a bearer token by requiring the client to present some type of verifiable keying material, like a mutual TLS certificate or an ephemeral key generated by the client. There was previously work of using TLS token binding as well, but that has sadly gone by the wayside as TLS token binding failed to take off. These work pretty well, but there is an overwhelming prevalence of OAuth 2 code that looks for the Bearer keyword for the token type and breaks if you show it anything else. Because of this, the MTLS spec requires that you continue to use the Bearer token type, and there has even been pushback for DPoP to use Bearer as an option as well.

And on top of all of this, the token presentation is tangled up with the OAuth2 client authentication, which could be based on a secret, a key, or nothing at all (in the case of public clients).

With XYZ, we don’t have this legacy of bearer tokens and client authentication, and we were able to build a system that was more consistent from the start.

Presenting Keys

In XYZ, key presentation is at the core of all client interactions. When a client calls the AS, it identifies itself by its key. The key formats in XYZ are flexible. A client can present its key as a JWK, an X509 certificate, or potentially any number of other formats through extensions. The key can be RSA, or elliptic curve, or potentially some other exotic form. The protocol doesn’t really care, so long as there’s a method to validate it.

This key can be passed by value or by reference, and the AS can even decide to assign a reference dynamically, but the important thing is that it’s the key that identifies the software that’s making the call. This alone is an important shift from OAuth 2, because specifications like MTLS for OAuth and DPoP have shown us the value in allowing a client to bind an ephemeral key to its requests. In XYZ, the default is that all client keys are ephemeral, unless the AS has some additional set of metadata and a policy attached to the key.

Whatever key the client uses for its requests to the AS, it’s reasonable that the client would be able to use that same key for requests to the RS. But in this case, the client is also going to be presenting the access token it was issued.

But unlike a bearer token, it’s not enough to present the token alone, or the key, or its identifier. The client also has to present proof that it currently holds that key to the server. It does this by performing a cryptographic operation on the request, in some fashion can be verified by the server and associated with the key. But how?

Agility of Presentation

Herein lies the real trick: a new delegation protocol is going to have to be flexible and agile in how the client is allowed to prove its keys. Just about every deployment is going to have its own considerations and constraints affecting everything from how keys are generated to how proofs can be validated across the layers.

In XYZ, we’ve made the key presentation modular. Instead of defining a required cryptographic container, XYZ declares that the client has to present the key in a manner bound to its request, in some fashion, and declare that proofing method. The two categories implemented to date are MTLS and HTTP-based message signatures.

For MTLS, it’s pretty straightforward. The client needs to use its client certificate during the TLS connection to the server, and the server needs to verify that the certificate in use is the same one referenced in the request — either the request body for the AS or the access token for the RS. The server does not need to do a full chain validation of the client’s certificate, though it’s free to do so in order to limit which certificates are presented.

For HTTP message signing it’s a similar outlay of effort, but the signature is presented at the HTTP layer and has to directly cover some or all of the HTTP request. There are a number of different specifications out there for doing this, and our test XYZ implementations have built out support for a several of them including Cavage signatures, OAuth DPoP, and even a simple JWS-based detached body signature invented for the protocol. The most promising, to my mind, is the new signatures specification being worked on in the HTTP working group of the IETF. It’s being based on a long experience with several community specifications, and I think it’s really got a chance to be a unifying security layer across a lot of different applications. Eventually, we’re likely to see library and tooling support like we have with TLS.

The use cases for these are different. MTLS tends to work great for webserver-based applications, especially in a closed enterprise setup, but it falls apart for SPA’s and ephemeral apps. HTTP message signing is more complex to implement, but it can survive across TLS termination and multiple hops. There is no one answer, and there are likely other approaches that will be invented down the road that work even better.

Signatures as a Core Function

OAuth 1 had its own bespoke signing mechanism, which confused a lot of developers. OAuth 2 set out to avoid the problems that this caused by removing signatures entirely, but in so doing has pushed the needle too far away from good security practices and made it hard to add such functionality back in. With XYZ we tried to strike the balance by allowing different mechanisms but assuming that signing, of some fashion, was going to be available to every client. With today’s library and software support, this seems to be true across many platforms, and time will tell which methods work the best in the real world.

Justin Richer is a security architect and freelance consultant living in the Boston area. To get in touch, contact his company: https://bspk.io/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store