Discovery, Negotiation, and Configuration

Justin Richer
6 min readDec 14, 2023

Interoperability is a grand goal, and a tough problem to crack. After all, what is interoperability other than independent things just working out of the box? In the standards and specifications world, we have to be precise about a lot of things, but none more precise than what we expect to be interoperable with, and how we get to the interop point.

In my experience, there are a few common ways to get there.

Illustration of many plugs (generated by DALL-E)

Conformance

The easiest way to achieve interoperability is for there to be no choice from the implementations. If there’s only one option, and implementations are motivated to follow that option, then interoperability is much easier to count on. If a specification has a MUST, and means it, you can realistically rely on that being followed by well-meaning developers.

This zero-level interoperability is not without its confusion, though. In any specification, there are a lot of assumptions that lead up to a MUST being placed. Changes in this context could make that MUST behave differently in between implementations. For example, taking the byte value of a data structure, even a simple one like a string or number, assumes an encoding and order for that data structure. What’s most dangerous about this kind of problem is that it’s easy for multiple developers to make the same assumptions and therefore assure themselves that the MUST as written is sufficient, until someone else comes along with different assumptions and everything breaks in spite of seeming to be conformant.

To give a concrete example, try asking anyone for a USB cable and chances are you’ll get a few different varieties back, from type A, to type C, to micro-B, to power-only varieties. All of them are USB and conform to all aspects of the cabling standard, but “USB” is not sufficient to ensure compatibility on a physical level.

Even with its limitations, it’s still a good idea for specifications to be specific as much as possible. But the world can’t always make a single choice and stick to it. Things start to get more interesting when there’s choice between options, though, so how do we handle that?

Discovery

If my system can ask your system which options it supports ahead of time, then ostensibly I should be able to pick one of those options and expect it to work. Many standard internet APIs are based around this concept, with an initial discovery phase that sets the parameters of interoperability for future connections.

This pattern works fairly well, at least for common options and happy paths. If your system supports some or all of what my system wants, then I can probably figure out how to connect to you successfully. If your system doesn’t support what I need, then at least I know I can’t start the process. OpenID Connect usually works from a discovery-based process, where the RP fetches the IdP’s discovery document prior to connecting.

The discovery pattern is predicated on an agreement of how to do discovery in the first place. I need to at least know how to make an initial call in order to figure out what the options are for all future calls. This is expected to be out of band for the rest of the protocol, but is often built on the same underlying assumptions. Does the protocol assume HTTP as a transport? Then discovery can use that, also.

Discovery is generally done without context, though. The existence of something in a discovery step does not guarantee that it will be usable in the context of a real request. A server might support seven different cryptographic algorithms, but might only allow some of them to specific clients or request types. That kind of detail is hard to capture through discovery.

For a physical example, let’s say that before you ask for a USB cable, you can check a list of all the available types that the person you’re asking has available. That way when you ask for a specific cable, you’ll at least know that they had it as an option. But maybe they only had one and already lent it out to someone else, or they only hand out power-only cables to people they haven’t met before, in case the cable goes walkabout.

Negotiation

If we can instead bake the discovery process into the protocol itself, we can end up with a negotiation pattern. One party makes a request that includes the options that they’re capable of, and the other party responds with their own set of options, or chooses from the first set. From there, both parties now know the parameters the need to connect.

This kind of option works well with connection-focused protocols, and it has the distinct advantage of avoiding an additional round trip to do discovery. There’s also no longer a need to specify a separate process for discovery, since it’s baked in to the protocol itself. Content negotiation in HTTP, algorithm selection in TLS, and grant negotiation in GNAP all follow this pattern.

Negotiation falls short when decisions have to be made about the initial connection, much like when there’s a separate discovery call. The protocol can be built to robustly account for those failures, such as a content type being unavailable in HTTP, but the ability to negotiate does not guarantee satisfactory results. Negotiation can also end up with less than ideal results when there’s not a clear preference order, but in such cases it’s possible for a negotiation to continue over several round trips.

If you need a USB cable, you can walk up to someone and say “Hi, can I borrow a USB cable? I need it to be USB-A or USB-C and I need it for a data connection.” The person you’re asking can then see if they have anything that fits your criteria, and choose appropriately from their supply. If they hand you something that’s less than ideal, you can clarify “I’d really prefer USB-C if you have it, but this will work if not”.

Configuration

On a simpler level, many developers simply want to choose an option and run with it, and if the other side makes a compatible choice, this can short-circuit any kind of discovery or negotiation process in a positive way. This might seem magical, but it happens way more often than many software architects and standards authors like to admit. It’s not uncommon for two developers to make similar assumptions, or for libraries to influence each others’ implementations such that they end up doing the same thing even without any external agreement to do so.

If a developer codes up something based on an example, and it works, that developer is not likely to look beyond the example. Why would they? The goal is to get something to connect, and if it does, then that job is done and they can move on to more interesting problems. And if it doesn’t work? Chance are they’ll tweak the piece that doesn’t work until it does work.

JSON works this way in practice, with well-formed JSON being the interoperability expectation and anything else being, effectively, schema-by-fiat. While there are schema languages on top of JSON, the practical truth is that applications apply their own internal schema-like expectations to the JSON by looking for a field with a certain name in a certain place with a data value that parses how they expect it to. Anything that runs afoul of that is an error not of JSON but for the application to deal with. This is a far cry from the days of XML, which expected processing of namespaces and schemas to make sense of tags at the parsing level. Was it more robust? Arguably, yes. But it was also far too heavy for an average developer to care about. JSON’s approach lets us get to data exchange simply by letting us get it right by accident most of the time, and ignoring things that don’t make sense.

If you want a USB-C cable but just ask someone for a USB cable, and they hand you a USB-C cable, everyone’s happy. You may have been accidentally interoperable with your request and response, but there’s power in that kind of accident and the frequency with which it happens.

Practicality

All of these interoperability methods have merit, and most systems are built out of a combination of all of them in one way or another. When we’re defining what interoperability means, we always need to take in the context of what is interoperable, for whom, and when.At the end of the day, practical interoperability means that things connect well enough to get stuff done. We should endeavor to build our standards and systems to allow for robust discovery and negotiation, but always keep in mind that developers will find the best path for them to connect.

Interoperability is a grand goal indeed, and while a lot of the time we stumble backwards into it, there are well-trodden paths for getting there.

--

--

Justin Richer

Justin Richer is a security architect and freelance consultant living in the Boston area. To get in touch, contact his company: https://bspk.io/