Federation Bubbles

Justin Richer
9 min readAug 4, 2023

We’ve spent decades building up systems that identify people and devices, and interconnect all of them. We’ve built systems that let us define and control accounts, assign access rights, and associate trust within and across boundaries. But today’s mobile, cloud, and distributed systems are complex beasts that defy the assumptions many of our architectures bring to the table.

Maybe it’s time to rethink what accounts and federation even mean. And for that, I’ve been thinking lately about bubbles.

Bubbles

A bubble is an interesting physical construct. A thin boundary that surrounds its contents. What’s part of the bubble is separated from what’s around the bubble by that boundary, creating a sub-entity that only exists and makes sense in a larger space.

Just some bubbles making a foam

In classical computer network systems, we liked putting hard boundaries up around things. Firewalls and even physical network cable separation let us define what was “in” and “out”. If you were on the network, you had to be good, because we made sure of that before you were allowed on the network.

While these approaches still have their uses, today we know that it’s not enough. Devices cross network boundaries all the time, and even our trusted API services dart around cloud data centers, pulling pieces and components and data from places we can’t fully predict and plan ahead of time. So the world has moved to a zero-trust approach, where we build security into every part of the architecture instead of assuming it’ll get stopped at the door. That is a very good thing, but it also turns out to not quite be enough.

Organizations want to be able to control policies in a central fashion, to define accounts and access in one dashboard. But on the other hand, cloud services need to make extremely local decisions in potentially ephemeral environments — do I start X image on Y network to process Z request? This dichotomy creates a tension in the system that only increases when you start to cross network and trust domain boundaries.

We need to address this, and that’s why I’ve been thinking about bubbles. Bubbles provide context, boundaries, and differentiation, and that’s what I think we need to consider in our identity and security systems.

Within a bubble, there’s a certain degree of trust, like the set of services within a pod on a shared virtual network. I can make hyper-local decisions about access and security, and I can do it in a way that doesn’t require me to talk outside the bubble. I can authenticate users directly, I can check policies that only apply to my systems, I can identify software and hardware, and I can do that all within the comfort of the bubble. It’s a small world that I can see and control. The bubble offers safety and familiarity, and we like that.

Foam

Bubbles don’t exist on their own, though, and that’s where the classical network thinking breaks down. It would be easy to say that a bubble is just a regular local network, disconnected from the world, but that doesn’t solve the most interesting problems. A single bubble might be a helpful concept, but it’s when we get a lot of bubbles together that things really get exciting, if you ask me.

Let’s take a user account as our example. We tend to think of an account as having a specific home, an authoritative source that binds the attributes and authenticators to a person (or other entity, if we’re feeling spicy). But those attributes and authenticators often come from somewhere, and that’s where the bubble concept really starts to shine.

I’m not just talking about just-in-time provisioning, where a central account database points to a new bubble and says “here is everyone that’s supposed to be in that bubble”. I do think that this kind of push is an important tool, but it can’t be the only one. Any developer will tell you that even the best distributed and delegated systems tend to accrete things like local accounts, group accounts, admin passwords, service keys, and other techniques that solve specific problems.

Instead of trying to engineer those kinds of things away, the bubble concept embraces the local-ness of them. Within the bubble, we want to authenticate and make our decisions locally. This lets us be fast and robust, and build in the kinds of things that we only care about here. But how do we balance that need against usability and account control?

When I’ve got a user in my system, they’ve got an account that exists only within the bubble. They can authenticate locally to an IdP or AS that is in charge of all the services within that bubble. The account has attributes and access controls that might only make sense to systems inside the bubble. But it would be obnoxious for a user to create a new local account by hand every time they needed to do something, even though that’s how we have solved this kind of thing in the past. This is the strength of federation technologies like OpenID Connect and credentialing systems like Verifiable Credentials — a user can show up and point back to an authoritative source for their attributes, and get an authenticated session out of the deal. We can use these to get the user inside the bubble, but instead of using these to log in every time, these technologies can be used to instantiate and bind an account within the bubble. From that point forward, the user can authenticate locally. If at any time in the future we need to verify the account with its source, we can re-run the same items we used at ingest.

And importantly, a bubble should be allowed to trust any other bubble as a source for information. There can be no strict hierarchy in reality. When my bubble is out in the world, I might have a very clear and immediate business need to trust people, information, and services from another bubble that I didn’t know about a few minutes ago. I should be able to make a local decision to trust a user that came from that bubble and bind them to an account in my own bubble.

This trust is also not linear. A user could have accounts, credentials, and artifacts from multiple places. It’s a common requirement in identity proofing to present evidence from multiple sources that corroborate your claims. In the same fashion, a user might show up with a VC from one issuer and an OIDC login from a different place. The combination of those things can be meaningful to the bubble in a unique way.

As the foam of distinct bubbles grows, it’s important to be able to trace provenance from these different bubbles. In our local policies, we need a way to say that “This is user A, they came to us from cloud B, but cloud B said they originally came from both cloud C and cloud D”, and be able to verify that chain. And since our bubble could be the authoritative source for some other bubble, we need a way to talk about that kind of chain downstream. These kinds of durable provenance artifacts aren’t simple, and they bring with them a whole host of privacy concerns — can an adversary use this to track an individual through a system that doesn’t want to be tracked? Can I troll around the network of bubbles and correlate all these points? It’s clear that being able to selectively disclose the provenance chain, as well as the data about the user themselves.

Turtles

A bubble can provide context for another bubble to exist. The bubbles can share part of their barriers, allowing specific trusted information to pass freely between them in a way that the external walls don’t, but stopping other information.

A bubble could also exist entirely within another bubble, acting as a sub-division. If we really do care about zero-trust security architectures, we’ll make our bubbles as small as possible and composable so we can stack concerns.

We’re starting to have this conversation on the IETF’s new Workload Identity for Multi-System Environments (WIMSE) mailing list. So if that kind of thing interests you, please join us there.

Flow

Bubbles offer us another helpful metaphor, in that they aren’t static. When you’ve got a bunch of bubbles floating around, they really don’t like to stay still. They move, connect, disconnect, and eventually even pop.

In a classical multi-lateral federation, the focus is always on getting all the online systems to talk to each other in a way that makes sense for the moment. A new university just signed the agreement? Get them into the directory! We’ve got a new deployable unit that’s heading into the field? Write down the serial number on that router and out the door you go.

But once we’re up and running, things change. New parties could get brought on locally, new services get pulled in as needed, and the entities that we once relied on cease to exist.

Disaster response gives us a great example here. In this case, we want to be able to stand up a new bubble of services in response to a specific event, time, and place. Specialists come in with qualifications. Some of these we can verify — you’re a doctor, I can see your medical license is good right now, triage tent is that way. You’re an electrician, your union card looks good, go jump on that truck. But sometimes people show up to help, and their presence in the situation is enough to warrant giving them access. You used to be a firefighter, ok grab an axe and make a firebreak up on that hill. You can answer phones and direct calls? Great, have a seat, here’s the switchboard and directory. We have a mix of accounts that get provisioned from the outside — the doctors and union electricians — and accounts that get provisioned locally — the firefighter and switchboard operator. All of these accounts are valuable for different reasons, and our systems need to have that level of flexibility.

And eventually, the disaster is over, and we need to clean up our data as much as the physical mess of the disaster. That firefighter went and accessed a bunch of stuff, were they supposed to see that? Those union electrical workers, were they actively in the union with valid licenses while they were down here, or had some of them been kicked out? And depending on what they did, does that matter to us? We need auditability and accountability for dynamic systems like this. We need to be able to call back to the union and say “hey someone used their credential from you and we let them do the following kinds of things, are you OK with that and are we OK with that?” It’s not an easy set of questions to answer, and it gets even more complex when we start chaining our systems together in unexpected ways.

These bubbles can also disconnect and re-connect to the greater foam. This is the value of the hyper-local decisions — once you’re on board, I don’t need to see your IdP all the time in order for you to log in. So if we go offline for a while, or your IdP goes offline for a while, that’s OK. But once we’re back online, I might want to check in with your IdP, especially if you’ve done something fishy inside my bubble. Cross-domain trust should come with cross-domain accountability.

It’s Not A Technology

I truly believe that no one technology will solve this, for the simple reason that we will never get the online world to agree to one standard to address things, no matter how good it is or how much nerds love it. Any solution that requires us all to speak the same single protocol is doomed to failure.

Reality is heterogeneous, and we need to build for that heterogeneous world. The real value, then, comes in defining the interconnects. Interoperability occurs for a purpose and in a context. I believe that we can use a family of standards and technologies in a common pattern to build our the future of internet connectivity.

As a consequence, in this space I see room for OpenID Connect, Verifiable Credentials, SCIM, Shared Signaling, TLS, SPIFFE, FIDO, and many other moving parts. The bubbles should provide a common set of connection points into the larger foam in which they exist. Not every bubble is going to use the same connection points, but each point provides a specific set of functionality and addresses a specific problem. Even inside the bubbles there’s room for a lot of flexibility and innovation — how do I connect and verify my internal services, how do I spin up subsystems, how do I know who’s there in the first place?

Some of you reading might be expecting the bottom of this article to be a pitch of my new stealthy start-up that solves all these problems with some magic product I’m trying to sell you, but I’m sorry to disappoint your CTO that it’s not going to just come off the shelf. In all truth, I don’t know exactly what this solution looks like, but I’m eager to start building it to see what’s out there.

--

--

Justin Richer

Justin Richer is a security architect and freelance consultant living in the Boston area. To get in touch, contact his company: https://bspk.io/