Markdown Version

Session Date/Time: 16 Mar 2026 01:00

The following is the verbatim transcript of the DINRG (Decentralization of the Internet Research Group) session held during IETF 120 in Vancouver.


Dirk Kutscher: Okay, good morning. We are starting shortly, please find your seats.

[Background chatter continues as attendees settle in]

Dirk Kutscher: Okay, good morning. Welcome to Vancouver, welcome to the IRTF DINRG session. So DINRG is Decentralization of the Internet. This group is concerned with analyzing root causes, phenomena in Internet centralization, and studying different approaches for decentralization. My co-chair here is Lixia Zhang, and I’m Dirk Kutscher.

Just quick housekeeping notes as usual: so we are following the IETF IPR disclosure rules here. Essentially, this means that if you contribute or see anything that is related to IPR or potentially related, you are expected to notify us shortly.

We also have privacy and code of conduct rules in the IRTF. We have a privacy policy, and there is an RFC that describes our code of conduct, and also an RFC that describes our anti-harassment procedures. The IRTF has a special code of conduct that has more details on research ethics—that would also be good to observe.

And then just to remind everybody, so we are here in the Internet Research Task Force, so we’re doing research, not standards. Sometimes this is confusing because we are using similar procedures, type of documents, and so on, but please see the IRTF Primer, RFC 7418, if you are not sure about this.

Alright, here’s our mailing list info. So we have a note-taker, Shen-Jiao Li is kindly taking notes. So we have a shared online document that you can access from the agenda. Feel free to jump onto it and help with the note-taking. For the Q&A and also for the attendance tracking, please make sure you register with the QR code and also use the Meetecho queuing tools when you want to ask questions.

Okay, so we have a pretty interesting agenda today. So we have four talks and a bit of time for discussion. Just looking around, is there anything else that you’d like to add to the agenda? Any spontaneous ideas?

Okay. Good. So then we start with our first presenter, that’s Mallory Knodel, and she’s going to talk about From Standards to Users: Leveling the Field for Interoperable E2EE.

Mallory Knodel: Hi everyone. I’m Mallory Knodel. I am a PhD student at New York University. I also run the Social Web Foundation, which is a small nonprofit working on open social media protocols. And so my research and a lot of my work for the last many years has been on end-to-end encryption. I worked as an advocate at the Center for Democracy and Technology, done several things for many years. I guess to disclose a couple of things: I sit on a group roughly called Privacy Roundtable run by Meta on behalf of WhatsApp encryption. I also inform Signal as somebody who participates in their annual and ongoing privacy work. And then as the Social Web Foundation, we have a grant from the Sovereign Tech Alliance to put Messaging Layer Security in the ActivityPub protocol.

So those are some of the things that I think have given me a positionality around this work. And so I wanted to talk a little bit about what interoperable end-to-end encryption really looks like.

Next slide, please. I’ll have you advance—it’s not working but it’s fine. Thank you.

Yes. So in fact, right, we can... this, you know, when I first wrote a piece that I linked to at the end, this was sort of just emerging. But, you know, I don’t need to tell you all that there has been a lot of things external things that have happened. For example, the US Department of Justice was taking Apple to court over iMessage not properly interoperating with Google Messages. That has largely been resolved. One way it was also resolved from a technical perspective was the GSMA has a standard specification for MLS in RCS, which is effectively what Apple told the Department of Justice it would now use, and also Google really pushed super hard for that, which is excellent.

Another thing that happened was the Digital Markets Act in the EU came into effect. It named specifically WhatsApp as a gatekeeper for messaging, and then WhatsApp had to write a reference offer in response to that about how it would allow or not inter-operating messaging apps.

So these things are moving. But they’re very separate, actually, right? I kind of refer to these as user contexts. So great that we have end-to-end encrypted messaging interoperability between iMessage and Messages—no more blue bubble, green bubble nonsense. Great that we might have WhatsApp interoperable with more services, although that hasn’t actually happened yet. But, you know, these things are not necessarily the same, right?

We also have a huge proliferation happening, right? We know from folks who are here that participate in the More Instant Messaging Interoperability research group, MIMI at the IETF, that Matrix was an early adopter of federated encrypted messaging. They have their own sort of protocol. As I mentioned, you know, there’s a lot of groups that have had this idea to, you know, take an existing user space and then use end-to-end encryption for the direct messaging feature, like in social media and so on. So we know that they’re proliferating. We also know that there are more and more apps out there. Great that if they have a messaging feature, they feel like encryption must be the default, right? That if you're—it’s kind of like, you know, maybe there are a lot of other examples out there, but you know, if you're going to make a new messaging app and it’s not using encryption, like what are you doing?

So yeah, I wanted to mention that this is sort of the context, this is where we are now. Next slide, please.

Right, why am I giving this talk here? Because there’s Messaging Layer Security, there’s MIMI, this work is ongoing. I would say like where I’m taking this talk and what I’ve learned and am understanding is there are going to continue to be gaps in the implementation of these different user contexts for messaging interoperability. So even if we have MIMI, even if we have WhatsApp’s reference offers and their white papers on how to do this, even if we have, you know, MLS and RCS, there’s still going to be a bunch of stuff that continues to fall in the crack—in between the cracks here—that typically is at the, you know, software implementation phase and just I think of interest to DINRG because it’s—a lot of these things are fundamental constraints or realities of federated systems.

So that is sort of why I think this talk fits in a particularly different place from some of the other groups and why I’m grateful to the chairs for hearing about this and letting me have some time to talk about it.

So the work is ongoing. This is very nascent. We want to look at what are some of the things that we’re seeing when we look at these different user contexts and how they might come together. There should be some action or some work over the summer, over the coming months, with a lot of the individual platforms who are involved in this to understand what their issues are. This, I think, really feels a lot to me like the work we’re doing with interoperable federated social media, because it’s the same thing: like we can have the protocols, but then it’s only once people start building apps and then building clients for those apps and then building services that work with those different apps and clients that we really see that we need to all sometimes get together in a room and talk through like why things aren’t working for users the way they should and where some of the pain points are.

So this is what I understand from the challenges with end-to-end encryption interoperability from the perspective of security. This slide gives you an idea of our sense of what is—what needs some attention.

Private key management is a huge one. It’s probably the most salient issue. We—I spent many years training activists and journalists to stay safe online. This was way back when. And we taught them how to use PGP email. And it would take us five days sitting in a room. And it was hard because mostly it’s about key management, right? It’s about signing keys, uploading keys, sharing keys, oh my goodness. And the advent—I, you know, I stopped doing that for a variety of reasons, but one of the wonderful things that allowed me to stop worrying about that so much was Signal, and then very subsequently WhatsApp, where a billion people now had access to end-to-end encryption and one of the best things about it was it hid key nonsense from the users. They didn’t have to think about it; it was trust-on-first-use, it did the key management for you. However, in order to hide key management from the user, you have to actually have control over the client. And in an interoperable federated system, you don’t always have control over the client. So this presents more issues, right? Private key management is going to be quite a big—quite a big deal. And luckily we’ve had things like key transparency activity in the IETF and all that's really helpful.

Federation introduces the need for security assessments and trust incentives in additional abstraction layers. Because again, you can no longer assume that you have control over all the endpoints. The client—your users might be using different clients, your users—the people that are on your network are not all your—they're not always your users, right? So you cannot just assume or enforce, rather, through code, certain security benchmarks, and you are no longer the only trusted party.

Third one is the bridging architecture. So this is the idea that, you know, you would delegate your credentials to an intermediary that would then be able to log in on your behalf, receive messages on your behalf, etc. That is a problem for people who are very intense about the end-to-end part of end-to-end encryption, right? We do not want our key material getting spread around. We do not want the endpoint to move off of our actual endpoint, out of our hands. That creates a lot of—that really, I think, it really disrupts actually the entire architecture of end-to-end encryption, right?

And then the last thing that I could think of, and there are probably more—I’d love to get folks in the mic line if you can think of more—already most of the apps that I understand how they work, they’re sending media, large files, pictures and so on, via transport encryption. They’re not end-to-end encrypting things like stories, video. I think that could be—I could be wrong about like real-time, right? There is, you know, if you're having a video conference with somebody on Signal or WhatsApp, I’m pretty sure that real-time media transfer is encrypted just like Webex uses MLS for—for real-time media. But but like attachments and things like that, they go—those go out of band. So not great. What does that look like in interoperability? Is there a way to level up that security model?

Next slide, please.

Um, the, yeah, I think here just kind of goes into a little bit more about what we already know. I feel like some of this stuff has been explained in message—or sorry, MIMI a bit. Identity is obviously like a huge focus of what MIMI is working on. So we know that's already there.

The trust model, I’ll just talk about that again, which is, you know, you need to have not just like trust in a—a peer-to-peer situation—so I’m an encrypted messenger, you’re an encrypted messenger, we need to have trust between us, right? We also imagine that sometimes there are going to be clients and services and other intermediaries that need to also be aligned. And so how does that work and how do you level that trust model up to the end users so that they know when they’re opting in to talking to an individual who’s then on a different service from them, that there is that trust relationship and what are the sort of boundaries or or constraints of those confidential and private communications?

You know, downgrade pathways is kind of another way of saying, you know, what happens when not all services are created equal from a security perspective? How do you do a proper negotiation when things are not exactly one-to-one? Do you tell users like don’t talk—you can’t talk to this person because they’re running an out-of-date client, you know, don’t talk to this person because they’re on, you know, a network that we don’t trust? Like how—how does that work?

We already talked a little bit about metadata. But metadata is probably going to expand. I talked about metadata in the context of the media, sorry. But metadata expansion I think is inevitable because right now you can do a pretty good job of doing—of minimizing metadata if you have full control of your system and your users and the client they're using and your service. But once you start interoperating, one obvious way metadata expands is if I’m interoperating with another service, they’re probably going to have to have some kind of metadata about me as a user even if I have not signed up for their service, right? They need to be able to track my identifier and other things like that. And again, like because we need to resolve some of these mismatch security issues, that’s probably all going to be, you know, metadata intensive.

Yeah, verification of user experience gaps. We need to be able to tell users what’s happening without overwhelming them. We don't want to go back to the days of PGP email; that would be an entirely huge fail, right? So how do we strike that balance?

And then of course, of course, of course, with any kind of federated system, our love—beloved email is the best example of this—there’s going to be content moderation issues and spam. Even though—and I know this is controversial—we’re talking about end-to-end encryption, and when we talk about end-to-end encryption, we don’t talk about content, but that is—I’m using it in the broad term, right? You might think about like spam as, you know, number of attempts to send—to send messages, right? Behavioral signals, that sort of thing without looking at the content.

So, end-to-end encryption properties must remain. That is sort of my point and my thesis, right? Authentication, confidentiality, forward secrecy, post-compromise security, conversation integrity, and verifiable endpoints. Like all of those things from a technical perspective are absolutely essential to call any system using end-to-end encryption actually end-to-end encrypted.

So next slide, please.

I sort of already went over this. But I think there were a few things that I understood happening in MIMI that I’m not entirely sure is part of the work there right now. One thing obviously they’re working on or have worked on already is defining common formats for cross-messaging—cross-service messaging. That just like makes sense, right? That’s the same as the ActivityPub protocol for social media effectively is just like activity streams gives you a container for certain kinds of content that is common across all those things. So that seems done and obvious. The next one, then on working on identity and identifiers, also very much part of MIMI’s charter if I remember. You need effectively like the email format, right? You need something that’s common across all and you have a way to deal with that.

However, then I think that the bridging of MLS-based systems—that is admittedly kind of a nebulous category there—but I think the idea is that just because something is running MLS doesn't mean that it's interoperable, right? There's a lot of work that needs to be done on top. That’s effectively what MIMI is trying to do. But you know, are there things beyond identity and formats that need to be dealt with in the—in MIMI? And like I can think of one, for example, that I don't think is in scope, but going back to clients, you know, making sure that the clients that your user—your users and the users that your users are talking to are running up-to-date security software is like something that needs to happen, right? That’s a some kind—that can be potentially standardized between service and client in a way that can make things a lot more—that can facilitate interoperability. But when it was proposed as work in MIMI, I don’t think it was adopted as an area of work. So just something that’s falling between the gaps. And I think overall the goal is that we want to reduce the fragmentation of these user contexts and we want to work everybody towards using the same protocols. So you know, MIMI’s doing that, but there may be other places where interoperability is happening. Like for example, iMessage and Messages aren’t using MIMI, right? So they’re using RCS. And that fragmentation I think ultimately can be negative for the user space.

Next slide, please. And I think we’re coming up to the end. And this is a lot of what I have already said, but at the beginning, which is like why we’re doing this: so beyond like the crypto algorithms and the transport or IETF-based protocols, platforms are doing all of these things. It’s really important to sometimes get the implementers in a room to talk about some of these other issues so that we can confront them, so that we can make good product design choices, so that we can effectively govern some of these things together and build relationships. So that’s the work I think that’s ahead. And I think the next slide should just be a wrap-up.

No, let’s—let’s skip this. I feel like I will just repeat myself if I go over that again.

So yeah, maybe some questions for this group, actually, to end on. I think these are the main open research questions. Maybe there are other ones we can add to this list. But since this is a research group, you know, I’m curious to hear like what people think about what a group like this might do or if there are other venues for dealing with this. But these are the things that I think are open. You know, which end-to-end properties survive cross-system interop? That was on the previous slide about the definition of end-to-end encryption. How do we deal with downgrade paths, clients versus services, etc.? Keys—just that is so—yeah, that—that’s a huge breakthrough is to figure out how we deal with private keys. And then I think measuring deployment risk is going to be really major because at least in the Digital Markets Act, the EU basically has to just trust what WhatsApp says when it says "We’ve chosen not to interoperate with these services because of security," and they just have to accept it, right? But I think there’s a way that we can maybe standardize or have some transparency around what actually does deployment risk mean so there can be some watchdogging of that process.

Anyway, let me just end there because I think I’m over time. I appreciate it.

Dirk Kutscher: Okay, great. Thanks, Mallory. That was a really great talk. And we have questions.

Vittorio Bertola: Hi, Vittorio Bertola. I wanted to flag another couple of issues that have emerged from the European attempts to make this happen. So first one is that since we’re doing managed end-to-end encryption—so the app is managing cryptography for you—then there’s still the gap with the app. And it’s important at least to make the users aware because I see a lot of users saying, you know, "My app does end-to-end encryption, so I’m secure and they cannot read my messages." In fact, your app can still read your messages and scan them because it gets your messages unencrypted and it needs to decrypt the other person’s messages to show them to you. And at that point in time, they can scan them, flag them, forward them to a server, whatever. And same for the app on the other hand. So from one viewpoint, I mean, Meta has used this as a sort of FUD, saying, "You know, we don’t want to open to interoperation with other messaging apps because the other side’s app we cannot trust it, maybe." But on the other hand, we need at least to make users aware.

And the second thing is that, also I don't know how to address, is that I mean, the dominant apps are trying to create problems for interoperation at the policy level. So maybe this is out of scope, but we’ve seen Matrix not being able to interoperate with WhatsApp yet because there’s this requirement that Meta is posing that they have to check the location of their users, and as soon as a Matrix user exits the European Union, then the app Matrix has to disable interoperation because then the law doesn’t apply because it’s not in Europe. Which is clearly a way to make this fail. So I mean, an excellent—I don’t know if there’s any solution we can work out for that, but I mean...

Mallory Knodel: Yeah, I agree. These—these things are exactly why I think we need to confront these things. So, you know, I think some of the—I don’t want to characterize resistance, but just like, you know, I don’t feel like I’ve effectively been able to table some of these issues in the IETF so far because they are either too endpointy and not networky, or they’re—they’re all like implementation choices, right? And so what I’m trying to say is that we—I think we need to confront some of these issues, otherwise what’s going to happen is the policy hooks aren’t going to be effective because the products—the major products are just going to continue to work however they want to work.

So this is why I—I mean, just maybe there are many—I hope I was convincing that there are other reasons to do this, but one of the major ones is so that there can be some transparency and accountability for the regulations that are pushing this through. Because we already know this would be good for end users, right? I think we can all agree that this would be a good thing to not have these siloed user contexts. But it’s challenging, no doubt there are security issues, it’s challenging on other levels, but doesn’t mean we shouldn’t do it. Okay, great. Thank you so much.

Dirk Kutscher: More questions while I... oh, there is... oh yeah, hi Lixia.

Lixia Zhang: I’m just sitting here taking advantage of that. I think it’s a really great talk. Thank you very much. And you put on the research question: what’s the right venue? I’m here because I really think this group should be a good place for it. When we talk about decentralization, the fundamental part of it is how you handle identity and keys. And so far, I don't think I’ve seen anything like a widely accepted solution, right? I was mentioning it earlier about the name and identity.

So right now, I think the way people are using it, basically email, right? Your email address with your—your affiliation, your school, you know, those are actually really important. So maybe the end-use case is the hard one, but the intermediary case where, yeah, it’s not maybe just email but maybe federated social web? Like maybe how—how do you see that fitting in with your suggested recommendations?

Mallory Knodel: That’s a great question. The fact is: who are the email providers? That is the question.

Lixia Zhang: But if it’s namespace, it might all be Gmail, but at least, you know, I get an @nyu.edu at the end of my name.

Mallory Knodel: I mean, email address essentially is DNS plus your user ID, right? But then fundamentally, who controls that email address? NYU? Google? Google so far is so great at providing email services to, I don’t know, for billions of users. But should we stay in that situation?

Dirk Kutscher: Okay, great. Thanks very much everybody. Yeah, we apologize for the glitches in the morning. So we wanted to have more time to discuss these things, but it didn't work out. But so if people are interested in discussing this more—so we had I think many really inspiring talks today. Maybe please contact us or use the mailing list. We are around for the whole week, maybe we can, you know, get together later in the week and continue the discussion. Thanks very much, and this concludes DINRG this morning.


Dirk Kutscher: One moment, let me get the next presentation. Okay, as I noted, we’re having some network Meetecho problems. Maybe you have noticed already. So we’re going to try a remote presentation now. If it doesn’t work, we have to change the order and hope it will be better later. But let’s—let’s try. So next would be Saidu Sokoto, talking about Open or Blocked Skies? Community Moderation Practices in Bluesky.

Saidu Sokoto: Okay. I just requested for screen share, I’m not sure whether that works.

Dirk Kutscher: Ah, do you want to use your own screen or do you want me to use your...

Saidu Sokoto: Yes, I prefer that, yes. I hope it works.

Dirk Kutscher: Could you try it again? Because I didn’t see the request.

Saidu Sokoto: Okay. Yes. Entire screen. This one. So, I hope you can see my screen now.

Dirk Kutscher: We can see your PDF.

Saidu Sokoto: Okay, this one. Can you see the presentation?

Dirk Kutscher: We can see the first slide.

Saidu Sokoto: Okay. Hello everyone, and thank you for the opportunity to speak today. My name is Saidu Sokoto and I am a PhD student at City University of London. Unfortunately, despite receiving an IRTF diversity travel grant and making every effort to attend, I was unable to obtain a visa due to stringent requirements for certain nationalities. So I have to present remotely. Before I start, I would like to express my sincere appreciation to the IRTF for their support.

Our work is titled "Open or Blocked Skies? Community Moderation Practices in Bluesky." This is the result of joint work with many others, without whom it wouldn’t have been possible. Since Bluesky became public, we have been studying its evolution through a series of papers. The first being an IMC paper, in which we study the network growth, the architecture. We also had an ICWSM paper in which we study the early user onboarding and feature adoption. Today my talk is in our most recent work, which focuses on looking at the specifics of delegated or community moderation. We look at individual blocking and blocklists and how these collective tools shape visibility and interaction at scale.

To understand why community moderation in Bluesky actually matters, it helps of course to look at how different social media architectures operate, particularly in terms of moderation. In the previous DINRG meetings, we saw a lot of discussion about the Fediverse, we saw a lot of discussion about Bluesky. There was a birds-of-a-feather session on Bluesky, there’s a charter to make it a working group. But traditionally, platforms like X or Twitter follow a centralized model. We have a single operator that has global control over access, visibility, and moderation. So this gives them a strong consistency and actually very fast enforcement, but unfortunately, it concentrates power.

A second model is the Fediverse as seen in platforms like Mastodon. Here control is usually distributed across servers or instances. This reduces platform-level centralization, but moderation authority is usually tied to a specific instance or the instance operator. So power is distributed, but at the end of the day, there’s this implicit centralization per instance.

Bluesky, which is relatively recent, represents a further step which decomposes the architecture. So we have functionality that’s separated into modular services that users can mix and match. In particular, public moderation with primitives like labels, blocklists, individual blocks, and feed generators can be provided by independent actors. Users choose whichever they want.

Focusing on moderation, labels are basic signals added by moderating actors to content or accounts. So we saw discuss—I think from IETF 112, Gareth gave a talk about how moderation labels, I mean, basically automatically remove content—they do not automatically remove content, but instead, they allow for content filtering with users deciding what to expose.

To actually understand what things look like, it’s important to know the differences between individually blocked accounts and accounts from blocklists. At a simple level, a user can directly block another account. This creates a moderation signal that affects visibility and interaction between the two users. So the blocklists that are introduced by Bluesky—of course blocklists are not unique to Bluesky, but Bluesky is the first social network to actually make it public, and so we’re able to study them at scale.

So this additional layer introduced by blocklists makes them delegable mechanisms. Users can subscribe to lists curated by other community members. So basically clicking subscribe means that you inherently or effectively inherit all blocking decisions made by the list curators. So decisions can scale from individual actions to collective filtering mechanisms. I had some animation here but unfortunately, it doesn’t work.

So relying on a full snapshot of the network from April 2024, we look at the scale of community moderation covering 34 million users, around 1.2 billion posts, 40,000 blocklists. This gives us a complete moderation history or the event history for moderation until that date, or until that month, which is April 2024. We see that while individual blocking is widespread with around 119 million individual block relations, which covers around 12.6% of the users, blocklists amplify the effect by several orders of magnitude.

Blocklist subscriptions create over 16 billion unique block edges, which is almost 100 times more than individual blocking. So a single list maintained by one user, as we can see here, can shape the experience of thousands or even millions of subscribers. Delegation massively amplifies moderation impact. Additionally, from here we see that community actions also exceed enforcement volume, which is the takedown you see here in green. These are the takedowns by the Bluesky team. We can call these centralized actions. So although they’re much fewer in absolute numbers, they mirror similar temporal patterns, which means that centralized and decentralized moderation mechanisms probably respond to the same stimuli. So although takedown actions probably target extreme cases such as those mandated by law, they seem to follow the same trend as community moderation.

Beyond measuring the scale of community moderation, we also examine the characteristics of accounts that appear on blocklists. We will see why this is important in a moment. But blocklisted users are substantially more visible. They have much higher activity and engagement than other users. Apart from being much more active, toxicity signals are also higher on average. So we measure the toxicity of the users and we see that those on blocklists are actually much, much higher. Looking also at the distribution of topics discussed by users—all the users, and not only those in blocklist—we find that political content seems to be more represented among accounts that appear on blocklists. We measured this of course based on a baseline which we defined as the topic distribution across all the posts in the network.

These are some figures to show you the level of toxicity on the network for those accounts that have never been blocked, those that have been blocked but at an individual level, and those that are included in blocklists. And for all the dimensions of toxicity we measured, those on blocklists seem to be much higher. Similarly, this shows topics discussed by those on blocklists are much more, probably, sensitive compared to those discussed by users that are not on blocklists. We see adult content, politics, crime, and law tend to be much higher than the baseline users or those that are not on blocklists.

But why do you think this is important? From the previous results, we know that there are definitely clear differences between accounts that are blocklisted and those that are not. Well, this tells us that blocking decisions can have a meaningful impact on visibility and interactions across the network. Therefore, knowing which type of users might be affected sheds light on the power held by these blocklist curators. The natural question you might ask is: who then is actually making these moderation decisions? Who’s controlling these blocklists? It turns out that only 0.043% of users create blocklists. Even worse, the top 100 blocklists by subscribers cover over 66% of all subscriptions. So, and these are created by only 79 individual users. So these top 100 lists are controlled by 79 individual users. So top blocklists actually cover a large share of subscribers. These means that any errors they make can actually propagate, they can have an impact on other users, a lot of users actually. So we start to see trends of centralization again, even with the blocklist.

But it’s not so bad actually, because looking further, if moderation power becomes concentrated in the hands of a small number of blocklist curators, we might expect that this will maybe significantly reduce participation or it might fragment the network. However, when we examine behavioral response, so what’s the impact of being blocked? What’s the causality? When we do the causality analysis using PSM, for example, what do we see? We see that the picture is actually more nuanced. Surprisingly, community blocking can increase the popularity, activity, and toxicity of the users. So what this means is that community blocking can be an effective way of not actually silencing users but shaping a user’s social environment without silencing them.

So beyond the causality impact, we also investigate the impact on the social graph. So what happens when you block, when you add a user to a blocklist? In Bluesky right now, it doesn’t affect the follower graph, but assuming we remove those edges or we remove those users, it turns out it has a very small impact on the social graph. Users tend to block, maybe at the individual level, other users whom they probably never interact with. So this is actually important, although we see trends of centralization, at the end it doesn’t silence the users.

So I have just really touched a small aspect of our paper. We have a lot of findings and I do encourage you to read the full paper, which is available if you scan this spark code. But in general, we see that blocking is widespread and it’s being used a lot. It shapes and covers more than 90% of content visibility. It does target the most active and polarizing users, but just being active is not the signal that you'll be blocked. No, blocked users actually have their own features and we do show this in more detail in the paper. We do nearest neighbor analysis and many other analyses to show that blocked users actually have their own unique features. But the nice thing I would say is that it doesn’t silence users. It rather segregates, it gives... we do propose some possible reasons why this is the case, but this is probably open to discussion as I mentioned here. Are they different interpretations for why blocked users become more active? So it might be this Streisand effect, maybe their communities rally behind them after they’ve blocked. Users are usually not informed when they’re blocked on social on Bluesky, but then there are other ways to find out. But it’s actually a very nice thing to see that users are not actually silenced.

This brings me to the end of my presentation. Please let me know if you have any questions. And do read the full paper, please.

Dirk Kutscher: Okay, great. Thanks very much, Saidu. And we apologize for the audio problems, and yeah, it’s a pity that you couldn’t make it here in the end. But thanks very much for your presentation. Do we have questions? Okay, that’s not the case. And Lixia could follow up with Saidu offline. There were some questions, some comments in the chat, Saidu, maybe you want to take a look at those.

So we continue with our final talk, and that will be given by Lixia. Thanks, Saidu.

Saidu Sokoto: Okay, thank you very much and thank you for this.


Arno Taddei: Hi again. I’m still the same person. So Arno Taddei from Broadcom. So I’m going this time... I am here with a person who is Frédéric, who is exactly on the screen here, he will speak later. So Frédéric must be, I don’t know, 2:00 AM, 4:00 AM in the night. And we thought that it would be best that I proxy him here. But he is the real person behind the presentation that I am going to make now. So it’s about Designing Human-Centric, Decentralized Digital Infrastructure for the AGI Era, and you see a term "Omega Phase Framework" that I will introduce today.

So just to explain about ourselves: Frédéric is an entrepreneur, right? He’s not a researcher, he has not worked in academia, he is an entrepreneur. He has done all sorts of things in his life, good, bad, and ugly, lucky, unlucky, as any business person. He is a bit more senior than I, I’m not that young anymore, but he has sincerely tried to address this problem here and we see this ground for DINRG. So the reason is "Can it help DINRG?" is the question we are trying to do here. So as neither he nor I are academic people, what it is proposed as is, is here for helping the community, if people want to help us improve it, if they are unhappy with it and so on, please help us improve it. But I think we have something that could be perhaps of consideration for DINRG in the future.

Next slide. So for the big picture, the starting question was: why do some systems (biological, social, or digital) survive while others collapse and disappear? So it’s about evolution, right? It’s about evolution, and if we would make a projection from the past, today, and the future: in the past, our lives, humans' lives, was not depending at all on digital life dependence, 0%. Today, let's say we are at 10%. By 2035, let's say we are at 50%. And in the future, we could be at 100%. So the more we go in time, we are now in a critical transition for human civilization, and we are moving to a synthetic evolution. I mean, it’s quite interesting to consider that we are ourselves blocking our own evolution. If Mother Nature would have the freedom to help us improve, we are basically preventing it to happen because we are moving ourselves to a synthetic life.

So if we go to the problem this is causing is that is not at all linear. It was linear for millions of years and now it’s not. And so let’s continue on that line. Next slide, sorry.

So for 40 years, we built the internet as a network that can connect machines, okay? The next phase will be that we need to connect intelligence, both human and artificial. By intelligence, I mean the Alan Turing view of it, by the test of Turing, that means that at some point we will not be able to make a distinction between human and a machine and at that moment we could call it intelligence. We can debate that, doesn't matter. So the point is that we are going to have autonomous agents, agentic AI and advanced agentic AI and humans on a shared common infrastructure. The problem is that what infrastructure do we need when intelligence itself becomes a network phenomenon and when some of that intelligence is no longer human but artificial? So do we need another layer? And that’s the question we are asking ourselves here, let’s call it "sovereignty plane."

So let’s continue. So here, in terms of sovereignty plane today, we don't have that. We have individuals connected with devices and there is some magic in the middle that is most of the time centralized and that doesn't have the right properties that connects us to other things and corporations. So if we go to the next slide, what if we would have another plane, a sovereignty plane where we are taking back our control, where organizations take back their control, where human intelligence, agentic synthetic intelligence, and corporate are on the same sovereignty plane? That’s the proposal.

So if you go to next slide, the problem is, and we could not find a better word, is we need to find accountability between our intelligence and the synthetic intelligence on the other side. We would call it "entanglement" for the sake of it at this stage here. Right, so you can go to the next slide. The problem of entanglement is that we have no protocol-level mechanism to answer three fundamental questions when we want to move to a sovereignty plane: who is responsible for the agent? Which human organization bears accountability for its actions? What are its boundaries? So what is authorized—the authority to do and not and what is not to? And how is it identified? Not the machine it runs on, but the agent itself as a persistent and accountable entity. So this is not just a feature gap, it's an existential architecture deficit.

So if we go to the next slide, if every autonomous agent is not entangled with a responsible human organization bound to them through cryptographic identity, accountability protocols, and enforceable boundaries, then we are building a network of unaccountable intelligences. So every agent must be entangled with a human principle and critical data needs to remain under human control. That’s the central point of this proposal in terms of acknowledging the architectural deficit.

So on top of that, we have converging pressures. So the data gravity is we have a data corpus—I mean, my data, my health data, and my financial data and so on—that are the most valuable assets that we possess, and yet it resides entirely inside infrastructure that we do not control. I mean that if I would consider this as an asset, even when I die, my heirs cannot have absolutely no way to access it or do anything with it. So it’s quite crazy when you think about it.

So the agent proliferation is another big pressure because we are going to have trillions of agents that are going to probably arrive on this new platform, and they are persistent agents. They have memory, they have, I should say, behavior, they have planning capabilities, they can step actions and so on. Each of them needs context, identity, and trust boundary. The current architecture we have provides nothing at all on that. This is why probably we have so many side meetings on agentic AI at this meeting again.

And there is a sovereignty demand. That means that individuals, organizations, governments are increasingly unwilling to seed control over identity, their data environment, computational context, and so on. It is not a political sentiment. That’s the mistake we should really acknowledge. It’s a structural requirement for any system where autonomous agents operate on behalf of principles.

So we have three forces here. Maybe you could find more, but this is the three big ones we identified. And the center of gravity of computation and crucially of accountability should shift closer to the individual and to the edge of the infrastructure. Decentralization should become mandatory.

So let’s go to the next slide. So what about having a new layer? Right, let’s call it the "internet layer split." I joked with Fred that we should call it Skynet, so he disagreed. We should call it Bronet or something like this. But you see where I’m going. So carriers, we know what it is; this is what we built as internet. Cloud and virtualization was the big new thing that happened with all the clouds we got. But now we are missing something where it consists of our personal and organizational computing environments that maintain local data, local compute, local AI, local things that belong to us.

We do not have a lot of time. This is part of a book that is pretty comprehensive here. I will discuss about that a bit later. There could be a lot of things we could discuss in terms of how to do that, but basically to come back to the agency, the accountability, and the human control. Something belongs to me, but I need federation as well. So that’s exactly in the purpose of this.

So here, the cool point is that if we would take this approach, the internet does not disappear, it’s just that its role clarifies, right? The carrier plane transports, the cloud plane computes, the sovereignty plane governs mission-critical agency.

So let’s go to the next slide. To do that, there is a framework phase and Phase is Federated Artificial Substrate Evolution. Right, so it consists of four axioms: persistence, adaptive complexity, tensi-stability, and continuity. So it’s—here we don't have the time, but there is measurability, there is a discussion on falsifiability, and in fact, it unites three disciplines that are rarely joined on complex systems: information physics and ethics of technology. So what I want to do is just to illustrate these four axioms so that you have a feel for that. There is quite a lot of things behind it.

So if you go to the next slide. The first axiom is the Persistence Law. And in clear text, it simply means every system that endures must continuously restore order faster than disorder consumes it. So the math behind it is simply the differentials of the system's usable information (I mean working knowledge about itself and its environment). So normally you should have S as entropy, but here we used U as uncertainty and time. So basically, the system must gain more useful information faster than the uncertainty grows.

Let's go to the next slide. The second one is the Adaptive Complexity Law. So in clear text, it means evolution favors systems that maximize information gain per unit of energy. So I would have never thought I would see that in my life, but indeed when you see about the consumption of energy, for example, that AI needs, that’s an interesting one. So eta is the prediction efficiency, I is the mutual information about survival-relevant status, and E is the energy expended. So basically, you need to have a certain prediction efficiency that is bigger per unit of energy.

Let’s go to the next slide. The third one is the Tensi-stability Law. So there is a glitch in the presentation, so let me explain. T is the tensi-stability law and is the multiplier of three things: IP and A, that are information grasp, power, and autonomy. Epsilon is the ethical damping. So we see, in essence, what we are trying to do here is to say there is a threshold where any system will grow if it’s not sort of managed by something, and here the proposal is to have ethics as a dumping limiting factor. So internalized constraints that slow, soften, or block harmful actions—that’s exactly what it is. So what we want to do is make sure we never reach the point where the system will be unstable. That’s the stability condition. So that’s where we are trying to do the threshold where the system's error amplifying capacity exceeds its error correction and monitoring capacity.

And the fourth axiom, next slide. And is where this DINRG scope in context now: the Stability at Scale requires distributed control. That’s the continuity law called Omega. So now you see Omega part of phase framework, and in fact, this is the same math as before, but you see that rather than using epsilon just as effective damping, we modulate epsilon by Omega, which is the distributed, the decentralized aspect in the equation. And so in this way, you not only manage to control the amplification that the system would do by itself by not only proper ethics that is organized here—and it's really specific what we mean by ethics here—and Omega as a decentralized aspect.

And that’s why when I saw that, that’s where it ticked my brain, I say that would be so cool for DINRG, because I think this is really in the scope, and I think this is where we could start to have some tangible platform of, you know, mathematics and it’s not just mathematics. And I must say here in the book itself, you would find really amazing examples because Frédéric is a practitioner, so it's not just theoretical. He shows that with real data where he puts metrics, he puts very good examples based on industry, on companies, on hospitals, on all sorts of things where he demonstrates each of these four points, he provides some data, he provides some minimum ways to measure. So there is a lot to consider here, but of course we don't have now.

So that was what really attracted my attention for you. Let’s continue. So of course you will tell me, "Yeah, but is it just theoretical?" Actually not. There are things here we could do in terms of a real device, and maybe we are seeing this happening when you see what Jony Ive, the former designer of the iPhone, is trying to do that was spotted by Sam Altman. I mean, I think Johnny failed a first startup of 200 million that he burned and then made another one and Sam put 6.5 million. So there is a real battle on that. Already now, I mean, it’s not at all science fiction.

To design a solution here, of course you need to design a very specific chip. But there is—you will see in the book that’s the etherum part, I cannot go inside now, but there is a real proposition here from a technical perspective to do that. So it’s not at all science fiction, there are in fact patents, there are a lot of things that are behind it. But I think this is something that we should understand in terms of: who do we want to have as our AI companion in the future? And basically, this approach gives some kind of a framework to to get us in the right place. Let’s continue.

So research agenda. There are tons of things here that this group could do: there is agent-to-principle binding, that would be one area of research for this group; sovereign node authentication... ah yes, I forgot to explain that if you go to the previous slide, yeah, sorry. All of this is not just that you have your device with yourself. At some point you must connect it. So imagine a device that has very specific technical characteristics on the chip in terms of its connectivity, its power consumption, its compute capabilities, its storage. Storage is critical here. By the way, there is a clear mention about Solid, and Solid was presented a few DINRGs ago. So Fred notes about Solid, for example, that Tim Berners-Lee and Bruce Schneier and Adrian are doing over there, and in fact, this is the software side of what etherum could do on the hardware side. So there are quite a number of things here that we don't have the time to cover and actually I don't even know if I’m late, I’m probably late. Let me continue.

So that’s why behind this node, there is a whole federation mechanism, what happens when there is the system when you have a major power failure on the grid and how you can still connect, maybe by satellites and so on. So there is a real deep thinking about what kind of system would look like to do that. Agent interoperability could be another area, distributed computation, and identity portability. So there are—let me check if I don’t forget anything, but there is quite a lot of things that could be an interesting set of topics for this group. And frankly speaking, just the theory of it, the math of it, we would love to have a proper research because as I said, Fred is not an academic and I’m certainly not an academic either. But there are certainly things to to consider.

So let’s go to the last slide, perhaps, because I don’t know how I’m probably late. Yeah, so let’s continue. So here, what is at stake here is we need to find a way to manage the agency with the proper sovereignty model in the middle because, as we said at the beginning, we are now connecting intelligence at a massive scale and this is happening like at an unprecedented speed and the real question, you could ask yourself the question in 50 years when history will look back at us. The question will not be about the technology if this happens or not. The question will be if we humans have decided or not to keep the control of what’s going to happen. That’s the real question. Perhaps back to Joss’s point just before on the market incentives, do we have the right conditions to make it a real choice? Not that we are manipulated by forces behind us that want to smash us as humans. So will humans remain sovereign nodes in the network of intelligence or will we become passengers in a system we can no longer govern?

Let’s go to the last slide, I believe. And I think next slide? Yes. So that’s my concluding slide: every agent must be entangled with a human principal and critical data needs to remain under human control. Thank you very much.

Dirk Kutscher: Thank you, Arno. Very inspiring talk. Do we have questions?

[No immediate questions from the audience]

Arno Taddei: And perhaps I can give the chance for Frédéric to intervene?

Frédéric: Yes. Okay. Thank you, Arno. Wonderful presentation. I know we were not very technical, we stayed away of the technology, but I think every 20, 30, 40 years, we have to take, we have to go back to first principles. I’m working since now 20 years on decentralization, and what we are observing now is some kind of uncontrollable chaos. Every week, every week we experience... last week for example, Strike got hit, 200,000 machines were taken out. Today there were other two big accidents. We are not really anymore in control over the infrastructure because, as Arno said before, the infrastructure was built for machines, not to run a society. And I think a federal system and to go back to the roots of the internet, where the internet was a decentralized system, we have now the opportunity—but not in the actual environment where we have hyperscalers, where we have basically 10-20 companies that decide how the entire technology is rolled out and how they work on the society. So it’s not us anymore to define how society evolves.

And I have a little critic on especially on Meta, social networks. Social networks have basically destroyed an entire generation because the generation was not ready to deal with social networks. We have not built, for example, a culture of social networks. We just unleashed them. Now we’re unleashing AI agents without end and it’s becoming exponential. I have five agents working here on my desktop where I’m sitting, they're doing a lot of work for me behind, and I’m wondering if I connect them now to my bank account what will happen. So I think the research group and what would be important has a unique opportunity to revive decentralization, what’s in its name.

So for this, I have tried to build a quantitative framework that gives us the ability not to have opinions but to calculate what the outcome of our decisions would be. So that’s basically all. I say thank you to Arno and if there’s questions, I hope I can address them also after the meeting. Arno will certainly share all my information and we see how it goes. Thank you so much.

Dirk Kutscher: Right, thanks very much, Frédéric. So we do have a question in the room. Jump on.

Speaker 1: Hi, thank you, very interesting presentation. I think is more for the future, so setting the ground for what will be the management of AGI. As of today, AGI is really compute-intensive, so I think has to be centralized by nature because I don’t see it feasible to have personalized or portable devices. So the amount of compute needed to implement AGI is in the hands of very few corporations and breaking centralization today is just really complex, but setting the ground for the future and implementing what will be the infrastructure for AGI when it will be more manageable is an important topic.

Arno Taddei: Thank you. One thing I should say is that, yeah, in the book Fred makes exactly the same remark. As we had the same issue with mainframes long, long time ago. I mean, when we started we had mainframes, then we got to smaller machines and PCs and distribution apps. So let’s say I believe in gravitation, so I think it the same will happen for for AI as well. Now the thing as well I want to take the opportunity is that in the book, it’s really very... it’s like 30, 30-something modules like this. So this is the book and Fred, would you mind make an offer to people who want to access the book for free, maybe?

Frédéric: Yeah, yeah. I have put it for free during the conference for another four days on Kindle, you can download it in Amazon, free of charge.

Arno Taddei: Right, so you can have access to the book. The thing, it’s well done because you have an easy way to read the book. If you are a regulator, so there are many people can read the book: regulators, technologists, maths, philosophers, engineers and so on and it helps you navigate. If you want to skip the maths you can, if you don’t want to skip the maths you can and so on. So there are, yeah, there is quite a lot that we could try and again, I think it could be worth having something like this as a platform for quantitative approach to decentralization.

Dirk Kutscher: Fantastic. Thank you both.


Dirk Kutscher: Okay. It looks like the network has stabilized a little bit, so I’d like to try Saidu again. Saidu, you have control of the slides. Please present.

Saidu Sokoto: Okay. Sorry for the interruption. I’ll probably just continue from where I stopped. I hope everyone can hear me. Yes, I was here when the remote connection was lost. So as I was saying, we’re all familiar with centralized networks; they have unlimited power, and of course, this is not something we really like. There has been a lot of discussion about the Fediverse in the last IETF and DINRG meetings specifically. However, Bluesky introduces something different. So they have blocklists. So apart from blocklists, they also have individual blocks, labels. Bluesky is decomposable; functionality can be separated into modular services that users can mix and match. So public moderation primitives like blocks, blocklists, labels, and feed generators can be provided by independent actors. Users choose whichever they want. In terms of moderation, I think from IETF 112, Gareth gave a talk about how moderation labels, I mean basically automatically remove content—they do not automatically remove content, but instead, they allow for content filtering with users deciding what to expose.

Yes. So to actually understand what things look like, it’s important to know the differences between individually blocked accounts and accounts from blocklists. At a simple level, a user can directly block another account. This creates a moderation signal that affects visibility and interaction between the two users. So the blocklists that are introduced by Bluesky—of course, blocklists are not unique to Bluesky, but Bluesky is the first social network to actually make it public and so we’re able to study them at scale. So this additional layer introduced by blocklists makes them delegable mechanisms. Users can subscribe to lists curated by other community members. So basically clicking subscribe means that you inherently or effectively inherit all blocking decisions made by the list curators. So decisions can scale from individual actions to collective filtering mechanisms. I had some animation here but unfortunately, it doesn’t work.

SoSaidu Sokoto: ...relying on a full snapshot of the network from April 2024, we look at the scale of community moderation covering 34 million users, around 1.2 billion posts, 40,000 blocklists. This gives us a complete moderation history or the event history for moderation until that date, or until that month, which is April 2024. We see that while individual blocking is widespread with around 119 million individual block relations, which covers around 12.6% of the users, blocklists amplify the effect by several orders of magnitude.

Blocklist subscriptions create over 16 billion unique block edges, which is almost 100 times more than individual blocking. So a single list maintained by one user, as we can see here, can shape the experience of thousands or even millions of subscribers. Delegation massively amplifies moderation impact. Additionally, from here we see that community actions also exceed enforcement volume, which is the takedown you see here in green. These are the takedowns by the Bluesky team. We can call these centralized actions. So although they’re much fewer in absolute numbers, they mirror similar temporal patterns, which means that centralized and decentralized moderation mechanisms probably respond to the same stimuli. So although takedown actions probably target extreme cases such as those mandated by law, they seem to follow the same trend as community moderation.

Beyond measuring the scale of community moderation, we also examine the characteristics of accounts that appear on blocklists. We will see why this is important in a moment. But blocklisted users are substantially more visible. They have much higher activity and engagement than other users. Apart from being much more active, toxicity signals are also higher on average. So we measure the toxicity of the users and we see that those on blocklists are actually much, much higher. Looking also at the distribution of topics discussed by users—all the users, and not only those in blocklist—we find that political content seems to be more represented among accounts that appear on blocklists. We measured this of course based on a baseline which we defined as the topic distribution across all the posts in the network.

These are some figures to show you the level of toxicity on the network for those accounts that have never been blocked, those that have been blocked but at an individual level, and those that are included in blocklists. And for all the dimensions of toxicity we measured, those on blocklists seem to be much higher. Similarly, this shows topics discussed by those on blocklists are much more, probably, sensitive compared to those discussed by users that are not on blocklists. We see adult content, politics, crime, and law tend to be much higher than the baseline users or those that are not on blocklists.

But why do you think this is important? From the previous results, we know that there are definitely clear differences between accounts that are blocklisted and those that are not. Well, this tells us that blocking decisions can have a meaningful impact on visibility and interactions across the network. Therefore, knowing which type of users might be affected sheds light on the power held by these blocklist curators. The natural question you might ask is: who then is actually making these moderation decisions? Who’s controlling these blocklists? It turns out that only 0.043% of users create blocklists. Even worse, the top 100 blocklists by subscribers cover over 66% of all subscriptions. So, and these are created by only 79 individual users. So these top 100 lists are controlled by 79 individual users. So top blocklists actually cover a large share of subscribers. These means that any errors they make can actually propagate, they can have an impact on other users, a lot of users actually. So we start to see trends of centralization again, even with the blocklist.

But it’s not so bad actually, because looking further, if moderation power becomes concentrated in the hands of a small number of blocklist curators, we might expect that this will maybe significantly reduce participation or it might fragment the network. However, when we examine behavioral response, so what’s the impact of being blocked? What’s the causality? When we do the causality analysis using PSM, for example, what do we see? We see that the picture is actually more nuanced. Surprisingly, community blocking can increase the popularity, activity, and toxicity of the users. So what this means is that community blocking can be an effective way of not actually silencing users but shaping a user’s social environment without silencing them.

So beyond the causality impact, we also investigate the impact on the social graph. So what happens when you block, when you add a user to a blocklist? In Bluesky right now, it doesn’t affect the follower graph, but assuming we remove those edges or we remove those users, it turns out it has a very small impact on the social graph. Users tend to block, maybe at the individual level, other users whom they probably never interact with. So this is actually important, although we see trends of centralization, at the end it doesn’t silence the users.

So I have just really touched a small aspect of our paper. We have a lot of findings and I do encourage you to read the full paper, which is available if you scan this spark code. But in general, we see that blocking is widespread and it’s being used a lot. It shapes and covers more than 90% of content visibility. It does target the most active and polarizing users, but just being active is not the signal that you'll be blocked. No, blocked users actually have their own features and we do show this in more detail in the paper. We do nearest neighbor analysis and many other analyses to show that blocked users actually have their own unique features. But the nice thing I would say is that it doesn’t silence users. It rather segregates, it gives... we do propose some possible reasons why this is the case, but this is probably open to discussion as I mentioned here. Are they different interpretations for why blocked users become more active? So it might be this Streisand effect, maybe their communities rally behind them after they’ve blocked. Users are usually not informed when they’re blocked on social on Bluesky, but then there are other ways to find out. But it’s actually a very nice thing to see that users are not actually silenced.

This brings me to the end of my presentation. Please let me know if you have any questions. And do read the full paper, please.

Dirk Kutscher: Okay, great. Thanks very much, Saidu. And we apologize for the audio problems, and yeah, it’s a pity that you couldn’t make it here in the end. But thanks very much for your presentation. Do we have questions?

Okay, that’s not the case. And please follow up with Saidu offline. There were some questions, some comments in the chat, Saidu, maybe you want to take a look at those.

So we continue with our final talk, and that will be given by Lixia.

Lixia Zhang: Okay, I will start here. So thanks everyone. I really appreciated Arno's talk. I appreciated those beautiful mathematics. Unfortunately, I couldn't follow much. I blame myself, I didn't have much of an education, cheated all the way to get a PhD, but not knowing much maths. But I think what you talked is very much in line with what we are talking here. So this talk focuses on this identifier design issues. A work jointly by Tianyuan who's a postdoc at UCLA and Dirk and Dirk's student, Xinciao Li, if I pronounce it correctly.

So what we are going to talk about. This is about decentralization. We need to clarify what decentralization really means. We think that decentralization is really about decentralized control power. It’s not the same thing like a distributed system, you can have lots of machines in lots of places, but if it’s controlled by the same organization, that’s not decentralized. So decentralization means decentralized control.

Now what does the decentralized control require? We believe that that really requires decentralized trust. How do I make that reasoning? So think about that you want to control something, that is about decision making. Now who’s making that decision? Whom you trust? Who authorized each actions? This is about trust decisions. So therefore, unless you have autonomous entities make their individual decisions—I think someone mentioned that autonomy have the freedom, I think Arno talk mentioned that the autonomous entities can make the decisions free from other external constraints, that is decentralized control and that required decentralized trust.

Now if I make the next claim, that is decentralized trust actually requires the establishment trust relations between semantically named entities, which in turn lead to the requirement of a globally unique semantically meaningful identifiers, the names, so that you know whom you are talking to. This claim might be a little controversial, especially in contrast to the other notions about decentralization that leads to decentralized identifiers.

But you have to think about this way: if a decentralized identifier does not tell you whom you are talking with, it’s very difficult for you to make a trust decisions except that you have to look at elsewhere to figure out a public key or large random number maps to which party for you to decide what decisions you can make. Now who will be hosting those mapping systems from this random unique IDs into meaningful entities?

At the same time, if you have this globally unique semantically meaningful names, then you can directly derive your decentralized decisions. So the question is: can we have this globally unique semantically meaningful and decentralized names? Up to this point, it seems that there's kind of a controversy. So there's this fellow named Zooko, he made this statement I think back in 2001, some 20-some years ago, talk about Zooko's triangle. I think people probably have seen this before. He mentioned that the three desired properties for naming in a networked systems: human-meaningful (this is what I said earlier about semantically meaningful names), decentralized name management, and secure. So secure means that you have to have crypto-verifiability about the name ownership. Notice that it’s required the name itself need to have crypto-verifiability. And decentralized also mean that you can just declare this your ID without permission from any other parties.

So with these three of the triangle, you can see so obvious that you cannot achieve all the three simultaneously. So therefore Zooko's triangle essentially made the claim that human-meaningful, decentralized, and secure identities are mutually exclusive, pick two and two only—so this called the trilemma or pseudo-dilemma. Among the three choices, you can only pick two.

But is this really the truth? So therefore we're going to actually look into more detail about this. We believe that this Zooko's triangle actually contains two definitional errors. The first one is about decentralization. He thinks that decentralization means the absence of any coordination, you can just declare an ID on your own. The second thing is about security, the definition, as I mentioned in the previous slides, a secure identity has to prove itself about the ownership.

Now does that make sense? So let's look at the first one first. If you define decentralization is about absence of coordination, it’s not about decentralizing the control per se, as we said earlier, that's really about decentralization, right? So under this definition, that you can just grab any ID as your own identity, then the only means you can get such a ID is the self-sovereign choices, you issue yourself an ID. But if you issue yourself an ID, it has to rely on randomness, or public keys, whatever, that has randomness in it. And therefore you forgo with any notion of semantically meaningful.

So we think that it’s really about whether decentralization means the absence of all coordinations, or decentralization really should mean decentralizing the control power. Do you think that if you just grab a random number as your ID helps you to decentralize control power? And that's actually a open question. I'm going to show something later.

So let's look at the next thing about the security of a name means that the name itself has to show cryptographically the name's ownership. Wow, this is a tall order. If you think a name need to show the ownership itself, I think the only means you can get that purpose is you just pick your key as your name, right? There’s no other way you say "this is my ID and I can prove I actually owns it."

So therefore the crypto-string will carry no semantics at all but reach this goal as Zooko defined, this be self-certifying. Now if we put these two together, we think that this really the both of them push towards this direction, that is the identifiers can only be this self-sovereign self-choices, so therefore this is the only solution for decentralized identification. So this shows that Zooko's triangle is essentially kind of a rationalization to say how you should pick the IDs. Decentralized means you pick yourself, self-certifying, that means you have to pick keys, so therefore this is kind of a self-serving thing, rather than kind of a fundamental the impossibility proof.

But nevertheless, I think there's many people who actually put it on—take on this Zooko's triangle believe that it's a fundamental impossibility result, and therefore they build their solutions on this assumption. And I think well say that maybe this decentralized identifier, DID, is maybe one of those cases.

So DID has this promise that is we want to decentralize the internet and therefore we should start with decentralized identifiers. So what the DID specification actually offered you is that it’s a format, right? It contains the DID, then most important thing is the method, because the exact identifier is a function of that method. And the resolution of from that identifier, how you can actually get the value of the things behind that identifier, this resolution is pure method-specific.

As you can see I listed here, essentially the three categories. There’s the web-based, DID Web, that’s essentially based on the DNS name delegation. Then there's the ledger-based, that's all these crypto-blockchain equivalent of the things. So the name issuance, of course, as permissionless, you can do it yourself, put it on the chain. And then the name type will be the cryptos or hash of that. The resolution you have to do a blockchain lookup.

Now the other one is this kind of interesting, that is it's ephemeral or self-certifying. So there’s the DID Peer, DID Key, essentially you can just use your key as your identifier, and therefore you don't really have the resolution, you show yourself "I am the thing." Except there's one problem, like I mentioned before: you show your ID, but how can I know who you are for me to make intelligent trustworthy decisions?

So I want to say that this kind of a self-certifying keys doesn't help you establish trust relations between parties for you to actually decentralize the control decisions. Now let's look into more specifics. I think although DID with all the intention of a decentralized identifier, but if you look deeper into the fact, then you can see that for DID Web, if they believe DNS namespace is centralized, then DID Web, very unfortunately, did go down to this DNS name based identifiers, it’s essentially URL and the resolution will have to go through the DNS resolution things like that. So what I put here is that if you think DNS is centralized, then DID Web is centralized by inheritance. So it doesn't really achieve the so-called decentralized identifier if you don't believe DNS is decentralized itself.

Look at the second thing about this blockchain-based stuff. So blockchain is supposed to be kind of a data on the chain, and the client would just look it up. So this is kind of a seemingly very decentralized system. But in practice, on the other hand, if you really want to look up things, you have two choices: either you build full members—I forget what that thing called—the full list yourself. We tried that. We had the student train to be a full member of Ethereum, we tried for several weeks and we barely succeeded due to the frequency of changes, due to the gigantic requirement on the resources. So therefore you end up with, "Hey, I just look up those centralized access points that will give you the lookup service quick, actually, on the blockchain." But what you end up with is, again, you hinge on this centralized point for the lookup.

Now you look at the third one about this universal resolver that's provided by the DIF. So this is really decentralized by design. It said that you can everyone can pull off their own this decentralized universal resolvers, but in reality, as far as I can tell, that most people actually use that lookup service actually provided by DIF. So it's decentralized by design but somehow end up with centralized deployment.

And I want to also add another thing, that is not to say only the DID has these three types of things, but they are isolated themselves. So therefore you really cannot say that you can prove some security with the DID Web and somehow establish trust relations with something proved by DID blockchain. There's really no relation. So if you go down this direction of having the DID to define this universal name wrapper, what's the actual security property one can build in a global scale? I think that's really becomes a open question.

Now, so what we think actually is the following: so I think a DID's promise is that decentralization should start with a decentralized identifier, as because this notion that DNS is centralized. But if you look into the detail, what is really DNS? The name allocation part: there’s the ICANN, obviously. You like ICANN, you don't like ICANN, I'm very sure people have all the different opinions, myself included. But the fundamental thing is that ICANN does not control the entire DNS namespace. All they are doing is to assure that the top-level domains do not have collisions. That's what they are doing. You know, there's people have questions regarding their practice, how they achieve that, but that's a separate question. You know, their job is to assure TLD uniqueness.

But under that, everyone gets their piece of the namespace and do their own jobs, like I'm at UCLA, I know for sure ucla.edu is our own territory, and the campus decide how many departments give names or how many special organizations they are willing to give a name under ucla.edu. In my department, the same thing, cs.ucla.edu, I go knock the IT guy's door, I say give me a name, lixia.cs.ucla.edu, there's no question asked at all. So I want to make this just as a example, DNS namespace is really delegation based. It's by and large is decentralized namespace management.

So this delegation model distributed authority for the ownership of individual names, like I said, apart from of the TLD. Nobody likes ICANN control the TLD, but to me, I think that's really a necessary evil. Without uniqueness of TLD names, you will not have a global unique DNS namespace. So the second point is that, you know, the DID people believed that decentralized internet need to decentralize the identity first. But you look up the example that I just mentioned.

Decentralization is about decentralizing the control power, that really requires decentralizing the trust. Now you decentralized the identifier space. How does that actually help you decentralizing trust? You know, DID Web, we go back to DNS. DID blockchain, in practice you end up with some centralized point so that you can deal with this blockchain challenges on its own.

Right, and then DID Key, unfortunately, it doesn't really help you establish trust relations between parties. Now, so there's a notion that whether this sovereign—this self-sovereign identifiers enables the trust. I already mentioned before, so I think for the sake of time I will just move forward.

So instead of Zooko's triangle, we come up with this actually Namespace Design Requirement triangle. We think that for the internet, the namespace design must meet these three requirements: of course they need to be globally unique, they need to be scalable, and they need to be semantically meaningful. As I argued before, because fundamentally we need to secure internet communications, to make the decentralized control, to decentralized trust. If I don't know who you are, I cannot make that establish the trust relations.

So between the global uniqueness and semantically meaningful, what does it require? It requires coordination. You know, I want the name, I don't know, weather.com, and somebody else wants it as well. Without the coordination, you will have collisions. So if you want it to be semantically meaningful and globally unique, you cannot get around namespace coordination.

Now, between the global uniqueness and scalability, what you can do to have it be semantically meaningful? I think delegation that whatever DNS started from day one is really a good solution to satisfy all the constraints. And I want to point out that we are talking about namespace design requirement. This is not about name resolution systems, okay? That I think is a separate question.

Now, next is about the DNS. Now we look at the DNS. DNS actually satisfies all the three requirements. It's coordinated because we have to be globally unique. DNS started recognized that as a requirement, so therefore it's structured. Therefore when we do the DNS lookup, we don't really have the headache as if you do the blockchain lookup because the namespace is structured. You look the name, you can find out the resolution pass just from the name. And then semantically meaningful, that actually the day-one requirement for having the DNS. I lived through the days to see that how DNS came into existence. Before DNS, we already dealt with the addresses and few of us could remember the addresses. So there's this informal host.txt table, it still maps the semantic name to the addresses, and that eventually get, you know, formalized as the DNS system. And I want to say that the added benefit of the DNS is it's already deployed namespace.

So the next point: name alone, I must agree, is not enough because name itself cannot prove ownership. It has to go together with crypto. But I must repeat myself that the crypto alone also is not enough because crypto is meaningless, semantically empty. Therefore give you a key, god knows who that person is, right? So you really need to bundle those two together to make a useful identifier.

And I just put a example there. So everyone knows that, you know, for UCLA I already mentioned, they freely give out the names under their own territory and immediately interpretable. Alice.ucla.edu, somehow some entity related to UCLA. Arno's talk talked about agent need to have accountability, need to be associated with human organizations. I think this is really one way, it’s actually very effective way to achieve those goals.

Another thing I highlighted there, that is now we talk about names need to go with crypto. Now whose trust anchor? Is that the root zone of the DNS? I think many people probably disagree with that. I can again use my local example, ucla.edu. It’s really autonomous administrative organization and we pay the UCLA, we followed UCLA's rules, and so UCLA creates its own administrative control entity, this trust domain, and can have its own trust anchor, and that will decide all the trust rules on campus.

And locally in our department, we can have another trust anchor controlled by the department, and that will involve decisions on the local local control decisions, a local trust, you know, between faculties, computers, students, whatever the rules the one wants to enforce in the department. So I want to emphasize this: name plus key is useful trust, but this trust should not be rooted to some external third party. We want to decouple the structure of the namespace from the structure of the trust relations.

Now, let's talk about the DNS because people all have different views on the DNS. The provide what? One thing people sometimes they don't give enough credit for is the number one role of a DNS is to provide this globally unique namespace. Very often you talk to people about the DNS they say, "Oh, it’s servers, resolvers, all of that." That's only the second part of the DNS, is resolution system. But the namespace itself is a fundamental functionality provided by DNS. We should never forget that.

And lately we get the DNS security extension, trying to... it's not so much a global trust anchor—I want to correct that. I had some debate with Steve Crocker some time ago, whether the root what kind of a role that it is. He insisted that the root verification chain does nothing more and nothing less than prove the name ownership. It's not about your trust relations when you try to do business, when you try to decide whether you give somebody your credit card. He said "That's your local trust, you can handle that, but I still need to have my DNSSEC root key for the name ownership verification." So leave that apart, for this talk we solely focus on the unique functionality of DNS providing a globally unique semantically meaningful namespace. We think that everything should be identified by a DNS name.

With regarding to concepts or misconceptions about the DNS: in the academic world, I've seen so many papers they started by saying that the DNS is a centralized naming system and then they start doing their own solution development. I really want to fix that, I think the misconception: DNS namespace management is not a centralized system. There's another thing people sometimes think DNS is centralized: it's they really referring to the resolution system. There's a measurement shows that, you know, the DNS services somehow get centralized because many organizations just outsourcing their DNS services. But that's is not related to what we are talking here because that's a separate thing, here we want to focus on the namespace design.

So today you say that we have the DNS system, how come we don't really have very good solutions connecting everythings and good security solution? You talk about the DNS names could help a decentralizing trust, where is that? Look at the OAuth servers. It's probably not so-called the decentralized, right? I think that really have the roots to the fact that although we have a DNS namespace, we have not used it as a unified namespace for everything, even before the agent age. Look at who get the DNS names? Probably not many of you, or I would say most of you. Jeff told me he got the DNS name, I haven't. My kids got one because they are so kind of egoist, they want to have their own names.

But fundamentally I want to say that users, generally speaking, do not have DNS names. And therefore they cannot use their semantically meaningful name in that online communication to secure their applications. Organizations get the name because currently we are dominated by this client-server paradigm; the servers have to have a name so that you can have TLS connection set up. But the users really falling behind. Talk about the lessons learned, I think looking back for this many years, starting from 1981, RFC 791 got published, that's IP specification, we've gone through a long pass. Looking back, there's many things we could have done differently.

The one thing I personally feel that is the DNS. At the time DNS was there and we were not thinking about why don't everyone get the DNS name? Because at that time security wasn't the first problem, or even a problem at all back then. Whoops, I'm running late. So given that we don't have a unified namespace, so security solutions is really built on all the different identifier spaces. People say "OAuth is a great solution for authorization." What are the identifiers used there? Your email address. Your answer.

So therefore as a result, we don't have our DNS name, what do we end up with? Our email address becomes a default identifier. Now who get your email address? Well, Google is the dominant email providers, I think they control the most, most, most users' identities. And identities directly lead to the authorization, the OAuth servers, etc.

So I should wrap up quickly. Now let's talk about agentic AI. What do they bring to us? They are predictive, the perception, decide, advanced that slide. Oh, oops. Sorry, forgot to turn that.

And fundamentally that there's a lot of them and they'll be very dynamic. It's different from we currently doing security, right? People talk about agent can come and go, some going to be long-lived, others going to be short-lived. How many of them we're going to have? Billions? Someone said trillions? Not immediately, but maybe in the future.

So the current state of affairs is that the intelligence of the agent progress very rapidly, but the system side of the agent I think is falling behind. That's how in Arno's talk, you talk about the current lacking of naming identity, security, and accountability. I think that’s fundamental deficit in the agent system architecture that we need to fill up. So whether you want to use the current solutions for the agent or otherwise we really have to think hard about restarting. There's pros and cons in doing each of them, and I have to say the existing solutions, the currently running agent-to-agent communication seems to be falling on the left side.

So I already said a lot. I'm going to skip this slide. Then I'll go on to necessary step: how we're going to push for this new direction using DNS as a unified namespace. When I say unified, it's not just like what we do today, identify organization, the website services, but identify users and importantly, agents.

What that needs is that there's really new things to do, right? We have to decouple the namespace from the trust management and that really requires that we have to develop new solutions. So how we can enable everyone manage their own namespace and especially dealing with the dynamics of agents come and go, the name will be allocated or otherwise deallocated. So lots of work to get done.

Oops, I'm going in wrong direction. And um, this is call for actions, this is my last slide. So we end up by saying that for this namespace design for the AGI, I think we need the global namespace, local trust. The rest should be clear, I have talked so much. Thank you. Sorry for running over late. But I started late.

Dirk Kutscher: Sure, we had many problems today. We do have time for one question, Mallory.

Mallory Knodel: Thanks for the talk and thanks for taking my one question. I just wanted to ask about because I think it’s in some of these slides, so maybe you could elaborate on the role though of intermediaries, like email is your primary example but I think also the federated social web is also a place where people feel like they can be part of a—a community and then they get the identifier. So we now we don't just have email. Which and I would say, you know, having an email address with your—your affiliation, your school, you know, those are actually really important. So maybe the end-use case is the hard one, but the intermediary case where, yeah, it’s not maybe just email but maybe federated social web? Like maybe how—how do you see that fitting in with your suggested recommendations?

Lixia Zhang: That's a great question. The fact is: who are the email providers? That is the question.

Mallory Knodel: But if it’s namespace, it might all be Gmail, but at least, you know, I get an @nyu.edu at the end of my name.

Lixia Zhang: I mean email address essentially is DNS plus your user ID, right? But then fundamentally who controls that email address? NYU? Google? Google so far is so great provide email services, I don't know for billions of users. But shall we stay in that situation?

Dirk Kutscher: Okay, great. Thanks very much everybody. Yeah, we apologize for the glitches in the morning. So we wanted to have more time to discuss these things but it didn't work out. But so if people are interested in discussing this more—so we had I think many really inspiring talks today. Maybe please contact us or use the mailing list. We are around for the whole week, maybe we can, you know, get together later in the week and continue the discussion. Thanks very much and this concludes DINRG this morning.

[Applause]