Session Date/Time: 15 Mar 2026 02:00
This is a verbatim transcript of the IEPG meeting at IETF 119.
Warren Kumari: I assume people in the back can hear me. Can someone in the back give me a thumbs up? Excellent, thank you. Let's get started. Otherwise, he will make a grumpy face at me. Hello everybody. It is now 10:00 AM. Welcome to IEPG at IETF 119. Does the clicker work? So, this is what the IEPG is. It was originally created by RFC something-or-other, like 16 something-something, written by Geoff [Huston]. So it has been running around—it's been running for a long time. It's supposed to be topics of operational relevance in some form of fashion. For some reason, this time it is all DNS. We seem to go through cycles where the entire IEPG agenda is BGP, sometimes it's all DNS. Probably it's all DNS because DNS is the best protocol there is. BGP's a close second.
And this is our agenda. It's going to be, as I say, all DNS. So, I'm Warren. Jen [Linkova], unfortunately, is remote this time, so she can't sit up here and make sure that we stay on time, but she says that she will poke people if they run late. And I will end my talk now so Job [Snijders] can officially agree that we are actually under time. I will hang up and we will have Willem [Toorop] present on global local root.
[Slides: Administrivia]
Warren Kumari: Would you like to use the clicker? There we go. Is there a red box? Yep, there's a pink box. Oh, does any—actually, just before we start, does anybody have any technical issues? Is everything working correctly? Everybody can hear us? If people can't hear us, they don't know that I said that, so I guess we'll just assume it's good. Take it away.
Willem Toorop: Okay. So, this is about—or the motivation for this research is the local root best current practice draft from Geoff and Warren and Jim Reid, etc., which is a proposal to provision resolvers with a root zone. And which has all sorts of advantages. I'm a big fan myself, actually. And at NLnet Labs, we're close to the University of Amsterdam. They have a Security and Network Engineering Master, and they do short research projects, like for two weeks, so to say. One week to make a proposal, two weeks to do the research, and then one week report. And so I thought it would be a good idea to have a look if—if this proposal to make this best current practice would be deployed at scale, if that would impact the traffic on the internet, if it would make it larger or smaller, and if so, how much?
And so this research was executed by Elias Rahimi, and this is his report. You can find the report on our website as well. So, the method he employed was to first define what traffic is. So in this research, we defined traffic to be the bytes transferred per resolver per day. And then the experiment, or the research, was a very hands-on, practical thing—is to set up existing open-source resolvers with the example configuration of version 1 of the draft. I noticed that in version 2 and 3, the example configurations are no longer there, but... And then measure the traffic between the resolver and the root, and in some cases, the root is provisioned by downloading the root zone from internic.net. So to measure the traffic towards that site as well, and compare that to a baseline.
So how to get a good baseline? Well, there is the fantastic, amazing RSSAC 002 report, which is advisory on measurements of the root service system. It's basically root server operators reporting, measuring their own operations and publishing statistics, or the metrics that they measure. And one of those metrics is the number of unique sources seen at the root server identifier: the number of unique IPv4 addresses and the number of unique IPv6 /64 networks. And so that could be considered as the number of resolvers seen at a root server instance. So the measurement is sort of a rough estimation of what traffic would be, of course. And we then only need the traffic per day at such a root server instance to be able to determine the traffic per resolver per day. And unfortunately, that metric is not available with RSSAC 002 data, but there is the query and response size distribution, which is published as a list of ranges with, per range, the number of requests and responses for TCP and UDP and queries and responses. And so if you multiply that number of requests and responses per highest number of that bucket, then you do have some sort of rough estimation of the traffic of that root server instance per day. So divide it by the number of unique sources seen at the root server instance would then result in the daily traffic at a resolver for that root server instance.
So here is the result of the baseline meas—well, not really a measurement, but applying this idea to the root server identifiers. And so they all sort of lie in the same number range around one megabyte per resolver per day, except for F. I think this is probably a mistake. This probably the number of unique sources isn't the actual number of unique sources, so Elias left F out of the further evaluation of the measurements.
And here are the results for the different resolver configurations, how much traffic they used towards the—or queries and responses towards the root servers, and also towards Internic. And so the first one is BIND with plain DNS transfer. You see that it also, yeah, the root zone updates at least twice a day, and sometimes three times a day if there's a change, and sometimes four times a day, but on average, two and a half times. And a transfer with BIND takes around 1.45 megabytes, the wire format zone transfer. So the total megabytes per day is a simple multiplication of the number of updates times what the transfer costs.
The same holds for Unbound, though it uses a little bit less surrounding queries towards the root server to look up the addresses, for example, where to transfer the root from. But the—one of the surprises from doing this measurement plain was Unbound fetching the root zone over HTTPS, because it had 48 updates per day instead of two or three or four. And which—and also over HTTPS, the root zone is in presentation format, so it's 2.2 megabytes instead of 1.3 in wire format. So that resulted in 105 megabytes per day for Unbound.
Also, Knot Resolver was sort of different. Also gets the root zone over HTTPS but prefills its cache with it and it has a scheduled period to fetch the root zone, and it transfers only once a day. So it's once a day, a bit more than the 2.2 of Unbound because of the way Knot Resolver does priming, etc.
Yeah, so this one is a nice outlier that needed further investigation. And so this is because Unbound considers, if it's configured like that, Unbound thinks it should serve the root as a secondary authoritative service. And that means that it should adhere to all the values in the SOA records. So the SOA record for the root zone, it says check if there's a new version every 30 minutes. And over DNS, this would be a DNS query for the SOA record, and then the resolver would know, "Ah, there's no new version, so I don't need to transfer the zone." But this logic was just there in Unbound, and the mechanism to transfer the root zone was just the same as with normal DNS transfer, only then HTTP, so it would just blindly transfer the whole zone every 30 minutes because it cannot do a DNS query over HTTPS. Unbound did not yet have support for ETag and If-None-Match mechanisms, which is an HTTP mechanism to check if the—if a version of the file you're requesting has changed. But yesterday, I managed to implement this, luckily. And it's available, it's that pull request, so I assume the next version will no longer do this.
So the further measurements from Elias are also considering Unbound with the bug fixed, so to say. So Knot Resolver updates once a day, which is also different from the other cases. And so it doesn't consider the SOA for its timings. And maybe that's a reasonable thing because the SOA is not for resolvers, it's for authoritative name servers. And the lowest TTL in the root zone is actually one day, so there would be no need—so apparently the root is fine with resolvers having maximally one-day-old data, right? Or at least that would be a point of view.
Yeah, Elias also made this picture, which is nice. It's on a logarithmic scale. So the upper line is 100 terabytes per day more traffic, I think. But yeah, so with current how resolvers currently operate, Knot Resolver would do best by just fetching it once a day. And but—but all of them, it means more traffic per resolver per day.
But, so, currently the root is fully signed plus/minus two and a half times a day, at 4:00 AM in the morning and 4:00 PM in the afternoon. And when changes happen. So that explains sort of what the daily traffic would be. But fully sign means that there is no incremental transfer because in an incremental transfer for the root zone, you first all the signatures would need to be removed and then replaced with all the new versions, and the size of two times the size of all signatures is larger than the size of the whole root zone and therefore transfer is always full—is a full transfer always.
And so we asked ourselves the questions, what if it would have been incrementally signed? Which means that a portion of the oldest signatures is renewed twice a day and only the changes are resigned when changes happen. So you need to do this properly. So it—it would be bad if signatures would expire and the zone itself is not yet expired. So it's important that the signature lifetime is longer than the SOA expire time. So that means that every signature needs to be renewed at least every six days. And so since there is—it is resigned twice a day, we've incrementally resigned the root for 1/12th of the portion of signatures every time the root updates. And we sort of maintained a shadow version of the signed root. And you can have a look at the files itself at this GitHub repository, which has all the versions of the root zone since the 22nd of December. And it also contains presentation format of the incremental transfers of one version to the other version.
So here's a picture of that repository. The .zone files are the zone files for that version, the .ixfr files the presentation format incremental transfer files. And here you see in the incremental transfer file, the presentation format, that it also mentions how large it is, how many bytes. So, you know, you can just, if you check out that repository, do a simple grep for data size and get an indication what the incremental transfers, how much that would cost the resolver.
So if we put that into that same table, two and a half times the average change or size of the change, then you would end up with daily traffic which is actually lower than the baseline, than what we estimated to be the current daily traffic average per resolver.
So that's it. We also are developing a new DNSSEC signer at NLnet Labs in Rust, and there's also a command-line version testing out things, which is called DNS-T. So there's a branch where you can check out that command-line tool to incrementally sign zones. And there's also a script to create incrementally signed versions of the root. Yeah, I don't know, I haven't looked at the time myself, but some discussion points could be: does traffic size matter, or with respect to timing and refreshing, what would make more sense? The SOA, which is intended for authoritative name servers, or TTLs? What do you think?
[Slides: Global Local Root]
Job Snijders: Job Snijders, Fastly software development. Um, what I think is, you should go back to slide 7, please.
Willem Toorop: Slide 7? Which one is that?
Job Snijders: 7. It's between 6 and 8.
Willem Toorop: Yeah, but no numbers on the slides.
Job Snijders: Oh, there is an—if you go back a little bit to the tables of Knot and Unbound.
Willem Toorop: Oh, the okay. The published tables. This this one?
Job Snijders: Yeah, yeah. Okay. Um, the lower left, 2.19 megabytes per update. To me, it smells like you desperately need gzip compression in the HTTP layer, because then it's 950 kilobytes. So you can take this idea to shave off 56% for free, and it's better for the environment that way.
Willem Toorop: Yeah, good point.
Job Snijders: But more importantly, using compression will increase the reliability of the transfer because your transfer is shorter and therefore a little bit less susceptible to packet loss.
Willem Toorop: Absolute—absolutely.
Job Snijders: So that's what I thought. Thank you for the presentation.
Willem Toorop: Good point.
Dwayne Wessels: Hi, this is Dwayne Wessels from Verisign. Um, thanks, Willem. One concern I kind of have with the comparison is that you're comparison—you're comparing, um, you know, real-world data with kind of an idealized behavior of—of local root, right? You—we know from lots of previous research that 90 plus percent of the queries to the root servers are junk. They shouldn't—they shouldn't be there, right? So these are misbehaving clients. They're doing something that we don't expect outside of the protocol. And so you're comparing that to something that doesn't exist yet. Yes. And—and certainly I think we should expect that there will be—there would be misbehaving local root implementations that would do the same kind of things, you know, send excessive amounts of—of traffic. And so I think this comparison is a little bit unfair in that regard.
Willem Toorop: Absolutely. There is a—yeah, we assume a lot of things, like all the resolvers will do aggressive NSEC, for example, and yeah, there will be no queries to the root server, which is absolutely not—not the case. Yeah.
Dwayne Wessels: Thank you.
Willem Toorop: Clear. So it's just a rough estimate.
Ralf Weber: Ralf Weber, Akamai. So one of the things you—I have a question on is the amount of traffic you consider that a resolver sends to a root server. But a resolver normally sends traffic to root servers. So can you go back, I think, two or three slides?
Willem Toorop: Go back? Go forward, forward.
Ralf Weber: Uh-oh. Two or three. This, one more, one more. One more. Okay. So you say the daily traffic is roughly 0.96 megabytes. But that's to one root server, right?
Willem Toorop: To—yes, that's the average of the root server instances. You mean the baseline?
Ralf Weber: The baseline. I'm questioning the baseline because resolvers will ask more than one root server.
Willem Toorop: Yeah. Yes. But—so also those resolvers will be seen and reported in that same RSSAC data as a unique source. So the actual amount of traffic at the root server instance and the number of unique sources seen would sort of match on average if you take the whole root server system.
Ralf Weber: But I mean, each source would then still be present in two kind of like different root servers.
Willem Toorop: Yeah. But they will also do queries to those two.
Ralf Weber: Yes, that's why I mean the resolver will query more than the 0.96 megabytes.
Willem Toorop: Um, I don't think so. But yeah...
Ralf Weber: And overall, I mean, what is the—what is the overall total traffic we are talking about here? Because that's the thing also something to consider.
Willem Toorop: Yeah. So maybe you walked in a little bit later, didn't you? So I have actually a slide of how to the baseline was determined. So the overall—there is a daily total traffic for a root server letter. This—I said it wrong, it's not the instance but the, yeah, the letter identifier. Yeah, that's it. Maybe you should take this off? Yeah, maybe you should take this off. Um... Wes?
Wes Hardaker: Uh, hey William. Uh, so first off, thanks for doing this. This was actually sort of the analysis I've been waiting to come up. I've done my own in the past that showed for my house, in fact, the amount of traffic went down. Um, and I think it will be sort of, you know, use-case dependent in a lot of times. Um, also note that—that I agree with Ralf that—that the traffic, you know, per source address needs to be accumulated across all the 13 roots, and that's harder to do. Um, V6 resolvers can actually move around, so it may be that you might be double counting some addresses, and that's sort of impossible to do to figure out at the moment. Um, and V4 can too, but that's less likely. Um, and then improvements will likely bring down the data as well. So the gzip compression that, um, you know, has been mentioned as well as doing head and SOA queries to test, you know, for freshness and stuff like that should help. Um, with respect to total traffic, I'd be happy to help you and give you some figures straight from, you know, B-root that's—that's actual. Um, we can certainly do that, especially looking at DITL data or something, we can actually get total traffic per address. My guess is that it will greatly benefit some resolvers, whereas, you know, if it's a resolver running on my phone, probably less so. Um, you also have to make sure that cache filling may not catch double queries. One of the changes I put into the document yesterday was you really should do NSEC aggressive caching if you're doing cache filling. Because without it, you're actually just still leaking all the negative answers and that doesn't really help. Um, Dwayne, also final point, Dwayne actually talked about misbehaving local root clients. I have one at—at my ISI local root servers that queries me, does a full AXFR transfer every couple of minutes. I haven't gone back to go contact that person, but I should probably do that.
Willem Toorop: Okay. Thanks, Wes.
Warren Kumari: I'm next so I'll keep it short. Yeah, we should definitely add something about caching, right? If you fetch it through HTTP, instead of grabbing the file each time you do an ETag and like, the file's the same, don't grab it.
Willem Toorop: Yeah, yeah, yeah. Definitely.
Warren Kumari: Just as an interesting thing, if you get it from, and I only discovered this a few days ago, Kim pointed it out, the URL that you're currently fetching from, if you end it in .gz, it turns out there is already a gzipped version that IANA has. I don't think that that's like the canonical version. It just happened to be there. And I'm out.
Willem Toorop: Yeah.
Warren Kumari: Next is Richard.
Richard Barnes: Yeah, hey. Richard Barnes, well-known DNSSEC hater. Um, I think it's ridiculous to include DNSSEC numbers in this. Like, if the resolvers are fetching these things, um, there's no need for the DNSSEC because in, like, three or four nines of cases, it's the resolver verifying the DNSSEC, and at that point you might as well just verify the TLS. Um, it is exactly isomorphic in terms of the security properties you get. So I'm like hugely supportive of this effort of moving the root zone out to the edge. Um, I think it's ridiculous that we have servers sitting there with a live query service for a couple hundred kilobyte file we can just sync out. So, yeah, let's absolutely let's do this, but like let's sync the stuff that matters and not the stuff that's not giving you any benefit.
Willem Toorop: Okay.
Warren Kumari: And last, we have Mr. Jeff.
Geoff Huston: Geoff Huston. Look, far be it from me to accuse DNS folk of being obsessive-compulsive to the worst possible level in—in the entire IETF zoo, but we are. Yeah. And in some ways, this was a really simple thing. We're not actually trying to relieve the load on the root servers. If you ask any root server operator, they would say we have buckets of capacity. It's not that we're dying anytime in the next, you know, geological eon. That's not the problem. No. And the other thing is, it wasn't to try and do it inline. It really wasn't. It was a simple observation that says if you have the zone signed with ZONEMD, if you get it, even from the local garbage truck as it wanders past your recursive resolver, if the signatures validate, you know it's current and authentic. Yes. So why bother doing it in the DNS? Our observation was, you know, there's this massive web object distribution industry and they serve a few petabytes per second as far as I can tell. They're bloody good at their job and they do a lot of it. You could throw the DNS root zone into that ecosystem and no one would blink for a second. Yeah. And so you don't need to try and put it in the DNS. You really don't. It's just a web object. Yes. And all you're trying to do is to get the zone back to the recursive resolver. And you probably shouldn't put it in the cache, you should treat it as an authoritative zone that you are also able to serve, in which case you get over the whole thing about cache dynamics, etc. Now, having tried to say it, and people have tried to say this for about the last 10 years, every time it gets said in an RFC, the obsessive-compulsives, some of whom are in this room, leap up and say, "Ah, but you've forgotten this and that and so on." Not really. The answer's really simple. Get a signed object, pull it across, validate the signature, and pull it in as a piece of authoritative data. Yeah. And Knot's answer is the right thing. We observe it changes once a day, do it once a day.
Willem Toorop: Yeah, it's the primary reason why I also like the proposal to have the DNSSEC-signed version of all the glue and all the non-authoritative...
Geoff Huston: Far be it from me to say that Unbound are—are worse than most in terms of obsessive-compulsives, but you know, doing a fetch every 30 minutes, you know...
Willem Toorop: Yeah, yeah, yeah. No, it's fixed now.
Warren Kumari: Thank you very much, and thank you for having actual data. We like data. Alrighty. Next we have Wes, who I believe is going to ask to present, and he will share his slides from Tokyo.
[Slides: Testing Resolver Behaviours With Broken Authoritative Servers]
Wes Hardaker: Okay, so, um, I presented this a week ago at something at the ICANN meeting, but what—what we were looking at was, um, Joe Abley had a wonderful draft about what would happen if we make a name server record point to dot (.) for something like .internal. Um, and so I wanted to see how crazy is that idea. Um, so given that, I did a few things. Um, I did some background on sort of some name, you know, what does this mean, right? Names that do not exist. So I'm going to walk through that. Um, I'll explain my experimental test setup, show my results, and then the conclusions that I've drawn from it since then.
So first off, the DNS fails in many ways. Um, I think we just talked about some of that. Um, operators fail to deploy zones properly, there's broken infrastructure, or, you know, names that don't exist. And names that don't exist globally happen because in, you know, corporations especially are really good at creating internal corporations—internal name spaces. The three most common were .corp, .home, and .mail, um, which still don't really have global allocations for those three because of the—the conflicts. So SSAC, in ICANN space, you know, recommended using .internal. And then ICANN actually made a pledge to—to not delegate .internal so that it would be the safe thing that would never be assigned to anybody. Um, and we'll come back to that in a minute, but my real question is: well, what happens when you actually query these things? What does the DNS do?
Um, so this is actually the board resolution about .internal. The important thing is that it says the board's not going to delegate it in the future. Um, I actually meant to remove those slides, so we'll just go on. So, what happens when you want to do something like resolve child.parent? There are three cases, right? The parent can not exist at all, the parent exists but the child doesn't exist, or the parent NS record is broken in some way. And there are actually three examples of that. Um, there can be internal name server records with no glue, so it's a broken internal bailiwick kind of problem. Um, there can be a name server pointing to an external name server record that doesn't exist. Or Joe's new idea of a name server pointing to dot.
Um, this is not internal specific, but I'm going to talk about internal a lot and then we'll show my example of testing it outside of internal space. Essentially, the parent that doesn't exist at all is exactly what .internal looks like today. The—the TLD does not exist. So, let's walk through some examples of what's actually going on here. What would happen if you query for, you know, something .internal and you're inside your corporation on the left-hand side, right? The corporation sort of has its own root, it—it has an internal resolver or whatever that says internal does exist, and you'll get back a record. You'll get an A record, it'll, you know, may or may not be signed, it doesn't really matter, and you will remember that as a client for the lifetime of that TTL.
So what happens if you query externally? You've taken your machine, you're outside the world, in—in the real world, and today .internal doesn't exist, so you're going to get back a NSEC record, um, with a, you know, indicating that the name doesn't exist. Uh, it is signed, and you possibly will believe those results for the lifetime of the NSEC record, the TTL of the NSEC record. If, on the other hand, and so these are sort of the three possibilities of what might happen in the real world. Um, and I added this max line a—a yesterday and—and really I should have redone the slide, so but I'll walk through it.
So if—if .internal was actually an empty zone, which it's not today, it—you'd get back an NXDOMAIN, right? Because you'd actually reach an authoritative server that said that record you're trying to ask about doesn't exist. It may be signed, it may not be. The problem is is that you will believe that result for sort of the maximum of either the empty zone SOA's negative caching TTL, uh, which few people remember which one that is, the name server of the parent TTL will actually remember that. So you'll keep trying to copy—you know, contact that name server because you'll believe that for a while, or the name server of the child TTL. Reminder, this is all what a resolver would do. So everything I'm talking about today is not a stub resolver, it is a recursive resolver, you know, that that may be querying from outside.
The final one is that if you do a name server query for dot, you'll actually get back a SERVFAIL. Um, and that SERVFAIL is not signed, and the TTL, the point the length of time that you remember SERVFAIL is on the order of a second or two based on current implementations. All right, so all that's kind of boring. So why would you think about this in the first place? Because it turns out that re—that resolvers move, right? If you have a resolver on your laptop, um, somebody pointed out that systemd-resolved or whatever it's called, uh, actually does this to some extent, it can move. Um, so they can move from inside to outside and outside to inside.
So let's talk about the first case, right? You're inside, you have queried for an A record, you've queried for mail.corp or something like that. You get back an A record and you're going to remember it for the length of the TTL. Well, if you have that in your cache and then you move outside, you will remember that for the lifetime of the cache, right? So you'll actually keep the A record around, you might try and contact your mail server and, you know, that actually won't work because, you know, it may not be reachable from the outside. But—but, you know, other than that, it probably won't break you because you're still not going to, you know, fail to get to your internal mail server. That's not necessarily a big deal.
The reverse case is potentially worse, right? You're outside, you have queried for a, you know, something .internal, mail.internal, you get back a signed NSEC record that says this doesn't exist that you may believe it for a day. So, if you remember this and—and it's like the records today for .internal doesn't exist, you're going to possibly remember that it doesn't exist for a day. So you move your laptop inside, you know, your corporation, you query for mail.com, and your local resolver says, "Nope, that's not—that doesn't exist," and it's going to refuse to send you there for a day. At which point you'll probably go home at the end of the day and you'll still continue to fail the entire workday.
Uh, the other possibilities is if it was an empty zone, you could actually get an NXDOMAIN back. And you'd sort of control the length of the TTL if we—if we sort of decided that, you know, something like this should happen in the real world. This is not, you know, today. Um, so the problem is is that it depends on the negative answer TTL and the SOA record, like I mentioned, it depends on the name server's TTL records, and those are currently two days in the root. So, um, this is kind of it could be better if we could actually control the TTLs of the name server records down to something short like five minutes, but that would be hard to do at the moment.
And then finally, if we actually made .internal be an NSEC record pointing to dot, you'd get back a SERVFAIL. It would be very, very short remembered, and every time you, you know, queried it again, you'd end up sending more traffic, you know, up to the root to ask for .internal, or if you're running local root it would be immediate. Um, we'll come back to this in a bit.
So the real question is, a lot of people looked at this and went, "Well, if you have a name server pointing to dot, then you're going to go back to dot and ask for .internal and then, you know, you're going to go back to dot, you're going to ask for .internal and you end up in a loop, right?" So I thought, well, we can test to see if we can break the real world. Let's try and test and break the real world, see if this actually happens.
So I created a infrastructure setup using 5,000 Ripe Atlas probes, randomly selected. Uh, I dedicated two domains to doing this: one under .com and one under .games, just wondering if new gTLDs were acting differently. Um, so that's the domains I'll refer to them as sub-BBB. That's just a unique prefix for doing greps on the—on the PCAP traffic. Um, and they were all done with one authoritative name server, um, recording all the traffic that happened, uh, to that authoritative name server in PCAP so that I could analyze it. Note that with one server, it did mean that you'll get timeouts and things from things really, really far away because, you know, it took them a while to build up their cache, but I tried to work around that at least a little bit.
So the—the six tests that I did: one is querying for an existing domain that does exist, sort of as a baseline, a control. One is querying for a non-existent name like does-not-exist.internal. Um, one is querying for something that doesn't exist like, you know, name.sub-BBB.frostedx.com. One that was querying for a non-existent subdomain, um, so that's sort of what .internal looks like today. And then one that was querying for a broken internal bailiwick and broken external bailiwick, and then one finally for a name server pointing to dot. And I wanted to compare them all to see, you know, what these were like.
So, um, the analysis that I came up with has sort of three vantage points: one is from the data that Ripe Atlas itself returns, one is for PCAPs from the authoritative server, and then one is traffic received at B-root that I could safely grep out as, um, definitely mine. Uh, so let's start with the Ripe Atlas data. Uh, this is sort of the different types of answers that we got per test, and—and this is fairly complex. So there's basically I repeated the test six times: three for .com, three for .games. And this is really a bit confusing and blurry, so I did merge it into a single slide that makes it a little bit more easy to look at.
So, in all of the tests for things that exist, we got back mostly no error response in the Ripe Atlas data. That's sort of what you would expect, right? There was no error, I could actually give you the data. Um, so that's for something that exists. For names that don't exist, either because the record itself doesn't exist or the subdomain it was in didn't exist, you get back an NXDOMAIN. Again, mostly what you would expect. And then finally, for name servers that don't exist, that includes a broken internal bailiwick, a broken external bailiwick, and a name server pointing to dot, you get back SERVFAIL. Again, not super surprising, this is sort of what we would expect. Um, I also put it into a heat map just for comparison, and we'll see this again in a minute, but you can break these down into the same sort of things. The big blue bar at the top left is for names that exist, and you get back no error, and then NXDOMAIN and SERVFAIL. Um, then you do get back a bunch of timeouts at times, but—um, we'll come back to those in a minute as well. Again, the timeouts could be for bad probes. There's a lot of bad probes out there.
Um, so, from the Ripe Atlas conclusions, my—my thinking is for results that are expected, you know, exists, doesn't exist, and bad name server records, we really get back the answers we expected, although there is some word—weirdness. And most importantly, for an NS record pointing to dot, they're no worse than the other two main fail—failure cases, like a broken external or internal bailiwick, which is rampant in the internet today. That's not saying we should do it, I'm just saying that's sort of if I'm going to compare it, all the results, you know, all the answers came back looking for, you know, those situations that are already prevalent today.
Uh, so let's look at the PCAP data. Again, these are PCAPs of all of the traffic received at my authoritative server. And we can actually break this down into bits. So, um, this is hard to look at, but but what I want you to see is a few things. Right, there's six sections, uh, that you can see between the red lines. And there's basically six peaks in each of the test, the way I did it. Um, and then three of those individual peaks inside: one is for .com and one is for .games. Um, and if we break it down a little bit, the second little tiny peak is, and it's not always little, is what should be cached. So the first peak is the first test that resolvers definitely haven't seen it before. The second peak is they were queried again like three minutes later, the same set of probes. So once I pick 5,000 probes, I use the same probes for all of them. The second peak means it was within the cache length, so in theory, they should have been able to cache it. The fact that there's a peak at all means that they didn't necessarily cache it. Um, the third peak, you know, in each of the series is for things that were beyond the cache length. Essentially, I queried once, I queried again within the cache length, and then I waited until the cache definitely expired, 10 minutes, and then queried again. Um, so you can see that in this, you know, there's no massive peaks in any of this, um, even when you break it down.
So I decided, well, let's forget that and let's look at just the top 10 source IP addresses and see: is there any address I actually got on any resolver that just went haywire and was just sending a ton of traffic at some point, right? And as you can see, the top 10 IP addresses really are behaving in a peak, right? There's nothing that's suddenly sending a lot more traffic, um, you know, that—that didn't die off immediately. Um, so that's actually good news, it means that even with a name server of dot, I was not looping. Um, except that that last slide, this is from my second run, I actually ran the whole set of experiments multiple times. The second run showed nothing. The third run, on the other hand, when I looked at the top 10 graphs, you can see that there is one address that is not exactly behaving properly, right? It's actually generating traffic, um, longer than—longer than expected, right? So everything else dropped off immediately, but then there was one that kept sending requests over and over and over. One thing to note is that you'll note that that address is sending requests for all the broken instances, not just the name server of dot, but it's also sending it, you know, the same requests repeatedly for a broken, um, internal or external bailiwick. So again, it's pretty much the same as—so this is the same thing but with three addresses. Um, pretty much the same thing as if it was querying any broken infrastructure.
So that address, um, I looked at the resolver of that address, it just happens to be a—um, an address in Russia. No—no harms to—no—um, references to the country, that really doesn't matter, a broken thing could appear anywhere. Um, but that's the—the ASN that it was coming from. Um, if you look at the top 10 ASNs, you'll see that some it's not terribly surprising that infrastructure that has a lot of public resolvers or a lot of infrastructure that a Ripe Atlas node could be sitting in happens to be also sending the most traffic: Cloudflare being an example or Google or things like that. They're going to send the most because they're going to get queries from a lot of different places around the planet.
Uh, so, then finally, I looked at B-root data. Um, so this is the checks the the requests seen arriving for something under sub-BBB.frostedx.com or .games, arriving at B-root. Um, you know, nothing huge. If you look at the queries per minute on the left, that's, you know, 150 something like that. The previous graph—graphs were on the order of a thousand per minute. So, um, not a huge amount of traffic seen at the root. The point being is that, you know, I wanted to see: is anything hammering a root server because it was caught in the loop? And I could not find evidence of that, at least from one root server, obviously I can't track them all. Um, even top talkers showed, you know, nothing unusual in that particular case.
And so, what conclusions can we draw from all this? So conclusion number one: all error conditions cause more traffic. You'll note that the good ones on the left for a name that actually does exist, all the peaks are significantly smaller than all the peaks on the right where it doesn't matter if the name didn't exist or if it didn't exist because it was a broken name server, they all cause more traffic. Two: broken DNS, broken authoritative DNS is the worst. Um, SERVFAILs aren't cached, as Mark Andrews wisely pointed out on the mail list. Um, SERVFAILs because they're not cached, that will actually repeat every couple of seconds because—uh, current resolvers don't cache that. I will note that in Joe Abley's document suggesting an NS record point to dot could actually be a signal that this is a name that didn't exist and in theory you could cache it longer, but that there's nothing that implements that code today. Conclusion number three: a name server to dot is actually no worse than the others because those bad NS records for broken internal or externally bailiwicks are also going to cause SERVFAILs and also act in the exact same way. So an NS to dot actually is no worse than the other really badly broken stuff out there. That does not saying we should do it, I'm just reporting results. Uh, conclusion number four: there's really not much of a difference between .com and .games. I sort of expected that, but, you know, I set it up to test three top—two top-level domains of different historical lengths. Um, it's a good thing that it seems like the world seems fine with .com and .games being fairly similar. Conclusion number five: an empty zone is the best when you can control the data, right? Can you control the name server record TTL? Can you control the NXDOMAIN TTL? That's probably the best because you can actually control the length of time that a resolver might cache that negative answer. Um, NS=dot may actually be better in cases where the name server length is a day or two days and you really can't control that, which is still going to put you in a broken state when you wander, you know, from an external entity into your corporation, for example. Uh, conclusion number seven: strange things always exist, they always have on the internet. This surprises zero people at this point. Um, they're kind of hard to track down. Uh, stranger things always exist. Can somebody explain to me why I was getting up of 20 "no error"s for things that didn't exist or name servers pointing to dot or broken name servers? I would not think I should get "no error" as an answer, as the RCode answer, but I did, a lot of them. Conclusion number nine: even stranger things are afoot at the Circle K. Um, I got 22 queries for QuadA—for A6 records, which is the earliest A6, you know, QuadA-like record for IPv6 that I think was deprecated 15 years ago, I forget. Um, I got four queries for CNAME. So there are four resolvers out there that weren't querying for the record with an A record or whatever, they were literally asking, "Is there a CNAME there?" I don't know why. That's for a record that exists. Um, there were a couple of things that were asking for MX and TXT records. Um, I don't maybe they wanted to send me mail so saying that my name server is broken, I don't know. Um, so with that, any questions? Except for the last three, you can't ask questions about the weird stuff because I can't answer those. Uh, this is actually a—a picture of a sign that I saw here yesterday. It's just outside the building that we're—that the remote off-site is in and I thought "Exciting Amusement" properly summed up the conclusions from all this. Anybody with questions or comments? Yelling at me for the way I did it, incorrect, wrong? No hands? Warren, you need to—you need to create hands. Otherwise, you gave me too much time.
Warren Kumari: Come on, Wes said it's a perfect opportunity to shout at him. I'm sure somebody would like to shout at Wes. That's funny because in—when I gave this talk a week ago, there was a bunch of people that had to come up to me afterwards because there was no time for questions. Now that there's time for questions... So I think, Warren, I will turn it back to you at that point. And Jen didn't yell at me for running over in time either.
Warren Kumari: We actually are ahead of time. Okay, so thank you very much. Next, we have Peter—Mr. Peter Thomassen, who will talk to us, again, about DNS stuff. A reminder for everyone: if y'all want less DNS topics, you're going to have to offer to present something else, because we can only present what people have offered. Um, here we go. I mean, I personally think DNS is awesome, but some other people, incorrectly, might disagree. Hello, hello.
[Slides: Bitflipping root-servers.net]
Peter Thomassen: All right. Um, Peter Thomassen. Um, I'll present some measurements about what happens if you register domain names that are bitflip variations of the root-servers.net domain and then see if that causes any resolvers to latch on to use them. The measurement was done in two stages, so there's different degrees of interesting—ness in the results from the first and the second stage. We'll get into all of that.
So, what's bitflips? Um, so very briefly, you know, memory can suffer random errors in which a bit is corrupted and that can have various reasons, and is very unlikely to happen, but it does sometimes happen. And if you have sufficiently much memory, then eventually it'll happen, and maybe it happens in a resolver. So let's see. Here's a bunch of reasons, um, how that could happen: for example, manufacturing or you operate the the hardware outside its intended specification with temperature and all of that. It could be radio contamination. Um, this—there is actually a case of radio-contaminated packaging that caused memory to actually fail. Um, and yeah, so some people think it's mostly due to cosmic rays. Um, I don't know if that is true because of the results that we later see, they don't look like it's random bits flipping. So um, it sometimes seems to be structural defects in memory. Anyway, the common assumption that hardware operates correctly is usually the case but is not always fulfilled, and then um, yeah, things can happen. What's interesting is that when that happens, you don't need to do any actual hacking, you just need to exploit the fact that the machine is doing something wrong. And what exactly then goes wrong in the application, or maybe in a middle box or something else, that impacts what you can do with that from an attack perspective. In general, when you consider the three security properties: confidentiality, integrity, and availability, here the integrity is affected, and that can of course sometimes be exploited to impact confidentiality also when you manage to extract a key, for example, because it's sent to the wrong location because of the integrity failure. Mitigation for that is to detect it, um, using checksums, for example, or to do error correcting codes in memory, but that is not very common, um, mainly I guess because it is rather expensive. I mean, it's not expensive per bit, but if you buy a lot of hardware and it has that feature, then it's a difference.
So the whole thing is inspired by Warren's experiment that he presented last year at an ICANN lightning talk. He registered a bunch of names that were bitflip variations of commonly used names like microsoft.com. One variant of that is microsont.com, and he—he stood up some fake systems under that domain and received traffic that then looked like it was meant for some cloud endpoints or something like that. And then depending on where in the connection stack the problem happens, for example, whether it is before or after the browser composes the TLS handshake data, then you might actually succeed, for example, um, doing a TLS handshake with a fake name but still getting an HTTP request for the original name, so that allows you to extract cookies and stuff, so that's actually quite interesting. And I thought maybe, you know, if we do the same thing for the root-servers.net, that would be interesting to see what happens.
So, okay. Um, one quick reminder: so there is something that's called priming queries. Resolvers have their bootstrapping configuration about what the root server hostnames and their IP addresses are from what's called the hints—hints file. That is usually hardcoded. And every once in a while the content of that might change, or maybe it doesn't change and the root servers' IP addresses actually get renumbered or something. In that case, a resolver can do what's called a priming query and update its knowledge to be able to use the new name servers. And that is a situation where the root-servers.net domain names actually get queried and that's where these bitflips could could potentially be exploitable. But I don't know how many priming queries, for example, happen in practice. And again, a condition of that would be that the resolver has undergone a bitflip before that happens.
Um, yeah, so skip over this. So here is what the priming query looks like. So this is essentially a reminder. It's just an NS query for the root name and what you get is the name server records for the root, A to M.root-servers.net, and also the glue for that. And essentially that looks like the hints file content, which is pasted here as an example, just with different TTLs, interestingly. All right, um, so while that just said the previous slide, when you don't use DNSSEC, then you won't be able to detect such a spoofing.
All right, so we registered wood-servers.net, which is the name server infrastructure that hosts the zones for the fake root-servers.net zones, right? So for example, if you have a.root-servers.net, but you replace the R by an S, for example, which is a bitflip, then the soot-servers.net zone has to be hosted somewhere so that you can then later point to the fake root servers. The name servers on which those bitflip variations of root-servers.net are hosted is wood-servers.net, okay? Um, so that's why it's called the Woodserver experiment. Um, the setup is—okay, sorry. So what we did is that we registered all the one bitflip variants of root-servers.net that we could actually get a hold of. Total there's 56, and eight of those were already taken, for example, boots-servers.net. So I don't think that is necessarily an attack, it's a plausible name to have. Um, and the other 48 that we could get we registered. Um, it originally was 47 and then one of the others got available. Um... Oh, thank you for the water. I'd be preferring coffee. Much appreciated, thank you. Um...
And yeah, what's interesting is that there's more variations actually where the bit is cleared, as opposed to where the bit is set. That's because, um, either lower or upper case numbers, I don't know, I don't remember, when you flip a bit, they they might also correspond to a digit, and in the other way, that's not true. And that is this opportunity is always in the same direction where you have to clear a bit. Um, so that is why there's more clearing opportunities than setting, which I found interesting originally, I found that confusing.
So the name servers of that are as I said A and B.wood-servers.net, and we recorded traffic on that. We also recorded traffic on the actual fake root name servers that we set up under A to M dot and then the bitflip variant of root-servers.net. Rutgers University, which is my alma mater, was nice enough to allow me to store data and use their CPUs to do the analysis. So this was done last year July until mid-January this year. There were 16 gigabytes of data at the Woodservers name servers and 220 gigabytes of data that arrived on the fake root server infrastructure. And of course, as usual, there is a lot of random noise, so needed some sanitization.
Let's first look at the traffic that arrived on the Woodserver infrastructure. Um, so A and QuadA queries for A to M dot a bitflip variant of root-servers.net. So those were on two VMs, they have different IPv4 and IPv6 addresses, so that's A and B. And here's the daily queries for the A name server for any single letter from A to M dot any bitflip variant of the root-servers.net domain. Actually the .net TLD doesn't have a delegated bitflip variant, so that's why we didn't try other TLDs. Um, and you can see that there is a constant influx of queries of a few hundred for IPv6 queries and a few thousand for IPv4 queries. So it looks like these names are actually being looked for. The same happened on the B name server. So these look pretty much the same.
Looking at the number of query sources, that is also not unexpected, there was a variety of query sources, V4 and V6 sources. Same for the B name server.
What's interesting is the frequency of bitflip variants. Um, so this is both name servers together, and I just plotted from the 56—sorry, from the 48 names that we registered, which are the most queried ones. So typically, in the test period, all of these variations got around 34,000 queries. And for some reason r-k-o-t-servers.net stood out. Later there will be some hints on why that is. So if you consider, on the right hand side, the O to K flip is in the third bit on the right. And the second winner was r-o-o-t-s-u-r-v-e-r-s.net where the last E in the second label is replaced by a U. And if you look at the thing on the right, you can see that that is a different bit that is flipping here. And for the other names, things didn't really stand out that much. For some reason the very first name in the list, the root-server double-R dot net, is an outlier towards low numbers, it only had around 2,200 queries. I have no idea why that is.
What's also interesting is that you can get double bitflips. Um, so there were three instances of r-k-o-t-s-u-r-v-e-r-s.net which seems quite unlikely to happen, and actually the number was pretty small. Um, I don't know by heart which bit in the ASCII table takes you from V to R, so I don't know if that's also the third one, I don't know if it's the same bitflip so to speak. But anyway, it is also—1, 2, 3... yeah, it's I think 7 bytes apart. So it doesn't seem to be like a hardware layout thing. I don't know, whatever.
Um, so this is for A queries. For QuadA queries you have the same winner but there is a different—yeah, there's a different second winner. So I'm flicking back and forth. I don't know why that is. Um, so the second winner now has a bitflipping from R to S. Um, and you can see that on the right-hand side. That's the first bit. And overall there's less QuadA queries. So far so good.
All right. This is the frequencies of other, or actually all prefixes that I saw queries for. We actually only answered for A to M, and the rest was not answered. Still we saw queries for things that were like random. So there's a bunch of wildcard queries, A and QuadA. So if you look at the bottom right, there's about 2 million of them, which is much more than for all the other names. And yeah, so for the other letters and numbers with some distribution, I don't really have an interpretation for that. Maybe some of that is bitflips of the actual single letters or something.
Okay, so the second step is to look at the fake root server infrastructure. So those are the servers to which the A and QuadA records we just talked about would be pointing to. Um, so we observed 114.6 million queries in those six months that we did the measurement. 75% of those were any queries for names like europa.eu or esc5.net and for .sl. I don't know why .sl is such a common name for any queries, but I think I've seen that in the DNS-OARC chat reported by other people too. And around 24.6% were TXT queries for names like cisco.com and a few others. And if you actually look at the TXT responses for those, then you will find that they are really large, several kilobytes. And the same would be expected for a reasonably engineered any query. So maybe all of these queries are just amplification attacks. I'm not an experienced resolver data analysis person, but I would think it looks like that. So if you remove all that, which sums up to 99.9% actually, then around 60,000 queries remain. And so those are the interesting ones.
I also removed other queries that didn't make sense like from the Chaos class. And yeah, so it's hard to say what that means because if a real resolver would latch on to that stuff you might expect more queries, or maybe not, because the referrals are all cached. Yeah, so some refinements might be needed to tell exactly what that means. I'd like to emphasize that we did not spoof any responses. So those fake servers actually were DNS-dist proxies to the K-root server. And so if anyone ended up using that stuff it would still work in practice for them.
All right. So from this first measurement phase, the half year, here's some interesting queries from a European mobile ISP. Um, so they're all from the same /16 subnet and they stretch from July to October. The first name at the top, glocation.garmin.com something Cloudflare, Garmin is a manufacturer of car navigation software. Um, so this is on July 11th. Then on July 26th there is a query for something in Volvo cars cloud endpoint it looks like. And in September, something vehicle thing too at Amazon. Interestingly this name has two bitflips in it, the Qname has two bitflips in it. So the whole thing including the newline is a long Qname. And I think it should both times be eu-west-1a, I think that's what the name of the cloud endpoint should be. And in one case the dash is replaced by a closing parenthesis, which is a bitflip, and the S is replaced by a W. So this is actually something seen rather frequently in this subnet, that there's multiple bitflips in this sort of query name. Um, then also the amazonaws.com domain is hosted at Amazon name servers. So there were also queries to resolve the name servers' hostnames, for example these ones in October. And not only did the question end up on our infrastructure, which requires a bitflip, it also had a bitflip in the co.uk name which actually it asked twice: so once with and once without the bitflip. I'm showing you this to exemplify that the problem in this subnet is apparently a permanent one and apparently it's related to cars. I don't know why that is, but maybe there is for example, some car that has a SIM card, and every time the car starts up it'll do these queries or something. And then the memory within that system has a manufacturing fault or something. I don't know. I find it interesting though that this car apparently is doing the resolution itself. I would expect it to use some resolver that is not within the car. So, I don't know. And then the question wouldn't be coming from the mobile ISP, right, if it was the resolver. So, I don't know.
All right. So next, improvements. Um, I looked at the stuff that I just showed you first in mid-January and then based on that implemented some improvements. And those measurements were run for three weeks, February 5th till 27th. So this is not very much time. And I'm considering to keep this running for longer. It comes with with some cost, I mean the domains cost a few hundred dollars per year, and then you need four VMs. We've talked about two but you will see there's two more now, and all the storage. Um, so if anyone is interested in collaborating for providing these resources that would be pretty cool. But I'll tell you about the improvements first.
So, I said earlier that we had only answered A and QuadA queries for... Ah, I made a mistake earlier. I claimed that we only answered A and QuadA queries for... no, I didn't make a mistake. Anyway, so, for the bitflip zones, we made those real zones so that we would send NXDOMAINs for the single letter subdomains that don't exist and that we would also not drop and instead respond to NS and DNSKEY queries for those fake root server names. The bullet here is wrong, it should be saying do not drop NS and DNSKEY queries for a-m dot bitflip of root-servers.net. The names that are listed here are just regular zones, so they never dropped anything anyway. So it's just a typo here.
All right, so we made these real zones. Then we also changed the responses that we would send for the fake root server names such that we would not only return the A or QuadA record for the fake single letter root server name, we also included an answer for the non-fake name. And then depending on where the bitflip happened when the client made the query, it might be expecting the wrong or the correct name, right? Because it might be preparing a question for the real name and expecting the answer for that and the bitflip happens later. In which case it'll be useful to have that second answer in the response. Um, and to make sure that these cases are distinguishable, we returned different IP addresses for that case. So those are the new VMs that I mentioned, P and O.root-servers.net, but those names don't appear anywhere. And the same thing for IPv6.
All right. We also found in the data that we repeatedly saw priming queries and we discussed before that that could be interesting, so we actually started tailoring the priming response. So all other responses from the fake root infrastructure were actually from the K-root and DNS-dist proxied. And only for the priming responses, we did return our own name servers. So this is where the own P servers appear and we did return their IP addresses in the additional section as glue. But those are different IP addresses from what you would get when you query the A records for the single letters. So we can distinguish whether the traffic we get is from a single letter address query or whether it is from a priming query, okay? And yeah, so this is two name servers here because diversity and all of that. And then actually, yeah, so two are required because of diversity. And then the IP addresses shown here and on the previous slide actually share those VMs. So each VM has two IPv4 and 4/6 addresses, and then you can tell by the destination address in the PCAPs what's happening.
All right. So some interesting queries after improvements. There's the Ford Transportation Mobility Cloud, which I mean, it just went to autonomic.ai which is apparently a thing that Ford is using. I'm spelling out the name here because Ford is just the user of it, it's not Ford's infrastructure that is wrong. The IP addresses shown here for the sources are not from Ford, they are from an American mobile provider. So the problem seems to be similar to the one that I mentioned earlier with Volvo cars, except that now it is American and not a European provider. What's interesting here that priming queries are shown and then the red IP addresses show up here which are the ones that you get when you look at the glue from the priming response. So this looks like whatever was doing this resolution thing did actually do a priming response and later sent queries to that, which is interesting. Um, I also don't know why the queries are being repeated so many times within a few seconds. So this is all on February 17th, within a few minutes from the same source, and those are not SERVFAILs, right? Because we did give the proper referral for the .ai TLD. And yeah, so maybe something else is broken on their part.
All right. So here you can see again Volvo queries, also American this time. Um, the stuff is related to a product that's called G-Book, which is also a navigation system, I think, from Japan. And it has been discontinued, I think, like 10 years ago. But um, yeah, so maybe that's the obsolescence time for memory stability or something. I don't know. Anyway, um, so you can see also that the bitflips here sometimes happen in the Qname, for example, .com gets turned into .aom. It's actually sort of reasonable that maybe a requery is happening here. And also the name that is being queried here is a CNAME for something cloudfront.net. And I'm not pasting all the queries here, but there is a whole lot of cloudfront.net queries, including clouddront, which is a bitflip of Cloudfront and various other combinations of that stuff.
Here's the G-Book queries that I mentioned before. Those also happened to the IP addresses that you get from the priming response. The previous queries were from the ones that you get when you do the single letter lookup. So I don't know what's what. It's the same source address asking the questions.
And that subnet also did ask a question for voltocars.com and the endpoint that is here in the subdomains is actually the same one as earlier in July in Europe. And so that seems to be the same system. Note that it's a few hours apart from the G-Book queries. Not only hours, it's actually a few days.
Um, and now if you look at volvocars.com, not Voltocar, then you will see that the name server for that is cscdns.net and the name server for that is cscudns.org. Um, and yeah, so we see queries for addresses of both these types of name servers that follow when you do the resolution for volvocars.com ending up on our non-priming response IP addresses. Um, so this seems to be a resolution cascade that uses all kinds of things for some reason, and apparently it's multiple resolvers in that network because the source addresses are different, or at least you know, multiple IP addresses on the same interface. So this looks like a real resolution cascade to me. And again, it seems to be in the car context. We couldn't observe many non-car things.
So to sum up, there were about 10 million queries seen for the single letter names and 100,000 queries on the fake root name servers after pruning. And we found around 10 interesting resolution cascades within the last three weeks where we did this measurement with the improvements. We didn't send any fake referrals. If we did that, then maybe we would see—like maybe the resolvers don't come back to us because they've cached whatever the .com delegation and then yeah, they ask the .com name server and we don't know if they actually ended up there through our response or whether our fake infrastructure is not of any relevance. So we were quite hesitant to actually fake referrals, which is why we don't know that. But it's possible that if we did that, that actually some resolver population would latch on, I don't know. And I also don't know for how long they would do that. I guess until they reboot or something, I don't know. Anyway, so this could be refined by looking at country origins and ASNs and whether the sources are resolvers in the sense that when you send a query to the source IP address whether you can use that for resolution. Um, I've tried that. I'm not sure whether with all, but at least significantly many and I did not get a DNS response from any of the sources. On the other hand, the source IP addresses from resolvers are not usually also serving DNS, so I don't know. And then I haven't looked into any DNSSEC aspects of it. Um, yeah. So that's the state of things and I don't know how interesting that is. Um, the problem is certainly mitigated in the sense that regardless of what we measure or whether that continues, the names are now taken. So if there is any risk that cannot currently be exploited because we don't do any bad stuff with that. Um, should we keep these names registered? Should we keep measuring? I don't know. All right. Um, so, thank you.
And for some reason, the slides share suddenly stopped, but Dwayne is up.
Dwayne Wessels: That's fine, it was the last slide anyway. Yeah. Hi, Dwayne Wessels from Verisign. Thanks Peter, this was really interesting of course. Um, and very clever to use different IP addresses for priming and single address queries. I—I kind of expected to see maybe some data on which one of those two was more commonly referenced later on. Did I—I don't know if I missed that or maybe you just didn't have enough data to do that.
Peter Thomassen: Um, so I have haven't extracted that number, so you know, it's hackathon time, so...
Dwayne Wessels: Okay. Cool. Yeah. Thanks.
Warren Kumari: Some other question? We have a Lorenzo in the queue who I believe is remote.
Lorenzo Colitti: Second one. Yeah. Okay, so let's see if this works. Whoa. Uh, so, I'm sort of wondering how you can get all of these bitflips. It seems to me that if memory on the client device was at fault, you'd basically see these things crash shortly after with sort of illegal instruction, right? And I'm sort of wondering, do you have any idea about how many of these bitflips come from the client memory, and like could they be just being induced by the network itself, right? Because that seems for, you know, I guess you'd see checksum errors in that case, at least in the UDP checksum, and then your server would have dropped that. So, I guess you wouldn't have found that because your—your server would have dropped it due to bad UDP checksum. But that might be worth, you know, might be worth building a server that doesn't care about UDP checksums and answers anyway. Wonder what you'd see then.
Peter Thomassen: Yeah, good idea. I don't have a prediction what would happen. Um, one last thought: um, so many of the names I showed you earlier, like, um, from from the different bitflips even from after the improvements, even from within the US, those also showed, um, that usually the same bitflips. I think it was the third bit from the back at the European provider and the fifth at the American one or something. I think definitely it looks like a structural um, thing in the device. Not a random thing and also not a transmission thing to me. But yeah.
Lorenzo Colitti: Thank you.
Warren Kumari: Somebody else just had walked up to the mic here and then disappeared. Oh there. Okay.
Marco Davids: Yeah, it was me. Marco Davids. I think Lorenzo already touched on my question, but I'm—I'm willing to repeat it just for my understanding because very interesting presentation, I'm not sure if I fully understand it. So my question was, uh, the bitflipped query comes in your authoritative name server, you respond, and you copy the original query in the response, including the bitflipped version and not the one that the resolver was actually asking for, right? That's correct, right? And have you looked into how resolvers respond to that copied answer in their query, which is not what they were actually looking for?
Peter Thomassen: Um, so we respond as if the Qname that we received is the one that is actually the one that the queryer wants to know.
Marco Davids: Aha. Okay.
Peter Thomassen: Um, yeah. So...
Marco Davids: All right, so that answers the question. Yes.
Peter Thomassen: And it's true that that perhaps, um, such responses in some cases where the resolver is awaiting for the actual name and not the flipped name will then discard the response, and in that case it would help if we did fix the bitflip in the copying of the question section into the response. We did not try that.
Marco Davids: All right. Okay. Thank you.
Davidson: Davidson from Alibaba Cloud. And um, very thank you for the sharing. And I think a very interesting, but I have a question is that I think it's true that some of the priming query were hijacked to to the bitflipping domains, but how can you tell the difference of or tell the difference between the hijacked bitflipping domain and the real query names? Because you know there are always some uh scanning the random domains in the internet and how you tell the difference?
Peter Thomassen: Yeah, that is true. So there was a lot of scanning like Shodan and internet measurements dot com and whatever universities. Um, so I actually went over I categorized queries by source network. Then I looked at who owns the source networks and I discarded things that were obviously from measurements. And the things that I showed here as the interesting queries are queries that had no indication of being from a measurement from somebody else. And that also looked like actual resolution cascades because they have multiple things after each other that you know relate to something the root would first see and then you know, a later query for a subsequent resolution of the name server hostname for the SLD or something. So these looked like plausible things that would usually not appear in an internet-wide random measurement.
Job Snijders: And one last comment. I think earlier Lorenzo said that he's surprised there's so many of these bitflips. I actually don't think it is very many. I mean it's a significant number, but the fact that we saw around 10 interesting cascades in three weeks given that this is a global thing um, I don't think is a very high number. Okay.
Warren Kumari: Thank you very much. And now, shockingly, we have a talk about DNS. Um, might actually be more about V6, but a reminder again, if y'all want talks about something other than DNS, please propose talks about something other than DNS. Or not, I'm perfectly happy with all DNS all the time. Do you want to use the clicker? I guess I can use the clicker.
[Slides: Measuring DNS over IPv6]
Geoff Huston: I'm Geoff Huston, I work with APNIC Labs. This is more a measurement talk than it is a V6 talk. V6 is kind of marginal in—in this subject matter. And it's also about the DNS. Um, Willem's talk was actually really good illustration of the opposite and also actually um Wes Hardaker's talk. Because when you're trying to measure DNS behaviors, one way is to actually create a microcosm of the DNS environment, you know, set up a recursive resolver, send sort of queries through Atlas probes and so on, and look at what you get back and generalize it out. And that was a classic example there. You sort of said, "Well, I saw these," and then you extrapolate further out to the total load of the root name servers. They wouldn't—and there's nothing wrong with that approach, but there's a big leap of faith that what I see in the microcosm is what you see in the entire environment. We in APNIC have been doing almost the opposite, where by using ad-based measurement, we actually seed the entire internet with ads. I think it's about 45 million ads a day these days, it's a lot of samples. And everyone basically who sits on their mobiles or gets an annoying Google ad, statistically one time or another you will see our ad. The ads kind of penetrate across huge amounts of the network and over time, and we've been doing this for about 15 odd years, you get a big picture of the network. But in some ways that picture can be as misleading as the microcosm. And the story that I actually want to tell about this is actually a story about IPv6 as transport for the DNS.
So to wind this back, and it's winding this back 22 years, Christ we're getting old. Uh, this was an RFC from Alain Durand when he was with Comcast and I've forgotten who the co-author was. They could even be in this room, so I'm sorry if you are. Um, and it was a sort of a weird RFC. It's ostensibly about V6 transport and the text said: "don't stop using V4." It really was just a reminder to say too much of the internet relies on V4, you'd be stupid if you tried to do this V6 only. And, you know, the guideline was every DNS zone should be served by at least one V4 reachable name server. Now this got revived, I think about four or five years ago, um, where why aren't we using V6 in the DNS? And there was a long argument at the time about to what extent does V6 make a really bad hash of big payloads. Part of the issues with V6 is you can't do fragmentation on the fly in V6. You've actually got to send back the ICMP message to the host and the host has then got to resend that packet appropriately fragmented. In TCP that sentence makes a huge amount of sense because I keep a hold of what I sent and when I get back that ICMP I can do that. In UDP what I just said was bullshit because in UDP when I've sent the packet it's all over and I get back an ICMP going, "That was a dud," and I go, "Well yeah, probably was. You should resend it." "No I can't. It's gone." And there was this sort of argument at the time, and I certainly had measurements to sort of show that large packet behavior in DNS over V6 was kind of not good. And if you're trying to recommend, as this document is trying to recommend, that we should have V6 for the DNS everywhere, the caveat is that you're going to get jammed into some nasty corners if you apply large payloads, for example, jeez, DNSSEC with really big keys. So it sort of was an interesting kind of question: can we show some data about this? And at the time I was doing data—I'm clicking forward and Warren and...
Warren Kumari: Nothing working?
Geoff Huston: You've got a problem too, yeah.
Warren Kumari: How do I... I don't know.
Geoff Huston: I might need to stop sharing and re-share.
Warren Kumari: Stop sharing and re-share.
Geoff Huston: Right. So the assumption behind that more recent document was "why aren't you using V6 oh DNS people? It's efficient, it's well understood, it's mature. Go and use it." And the real question is: is that true? So I'm not even looking at the corner cases. What I'm trying to do is actually really quite vanilla. Here's a response, it's well under 1,500 octets, it's well under 1,280. It's not going to run into any of the V6 complications with large payloads. And the question is really simple: if my authoritative name server only answered in V6, what proportion of the entire internet can't see me? Reasonable question. It kind of is testing the maturity of V6 in the broader—in the broader spectrum. Seems like a decent question to ask, but the measurement is—is kind of the topic of this talk, because measuring that is not as straightforward as it might sound.
Now, what's a resolver? Oh shit. Because it could be a single platform, I run a resolver. Google run a resolver behind all eights. No they don't. They run a large number of machines behind that single service address, and if you go to any individual site you'll find a large cluster of individual DNS engines behind that. Is it a bunch of DNS engines, or is it a bunch of query engines with a common brain? In a server farm, how do they talk to each other? Do they multicast cache? What is a resolver is actually a really fascinating question in the DNS and—and quite frankly there is no single answer. When you start sort of poking into the DNS, what you actually find is some remarkable aberrant behaviors, and resolvers are not kind of commonly understood as a concept. So when I talk about, well, how many resolvers can reach, you know, an authoritative name server over V6, well, I don't know what a resolver is. I really don't. So it's kind of a dud question. The other issue is if I'm just doing a head count of resolvers, I run a resolver at home, it has me as a customer. Its weighting is one out of how many people are on the planet today, 8 billion? It's teeny. Whereas if I'm running Google, I probably have around a 10% market share of the internet user population, maybe 15%. So if I'm talking about the Google resolver as a single entity, I'm talking a massive, truly massive footprint. So just counting resolvers and going "X percent of resolvers can and X percent can't" kind of bullshit, isn't it? It has absolutely no proportion to the experience you and I have as users.
So what is a resolver is a really hard question. And I'm not even sure that measurements of resolver behavior should refer to the resolvers because, like I said, some resolvers are important, some like mine are astronomically unimportant. And then you go, "Well okay, what I'm going to test is a query over V6." So what's a query? You kind of go, hang on, a query is simple. I have a DNS system somewhere that's running the DNS protocol, it sends out a packet over, oh, UDP port 53, it has a certain formatting, that makes it a DNS query, yes? And if I'm looking at a resolver inside your machine that you're staring at right now, you might well emit one query. But that will go to the local thing that grabs the DNS. And that thing, which is also a resolver of some sort, might well try to fan it out. And what we actually find is that as queries enter into the cloud of the DNS resolution environment, they often get duplicated, triplicated, they often get copied all over the place. Sometimes it's immediate, sometimes they wait, sometimes that wait is very, very small—10 millisecs. Other times it's more generous—a third of a second. Other times it's Microsoft, they reckon you're a very patient user and they'll wait a whole second before duplicating the query. It's UDP. You never know when the answer will come back, and so the common behavior is twiddle your fingers for a while and whatever you think is a good time, send the query again until a) you get bored or b) the user goes away or something.
Hang on a second. The DNS does not have an interrupt. So when the application says to the DNS resolve this name, the application can hang in the next microsecond and disappear, the DNS doesn't know that. It's been given a task down at the stub resolver level that then forwards to the recursive and forwards on. It never gives up. Now some resolver implementations do give up after a timeout. Look, I've been working at this for too long, in BIND's case it's 7 seconds. In Unbound's case... I think it's a geological eternity, or it could be two. It just never stops. It exponentially backs off, but as far as I can see, you know, even a year later, "Oh, I've still got this unresolved query." Reboot the machine if you want to kill that crap.
Um, the other thing about DNS is that if I refer a query to you and you go, "Oh look, I'd like to help you but I can't but I think I know someone who can," you forward it on, and you forward it on, and you forward it on. There's no hop count. Unlike IP packets, there's none. There's no trace. So I can set up a forwarding loop and the DNS goes, "Not a problem, dude," and there probably are DNS queries circling around forever because there's no way of stopping 'em. So a query is not simple. They have a life of their own.
This is just a snapshot of one day in the ad system, and what you actually find is to resolve a name these days, most folk in red ask for an A record because there's more V4 than V6 even today. Um, the other thing is of course these days that the Apple infrastructure, and not Android but the Apple infrastructure, have taken to the HTTPS query like a duck to water. And these days almost the first thing it asks for is HTTPS—what protocol should I use for the resultant, you know, fetch? So HTTPS figures—that's the light blue—and the green is V6 A records. So that's the number of single queries that I see for that name. But hang on, sometimes a query happens twice, three times, four times, five times. I arbitrarily cut that off at 10, I could have gone to around 2,000. They just sometimes the queries just duplicate. And—and the other thing you find is that in a sort of a cumulative distribution, around 70% of queries happen once but 30% happen at least twice or three times or four times and so on. It just goes on and on and on.
So in trying to understand the DNS, you've got to try and understand root cause, and the DNS is a massive feedback amplification system that's as noisy as shit. And trying to get the signal out of that feedback amplification system is kind of interesting. So, I mentioned that the way we measure is to seed the world with ads. No we didn't invent ads, Google did. Google make, ah jeez, about a billion bucks a year isn't it if you look at their annual reports. It's what drives the world, or at least their particular part of the world. Uh, we just piggyback on. Um, the ads are simple. If you get the ad it runs the code. You can't stop it. It's on impression, do this. What does it do? It just fetches a bunch of URLs, typically around 10, and then it stops. It's quite silent, it runs in the background. But the thing is, all those URLs have unique DNS names for that instance. So every single ad has a different set of DNS names, and the only authoritative servers out there are ones we run. So we get this remote view of the user. And we get forty odd million of them every day.
So, back to that measurement question that I was asking. How much V6 is out there? And the way I'd like to phrase an answer is not "well, 20 resolvers, thousand resolvers," no no no no no. I'd like to phrase it as users. And I'd like to say well if we're sampling users maybe I can talk about the percent of users. So here's an answer. That's, um, I can't see the slide from here, 66%, 67%... whatever. Two-thirds of the world's users can actually resolve a name if the authoritative server will only answer in the DNS. Cool. How do I do that? Well, I don't—I can't see if you successfully resolved in the DNS. I can see your query, but I don't know if you got the query—got the answer. A lot of problems we had with V6, as you might recall back in the early days, was that you could certainly send the packets out, it was the incoming filters that blocked you. So over there if I'm measuring your V6 traffic a single packets, I was seeing a huge amount, but none of them were getting back to you. So the web session says you got the answer and you fetched a web object. And you fetched a web object. So what I'm actually measuring is two things. I'm actually seeing "well you could resolve it using V6 only," big tick, and you fetched the web object. Over what protocol? I don't care. So the object is dual-stacked, I don't care what protocol, you know, whether you use V4 or V6 to get the web object, if you went and fetched the web object, you successfully received the DNS answer.
Cool. So that sort of 60 odd percent looks good, right? Obviously I wouldn't be asking if there weren't some doubts about this. Because when you're measuring, a lot of the time you're measuring what doesn't happen. For 30% of the world's users, they couldn't do the fetch. And that wasn't a positive signal, it was the lack of a signal. Right, so why didn't they fetch it? Well, it could be because they didn't get a DNS answer. I can't tell, I haven't got any other code on that user's device that they've seen the ad. It could be that they just went somewhere else, because, you know, users are remarkably fickle. They move on, we all do. So they may not have actually hung around and done the web fetch. So what if it's being filtered by middleware? What if there are other factors?
Why don't I get rid of the web? Can I make a DNS test that says in the DNS whether you actually received an answer? Um, yes you can, because of a strange quirk of delegation. You see, when you delegate a zone in the DNS, you do not specify its IP address. You specify its name. If you want to know why, Christ, ask Paul Vixie or someone who bloody knows. I don't have a clue. It does seem like a weird thing, but the DNS is sort of part of this obsessive-compulsive, you can only give names for dele—for delegated name servers. Me.
Normally, when I operate a zone and I have a delegation, I'm also a helpful chap. So are you. Normally you would put the IP addresses inside the zone as well as glue. Whether it's part of the zone or not doesn't matter, and the referral answer you get has "this is the name of the name server and by the way here's their IP address." But it's perfectly legal not to give them that help. No, sorry, here's the name of the name servers, go fish. Have fun. So I do. I set up a second zone which is actually the name of the name server, and the poor old recursive resolver has to stop what it's doing and start to resolve the name server name. And this is where I put in the V6-only bit. Because what the sibling zone then does is say, "Well, here's the address of the name server you were after, but it itself only answers in six." What that means is that third query only happens if you got back an answer. So if I look for those third queries, I know that you could make and receive a DNS query in V6 only. Which seems a lot tighter. A lot tighter.
So, you know, does it make a difference? Oddly enough, yes. And this is kind of intuitively sort of interesting. Around 15% of users will actually show that they can do V6 in the DNS even though they didn't get the web object. You kind of go "well the web signal's kind of lossy," this is sort of intuitively obvious, right? Now, this is kind of what we expect, but the problem with this is averages lie like crazy. Um, Algeria... 60-70% do it on the web. Glueless delegation? Nah. Algerian ISPs do not resolve names that don't have glue records. They just don't. V6 or not, I have no idea. Uh, across the world though, North Africa seems to have this common behavior. Libya and Egypt do it too. They just don't resolve names that have glueless delegations. Who would have thought? Um, and if I go the other way, I also find not just 1 or 2% or 10%, but a massive drop that some folk go in the DNS, "Yes, V6 no problem," that web fetch, "Oh I don't like the idea of fetching a one by one pixel that isn't even flesh-toned," or "I don't like that. I'm not going to do that." And so a number of countries, Bolivia... um, I don't know, you can read as well as I can, Myanmar... Myanmar's interesting. Myanmar is a mess, uh, to measure Myanmar. It's totally dominated by, um, spam factories in the north of the country and, uh, wow. The DNS gives a decent signal, the web signal is horribly confused. So you kind of think that there's this pattern, but what you actually find is massive divergence in both ways. And I can plot that and show yes there is massive divergence in both ways, and I can sort of point out that there's anomalies in both directions where DNS glueless is blocked or the web fetch is blocked.
So the averages kind of lie. And what I would expect, yeah, about a 10-15% difference, actually doesn't quite work out for everyone. So then I start to combine them and say, "Well okay, in this ad I'm doing two measurements. If you do one or the other, let's go and give you a tick." Wow. As soon as I do that, I get a number that's higher than either. That there's a number of folk who can do a positive answer in the web but can't do the DNS glueless and vice versa. And so the actual end result of folk who can do resolution over V6 is oddly enough higher by around 2% in total, which is not what I expected.
So, what can you learn from this? Um, the DNS is completely borked. Simple models don't work anymore. We are so used to people futzing with the DNS in every possible way. Many national regimes have their lists of names that will not resolve, but bizarrely many regimes then impose a technology solution which is theirs and, you know, not anywhere else. And so what you actually find as you look across the broad landscape of the DNS is there's no uniform model anymore. There's no stub resolver-recursive resolver-authoritative life is simple. You actually find it's horrendously complicated. 40% of users in Algeria have their queries come at me from Google. From Google. Does Google do glueless delegation? Of course it does. So something is happening deeper in Algeria that says, "I'm not even going to pass the question on to Google. It's just not a question I'm prepared to answer and refer onwards to Google." Maybe Google is being used to mask the fact that there's other DNS engines actively involved in that country and in other North African countries to do the same thing. So what you actually find is trying to infer the DNS from the observation of behaviors is far more complex than we thought and that simple tests that use a simple model often get misled badly. Um, yeah. One little thing. Should you use V6 for the DNS? I don't care. Go and do it yourself, it's your problem not mine. Um, but what I'm trying to say is ways to measure it. Let's do a data-driven, you know, example. Let's drive this whole thing from data. And you sit there and you go, "I have no idea how to give you good, reliable data that measures that artifact in a way that lets you make a good measurement out of it." The DNS seems to resist that completely. Um, I don't know how long that took, Warren, but that's it.
Warren Kumari: You've still got time. We've still got time, hopefully there are questions. Maybe you covered everything. Yay, a question in the queue! There we go. Okay, you first.
Wes Hardaker: Thanks Jeff. Amusing as always at how the DNS is broken. So I—I will reveal the fact that in my graphs earlier for my, uh, .internal test, I didn't show the right-hand side of my graph where I measured PCAP queries long beyond the point that my experiment had ended. And sure enough, there are little peaks for the same domain names long after I actually told Ripe to stop asking. So your notion that things run in a circle... I—I think Warren's, you know, convinced me before that, you know, some of that could be prefetching for pre, you know, for caching and things like that that some cache is trying to auto-refresh some data whether or not that's true. Uh, shoot, there was something else that I was going to say but I've forgotten it, so I'll get back in the queue if I remember.
Geoff Huston: Repeats are interesting because in our measurement, we're seeding the world with unique DNS names and we make care to put the time in the name. So when I see a repeat, I actually know how old it was. This was an ad that was delivered two years ago, you know, because it's got the time in the name. And there's some interesting work in looking at ISP-based queries, it's coming from the same AS as the user, and queries that come from say Google or Cloudflare open DNS names. Is the age profile the same? Or do we use open DNS resolvers for other purposes? There is a huge amount of log replay. Why? Because the DNS is free. And there's a lot of folk who capture bits of the DNS and just replay it again and again and again, which adds to the noise factor without adding to the signal factor. And it's remarkably hard to capture those replays unless you're putting the time in the query. So the problem I suppose inside all of this is when you're looking at queries you have no idea why the query. It's—it's not clear that this is a timeout, this is a replay, this is just, oh it's a Wednesday, or it's a hammer, um, aggressive cache refresh. It—it's really hard to tell. And then when you start to try and disassemble from that resultant log of queries, inferring behavior, you're in swampy ground is all I'm saying. It—it's not clear.
Wes Hardaker: Yeah, so one other I did remember the other thing I wanted to talk about. But following on your discussion a second ago, I assume you know that somebody implemented a file system in DNS just by transferring bits. Or maybe they did it in ping or something like that because you have enough round-trip time to get to the other side of the world that you can send packets out and expect it to get back. And as long as you keep repeating those packets you actually don't need to ever store it on disk. Um, but the other thing I was going to talk about, you mentioned I believe you said that you were expecting sibling glue. And if that's the case, you know, you really shouldn't be trusting sibling or parent glue. And there's a paper in NDSS that occurred two weeks ago, I saw the presentation, um, showing the resolver percentages that were believing that you could cache poison your sibling or cache poison sometimes even your parent. Which obviously is not good.
Geoff Huston: There is a much deeper issue here, and it actually refers to the DNS discovery process. In DNSSEC, answers are signed. The path to the answer is not. And very few folk actually do the distinction between NS records in the parent, NS records in the child. Because only the child records are signed. The way it works out is that DNSSEC doesn't give a flying how you got the answer, it's just the answer that counts. And—and you kind of think about that going "is that right? Should I really care about how I learned the answer?" And in some ways, I think I'm actually on the side of the existing DNSSEC approach. I don't care where I got the answer. If the signature says it's good, it's current, it's valid, I'm going to use it, and I don't care where I got it from. And if we're obsessing over the path and how the path was discovered, maybe we're wasting our time. And you go "well that's just a Geoff comment." It strikes at the heart of the deleg working group. Because if I really don't care how protected a delegation is, it's the answer that counts, why are we wasting our time trying to sign these records in a different way? It's a fascinating question. I don't have an answer. But I have an—I have an opinion.
Warren Kumari: Mr. Peter.
Peter Thomassen: Thank you, quite interesting measurements. I have two questions. Um, the first question is, so I understand why you want to do the measurement, um, with DNS only, like with the IPv6 name server only and then return addresses for the web server in both address families. Um, but can you explain again why you made the distinction with the gluelessness? I didn't understand what the upshot of that is. And the second question is, have you considered doing the same thing with CNAME where the pixel hostname is on an IPv6 name server only, but as a CNAME and points to some name that is on a name server... yeah.
Geoff Huston: So the problem is in the web fetch issue. You don't get the web object. Is it because you couldn't do the DNS query or you didn't receive the DNS response? I can't tell. You know, and—and if you got the response and you just got bored and walked away, I'm going to count that as "you don't do V6 in the DNS" incorrectly. So I'm interested to learn: did you get the answer? Because what we found in doing the whole bunch of V6 transitions was often you could ask all you wanted, the answers didn't come because the filters were on incoming, not outgoing. And I was trying to isolate that factor of "did you receive the DNS answer?" Glueless is one way of going, "Well I can test that by doing a consequent query." If you did the next query, you got the answer. If you didn't do the next query, it's safe to assume you didn't get the answer. Why? Because it's really hard to interrupt the DNS. Once a recursive resolver has started on a problem, it's—it's kind of it mirrors the obsessive-compulsive behavior of the programmers. It will not let it go until an eon of time has disappeared, or in the case of Unbound, it will never let that question go. It's a question that deserves an answer forever. So that's what I was trying to test out. The CNAME, I haven't worked hard in CNAME. For a long time CNAMEs were variously implemented. The whole synthesis of the CNAME and trying to understand the resolver behavior made my brain explode. It's probably a good idea, and it's worth trying through, yes. But didn't—didn't naturally go there.
Job Snijders: Job Snijders, Fastly. Um, I cannot help but wonder that you lament about the lack of TTL and there might be queries looping around forever and ever, but then Peter points out there is the natural phenomenon of bitflips, and I wonder is this evidence of an intelligent design that we in fact do not see queries loop around forever and ever because eventually they'll hit some memory that flips a bit and it stops? So I mean we—we got to get some theological experts in here.
Geoff Huston: You know, it is a fascinating question. We commonly say at the root server between 70 and I think Dwayne, if he's still here, said up to 90% of queries are NXDOMAIN. It's kind of we're spending all this money building root servers where the answer is a really fast no. You could take a year to answer no. The answer's still no, you know. What's the obsession? But this other question of is there something waiting for the answer, or is just the DNS talking to itself? And there is certainly an assumption, and our experiment tends to show that if you take a large number average, at least 40% of DNS queries are bullshit. No one's hanging around for an answer. It's not related to a user event in the recent past, because those time signatures say if it's in the first few seconds it's a live user waiting. If it's beyond 30 seconds we've all moved on with our lives, and the amount of queries that are as old as shit is disappointingly high. And the amount of engineering time and money we spend on the DNS to keep the bullshit answers coming quickly kind of strikes me as misplaced resource—resource management. It's not worth it. Now if only the DNS queries themselves said "here's how long it needs and if you don't answer by then give up and go home," that would be an interesting change in the DNS. Probably better than DELEG, nudge nudge hint hint. Thanks.
Warren Kumari: All right, thank you. Um, well, always more work to do. Thank you everybody and many of y'all have a long flight home. During this time please think of what you're going to propose for the next IEPG meeting. DNS related, BGP related, other things related, whatever you have. And if you do not have a long flight home, you can do it over dinner. Thank you everybody.
Warren Kumari: Goodbye Wes, thank you.
Wes Hardaker: Bye Warren, thank you.
Warren Kumari: Bye Wes, bye Jen, thank you.
Wes Hardaker: Sure, leave Lorenzo out.
[Recording ends]