**Session Date/Time:** 16 Mar 2026 01:00 This is a verbatim transcript of the LSR session from IETF 119 in Brisbane. **Yingzhen Qu:** Okay, it's nine o'clock in the morning. Let's get started. Good morning, everyone. Welcome to Brisbane, and welcome to the LSR session. I'm Yingzhen Qu, and I have my co-chairs Acee and Chris online, joining me remotely. So, we have a relatively light agenda, so let's get started. We don't have volunteers for the note-taking, so for people who speak or make comments at the mic, make sure you check the notes and make sure your comments get recorded. Okay, let's start with the working group status. Maybe I should step to this... Okay, how about this mic? Can you guys hear me okay from this mic? For remote people? **Acee Lindem:** Yes. Yes, it sounds good. **Yingzhen Qu:** Okay, yes, I want to make sure this mic works well. So, it's Monday morning, so we need to look at the Note Well. It's basically saying by participating in the IETF you are agreeing to follow the IETF process, behave well, be polite, and if there are pointers in these slides, if you are interested in the... you should read the detail. And if there is any question, talk to the chairs, talk to the AD. And for especially for people in this room, make sure you sign in using your phone. So then you can join the queue, ask questions, and we'll also record the right number of participants for the session. And we don't have any volunteer for the note, so, you know, I also sent a link in the chat, so please make sure you help with the note-taking. We have four RFCs published since IETF 118. We have two data models, the both for OSPF and IS-IS segment routing, finally published. And also the IGP unreachability prefix announcement. That one has gone through a long history. And we have Anycast Property advertisement on RFC Editor Queue right now, so this one hopefully will be published soon. And these three drafts have been submitted to IESG for publication. The link infinity—there was a discussion point, a discussion from the IESG and the authors have cleared the discussion, so the draft is ready to move forward. Yeah, okay, and hopefully move to the RFC Editor Queue anytime soon, right? And the flex-algo YANG model are now in Gunter's hand, pending for AD evaluation. You want to say something? **Gunter Van de Velde:** Just a quick update. So, the first one, link infinity, is pushed to RFC Editor right now, so that's a good thing. And the other two are on the agenda for the next telechat. **Yingzhen Qu:** And we adopted the [draft-ietf-lsr-flood-reduction-arch](https://datatracker.ietf.org/doc/draft-ietf-lsr-flood-reduction-arch/) and Tony P is going to give an update on this today. And Tony Li, we actually have a question for you about this draft. I think it jumped between Informational, Experimental, so... **Tony Li:** We would love to move it forward. We don't care whether it's Experimental or Informational. It doesn't matter. **Yingzhen Qu:** Uh, I think there are some new requirements for a draft to be Experimental, is that correct? Yeah, I haven't looked in... **Gunter Van de Velde:** Gunter speaking here, Routing AD. I wouldn't really call it like new requirements. The... a more modern philosophy, let me call it like that, is that in the IESG there is this belief that an experiment should be an experiment as such. So it's still being debated to some degree on on what experiment actually means. So, yeah. So typically an experiment, you know, it has like a start date, an end date, and a particular target. So, yeah. That is where we are right now. Yeah. **Tony Li:** Okay, well, we have a start date, we've already shipped the code. We have an end date of never, um, because we've shipped the code, we're not going to stop shipping it. Um, so you know, if you don't want to call that an experiment, then we... I guess we'd like it to be Informational. **Yingzhen Qu:** So, I think you... I forgot we... you started from Experimental, right? You started... and then I believe that it switched to Info. You changed the track in between when... it's adopted by... it was adopted by the working group. We... I don't know whether it was a mistake or you intentionally did that. **Tony Li:** Um, doesn't matter. Um, we don't care what it is. What would help it move forward? We don't want to have an argument, there's no point in the argument. **Tony P:** Tony, Tony, there was no... there was not an argument, we just wanted to make sure that you were aware that it changed. Um, we don't have a... we don't have a problem with it either way, and we'll move it forward. We just want to make sure it was not a mistake. It was your intention to change the status, the track. **Tony Li:** I... I probably changed it because somebody asked me to. **Acee Lindem:** I... I just say that, uh, if you follow the precedents that has been set for other drafts that have, uh, been implemented by a single, uh, single source, we've made those Experimental in the past. I don't know about these new, uh, requirements for defining the experiment and the outcome and everything like that, but... **Gunter Van de Velde:** Yeah, so Gunter again here, so from my perspective, you know, I think both of them are good. So I'm very pragmatic, hey? They will both help. To me personally, I think Experimental is probably more close to the intention here, I guess, so I'm happy with that. If that causes problems then I will handle them at that point in time with the IESG. I think the most important thing is... **Yingzhen Qu:** The chairs had a discussion about this draft and we thought it was Experimental, then we noticed the track changed in between, so we just want to check with Tony, make sure it's... we are on the right track. Yeah. **Tony Li:** I will happily change it back at the drop of a hat. Just let me know what you want. Yeah, okay. **Yingzhen Qu:** And we have two drafts that's in the adoption queue, so I think the authors have requested adoption call, so we probably will start the process after this IETF. If people have any comments or want to review it, please do. **Gunter Van de Velde:** So, this thing is really low, hey? Voilà, better. So one thing what I've been noticing also with the current IESG, they're like much more rigid on what is in-charter and outside of charter. And what we have seen so far is that... so one of the things I think what we have in LSR, if we have a document which is like maintenance, uh, that there should be a milestone in there. So we probably, you know, if we are thinking about adopting them and they are adopted, we should actually add a milestone in there to avoid troubles down the line. Okay. **Yingzhen Qu:** Uh, how about we do this? If we are ready to adopt something, we put it in the milestone. Otherwise, just individual document then we just let it sit there. Yeah, we need to remember to do that. **Acee Lindem:** I... I think we can do it upon adoption. **Yingzhen Qu:** Right, yeah, that's just one extra step for the adoption. Yeah, yeah, once it's adopted. We don't... we don't need it at the start of adoption. **Tony Li:** Yeah, I'm wondering how many... how many years have we been running this working group and we almost never use milestones, and we don't worry too much about this chart. I just love all the process that's suddenly appearing. **Yingzhen Qu:** Oh, well, yes. Well, it's just one small extra step, so not that big deal. And there are a bunch of YANG models in the working group. I will do a presentation later about this, so... **Tony Li:** Welcome to ISO. **Yingzhen Qu:** And these are the existing working group document. Um, I think the first one is close to be ready to last call. The authors have, uh, communicated with the chairs that they will, um, probably make one more version of changes and ready to last call it. And the last one, the multi-topology for segment routing—that one has dependency on those two drafts and one of them is ready to be published, the other one is just an individual spring working group... uh, it's just a spring working group document, so we'll just have them sit there and wait. And that's today's agenda. Any comments about agenda? Any changes we need to make? If not, Lian, you are the next one. Any comments about the working group? Okay. **Lian:** Good morning everyone. I'm Lian from China Mobile. Um, I'm onsite for the first time, so nice to meet you too. And, um... hold on. Sorry, we wait for the slides. **Yingzhen Qu:** Screen froze, we have a little problem. I think my MeetEcho crashed. Wait a minute. What's wrong? **Tony Li:** Lian, are you able to get back to it or do you want me to try to display the slides? **Yingzhen Qu:** Give me a second, let me try. Yeah, okay. I think I... you can still see the screen but my browser already crashed. The MeetEcho session crashed. Yes, we see the agenda slide. Okay, thank you Yingzhen. And, uh, it's a great honor to be here to present our draft on behalf of all the co-authors. And our draft name is the [draft-ietf-lsr-l2-bundle-member-remote-id](https://datatracker.ietf.org/doc/draft-ietf-lsr-l2-bundle-member-remote-id/). And first I want to go to a overview of this document and it describes how the OSPF and IS-IS would advertise the remote interface identifiers for L2 bundle members. And the key purpose is to enable the controller to know, uh, which interface on one side connected to which interface on the other side, uh, for the topology management or traffic engineering scenarios. And the initial document was presented at the IETF 119 meeting about March 2024, and it became a working group document since March 2025. And we have updated two versions, uh, based on the discussion from the mailing list and also the meeting discussion. Um, so today I will present all the updates, uh, about these two versions. The first one, we clarified the use case description and we added some text to, uh, describe the traffic engineering and bidirectional path computation scenarios. Um, and the second one, we improved the BGP-LS specification by adding the explicitly reference about the peer adjacency link from RFC 9086. Uh, we think that this extension makes, uh, it more clear and more consistent with the current protocol. And the third one, we completed the security considerations, added some comprehensive security tests, including OSPF, IS-IS, and also BGP-LS. And, um, overall, this document doesn't propose new security issues, so all the current solution for these protocols is applicable. Um, the fourth one is about how the remote, uh, identifier is acquired. As we all know that IGP, uh, have no direct way to exchange the L2 bundle member link identifiers, so, uh, we rely on the L2 protocols. And how to acquire it is out of, uh, the range of our draft but we gave two proposals and also some advice. Um, the first one is the Link Bundle Protocol and the other one is the LLDP. And the LLDP runs on the physical links to change the, uh, neighbor information, and the Port ID TLV can carry, uh, this remote ID. So, um... in previous version we mentioned the Management Address TLV, and it was not correct, so it was a mistake, so we corrected, uh, it and now it points to the Port ID TLV which is mandatory for the LLDP. And finally, we update the IANA consideration, uh, adding the MP field to align with the IS-IS protocol. And these all the update of this draft. Uh, we think now it's more mature and we are ready to the working group last call, and always welcome the comments and suggestion at any time. Uh, thank you. **Yingzhen Qu:** Jeff, please go ahead. **Jeff Tantsura:** Sure, can you hear me? Try again, can you hear me? **Yingzhen Qu:** Yes. **Acee Lindem:** Yes. **Jeff Tantsura:** Thank you for coming. Lian, thank you for the presentation. I see though your primary mechanism is LACP. Has there been any discussion about BFD for lags? **Lian:** BFD? **Jeff Tantsura:** Yes. So LACP is one way that lags can be formed with different port IDs. Uh, another common way for doing this type of, uh, you know, lag is using BFD. **Lian:** But I don't think it's related to the BFD protocol because we just get the remote ID identifier to advertise in IGP so that the BGP-LS can gather it for the controller. So, um, I didn't catch you why it's related to the BFD protocol. **Yingzhen Qu:** So Jeff, my understanding is you are suggesting that's a possible solution to get the remote ID, right? **Jeff Tantsura:** I am suggesting that LACP is the most common way that these types of lags... and BFD is another possible way. Exactly. And since this is an IEEE vs IETF problem... **Yingzhen Qu:** Yeah, I think Lian can work with you offline as if that's another possible candidate solution for that part. But that... that is actually clearly out of scope of this document, it's just some suggestions. But anyway, yeah. **Lian:** Yeah, we can discuss offline and maybe we can if it's suitable for the proposal we can add it to the draft. **Yingzhen Qu:** Sure. Les, you are the next. **Les Ginsberg:** Thank you. Yeah, I just wanted to reinforce... I frankly, Jeff's comment confused me. I really don't see what BFD has to do with this draft, but maybe he can enlighten us. Thanks. **Acee Lindem:** Yeah, I was just going to say that I don't know... Yingzhen said this, we don't want to go down a rabbit hole with the way that the IDs are discovered since that's not part of the draft anyway. I mean, it's... it's required as a prerequisite, but which way... so I don't think we need to mention every way in it, or know that we need to put BFD in this. I think the two existing ways that are there are enough myself. I'm speaking as a working group member. **Yingzhen Qu:** Yeah, I think the draft clearly said that the how to get the remote IDs is out of scope of this document, but you know, there are some recommendations how you can get... the possible ways you can get it. So it's up to the authors. **Lian:** Okay, we can discuss further. **Yingzhen Qu:** Any other questions? Thank you. **Lian:** Thank you. **Yingzhen Qu:** The next one is mine. Can remote people hear me okay now? So, um, I'm going to do a quick update for the YANG drafts. Okay, so here are the list of YANG models we have published, um, in the LSR working group. The last two in blue color are the ones we published since last IETF, since 118. And we have been trying to define YANG models together with the specification if the specification is not so large and the model is we think is okay to be included in the draft. So, um, if you look at this one, we did that for the administrative tag. So for these two drafts, the anycast property advertisement and the unreachable link in OSPF, we did... we have data models included in the draft as well. And for the link infinity one, there is an IANA model included in the document. And I'll tell some details about how to define an IANA model or and how it can be used later in the slides. And the OSPF Flex-algo YANG model, both OSPF and IS-IS, those two are have been submitted to the IESG, I mentioned during the status, and they are in IETF last call. So here is an example of the IANA model. So when we define the base OSPF model, we didn't know how to, you know, figure out there was IANA was maintaining all the models. Now we have a way to, you know, we have those code point in IANA registry, so we can actually define a corresponding model, IANA model for those registries. So for example, the one I have this is the metric type. If you look at each identity, those are the registries defined in the there. So the advantage of doing so is, you know, later if a document extended the metric type and we since we already have the metric type configuration and the status available in some existing model, we actually don't need to define new model anymore, so the IANA will just add the new registry for it. So another example is let's say we have the like MSD type. If we already have ways to configure it and to query the status of it. So if later a draft defines a new MSD type, we don't need to define any new model for it. This is just a way to show how it can be defined, so like the link infinity one we actually define the functional capability bits using IANA model, so later any draft that's using the functional capability bits we don't need to do define new models anymore. But if you need any help of this, let me know, I'd be happy to help. So, and we still have the augmentations for additional features, and these are the list of features. We did move the Flex-algo to the separate model in order to get them published sooner. This we probably will pick important ones get them published in line. Same for IS-IS. And we have the IS-IS PICS model. We have three models for IS-IS PICS. Um, we had quite some discussions when we adopted this draft, but since then it has been quiet. I suppose, you know, if there are no further comments, we will get this model ready to be published. So if you are, especially if you're a service provider, please take a look, see whether this model will be useful for you, or if you have some comments want to add something, let the authors know. And the IS-IS OSPF SRv6 YANG, these two are pretty much waiting for the SRv6 base model in spring to be published. And these are the sequence we think might be... we might be able to get published, how we plan to publish these models. Any questions? This is just a quick update. Otherwise we'll go to the next presentation. Tony, is this right? **Tony P:** Yep, I'm here. Let's hope I don't get squelched off again. **Yingzhen Qu:** Yeah, and you have control of the slides. Please go ahead. **Tony P:** Thanks so much. Alright, so that's an update on the [draft-ietf-lsr-flood-reduction-arch](https://datatracker.ietf.org/doc/draft-ietf-lsr-flood-reduction-arch/). What we're seeing, interesting things. Let's fire it. Alright, so what happened between the last time and now? Um, pretty massive readability rewrite based on some feedback. Perplexity AI to be blamed for anything that looks funny. Um, the big change is that the original draft was suggesting that the HSNPs have some kind of ranks. Yeah, levels bad word because we always get confused with IS-IS levels. That has been abandoned completely. I'll talk about why. Um, and you know me, I like to heat up CPUs a lot, so I did quite a lot of work on actually building an emulation of very large networks, doing refreshes over very long periods of time, generating HSNPs and all those hashes to look at collision probabilities, right? To get some kind of a hang on those things. And one practical thing fell out of this work for the fragment fletcher, right? Because the HSNP hash for the fragment is a fletcher, we changed that. Um, is that we added the IS-IS MTU length or PDU length that is in the packet, which is actually a good source of entropy as well to get those hashes more evenly distributed. Alright, so why did we abandon this level or rank? Because with the implementation we're seeing that there isn't anything like that, because different parts of the LSDB compress at different density, and that depends on a lot of things, right? So the strategy implementation may choose is not something that we can standardize, and it's not only varies over time but also per peer, right? Because talking to different peers, these peers may have LSDBs in different states, so the agreeing on consistency, what needs to be exchanged, can not only vary on time but also per peer. Alright, and the strategy a node can choose to actually decide how much it compresses—so you know, how many nodes it covers with a hash—can be based on a lot of different variables. So you can keep some statistics, right? About how many of mismatches did you see to appear, and if you see a lot of mismatches you can come to a conclusion it's not worth to compress anything on this part of the database. A link flap is very different from a node reboot, right? Because if the neighbor reboots it basically loses the whole database, so there is nothing to talk reasonably about at high level. Whereas during and... irrelevant... okay, no, irrelevant... whereas if a link flap, the databases are basically synchronized on both sides, so HSNPs, you know, large hashes or hashes covering large part of the database can help a lot. People can have in implementation different caching strategies for all those hashes or whatever not, so that also can influence how dense they want to make those HSNPs or you know, how how large this hashes are in terms of covering parts of the database. And then there is other consideration, like hashing of CSNPs and for example generating CSNPs may be extremely cheap compared to HSNPs, so you don't want to have hashes where you may want to have hashes that actually trigger a lot of CSNPs if they mismatch because they're very easy for you, or maybe not. Um, and I think last time the topic came up, but again to emphasize it again, to agreeing on all the peers on what are the ranges that the hashes cover seems like an easy and natural idea but it's actually not feasible for reasons that actually I discussed last time on the mic with Chris. Good. So, a good analogy why these HSNPs are useful is you can consider them basically a gradient descent, if you're familiar with, you know, like the most basic stuff in AI. So these HSNPs you can consider, you know, the optimum is both databases being synchronized, and these HSNPs can be considered the large steps on a gradient descent, right? And if the hashes match, it means you descend very quickly towards a database that is consistent on both sides, and if they mismatch you basically have to take smaller steps. Alright, I hope people can follow that, if you're familiar with either of the drafts it makes sense. But what it shows is here that this gradient descent analogy, if you think through it, leads to basically fairly simple rules, right? Basically when you get an HSNP hash over a range of nodes that mismatches, then you have to take smaller steps, and the smaller steps is either to send an HSNP packets with more specific hashes for the mismatch, right? So kind of disaggregate the range into better, you know, smaller hashes, or you can react by basically sending CSNPs for the mismatch directly, which are really small steps, which the protocols are taking today if you think about it. Or at the very smallest step where you can be pretty much sure that the databases are consistent on the ranges you basically flood all the involved fragments, right? So of course it varies on implementation, depends a lot of considerations about, you know, optimal strategy and so on, but in architectural terms this is basically going from a big step on a gradient descent to smaller steps which will be more precise, get you in the right direction. And at the point in time where an HSNP hash is covering just a single node, and node ID which means all the pseudo-nodes and all the fragments of the node and the pseudo-nodes, then the action will become just the last two steps, right? Because you cannot send a more specific HSNP hash, the smallest resolution of HSNP hash is basically, you know, a node ID. So then you either send CSNPs, PSNPs, or otherwise you flood directly. Um, so that's I think helpful if you think why the whole thing works and what will be ultimately, you know, the precise language of the standardization for the whole thing to work correctly. Alright, now minor changes. So there is the funky case where a fletcher of a fragment can become zero, which is actually a valid... funny enough, a valid case, and the draft says replace it with one so during XORing into HSNP range the stuff is visible. Um, we look at the case where a different set of fragments with different fletchers will create the same node hash, and when you look at the probability of that stuff it's so incredibly unlikely it's not worth considering further. You find something in the draft. Um, and the... there was a stupid idea initiated by me as very often that if we have an HSNP packet and the range is, um, in the range of the HSNP packet we have multiple hashes, right? And there may be holes in those ranges of the HSNP hashes. We could advertise the missing range with the hash zero indicating absence, but that's actually dumb because what could happen is that multiple node hashes XOR each other out and the zero as a HSNP hash for the node range is actually zero, it's a valid case. So that needs to be removed. I'm again getting into a little bit arcane details here. Anyway, the meat of the stuff is that, um, I burned a lot of CPU powers to look at the hash collisions. And basically, um, what I built is a perfect emulation of a network where everybody is refreshing. I was running 32 networks, just amount of cores that I had, each network 50k nodes and total about a million fragments. Um, and each network was running for two years, um, refreshing, right? Refreshing and a little bit more than that. I used roughly MaxAge to refresh fragments, somewhere lower like 50k seconds or something like that. I did not make the node IDs too different, so the node IDs which are used in the hashing were only differing by three bytes, which is kind of realistic or or pretty, you know, you can argue already a fairly suboptimal case if you have a 50k nodes network and and basically with node IDs you only squeeze it into three bytes difference. Only on 5% of refreshes I was changing the IS-IS PDU length, so assuming that the network is basically refreshing all the time the same thing. Um, the check-sums I assumed would be moving in fairly uniform fashion, and the check-sums will be changing because the sequence numbers are bumping, right? And the sequence number is in the IS-IS check-sum. Um, and while the stuff was running I was basically every time something refresh I was checking the whole database whether there is a collision of an HSNP fragment. So, pretty much mirroring, you know, the reality of that stuff as far as it mattered. Now, what did we see? So 32 bits HSNP hash is too small that will collide too much. So we're running 48 and 64 bits. Um, astute observer realized that there is no 48 bits fletcher, but it's very easy to extend fletcher to do actually 48. Um, and I will show you a graph, right? And this is the quintessential graph that shows that actually 48 bits work better than 64 bits, which is kind of surprising, and I have some gut feeling why that is but no no real data to back up, you know, what would cast effect or or, you know, a theory foundational theory, but I see it on the graph, so you see the graph. So horizontally you will see there is the distance between the fragments that collided in LSDB, right? And how many collisions. And on the vertical axis you see the duration. So the duration was, you know, somewhere a little bit less than the lifetime of a fragment. So we're talking about 10 hours, something like that. Now, the red line means that only a single HSNP packet summarized the whole database, right? And you see then now why is it important whether it's left of the line or right of of the line. Because right of the line means that the collision distance was so far apart that it didn't matter. It was not in the same HSNP hash. So for practical purposes those collisions did not matter. Left on off the line it means that it was in the same HSNP hash possibly, and that means that if you XOR both that they both vanished. Alright, I hope people can follow that if you familiar with either of the draft it makes sense. But what it shows is here that the 48 bits, lots of these collision didn't matter. Albeit it generated, you see at the top at the very right, 230 collisions whereas the 64 bits only generated 190 collisions. We talking here something to the order of 36 billion refreshes, right? 32 networks running over two years each of them a million fragments, so to to even get those 200 collisions you need to run about 36 billion refreshes. Um, but let's go back to those lines, right? So on the 48 bits what we seeing, if you generate a single HSNP packet for the whole database, which is the red line, you get 109 collisions. Um, there was a pointer I think somewhere but I hope you can follow me. So that's the 109 number. So if you translate this 109 collisions over the whole thing, um, it basically means that we will be seeing about one collision for about 10 hours every year, okay? So that's that's what we talking. If we generate however like 70 or 80 HSNP packets to cover the whole database, we seeing one collision for 10 hours in about 10 years. So you know, practically this is all pretty much irrelevant because if you start to look like what will happen if you have to generate that many CSNPs, the the flooding rates are become completely unsustainable. So we can't actually use CSNPs practically speaking to synchronize these kind of databases. So you know, unless the flooding is perfect perfect super perfect, um, we don't have a solution until we go to something like these HSNPs. And on the 64 bits you see that albeit we have less collisions, they cluster much closer, which means that they will be more relevant. We'll see actually more relevant collisions than with the 48 bits hash. Um, one thing remains to be seen is probably to go look something like SipHash. I had some discussion with Job Snijders, you know, we may go and look at the stuff. The problem with that is of course that everybody knows fletcher when we start to go go Sip that we have different variants people have to implement and so on and so on. Um, but you see that um the numbers with 48 bits look like it's practically absolutely viable to run those HSNPs, they the collisions will not show up in any relevant form and fashion to matter, you know, what was my conclusion. And yeah, I think that's all I have for the moment as far as these works go. Thanks. **Yingzhen Qu:** Thanks Tony. Um, questions, comments? I don't see anybody in queue. **Tony P:** Yeah, and thanks to Les, right? He's keeping me honest and he has a lot of good comments like throwing at the draft. Yeah, mhm. **Tony Li:** Uh, hopefully my I switched my mics. Hopefully this isn't horrible. Um, can you... you said you had a feel for the collision. Uh, is it can you say what you think it is quick? **Tony P:** Uh, sorry, I I couldn't parse your question. What you mean quick, I don't I don't I don't get it. **Tony Li:** You said you had a feel for why 64 bits was not as good as 48. **Tony P:** Yeah, because it looks like when you when you go for the 48 bits and you run the hash over this sequence of things that the node IDs somehow have enough influence and they naturally partition the hash, you know what I mean? So the node ID kind of puts it into different bucket. Whereas with 64 bits the effect vanishes, they um the hashes become so uniformly distributed that they start to collide more. But it's just a gut feeling, you know, it's really hard to look at those things and have any conclusions, especially since you know this 36 billion things run for, you know, is hitting up, you know, water-cooled machine for a day or two just to get this kind of graph. Mhm. **Tony Li:** Thanks. **Tony P:** But you know, after I run SipHash I may show up with something completely different, right? So this is this is just fletcher right now. **Yingzhen Qu:** Okay, um, I don't see people in the queue, so thank you Tony. **Tony P:** Excellent, thanks. **Yingzhen Qu:** Okay, um, Tony Li, he is here in person. **Tony Li:** Ni hao. Um, I'm Tony Li, I'm here to present A Power Conserving Path Placement Strategy. And I'm told I have to eat the mic, okay. Um, last time in Montreal, Colby came up and presented, uh, all of our proposals for how to add, uh, power monitor... power consumption information to IS-IS, and he got the question, how do you use it? Uh, this presentation is out to answer that question. Um, so the problem we're trying to solve is pretty simple. Uh, we have to provision networks to peak utilization. Uh, however, lots and lots of networks, especially eyeball networks, have daily utilization patterns where traffic falls off at night, because humans are driving it. If they stop streaming, guess what? Uh, so then we have excess capacity after everyone goes to sleep, yet we're running the network at full power. Uh, this can actually be significant. Um, some public numbers, uh, BT, the ISP for the British Isles, uh, consumes 1% of the power in the British Isles. And at the same time, uh, in their low traffic levels, they're 85% idle. So 85% of that 1% is being wasted. We would like to not do that, okay? So how about we turn things off when we're not using them? Alright, what we'd like to do is when we have low traffic, we'd like to push traffic onto a small set of links, right? Use 15% of our links, maybe a few more for redundancy purposes, but turn off lots and lots of other links. Okay? If we can move the traffic and leave things idle, then powering things down and especially powering off the silicon saves lots of power. Doing this is not too difficult. We've described how to understand the power used by each node in the network and each interface. And now we're going to modify CSPF. Uh, CSPF is Constrained Shortest Path First, this is the algorithm we do for most traffic engineering, and yes, we're going to be presenting this in TEAS as well. So it's relatively easy to take CSPF and modify it to take into account power utilization. Before we add power to it, CSPF looks at each TE metric on a link and then computes a path from source to destination that follows the constraints that the network manager requests and then finds the path with the lowest cumulative TE metric. Once you start considering power, things aren't too different, okay? Again, we're going to respect all of the constraints, but now we're going to look at the power metric as well. What we're doing is to have each ingress node compute the power metric, its influenced by the power information. Okay? The exact computation of this metric is out of scope. I.e. we're not talking about it, okay? Does not need to be identical on every node, every head-end can do its own thing, and this is a place where we can do some various degrees of innovation. The power information is all there in the IGP. Uh, you have the power on an interface, each interface belongs to one or more power groups. The power groups represent an abstraction of higher-level hardware that you can take a look at to understand the total power consumed by each interface. All of this is flooded in the IGP, can be then copied into the traffic engineering database (TED), and that's all described in this draft. Um, we also imply from that that things certain things can be powered down. Obviously for redundancy purposes and for latency purposes, there are things that you may not want to power down, that's okay, um we give you that level of control. Um, as I mentioned, the power groups, uh, describe the power consumed by components, hardware components in the box. And the draft gets into details about how to do this. Uh, in the case where an adjacency is on a lag, a adjacency may have multiple power group parents. So... And also to handle the case where we're dealing with lags and we've turned off members of a lag, we also track the unidirectional sleeping bandwidth. This way if a lag has been turned down by 50%, you can tell that and react accordingly. Sleeping links are an interesting challenge. As you know, IS-IS only distributes active adjacencies, and we actually want to distribute information about links that are asleep. We want to keep those in TED for further computation. So what we did was to go ahead and take those adjacencies TLVs, we defined them for OSPF and for IS-IS, and then allowed those TLVs to be flooded, but we indicate that they are sleeping. Uh, for IS-IS, that's easy, we have a link attribute that indicates the link is asleen and as long as IS-IS sees that link attribute, it knows not to use that adjacency in the SPF calculation, yet the adjacency is in the TED and can be used for CSPF. Um, so... CSPF is usually done, especially offline, by an external controller, so this is just feeding the controller the information it needs. Um, most controllers then turn around and establish TEs using RSVP-TE or SR-TE. Um, we would like the protocol, RSVP-TE to be aware of the fact that it's using sleeping links and we'll be presenting those extensions in TEAS later this week. And similarly for PCEP. Uh, SR-TE doesn't need any signaling of course, just put the path in place and away you go. Uh, when the head-end or controller decides that it wants to use a sleeping link, it must somehow signal to the relevant nodes to wake up those links, okay? And that can happen again via RSVP-TE, via PCEP, or via Netconf/YANG, and again we'll be talking about those later. Okay. So that is what we're proposing, are there any questions? **Yingzhen Qu:** Gunter? **Gunter Van de Velde:** Yeah, so the the draft says that the path choice is like dynamic. Does it also mention like make-before-break, or how do you ensure that traffic doesn't get dropped while you move traffic along? **Tony Li:** So, because this is TE, um most of them make use of something called auto-bandwidth, which actually does make-before-break and changes paths almost constantly. Um, typically, you know, people are changing paths about every 15 minutes. Uh, so this just becomes part of that optimization. **Gunter Van de Velde:** Okay, thank you. **Jeff Tantsura:** Jeff Tantsura, Nvidia. Tony, are you going to provide BGP-LS extensions? **Tony Li:** Um, not at this point yet, one thing at a time. **Jeff Tantsura:** Thank you. **Tony Li:** Alright. Um, so we're ready to move this forward. We would like to adapt this as working group document and have [draft-mini-lsr-power-group](https://datatracker.ietf.org/doc/draft-mini-lsr-power-group/) also as a working group document. **Yingzhen Qu:** Acee, you are the next one. **Acee Lindem:** Yeah, this is just speaking as a working group member first. I think you're... you should, um... this presumes the majority of the traffic is, uh, is engineered. Which is okay, but I think you should say something about that about the best effort and and how that how that is if it just follows whatever available paths are there. And in these types of networks where everything's engineered, you you can use the proper words there. But... **Tony Li:** Uh, we'd be happy to note that. Um, most of our customers do use traffic engineering to engineer almost all of their traffic. Um, dark traffic that is not being engineered obviously does not, uh, is not affected by this and of course may cause you to not power things off. **Acee Lindem:** Right. The other thing... the other thing I was going to say as a working group chair is, I think the [draft-mini-lsr-pcpps](https://datatracker.ietf.org/doc/draft-mini-lsr-pcpps/) document that does belong in TEAS, where the power group, that belongs in LSR. **Tony Li:** We would be happy to do that too. **Yingzhen Qu:** Any other questions? Uh, Les, you are the next one. **Les Ginsberg:** Yeah, so Tony, this question is not just for you, it's it's also to the to the working group and perhaps the working group chairs. Um, I think there are many ideas out there about, uh, how to incorporate power efficiencies in routing. Um, I'd like to get a feel for whether this solution is a solution that customers are eager to deploy, to the extent that you can talk about that. **Tony Li:** I can say that our customers want us to implement this. Our customers want you to implement it too. **Les Ginsberg:** Okay, thanks. **Yingzhen Qu:** Zafar. **Zafar Ali:** Yeah, so I think, like, to second what Les mentioned, I wouldn't say that our customer want us to implement this. Uh, there is quite a bit of unknowns here, especially the way that you define the power group. You presented the LSR draft last time and it... there was a discussion that the discussion regarding the metric needs to happen with some condition with the GREEN working group. Um, and I believe there was some action taken, I don't know what what was the outcome of those actions. And then, uh, you... your presentation was for a TEAS working group and you are asking for adoption in LSR. So that was a bit confusing. But I think some work needs to happen in terms of, uh, the requirements for this, what customer wants, the metric, uh, before the working group could proceed. **Tony Li:** Well, customer wants to save power because power is money. The GREEN working group seems to be intent on modeling the world and ignoring how to actually save any kind of money. So we're ignoring the GREEN working group. **Zafar Ali:** I... I mean, yeah, if you have some problem with GREEN working group and they're not doing the real work then you need to talk with the... with the GREEN working group, but there is certain things, there's some precedence in IETF that needs to happen, especially when it come to metric. Look at the charter for GREEN working group. **Tony Li:** I helped write the charter for the GREEN working group. I helped start the GREEN working group and I and I contributed the first draft to the GREEN working group, and they've subsequently ignored it. Okay, so that's fair. They want to ignore it, they got a better idea on how to do things, they're welcome to do that. They don't seem to be actually solving customer problems, we're going to let them do that. **Zafar Ali:** Okay, uh, I'll let chair and AD, uh, chime in, but that's fine. Thank you. Appreciate that. **Yingzhen Qu:** Zafar, are you... okay. J, you are next. **Jiedong Ji:** Yeah, Jiedong from Huawei. I'd like to ask a question about the metric. Uh, it seems, uh, you're talking about the metric is computed, uh, can be used, uh, customized algorithm to get the metric by different nodes. Is that something you have in the draft? **Tony Li:** Yes, that's what it said. You can you can do what you like with the power information to compute the metric. **Jiedong Ji:** Uh, in that case how can you get a consistent, uh, view, uh, of about the metric if the different node use different algorithms to get this metric? **Tony Li:** Okay, take a look at bullet two, sub-bullet one. The metric does not need to be identical on every node. It does not matter. This is path computation that we're doing here. Every head-end is allowed to compute whatever path it wants, it does not need to be consistent with every other node. **Jiedong Ji:** Yeah, that sounds like you are using different metric types for the computation of a path on different nodes. **Tony Li:** I didn't say that. **Jiedong Ji:** Oh, okay. That's... that's maybe something I missed. **Tony Li:** I think the point is that when you have a centralized, um, path computation, that you don't require that everyone agrees on what the metric is, right? In a distributed computation... **Jiedong Ji:** I know that is a centralized computation, just information collected maybe based on different criteria to get the metric, so maybe the total metric for a path may not be the, um, the most energy efficient from the perspective of the of the controller or of an operator. **Tony Li:** Well, I think this is a great, I think this is a great comment for the list too. Let's get some discussion going. Yeah, I agree. It's not required that everyone agree on what the most optimal power metric is, okay, and what what you consider to be power efficient is not necessarily what I consider to be power efficient. Okay? I I have this weird warped perspective that I want to turn off as much power as possible. So so I care about turning off ASICs. That's what I optimize for. **Jiedong Ji:** Yeah, another quick question is will this metric be very dynamic or it's like a kind of static? **Tony Li:** So, again with IS-IS, if you have a dynamic metric you tend to thrash the LSDB, you thrash flooding, and generally that's a really bad idea. **Jiedong Ji:** So it's static thing, right? **Tony Li:** I didn't say it had to be static. **Jiedong Ji:** It's relatively stable? **Tony Li:** Yeah, yes of course. You want something that's relatively stable. Yeah. If you introduce a 1 gigahertz input signal into IS-IS, I guarantee you you'll have a bad day. **Jiedong Ji:** Okay, thank you. **Yingzhen Qu:** Rakesh, you're the... **Rakesh Gandhi:** Rakesh Gandhi from Cisco Systems. Uh, slide four, um, I think you are talking about optimizing using this new metric instead of TE metric. Uh, there is also, uh, requirement for low latency traffic and there is path computation with latency metric to minimize the end-to-end latency for example. So, uh, if you just focusing on PCPPS metric optimization and not looking at low latency requirement, you may get in trouble with some SLAs, right? **Tony Li:** So you'll see the words, "where it does not violate constraints." Okay, we did not talk about every possible constraint. We've been talking about constraints in traffic engineering for about 30 years now, uh, we think most people understand it. **Rakesh Gandhi:** So it's keeping the existing constraints but adding additional end-to-end cumulative metric minimization, that's what you said. **Tony Li:** Absolutely. **Rakesh Gandhi:** Okay, thanks. **Yingzhen Qu:** Zafar, you're the next one. **Zafar Ali:** Yeah, and and and that come back to the same thing is that without consistency, um, you are not doing the right traffic engineering because you some node may give a a metric which is way off from the other node, but the actual power saving is something that is way different or from each other. So this is where the normalization some kind some kind of normalization is needed, some consistency. We can talk offline, you know like in TEAS working group, but... **Tony Li:** Yes, but doesn't the operator just stop buying that box then when it just bugs? **Yingzhen Qu:** Chris, you're very choppy for me, sorry, I I apologize. **Tony Li:** If you were lying to your customers you get what you deserve. Feel free. **Zafar Ali:** Is... anyway Tony Li, let's we can take it offline but it has to be some consistency some normalization some proper definition for this, okay? **Acee Lindem:** Hey, I think, I think what the draft says is that it... I'm speaking as working group chair and we'll move on and we can take it off list. I think what the draft says is it's out of scope. Should the measurement of milliwatts be consistent? Yes. But that's not really... that's really out of scope for this draft. **Zafar Ali:** In scope, in what what? **Tony Li:** To be fair the normalized unit for power consumption is the watt, okay? If you don't understand it, I refer you to your local physics professor. **Zafar Ali:** Okay fine. Okay, anyway we can take it offline. **Yingzhen Qu:** Okay, um, Ron, please. **Ron Bonica:** I think there are a few things we can agree on. One is the customers do want to save power. Two is that there's definitely a CSPF part of this. You need to steer the traffic, uh, so you can shut down nodes. And if there's a CSPF part, CSPF has to learn stuff from nodes, which means there's an IS-IS part. That suggests to me that there's enough motivation here for a call for adoption. Um, you know, the answers that we propose may not be the final answers, but this is certainly something that I think the working group should be working on. And I'd ask for a call for adoption. **Yingzhen Qu:** Okay, thanks. Uh, Jeff, you're the next one. **Jeff Tantsura:** Jeff Tantsura, Nvidia. Uh, you mentioned ASIC a couple of times. What's the correlation between adjacency and ASIC? **Tony Li:** So this is exactly why we have power groups, because there are multiple adjacencies that may, um, be dependent on a single ASIC. And so a power group can be used to model an ASIC or it can be modeled used to model a set of ASICs or even an entire line card. Um, so there are lots of things you can do. We're not picky about how you model it, that's why we left it as vague as possible. Uh, so you can do things hierarchically, that's all we asked. Um, and for lags you can have multiple parents, uh, because you could be using say even multiple line cards in the same chassis. Um, so you have a great deal of option about how you want to model particular situations. **Jeff Tantsura:** So it's an abstraction. In other words, would you support hierarchy of abstractions in case of lags? **Tony Li:** You can you can deal with any abstraction that your box cares to name. If there's some mechanism that we haven't foreseen, then please let us know. Uh, we don't... there's no bound on the number of levels. You're welcome to do whatever you like. Um, all we ask is that you not introduce cycles in the power groups, that would be bad. **Jeff Tantsura:** No, you expect PC to understand the relativity of levels, right? **Tony Li:** So there's no, there's no bound on the number of levels. You're welcome to do whatever you like. Thank you. Um, all we ask is that you not introduce cycles in the power groups, that would be bad. **Yingzhen Qu:** Martin, yeah, please. **Martin:** Yeah, uh, I do like the concept of your abstract power groups perfectly. I just wonder whether the whole topology and maybe hierarchical level of power groups makes it a trivial optimization algorithm or method, uh, that can be handled distributedly at all? **Tony Li:** Um, we don't claim that this is absolutely optimal, but it is an optimization. Which is to say we don't claim that we got this perfect. **Yingzhen Qu:** Okay, I think we finally finished the line. Thank you very much. Any other questions? If not, okay. Please continue the discussion in the mailing list. Derek, you are the next one. **Derek Yeung:** Hello. Hello. Hello, everyone, I'm Derek Yeung from Arrcus. I'm here talking about the Advertising IGP Measurement Groups on behalf of my article workers... co-authors and, um, from Arrcus and Equinix. So the problem statement and motivation is that, uh, in the customer networks, they want to run measurements using TWAMP or STAMP, and, um, currently they need to go and do all the configuration themselves to figure out this, uh, what router need to be connected to, to do the measurement, and it's very time consuming. Yeah, and very troublesome. And, uh, so the requirement is that could we have IGP router auto-discovery which router want to do the measurement, and, um, this measurement need to be, uh, potentially leakable across areas if needed, and, um, we need to advertise the address, uh, of the interface to identify, uh, which, uh, the interface or which, uh, the membership of the interface that want to be measuring a particular, uh, mechanism, like TWAMP or STAMP. Um, and of course the same address could be used for using... used for multiple measurement group if they, uh, are such configured. So here is the proposed TLV, and, uh, it's very simple. It has, uh, basically include IP address and then indicated which protocol is going to use that address, and, uh, currently we identify two of them, TWAMP and STAMP. So how it operate? So, um, the router that want to participate into this, uh, measurement groups advertise the sub-TLV, set specifying an address they want to use, and, um, they can advertise multiple instances of the same sub-TLV if they want to have like different addresses for different protocol. And then the receiver, it will get a group of, uh, information of the membership, and they'll build a database and then they can use the database to communicate with other, uh, components on their routers, and they will... those components will use the information to make connection and do measurement. Um, as usual, the router capability TLV could be propagated across area boundaries if needed, uh, configured to do so so that the it meet the a requirement that it could be, uh, a domain-wide distribution. So basically after all this is done, uh, if anything change, the membership change, new new box come up or box go away, uh, it get detected and the receiver will update their database accordingly and cause the component, the measurement component, will react accordingly. So summary and next step. So, uh, of course we want to do this, we need to get the sub-TLV type and we need to assign the bit if we keep the encoding as is. Uh, nothing much changed about security, it's the same thing that in IS-IS as before, um, using authentication is your friends. Now what we want to figure out this time is to identify, uh, whether there's other people on operators that have the same need, and also whether this approach of carrying this data in IGP, uh, is a valid approach, uh, for the working group. In all this then, uh, you know we'll have a update and then include other protocol like OSPF in the future. Okay, thank you guys. Any question, comment? Um, Martin, are you in queue for questions or is it from last time? Okay. Any questions, comments? Oh, maybe we can do the discussion in the... Acee? **Acee Lindem:** Uh, just speaking as a... as a working group member and a co-author, this is... we were talking about this and, uh, how this fit with IGPs and it's somewhat similar to the seamless, uh, BFD advertisement in the capabilities. It's a very small amount of information and the resulting, uh, computations could be used for, uh, traffic engineering. So I think it I think it does fit in IS-IS and OSPF. **Yingzhen Qu:** Okay, um, finish? **Acee Lindem:** Yep. **Yingzhen Qu:** Okay, um, thanks Derek. By the way, is not moving, I don't... yeah, that's the script stopped moving a while back. Tony, you are the next one. **Tony P:** Yep, here I am. Let's hope I don't get squelched off again. **Yingzhen Qu:** Yeah, and you have control of the slides. Please go ahead. **Tony P:** Thanks so much. Alright, so that's an update on the [draft-prz-lsr-hierarchical-snps](https://datatracker.ietf.org/doc/draft-prz-lsr-hierarchical-snps/). Whatwe're seeing, interesting things. Let's fire it. Alright, so what happened between the last time and now? Um, pretty massive readability rewrite based on some feedback. Perplexity AI to be blamed for anything that looks funny. Um, the big change is that the original draft was suggesting that the HSNPs have some kind of ranks. Yeah, levels - bad word because we always get confused with ISIS levels. That has been abandoned completely. I'll talk about why. Um, and you know me, I like to heat up CPUs a lot, so I did quite a lot of work on actually building an emulation of very large networks, doing refreshes over very long periods of time, generating HSNPs and all those hashes to look at collision probabilities, right? To get some kind of a hang on those things. And one practical thing fell out of this work for the fragment Fletcher, right? Because the HSNP hash for the fragment is a Fletcher, we changed that. Um, is that we added the ISIS MTU length or PDU length that is in the packet, which is actually a good source of entropy as well to get those hashes more evenly distributed. Alright, so why did we abandon this level or rank? Because with the implementation we're seeing that there isn't anything like that, because different parts of the LSDB compress at different density, and that depends on a lot of things, right? So the strategy implementation may choose is not something that we can standardize, and it not only varies over time but also per peer, right? Because talking to different peers, these peers may have LSDBs in different states, so the agreeing on consistency, what needs to be exchanged, can not only vary on time but also per peer. Alright, and the strategy a node can choose to actually decide how much it compresses—so you know, how many nodes it covers with a hash—can be based on a lot of different variables. So you can keep some statistics, right? About how many of mismatches did you see to appear, and if you see a lot of mismatches you can come to a conclusion it's not worth to compress anything on this part of the database. A link flap is very different from a node reboot, right? Because if the neighbor reboots it basically loses the whole database, so there is nothing to talk reasonably about at high level. Whereas during and... irrelevant... okay, no, irrelevant... whereas if a link flap, the databases are basically synchronized on both sides, so HSNPs, you know, large hashes or hashes covering large part of the database can help a lot. People can have in implementation different caching strategies for all those hashes or whatever not, so that also can influence how dense they want to make those HSNPs or you know, how how large this hashes are in terms of covering parts of the database. And then there is other consideration, like hashing of CSNPs and for example generating CSNPs may be extremely cheap compared to HSNPs, so you don't want to have hashes where you may want to have hashes that actually trigger a lot of CSNPs if they mismatch because they're very easy for you, or maybe not. Um, and I think last time the topic came up, but again to emphasize it again, to agreeing on all the peers on what are the ranges that the hashes cover seems like an easy and natural idea but it's actually not feasible for reasons that actually I discussed last time on the mic with Chris. Good. So, a good analogy why these HSNPs are useful is you can consider them basically a gradient descent, if you're familiar with, you know, like the most basic stuff in AI. So these HSNPs you can consider, you know, the optimum is both databases being synchronized, and these HSNPs can be considered the large steps on a gradient descent, right? And if the hashes match, it means you descend very quickly towards a database that is consistent on both sides, and if they mismatch you basically have to take smaller steps. Alright, I hope people can follow that, if you're familiar with either of the drafts it makes sense. But what it shows is here that this gradient descent analogy, if you think through it, leads to basically fairly simple rules, right? Basically when you get an HSNP hash over a range of nodes that mismatches, then you have to take smaller steps, and the smaller steps is either to send an HSNP packets with more specific hashes for the mismatch, right? So kind of disaggregate the range into better, you know, smaller hashes, or you can react by basically sending CSNPs for the mismatch directly, which are really small steps, which the protocols are taking today if you think about it. Or at the very smallest step where you can be pretty much sure that the databases are consistent on the ranges you basically flood all the involved fragments, right? So of course it varies on implementation, depends a lot of considerations about, you know, optimal strategy and so on, but in architectural terms this is basically going from a big step on a gradient descent to smaller steps which will be more precise, get you in the right direction. And at the point in time where an HSNP hash is covering just a single node, and node ID which means all the pseudo nodes and all the fragments of the node and the pseudo nodes, then the action will become just the last two steps, right? Because you cannot send a more specific HSNP hash, the smallest resolution of HSNP hash is basically, you know, a node ID. So then you either send CSNPs, PSNPs, or otherwise you flood directly. Um, so that's I think helpful if you think why the whole thing works and what will be ultimately, you know, the precise language of the standardization for the whole thing to work correctly. Alright, now minor changes. So there is the funky case where a Fletcher of a fragment can become zero, which is actually a valid—funny enough, a valid case, and the draft says replace it with one so during XORing into HSNP range the stuff is visible. Um, we look at the case where a different set of fragments with different Fletchers will create the same node hash, and when you look at the probability of that stuff it's so incredibly unlikely it's not worth considering further. You find something in the draft. Um, and the... there was a stupid idea initiated by me as very often that if we have an HSNP packet and the range is, um, in the range of the HSNP packet we have multiple hashes, right? And there may be holes in those ranges of the HSNP hashes. We could advertise the missing range with the hash zero indicating absence, but that's actually dumb because what could happen is that multiple node hashes XOR each other out and the zero as a HSNP hash for the node range is actually zero, it's a valid case. So that needs to be removed. I'm again getting into a little bit arcane details here. Anyway, the meat of the stuff is that, um, I burned a lot of CPU powers to look at the hash collisions. And basically, um, what I built is a perfect emulation of a network where everybody is refreshing. I was running 32 networks, just amount of cores that I had, each network 50k nodes and total about a million fragments. Um, and each network was running for two years, um, refreshing, right? Refreshing and a little bit more than that. I used roughly MaxAge to refresh fragments, somewhere lower like 50k seconds or something like that. I did not make the node IDs too different, so the node IDs which are used in the hashing were only differing by three bytes, which is kind of realistic or or pretty, you know, you can argue already a fairly suboptimal case if you have a 50k nodes network and and basically with node IDs you only squeeze it into three bytes difference. Only on 5% of refreshes I was changing the ISIS PDU length, so assuming that the network is basically refreshing all the time the same thing. Um, the check-sums I assumed would be moving in fairly uniform fashion, and the check-sums will be changing because the sequence numbers are bumping, right? And the sequence number is in the ISIS check-sum. Um, and while the stuff was running I was basically every time something refresh I was checking the whole database whether there is a collision of an HSNP fragment. So, pretty much mirroring, you know, the reality of that stuff as far as it mattered. Now, what did we see? So 32 bits HSNP hash is too small that will collide too much. So we're running 48 and 64 bits. Um, astute observer realized that there is no 48 bits Fletcher, but it's very easy to extend Fletcher to do actually 48. Um, and I will show you a graph, right? And this is the quintessential graph that shows that actually 48 bits work better than 64 bits, which is kind of surprising, and I have some gut feeling why that is but no no real data to back up, you know, what would cast effect or or, you know, a theory foundational theory, but I see it on the graph, so you see the graph. So horizontally you will see there is the distance between the fragments that collided in LSDB, right? And how many collisions. And on the vertical axis you see the duration. So the duration was, you know, somewhere a little bit less than the lifetime of a fragment. So we're talking about 10 hours, something like that. Now, the red line means that only a single HSNP packet summarized the whole database, right? And you see then now why is it important whether it's left of the line or right of of the line. Because right of the line means that the collision distance was so far apart that it didn't matter. It was not in the same HSNP hash. So for practical purposes those collisions did not matter. Left on off the line it means that it was in the same HSNP hash possibly, and that means that if you XOR both that they both vanished. Alright, I hope people can follow that if you familiar with either of the draft it makes sense. But what it shows is here that the 48 bits, lots of these collision didn't matter. Albeit it generated, you see at the top at the very right, 230 collisions whereas the 64 bits only generated 190 collisions. We talking here something to the order of 36 billion refreshes, right? 32 networks running over two years each of them a million fragments, so to to even get those 200 collisions you need to run about 36 billion refreshes. Um, but let's go back to those lines, right? So on the 48 bits what we seeing, if you generate a single HSNP packet for the whole database, which is the red line, you get 109 collisions. Um, there was a pointer I think somewhere but I hope you can follow me. So that's the 109 number. So if you translate this 109 collisions over the whole thing, um, it basically means that we will be seeing about one collision for about 10 hours every year, okay? So that's that's what we talking. If we generate however like 70 or 80 HSNP packets to cover the whole database, we seeing one collision for 10 hours in about 10 years. So you know, practically this is all pretty much irrelevant because if you start to look like what will happen if you have to generate that many CSNPs, the the flooding rates are become completely unsustainable. So we can't actually use CSNPs practically speaking to synchronize these kind of databases. So you know, unless the flooding is perfect perfect super perfect, um, we don't have a solution until we go to something like these HSNPs. And on the 64 bits you see that albeit we have less collisions, they cluster much closer, which means that they will be more relevant. We'll see actually more relevant collisions than with the 48 bits hash. Um, one thing remains to be seen is probably to go look something like SipHash. I had some discussion with Job Snijders, you know, we may go and look at the stuff. The problem with that is of course that everybody knows Fletcher when we start to go go Sip that we have different variants people have to implement and so on and so on. Um, but you see that um the numbers with 48 bits look like it's practically absolutely viable to run those HSNPs, they the collisions will not show up in any relevant form and fashion to matter, you know, what was my conclusion. And yeah, I think that's all I have for the moment as far as these works go. Thanks. **Yingzhen Qu:** Thanks Tony. Um, questions, comments? I don't see anybody in queue. **Tony P:** Yeah, and thanks to Les, right? He's keeping me honest and he has a lot of good comments like throwing at the draft. Yeah, mhm. **Tony Li:** Uh, hopefully my I switched my mics. Hopefully this isn't horrible. Um, can you... you said you had a feel for the collision. Uh, is it can you say what you think it is quick? **Tony P:** Uh, sorry, I I couldn't parse your question. What you mean quick, I don't I don't I don't get it. **Tony Li:** You said you had a feel for why 64 bits was not as good as 48. **Tony P:** Yeah, because it looks like when you when you go for the 48 bits and you run the hash over this sequence of things that the node IDs somehow have enough influence and they naturally partition the hash, you know what I mean? So the node ID kind of puts it into different bucket. Whereas with 64 bits the effect vanishes, they um the hashes become so uniformly distributed that they start to collide more. But it's just a gut feeling, you know, it's really hard to look at those things and have any conclusions, especially since you know this 36 billion things run for, you know, is hitting up, you know, water-cooled machine for a day or two just to get this kind of graph. Mhm. **Tony Li:** Thanks. **Tony P:** But you know, after I run SipHash I may show up with something completely different, right? So this is this is just Fletcher right now. **Yingzhen Qu:** Okay, um, I don't see people in the queue, so thank you Tony. **Tony P:** Excellent, thanks. **Yingzhen Qu:** Okay, um, Li, yeah. **Li Zhang:** Yeah, so good morning everyone, this is Li Zhang from Huawei. I will introduce the IGP extensions for sub-interface relationship information on behalf of my colleagues. So the motivation is that alternate TE paths can be used for traffic load balancing when the shortest path is becoming congested. So but this requires the devices to know the bandwidth information of other links. So but there are already some documents that extends the ISIS and OSPF to advertise the link bandwidth information and also the utilized bandwidth information. However, there are also some scenarios that the ISIS or OSPF they establish neighbors through the sub-interfaces. So when the two directly connected ISIS neighbors are established through the sub-interface, the link bandwidth and also the utilized bandwidth information just inherited from their parent physical interfaces. Then the remote devices will now know that the relation the relationship between the sub-interfaces and their parent physical interfaces. Therefore the remote devices don't know the link bandwidth and utilized bandwidth is shared among the sub-interfaces. So this makes the remote link based on the sub-interfaces difficult to participate in the traffic load balancing in as an alternate path. So our document extends the IGP to allowing a network device to advertise the relationship between a physical interface and its sub-interfaces. So, let's take the ISIS as an example, we define a new type of TLV sub-TLV which is named the physical local link information sub-TLV. It can be used in the 22 and 222 TLV. It carries the physical local link identifier and also have some sub-TLVs field to carry the bandwidth utilized information and also the physical bandwidth information of the physical interface. So this page shows a detailed use case where the this can be information can be applied. Considering a network topology with four nodes, node A, B, C and D, and the for traffic from node A to node C, and the link there are two links between node C and D and these two links are established from these two sub-interfaces and but they are from the same physical interface. And both the link from A to B and the physical link from D to C are only 30 Gbps per second are available. So if the traffic rate is 60 Gbps per second then the we can calculate a primary path which is from AB to C and a alternate path from AD to C. So if the devices A don't know the doesn't know the sub-interface information from D to C then it will see oh there are three paths and I can allocate 20 Gbps per second on each path and so then the interface from D to C will congested because it is overloaded, it is covered, it's carried about 40 Gbps per seconds traffic. And if the device A can know the sub-interface relationship information they it will see okay there is only two paths, and then it can allocate 30 Gbps per second on each path, and all the paths works well. Okay, so I think is this motivation clear for this draft? And we also welcome for collaborators to work together on this work and also welcome for suggestions and comments to improve this document. **Yingzhen Qu:** Tony Li. **Tony Li:** Hi, Tony Li HPE. Um, I asked this question on the mailing list, I didn't get a really satisfactory answer so I'm going to ask it again. Um, it seems like the only topology where this happens is where you've got parallel VLANs on the same ethernet. And I have to ask why would you architect the network this way? Because it seems like a strange, strange thing to do. **Li Zhang:** Uh, well I I see this question but uh yes there are but there are actually some practical network deployment that uses uh several VLANs over a physical link. That is maybe we can provide this use case next version. **Tony Li:** I understand using multiple VLANs. Okay, usually this is for traffic separation because it's different administrations, but normally they don't use the same ISIS instance over multiple VLANs. **Li Zhang:** Yeah okay, thank you. I will consider your comments. Thank you. **Yingzhen Qu:** Les, you are the next one. **Les Ginsberg:** Yeah could you go back one slide, I believe? **Li Zhang:** Yeah. **Les Ginsberg:** Yeah, thank you. So what's happening here is—and by the way I agree with what Tony has said, I think it's a strange use case, but even if it happens, you are advertising two IS neighbor TLVs between D and C. And as you have pointed out even in the draft, uh from the point of view of a node like A who is remote, A doesn't know anything about the relationship between D and C and doesn't need to. And in those two neighbor advertisements, assuming that D has some way of partitioning the bandwidth between the VLANs—which you're going to need anyways no matter how you advertise it—you can advertise it uh in the the two different neighbor TLVs, one for VLAN 1 and one for VLAN 2. I just don't see why you have to invent a new way of advertising information when you've already you already have the ability to do that. **Li Zhang:** Uh, yes I think uh we can discuss this is in the mailing list. Yeah. **Les Ginsberg:** Okay. So one other quick thing, uh if you go back I think to your first slide. **Li Zhang:** Here. **Les Ginsberg:** Uh no, I think slide 3, I think. **Li Zhang:** This one. **Les Ginsberg:** Yeah, yeah, so bullet point 2 is uh is false. You've made that assumption, this is not true. You don't have to yeah, point two is not quite... yeah, this nothing that insists that you have to advertise the same bandwidth on the two sub-interfaces. **Li Zhang:** Yes I think our implement is is in this way. **Les Ginsberg:** Okay. Thank you. **Li Zhang:** Thank you. **Yingzhen Qu:** Acee, you are the next one. **Acee Lindem:** Yeah, speaking as work... Acee Lindem Ericsson speaking as working group member. Yeah, I noticed that your slides you only had the physical group identifier which I somewhat I mean I understand what you're doing there. I agree that this is strange to need to do this because you will have separate for the separate interfaces. Uh, but in the draft there was also these other TLVs. I was going to bring it up uh bring up the draft as well that you didn't talk about in your presentation, the ones the the first TLV you talk about uh that goes in the that goes in the neighbor advertisement. **Li Zhang:** Yeah. **Acee Lindem:** So what's the question? Uh I didn't catch it. **Acee Lindem:** Uh you provide you're it seems you could do everything with the TLV that you have in your presentation in 2.2. **Li Zhang:** Yes. **Acee Lindem:** What is what is the TLV that's added to the neighbor in 2.1 in the draft? **Li Zhang:** Section 2.1 draft? **Acee Lindem:** Yeah, well if you don't remember I guess take to the list. **Li Zhang:** Yeah, yeah, please take to the list. I can't remember. **Acee Lindem:** Okay, thank you. **Yingzhen Qu:** Lian, you're the next. Okay. **Lian:** Yeah, um this is Lian from China Mobile and uh just want to confirm that, could you please go to the slides of the use case? That's the purpose of your draft is to know the relationship between the D and D1 and D2? **Li Zhang:** Yeah, yeah. **Lian:** Oh, I understand and uh I have just quick read the draft I found that uh maybe the new TLV you should consider the length, because sometimes there are so many sub-interfaces under one physical interface, so the length may be very big. Uh so maybe you should consider this situation on your draft. And uh also I found that the OSPF extension is not specified. **Li Zhang:** Uh yeah, maybe you can add it later. This is just the 00 version, yeah. We will do it later. **Lian:** Okay, thank you. **Li Zhang:** Thank you. **Yingzhen Qu:** Any other questions, comments? Okay, you do have some questions you need to answer in the mailing list. Thank you. **Li Zhang:** Okay, thank you. **Yingzhen Qu:** With that, I think we finished all the items on the agenda. So any questions to the working group or anything about OSPF or ISIS? **Kireeti Kompella:** Yeah, please go ahead, Kireeti. **Kireeti Kompella:** Hey, um so I have a draft on capabilities to signal to the network whether a node is capable of doing MP-TE or not. Um it's not a high priority right now but if the multipath TE stuff starts taking off it'll be important so that you can signal through nodes that can actually support this. So at some point I will bring it to this working group. **Yingzhen Qu:** Okay. **Acee Lindem:** Uh, I looked at that Kireeti, that was a very drafty draft. It needs some more uh encodings and some more specif more specification. **Yingzhen Qu:** So I suppose we'll hear your presentation in Vienna? **Kireeti Kompella:** Um sure, yeah. **Yingzhen Qu:** Okay, and maybe Acee you can send him comments about how to close up some of the holes? **Acee Lindem:** Yeah, I I think you could probably just look at some of the other drafts that do similar capabilities. It all depends it all depends on how you want it scoped. I think you want it scoped at the ISIS level or OSPF area level is where where the information is scoped because that's where the multipath the multipath is computed. **Kireeti Kompella:** Yeah, the multipath uh is computed. Yeah that's probably right, yeah. **Acee Lindem:** Yeah. **Kireeti Kompella:** Thanks. **Acee Lindem:** Thanks. **Yingzhen Qu:** Any other questions, comments? Acee and Chris do you have anything? **Acee Lindem:** No. **Chris Hopps:** No. **Acee Lindem:** I I was just going to say, speaking as as working group chair, I think we want to discuss the power group draft. Uh I mean there's a couple of things that I think it makes sense to last call that we we talked about that we talked about had requests for last calls, both the one presented and the one and Tony's. And the power group, we should do that right away while it's fresh on everybody's mind. **Yingzhen Qu:** You mean adoption, right? **Acee Lindem:** Yes. Well the adoption and then hopefully the questions will ensue. **Yingzhen Qu:** Yeah, I think we'll continue the discussion on the mailing list. There were... **Acee Lindem:** Mhm. **Yingzhen Qu:** Okay, if no more questions, I think we are done for this session and we'll see you next time in Vienna. Thank you everybody. **Acee Lindem:** Bye everyone, thanks. **Chris Hopps:** See you later.