Session Date/Time: 17 Mar 2026 03:30
Chair: So, good morning everyone. We can start this session. And welcome to SRv6 Ops meeting. We have two sessions, and this is the first session, and today we will focus on the operators' presentation. And the next session we will focus on our drafts. So this session is being recorded.
First, the Note Well. You know by the participating, no matter for the remote or in person, and you agree to follow IETF process and policies. And for the tips, in-person participants, please make sure to sign into the session through the DataTracker. And for the remote, please make sure your audio or video are off unless you are presenting.
So here is a Working Group website. We have GitHub website and you can get the full information about the working group. And if you have some drafts in this working group, please use the GitHub tools.
And for the working group status, now we have two adopted drafts. The first is draft-ietf-srv6ops-srv6-deployment and the second is draft-ietf-srv6ops-problem-summary. And for the liaison, Dhruv please give some introduction.
Dhruv Dhody: Yeah, thank you. So for this liaison, we have received this from ITU-T Study Group 11. There was a couple of liaisons back and forth with respect to SRv6 conformance and listing out all these various different test cases. We initially gave them feedback from SRv6 from BMWG and a bunch of other working groups back to Study Group, explaining how we do things in IETF, how we don't identify what extensions are core and what are optional as they have done earlier. So those things they have already handled. But they have given us a document for review, which is the whole conformance testing. Conformance test cases, all the various in which SRv6 ops can be tested. As of now, when the chairs were discussing, we did not have any specific input that we want to provide as a group, but as individuals if people are interested in this, please feel free to reach out directly to ITU or to our liaison to ITU-T, which is Scott. And or reach out to chairs if you have any thoughts if as a working group we need to respond in any way. So currently our plan is not to take any action, but if anybody in the working group feels otherwise, please raise it up on the list.
Chair: Okay, thank you Dhruv. So let's go to today's agenda. We have very good presentations today. One is from Alibaba, and the other is from China Mobile, and the last one is from Guangdong Power Grid. And you know, we cover the cloud operator and industry. So we're looking forward to their presentation. The first one is from Mr. Chen, Alibaba Cloud. Mr. Chen, is he here? Yeah.
Chen Fei (Alibaba Cloud): Good morning, everyone. I'm Chen Fei from Alibaba Cloud. And today my presentation is about 2.1 IPv6&SRv6 Powering Alibaba Cloud AI Computing (Alibaba). And today I will share in how IPv6 and SRv6 is helping Alibaba Cloud to tackle the network challenges from facing in AI computing. And before we dive into the solution, first let me take a look at the challenges from AI computing.
So here AI computing presents us with three significant challenges. The first is IPv4 resource provisioning. For AI model continue to growing in scale and pre-field decoding separation used in inference, and we can see the AI traffic shift from intra-cluster to the inter-cluster. So it's so it means we have to assign unique address to each cluster, make address reuse impossible. For the latest data center in Alibaba Cloud interconnecting ten cluster requires 380 /16 IP blocks. That's a massive demand on IPv4 resource. And second is virtualization performance loss. As a cloud provider, Alibaba offers AI computing products with talent isolation. And for the traditional VXLAN solution introduces 50 bytes packet overhead and encap/decap latency. The problem is AI computing based on RDMA protocol. It requires us to minimize the network performance loss as much as possible. For instance, a training job involving 1K GPU with VXLAN solution has 1 microsecond latency and 1% throughput loss. That's a significant impact in AI computing. The third challenge is cross-region bandwidth contention. ECMP-based routing and forwarding cause AI traffic compete with other TCP traffic to compete the bandwidth for backbone. And even we used even we used the resource allocation based on the DSCP mapping to different hardware Q, we still cannot effectively distinguish the different between RDMA and TCP traffic.
So how can we address such challenges? Here let me walk you through how IPv6 innovation provide the solutions. And to tackle the IPv4 address shortage, Alibaba Cloud has launched IPv6-native data center. Our latest data center, HPN7, can support IPv6-only server access. And here we have implemented two key IPv6 innovation. First, servers access data center assigned /64 prefix address by default. The design choice reserves 64 bits address space for programmability. And the second, it's a problem we have to address processing neighbor discovery message, ND message for all IPs in /64 prefix could consume a significant switch resources. So we design an elegant solution: servers access data center using Layer 3 mode. We leverage the IPv6 capability to configure multi addresses on a single interface. And here we first we just configure link-local address to switch and server's interface. And then we add a prefix address on server side. This means server has two IPv6 addresses. And then we configure a static route on switch to pointing to the server's prefix address and link the server's link-local IP as a next hop. So this approach can significant reduces the ND process overhead. And the switch only learn the ND message of server's link-local IP.
So based on the foundation, Alibaba Cloud has launched SRv6 container network called NetPillar. NetPillar utilizes the 64-bit address space we reserved on server's interface and achieves the lightweight network isolation through programmable capability. And here first let me explain how NetPillar handle the container interface addresses. NetPillar defined as defined specific IP generation rules. It's similar to the SRv6 End.X SID definition. We just modified some fields of function part. You can see in the diagram the structure of NetPillar defined here. The Talent ID, the length is 24 bits. It's used to distinguish between different talents. And the Interface ID, length is 12 bits. It's used to identify unique container interface locally on each servers. And the reserved field is set to all zero by default.
And here let me give you a example. In the diagram, NetPillar defined two talent ID network. Talent A ID is 1, Talent B ID is 2. And suppose NetPillar on Server A need to assign address for container interface belong to Talent A. And first, NetPillar obtains the prefix of server's interface as a locator field. Here is 2001:coffee:: and next, NetPillar retrieves the talent info from container creation file. Here the it tells us the container is belong to the Talent A, corresponding to the talent ID value is 1. And then NetPillar calculates the unique interface ID locally. In this case, it's 1. After assembling these fields, the container interface IP is formed: 2001:coffee::0001:0001. It's a clean and programmable mechanism of IP generation.
And here let me share how NetPillar implements packet forwarding and lightweight network isolation. For packet forwarding, NetPillar adopt SR-BE model because we think within the data center, multipath have no quality difference. So SR-BE can fully meet our business requirements. And for the lightweight isolation, let me give you a example. In the diagram, container A will send packet to container B. At the sending container side, the address of container A's interface is put into the source IP field of packets and the address of receiving container interface is put into the destination IP field of packets. As packet forwarded through NetPillar, NetPillar compares the container ID in source IP and destination IP. It only in egress direction, it only allows the packet with matching talent ID to pass through. And then NetPillar send the packet to the data center. In data center, switch match the rules of receiving server's prefix address to forward the packet to the receive server. Upon receiving the packet, NetPillar performs the similar talent ID matching on packets in ingress direction and send out the packet to the receiving container based on the address assigned interface. So it's a simple and effective mechanism to achieve the lightweight network isolation without packet overhead and encap/decap latency.
And moving beyond the data center, Alibaba Cloud has launched SRv6-based backbone network called Eco. Eco is build on the SRv6 forwarding model. It construct the different forwarding path based on the link attributes. In the diagram, there includes RDMA path, high bandwidth path, and low-cost path. The path selection is implemented using flow label defined in IPv6 header at the SRv6 head end. The different forwarding path are mapping to the flow label value. And SRv6 head-ender matching the flow label value of received packet through the traffic policy to determine the forwarding path type. And then based on the destination of IP of received packet destination IP of received packet to identify the tail end and thereby finalizing the forwarding path to be used. Once the forwarding path determined, SRv6 head end add a outer IPv6 header to the received packet and then it fills the SID list of determined path to the segment routing header and send out the packet to the tail end by SID list.
And here let me share how server and network collaboration to provide the cross-region path selection. First of all, the...
Chair: Mr. Chen, two minutes left. Yeah.
Chen Fei (Alibaba Cloud): Okay, okay. Here, first of all, the Eco network push the mapping information to the NetPillar on each server. And then the NetPillar retrieves the QoS requirement from the container creation file and to determine the path can used. And as the packet forwarded through the NetPillar, NetPillar modified the flow label value to the RDMA path and achieving the cross-region path selection at server side.
So we have seen how IPv6 technology enables Alibaba Cloud to address the major challenges here. To wrap up, let me summarize the deployment status and business impact of our solution. As of last year, our solution had deployed more than 14 cluster covering over 90,000 servers. It's not only support AI computing products, it also support cloud native and database products in Alibaba Cloud. For business impact, now all AI computing product is fully based on the SRv6 solution, delivering 2% increase in throughput and 10% reduction in latency and three times improvement in job startup speed. Okay, that's all my presentation today. Thanks.
Chair: Questions? So one quick question. Could you page the page four? Page four. Go back to the page four. One quick question is here for your container network. Yeah, do you use the VXLAN or you just use SRv6 within your data center?
Chen Fei (Alibaba Cloud): No VXLAN, just SRv6, but it's based on the SR-BE mode.
Chair: Okay, okay. That's very good, impressive. Okay, any other question? Yeah.
Participant 1: Hi, I'm Mingxiang from ZTE. And about the next slide. You talk about you using the mapping of the flow label to the specified path. So how is the value of the flow label is generated? So is it there something like a centralized controller?
Chen Fei (Alibaba Cloud): Yes, yes. The Eco network have a central network manager system and he can collect the all the network information, each node, links information and to calculate the the path and based on the path to calculate the flow label value. Yeah.
Participant 1: So so the generalized controller or something like that, so computes the value of the flow label, then tells the NIC so to to incamp to use that flow label?
Chen Fei (Alibaba Cloud): Yeah, yeah. The centralized system will push the mapping info to the server side, to the NetPillar. And NetPillar will based on the mapping info to to select the flow label. And also the NetPillar also will retrieve the QoS requirement from the service because service is created by the container, so he can the NetPillar can get the QoS requirements info from the container creation file. So he can know first he can know how the QoS they the service needed, and then he based on the mapping info from Eco network to select the flow label.
Participant 1: Okay, thank you.
Chair: Okay, thank you. Because the time limited, I think we can go to next presentation. Thank you very much, Mr. Chen. Next one is 2.2 SRv6-Driven Security Resource Pools (China Mobile) V1 from China Mobile. Jiaming, please.
Jiaming (China Mobile): Hi everyone, this is Jiaming from China Mobile. It's very happy to give this presentation and share the deployment of the SRv6-driven secure resource pool. And the secure resource pool is a centrally constructed resource pool that consolidated geographical distributed and hardware-dependent security resources like firewalls, IPS, WAF, etc. into a centralized resource pool. In this pool, value-added services we call it VAS are sequentially orchestrated into SFC, thereby centrally providing customized and on-demand services. In this pool, resources are virtualized to significantly improve the utilization, and it also could support a large number of tenants, offering both tenant isolation and per-tenant service customization. And the VNF and VMs could be added or removed based on the requirement.
And in the past, we use PBR to perform the orchestration inside the secure resource pool and basically like this. First, steer the upstream and downstream of user traffic on MB to SecGW by the SRv6-BE based on the user IP. And the SRv6-BE here is just only to direct the traffic to SecGW and let SecGW to perform the service performance service orchestration. And the traffic of different users are differentiated by the user IP and isolated by the VLAN for tenants on one VAS. And as for the service orchestration, it is controlled by the SecGW, that's the gateway of the resource pool. And the SecGW will first redirect the traffic at the ingress interface to the first VAS, like the IPS on the SFC for processing. And after processing, SecGW will return the VAS will return it to the SecGW. And SecGW again will redirect the traffic to the next VAS, like the WAF following the same procedure. After traffic are processed by all subscribed security services, they will return to the SecGW again. And therefore, we can see there's some limitation of this traditional approach. First, it struggle to scale to the large tenant scenario just because limited by the number of VLAN. And it's complex to operate and management, no matter adding, deleting or monitoring the service require configuring each device along the SFC. And it also lead to inefficient bandwidth utilization on the SecGW. It is because the traffic's repeatedly forwarded between the SecGW and various VAS. Take the SFC with three VAS for example, the traffic towards the SecGW to TOR link three times. Therefore, it will consume equivalent to three times the total user traffic.
So, to mitigate those problems, we plan to adopt the SRv6-based approach. In this in this development, policies are distributed to the SecGW and VAS. First, the operation platform will distribute mapping table link user traffic to tenant's ID, like a ARNID carried in the packet header, and the corresponding SRv6 SFC. Here the SRv6 SFC will sequentially orchestrate the VASes as the segment into SRv6 TE policy based on user's subscription, and with the SecGW SID sub as the last hop of the TE policy. For example, the SRH of the SFC from VAS1 to VAS2 to VAS3 is just like what is shown in the picture: first is VAS1 SID and VAS2 SID, VAS3 SID and the last one is SecGW SID. Then the operational platform also will distribute mapping table link ARNIDs to the specific processing, that is the virtual instance on VAS bound to ARNID. And then is the traffic steering. It is just like PBR-based case, just redirect the traffic into the SRv6-BE based on user IP.
And then is the service orchestration. Differently inside the secure secure resource pool, in this case we will use SRv6 to perform the orchestration. First, SecGW serve as the head end of the SRv6, classify and encapsulate the packet with ARNID and SRv6 header based on the flow characteristic, and then forward them to the first VAS on SFC. Then VASes, each VAS will process the packet and forward it to the subsequent VAS based on the SID list. And here when VAS only need one SID, primarily to guide the packet forwarding to the corresponding VAS. And because the the SID here only represent the location of VAS in the network, not the tenant. And the specific tenant-specific services are indicated by the ARNID in the packet header instructing the VAS on how to process packet, in other words which instance to use for packet processing. Therefore, it will save a lot of SID and segment list resource. Finally, SecGW will serve as the tail end of SRv6 again, and the packet are return to the SecGW after all subscribed VAS process the packet. Therefore, it will save there's no bandwidth waste between the SecGW to TOR link.
So let's look at the SRv6-based deployment. First, forwarding performance. With the same capacity for SRv6 with two VAS, SRv6-based deployment demonstrate significant performance improvement as packet size increase. And when tester send 1400 packet to to the SecGW, and the SRv6 deployment will deliver 20G forwarding performance, but PBRs only gets 15G. And then as the number of VAS in SFC grow, SRv6 method incur only a minimal increase in the forwarding delay compared to a sharp rise on the PBR. Next is the most significant data bandwidth efficiency. For the link between SecGW and TOR, SRv6 maintains consistent bandwidth efficiency regardless the VAS number. It stay at 45%. While PBR suffers from a rapid utilization degradation due to the multiple traffic detours. And here we can see that's the way to calculate the bandwidth efficiency. And beyond the measurable benefits we just saw from the data, the SFC solution also offers several additional advantages. First, easy maintenance. We just simply update the SRv6 policy at the head end of SRv6 to modify the SFC. And it could support support for the large-scale tenants when used in conjunction with ARNID.
Finally, the future work. Although SRv6-based SFC offers numerous advantages, yet there are still many challenges in its practice that needed further resolved. First is the reliability scheme. The reliability method based on the switchover between candidate paths suffers from low resource utilization. This is because because one VAS failure ruins the entire candidate path and available, just like what is shown in the E2E path redundancy. So we need to further explore how to perform a switchover in the granularity of VAS, for example, like what is shown in the ideal protection, if the master VAS1 down, how to form a SFC from VAS1 VAS1 slave to the VAS2 master. And then currently the most VAS in the security resource pool, they are not they do not support the clustering, so how to make them to support it. And finally is the forwarding efficiency. As the SFC lengthens, the packet hash overhead grow and reducing forwarding efficiency. So to mitigate this, C-SID is recommended to reduce this overhead. That's all my presentation, thank you. Any question?
Chair: So any question? Please use your Mikos Q. Seen none. Okay, thank you Jiaming, very good presentation. Next one is 2.3. SRv6 for End-to-End Services (Guangdong Power Grid) from Guangdong Power Grid. Lugian Gang, please.
Lugian Gang (Guangdong Power Grid): Thank you. Hello, everyone. My name is Lugian Gang from Guangdong Power Grid. It's great to be here and thank you all for joining us. Today I'll be sharing our experience in deploying and applying SRv6 in Guangdong Power Grid's data network. Here is a brief overview of Guangdong Power Grid. The grid serves 21 cities and also supplies electricity to Hong Kong and Macau. It's the largest provincial power grids in China. Here are some numbers: the peak load was 164 million kilowatt, total electricity consumption reached 950 million kilowatt hour. The grid services serves 100 million people and we have built more than 3,000 substations at 35 kilowatt or higher. At the same time, the new power system is developing very fast. Guangdong leads the country in offshore wind. Its capacity exceeds 12 million kilowatt. The total capacity of green energy has exceeded 80 million kilowatt.
To support a new power system in Guangdong, the Guangdong Power Grid's communication network covers all substations. The foundation is optical transport network. Multiple data networks have been deployed. Each network is designed with two or more independent planes to ensure reliability. Different types of services run on separate networks. Among these networks, the dispatch data network mainly carries production services, such as power protection and stability control. The integrated data network handles management services, including administrative telephone and management information system. Our SRv6 practice focus on the integrated data network, which consists of a backbone network and 20 city-level access networks. Guangdong Power Grid is upgrading its integrated data network from IP/MPLS to SRv6. It's a multi-area, multi-vendor network. This brings challenges during the migration. Scenario one, within a single network region devices from multiple vendors coexist. Scenario two, each network region uses devices from a different vendor. In such a network, device configuration must be performed node-by-node, which is very time-consuming. Even after the network is up and running, end-to-end analysis remains difficult. To improve the efficiency and security of daily power grid maintenance, more types of endpoints are being introduced, such as inspection robots, smart helmets, and mobile operation endpoints. Different services have different quality requirements. Network engineers need configure many SRv6 and QoS policies. This makes the configuration process complex and increases the risk of errors.
To address these challenges, a more efficient solution is needed for network configuration and monitoring to support the fast development of the power grid. So Guangdong Power Grid's solution is using a unifying NMS to manage the whole network centrally. It's a three-layer network architecture. First layer, the physical network is the foundation. This layer consists of devices from multiple vendors. Since these devices use different CLI and data reporting formats, engineers must understand and manage all device types increasing operational complexity. Second layer, the vendor NMS is in the middle. Each vendor has its own controller to manage its devices. However, these systems only provide visibility of their own part of the network. They must report their data to an upper-level system for unified analysis. Third layer, the unified NMS is on the top. The unified NMS connects with the vendor management systems via standard interface to collect network data. This enables operators to monitor and analyze the entire network in a consistent manner.
The unified NMS provides three key capabilities: monitoring, configuration and analysis. First, network monitoring. It makes the network visible and manageable. The NMS not only shows the real-time indicators, such as configuration status, packet loss and network latency, but also tells you whether a service is healthy. It also supports historical data playback. Second, network configuration. This capability makes service deployment simpler and more reliable. Through a unified and simplified interface, the system hides the differences between devices from different vendors allowing network administrators to focus more on service design rather than configuring individual devices. It also supports configuration orchestration. Administrators can preview configurations before deployment to ensure the delivery matches their expectations. In case of failure, a one-click rollback helps reduce configuration risks. Finally, network analysis. Administrators can view real-time service paths and replay historical changes to quickly identify the root cause of issues. The system uses a graph-based algorithm to recommend optimized forwarding paths, improving network resource usage and ensuring a more stable service experience.
Here I will show two cases how the NMS uses SRv6 to address the challenges. Case one is about simplified SRv6 VPN deployment and real-time path optimization. Let's use the video conference as an example. Normally, video conference requires many site to join the meeting at the same time. To make sure an urgent video conference goes smoothly, there are two key things: first, the system can quickly set up a dedicated policy to reserve enough resource for good experience. Second, during the conference, it ensures service quality with no lag or disconnects. Here is how our unified NMS achieve the target in three steps. First, resource management. The system collects all relevant resources to evaluate how to ensure a high quality video conference. This includes available link bandwidth, latency between sites and other network conditions. Step two, configuration orchestration and delivery. It is an automated process where the unified NMS performs seven steps to design, deliver and verify the configuration, ensuring the service is up and running. The unified NMS pushes configurations to the vendor controllers, which then execute them on their devices. Step three, service maintenance. All network devices report real-time data to their controllers, and unified NMS collects the data for administrators. It can display not only individual device indicators, such as CPU or memory usage, but also end-to-end service metrics. The NMS calculates multiple paths and automatically switch traffic to the optimal path if the current path degrades.
Case two is how CSG using application-enabled networking to identify the traffic and apply corresponding policy. The CSG team has developed the function in the electrical open Harmony OS to enable synergy between device traffic and SRv6 policies. Electrical open Harmony OS was released first by CSG in 2023. It supports a wide range of devices and has already been adopted by over 900 types of endpoints. Each device marks its IPv6 traffic with a specific ID. The network device can identify the ID and maps it to the corresponding SRv6 tunnel. As shown in in this figure, the robot marks its CCTV traffic with an ID.
In conclusion, first, Guangdong Power Grid will continue to explore and apply SRv6 to empower the development of smart grid communication. Second, exploring SRv6 and AI integration to drive intelligent network operation and maintenance. Third, the goal is to gain more innovative ideas from IETF working groups and relevant conferences, leveraging technology to advance the industry and promote large-scale SRv6 deployment in the power industry. Thank you for your attention.
Chair: Okay, thank you for presentation. And I see Mingxiang, yeah please.
Mingxiang (ZTE): Hi, Mingxiang from ZTE. Great to see SRv6 technology is deployed in such a large-scale production network. And my question is I noticed the architecture is for multi-vendor and multi-area, maybe multi-province networking. And I think efficiency is so important. So I'm wondering when the unified network management system is integrated or interact with the vendor-specific controller, do we consider some standardized interface or data models from IETF such as L3 service model or network model?
Lugian Gang (Guangdong Power Grid): Okay, thank you for your question. It's a good question. And in our work, we have some standard IETF standards for the north interface, just like L2NM and L3NM. But also to meet our needs, we make some extension for the real work. So we hope to do something for the advanced standardization for the communication, both from the north direction or south direction. Thank you.
Mingxiang (ZTE): Okay, thank you for your answer. And I noticed a key application of our smart grid network is network surveillance or suspicion, right? And I believe the AI-based surveillance service maybe will give some valuable input for the involving service model that which is conducted in IETF. I believe some work item is just kicking off in the Anima work group. I think from the operation or production network perspective, some valuable input maybe as the input of the Anima work items. That's from my point, thank you.
Chair: Okay, thank you. Dhruv.
Dhruv Dhody: Yes, just a quick question. I can't help it, but your slide 3 on the video services. Is it round trip or one way, the latency there? I think it... is it one way or is it round trip slide 3 where you have 100 milliseconds there for video services. That's round trip, right?
Lugian Gang (Guangdong Power Grid): Yes.
Dhruv Dhody: Okay. Maybe I'll take it to the list. Yeah.
Lugian Gang (Guangdong Power Grid): In our network, we have different devices from the vendor and they are mixed to form the network. So it's very difficult to to operate and maintenance. So we need to a unified NMS to control or get the information and control. So it's very easy to maintenance. So. Okay.
Chair: Okay, thank you. Any question, comments? Seen none. Okay, thank you very much. Thank you, very good presentation.
Dhruv Dhody: Thanks everyone, I I really want to thank all the three operators to come and share their experience, but we also have an ask that these are just a starting conversations, these operator presentations, but the real deliverable from the working group would be when we rip like when we put those in our documents of the drafts that are being worked on. And this is just a starting input. And since we were in the region, we have come to this region after a long time. It was really good to see different operators using IETF technologies. But please continue to take part in our IETF processes as well and contribute to our documents as well. And since they are here, if folks wants to discuss and have conversation offline, let's take this opportunity and learn from each other's operational experiences. And with that, thank you. I hope you have an amazing week.
Chair: Bye-bye.
Session Date/Time: 20 Mar 2026 03:30
Speaker 1: So, hello everyone. Good morning. And let's start the section two of the SRv6 OPS working group meeting.
And this session is being recorded. And first let's see the Note Well. By participating in the IETF, you agree to follow IETF processes and policies.
And for the meeting tips, please make sure to sign into the session through the Meetecho if you are in person now. And for the remote, please make sure your audio and video are off.
So for the administrative, please use the mailing list and Wiki and GitHub to track the documents.
Today's agenda is very tight. We have seven presentations. Let's go for the first one. Michael? Michael is here. Yeah. Please.
Michael: Alright, so we continue to make slow and steady progress on this deployment draft, taking your comments into consideration. We did update it for Shenzhen.
I can't see the bottom of that slide, unfortunately, but these are the changes that we made. Again, mostly based upon your comments. We made it a little bit less MPLS-focused. The initial goal of this draft was to help customers being able to know how to migrate from existing MPLS, SR-MPLS to SRv6. It was brought up last meeting by Nick Morris that we may want to consider some other encapsulations, such as VXLAN. So in the abstract and in the introduction, we kind of made it a bit more generic, so it's not just solely MPLS to SRv6 focused. So we changed some of the wording. The draft is still very MPLS, SR-MPLS heavy, but it's getting less so.
We also rewrote a section about gradual versus direct evolution. There's a section in the draft where you can gradually migrate from MPLS to SRv6 by using an inner step going to SR-MPLS. That was particularly helpful when SRv6 wasn't yet a standard and it wasn't widely available on platforms. But that's less so now. There's less reason to do that. So we made it so that people can understand that both a direct migration from MPLS straight to SRv6 without going to SR-MPLS is viable, and the gradual migration is also viable. But the cleaner single-step architectural direction of just direct migration is typically preferred. And then we included a new VXLAN section based upon Nick's suggestion.
So here's just a brief overview. We recommend looking at this section. It's similar to other sections that we've had in this deployment draft. Basically, in the migration process, the VTEPs will be upgraded to SRv6 to be able to encapsulate the VXLAN Ethernet VXLAN UDP IP packets. And the line in the middle is to just show that that existing VXLAN tunnel is still continuing, but you can upgrade the core, upgrade the VTEPs with SRv6, and then you can slowly migrate the traffic away from the VXLAN tunnel towards the SRv6 core. And we created some steps in this as well. We haven't performed this migration. We don't know of customers who have. We have plenty of experience with people who have done so from MPLS. So if any of you have experience or know of this, we're kind of taking a stab at what some of the migration steps are, but they're probably pretty obvious. And we can, you can look at the draft to kind of see what we've done. But basically, again, you're upgrading the VTEPs to support V6 and SRv6 encapsulation while you're continuing VXLAN forwarding. You assign the SRv6 locators and SIDs. You deploy SRv6 in the core. You steer some of the VXLAN traffic over to SRv6 transport paths. And then you migrate eventually all the traffic over to those paths. And then you're able to validate the flows and then eventually decommission the VXLAN encapsulation. So thanks again to Nick Morris for suggesting us to use different encapsulations besides just MPLS, SR-MPLS.
And so here's our next steps. We're going to keep making progress. We're not in any particular hurry with this draft to make it progress beyond this working group. We'll know when it's ready to go. It's already proved helpful to be able to send to customers to say, here's some ideas on how you can migrate to SRv6. Dhruv last time made this comment. It may be better to, it may be helpful to also show how you can migrate certain areas of your network at a time when you can't migrate the entire network to SRv6. We mentioned that, but we will include that in the draft. We've been needing to do so for a little while now. It'll just take a little bit more effort to do that. So that's really it. We welcome any comments or questions about this draft. Thank you.
Speaker 1: One quick question. So you mentioned the migration from VXLAN to SRv6, right? Have you done some tests in lab?
Michael: No, yeah. We have not done, we have zero experience with this. We're just kind of, if someone wanted to do this, this is how we think you should do it. But that's a good idea. We should...
Speaker 1: Yeah, do you think there are some potential benefits if we upgrade VXLAN to SRv6? Could we get some advantages?
Michael: Yeah, the advantages would be similar to other, you know, when you're upgrading from MPLS to SRv6. You know, with VXLAN, you've got your Ethernet frame, you've got VXLAN UDP IP header, and you're able to get rid of all that and just do that encoding within SRv6. So it makes it much more simple. And then you've got the programmability aspect of SRv6 that just makes it much more programmable because then you've got the, you know, you can just use a Layer 2 SID within SRv6, which just replaces the need for VXLAN. So it just simplifies the network. Yeah.
Pavan Beeram: Pavan Beeram, HP. I apologize, I haven't read through the draft, but I just wanted to ask if this draft also covers migration from say a RSVP-TE MPLS network to SRv6? Is that in the scope of this?
Michael: The scope, yeah, it could very well be. And we probably should expand on that. What we're currently covering pretty dang well is MPLS to SRv6.
Pavan Beeram: Okay. So there is this RFC, RFC 8426, that talks about RSVP and SR coexistence as an intermediate step to migrate from RSVP to SR. I would encourage you to take a look at that and then see how that can be fed into this.
Michael: Do you happen to know that offhand? 8426?
Pavan Beeram: 8426.
Michael: 8426, okay. Thank you. Yeah. Thank you.
Dhruv Dhody: Thank you so much and I kind of agree with what Pavan was saying. I think that's a very good addition to the document. For the VXLAN one, I think we can also reach out to our colleagues in BESS and NVO3. As especially when you were saying that we want to verify what we have written in this step is making sanity check. It's a working group adopted document. I think I would prefer if you only sent, but if you need chair's help to for anything, please reach out. For BESS and NVO3, is there any other group that people can think of where we could share this information especially with respect to migration? In my case, I think those are the two groups that come to my mind.
Michael: Great. Thank you.
Speaker 1: Thank you. Thanks. Okay, next presentation is from Yisong. Yeah. Please.
Yisong Liu: Good morning everyone. I'm Yisong Liu from China Mobile and I will present the considerations for the traffic steering to SRv6 on behalf of my co-authors.
Firstly, this is the summary of the traffic steering methods. We have mainly two types. The first one based on the destination address and the second one is based on the flow characteristics. I will introduce one by one, but I will fast to go of that.
The first sub-type of the steering method is that the static route. The packet destination IPv6 address can match the IPv6 route static with the next hops to the SRv6, like in the figure we can see the example. So that's simple.
The second sub-type of the based on the destination address is the IGP shortcut. That means the head node treats the SRv6 policy as a direct link to the endpoint. When the IGP shortcut feature is enabled and the SRv6 policy status is up, so the head node will automatically generate a route to the policy's endpoint address that will point to that policy. If the SRv6 policy goes down, so the route is withdrawn and the traffic will fall back to forward via the SRv6 BE, like that.
The third sub-type is that based on the color-based. We leverage the BGP route with the color extended community and the next-hop address. The head-end on the head-end the route will match to the SRv6 policy with the same color and the endpoint. And the traffic destined for that route is forwarded to the matching SRv6 policy.
The fourth sub-type of the destination address is that based on the binding SID. When a packet destination IPv6 address matches the binding SID, the binding SID means that it will indicate a SRv6 policy, and the packet will steer into that policy. And this allows the seamless concatenation of the multiple SRv6 policies without insert the full SID list. It can reduce the encapsulation overhead.
So let's move to the second main type. The this is the first sub-type of the based on flow characteristics, based on the DSCP. For this method, we designed the scenario. We first build the SRv6 policy group that group is created for given endpoint. With the same point, also in the group we should configure a mapping rules, that's the DSCP value to the color. In this example, the DSCP 10 maps to the color 100 and DSCP 20 maps to the color 2000. So we can use a tunnel policy to steer the traffic matching the destination to the SRv6 group. So the DSCP value in the packet can determines which specific SRv6 policy. So that will allow the different QoS classes for the same destination prefix.
The second sub-type is the based on the 802.1p. This steering traffic based on the 802.1p priority value in the VLAN tag, that for the Ethernet traffic. It is almost identically to the DSCP-based steering. But the difference is that it will use the 802.1p field as the classifier. So it also requires a SRv6 policy group and tunnel policy to direct the traffic to the group and we should configure mapping rules with the group and the 802.1p value mapped to the color.
The third sub-type is that based on the service class. This is the local device identifier that will applied for the unmark traffic. And that means the packets don't carry the DSCP or the 802.1p information, etc. So the traffic is first classified locally and marked with a service class value. And we also should configure mapping rules for the service class value to the color.
The fourth sub-type also the almost the same as the service class, that means the TE class. The TE class also a local identifier and but it can support much larger range compared to the service class. And it can support more service differentiation. So the traffic we also should configure mapping rules for the TE class value to the color.
So the fifth sub-type also the last sub-type of the method is that steering via the BGP FlowSpec. The controller-based method for the flexible global traffic steering. A central controller defines the fine-grade traffic matching rules. The controller will distribute these rules as the BGP FlowSpec route to the head-end. So the FlowSpec route include actions, such as setting the color and next-hop for the matching traffic and it will enable the dynamic and centralized control over the traffic steering.
So this is the summary of the all of these specific methods for the key point and characteristics. So that's all. This is the first presentation of this draft. We see the more review and feedback from the working group. Thank you.
Speaker 1: Any questions or comments? Support? Agreeing?
Dhruv Dhody: Hi, we were just discussing this in the chat as well. This felt as if everything that we have defined all the options in the summary is applicable to SR policy in general, like both for SR-MPLS as well as SRv6. Our charter is only for focusing on SRv6 operations. For any generic operations, it is in the Spring charter. So we have Spring chairs as well, any thoughts from them, is should we send them your way? The content looks good, I think it's a good collection of various techniques and if operators wants this to be in one document, it according to me it will fall under the Spring charter.
Bruno Rijsman: Bruno is my name. No. I'm co-chair of Spring chair. I'm going to defer to whatever Bruno says. Are you going to get up and say something?
Bruno Decraene: Yes, so what was going to say is in, yes if there is a combination of things, it belongs in Spring. I agree with that. Now having said that, I have noticed in Spring a trend lately in many of the authors wanting to split their drafts into. I don't know that it makes sense for this draft to be split in two, but I'm just saying that you know this is the inverse of the trend that we've been noticing. The important thing in Spring is going to be engagement from the mailing list. So if you want to post something to the Spring working group and see if we can get engagement, we can do something like that. We don't have any more time this afternoon in our meeting to talk about this draft, but you know send it to the list and see what happens. We can't make any promises.
Michael: Quickly, could you just elaborate splitting in two, what would be part one and what would be part two?
Bruno Decraene: So we've seen several of the drafts in Spring that cover both SRv6 and SR-MPLS. And so some of the authors, for example we recently split up the SR-MP drafts into what the SRv6 machinery is going to be and what the SR-MPLS coverage is. So that's what we've been seeing there.
Michael: Thanks. I thought it was like specification part one and then operational part two, but it's fine. Thanks.
Dhruv Dhody: If we decide to do that also I think we need to update the document. Right now document says everything is for SRv6 but in reality when we go through it, it's applicable to both and of course then you need to refresh and make sure that that's the content being reflected, that it's generic SR policy rather than SRv6. And if there is something needs calling out, for instance there is an example in the chat as well. That we can still add a section in the same document that this is specifically to SRv6 which is a good thing to point out. Okay. Okay. Thank you.
Speaker 1: Thank you very much. Okay, next presentation is from, yeah, Martin. Yeah, please.
Martin: Okay, so this is just a small update on the draft on the addressing considerations, most of the work actually done by Jakub who can't be here this time, so I will present the updates. Let's just do a quick recap of the draft which already had been presented at the last IETF meeting. So the main goal is to provide operational guidelines for service providers and that's mainly because it became obvious to us that good consideration, good address scheme is crucial when leveraging the actual advantages of SRv6 in general and in particular when using compressed SIDs. Design goals we had are simplicity, it's a strong focus on summarization which seems crucial and of course on traffic engineering efficiency so we really can get the advantage of compressed SIDs. The document really focuses on the major uses. So it's all tailored around the F3216 flavor which as far as I know is the one which is mostly used. It concentrates on the NEXT-C-SID flavor, but in both respects is probably applicable to other formats as well with some adjustments probably.
So this is a gist of the actual recommendation, actually there are three of them tailored for small, what we called small networks, large networks and very large networks. Small networks could theoretically have up to 57k devices with the summarization efficiencies apparently quite some less, but probably cover quite a lot of use cases. And we have some SRv6 C-SID block, we suggest a certain placement of some bits for flex-algo, and then for the actual node SID we suggest the differentiation in the set ID and the node ID. And in particular between the set ID and the node ID we can do address summarization which helps a lot to reduce routing table and forwarding table size. The recommendations for large networks and very large networks below there is basically moving the flex-algo bits a bit to the left so you have place for something called a region and still allow some summarization there.
So what changed since the last version? The new version is out since a few weeks. Basically it's really focused and it's directly reflected that we focus on the NEXT-C-SID flavor, that the status is informational and we started to add a short section on IP address management. So that's it. We'd be happy for any contribution, maybe in particular with respect to whether the concepts for small, large and very large networks seem to be good. And in the end if all is good whether it would be time for working group adoption.
Speaker 1: So, Jim online?
Jim Guichard: Yeah hi. I'm just any particular reason with just focusing on NEXT-C-SID and not all of the C-SID flavors?
Martin: Yeah that's basically because it seems to us that this is mostly used and it makes it much easier to be direct to the point and have the examples tailored at this. In general I think if I understand it right that the whole addressing scheme would apply to REPLACE-C-SID as well.
Jim Guichard: Okay.
Speaker 1: Okay. I have a one question. So you know currently the NEXT-C-SID can be also used in data center, especially for AI data center. We see some use case here right. So do you think there is some difference between the current addressing for the WAN network and the AI data center.
Martin: I don't see any difference whether it's a WAN network or a data center network. You just have to check whether the size fits. So whether you can use of aggregation and whether the size of the overall network fits. So you have just to pick the right addressing scheme whether small, large or extra large networks.
Dhruv Dhody: Adding on to what Jim was saying. As an individual document, it's up to the authors like you want to keep focused on NEXT-C-SID is completely fine. But now since we are asking for adoption, I would also ask the working group to give that feedback early if this is a concern for you that the authors want to only focus on NEXT-C-SID or or not. I think we have standardized both mechanisms and it's up to the working group to insist. And if and then as chairs we can do that but I want that feedback early. So especially people who are who have done REPLACE-C-SID and have deployed it or are active, either please contribute now if you have specific suggestion and then we can make it generic it's always best if it includes both but if not that's what will happen. So this is your my suggestion to please be active and provide text early and don't wait for the last moments. And we have Jim back.
Jim Guichard: Yeah, so just to add to your point as the AD that's eventually going to get the document if it's adopted I would immediately be asking that question why not REPLACE-C-SID and if the document doesn't tell me why not then I would be asking the question. So basically either the document needs to include the you know both flavors or there at least needs to be text based on the working group consensus as to why it's not included.
Martin: Okay, so point is either include it or have a statement why we don't consider it. Yeah.
Jim Guichard: Yeah exactly. I mean I don't I don't care whether you do or don't it's up to the working group it's not my decision but obviously to save me asking that question later it's it's better to just make that clear. Thanks.
Martin: Okay.
Speaker 1: Those are good comment for the mailing list. But is there any other questions or comments? No. Let's go to next one. If any comments please send email to mailing list. Thank you. Thanks. Okay. Yeah. Haiyang?
Haiyang: Hello everyone. I'm Haiyang from H3C. Today I will show this topic: Service Interworking Between SRv6.
Okay, for motivation, as operator deploys SRv6, their network are not always a single domain. They often split into multiple AS or administrative domains. The problems arise because the locator routes are not advertised across domains. So this broke the end-to-end service path. Our goal is to describe how to achieve end-to-end SRv6 VPN service interworking across domains by adopting the MPLS VPN models, option A, B, and C to SRv6 environmental, it's also supporting BE and TE forwarding.
Okay, we use this reference topologies. In this topologies, CE1 and CE2 represents the customer endpoint. PE1 and PE6 are the ingress and egress PEs connected to the CE. ASBR PE1, PE2 to PE5 are the routers at the boundaries between AS. We are focus on the traffic from CE1 to CE2 illustrating how the service is maintained across the three AS.
Joel Halpern: Okay, I want to clarify one aspect of your first two slides. You're showing multiple ASes. If those are multiple ASes under one administrative domain, I can understand that. If those are multiple ASes under separate administrative domains, you may not run SRv6 across them. So I need you to clarify what your administrative and trust relationship assumptions are before we get any further into this.
Speaker 1: Yes, Joel, I'm also one co-author of this draft. Yeah, you are correct, we need add some constraint to the text describe what you are saying. Yeah, thank you.
Haiyang: Okay. Okay, for the option A, VRF-to-VRF. The key idea is that the ASBRs trust each other's AS as if they were CE routers. They have direct links configured with sub-interfaces bound to specific VRFs. At the control plane, within AS, iBGP is used to advertise VPNv4, v6 routes and VPN SIDs. Between ASBRs, eBGP is used to advertise the IPv4 and IPv6 unicast routes. For forwarding, the SRv6 or SRv6 policy is used inside AS. But the standard IP forwarding is used over the VRF links. This option is simple implementation, but he have limited scalability due to per-VPN link requirement.
Second is option B, VPN route redistribution. The key idea is ASBRs do not bound VPNs to physical interface. And the VPNv4 and VPNv6 routes are redistributed via eBGP between the ASBR. And the traffic is carried over the multi-segment SRv6 tunnels. For SRv6 BE operation, the ASBRs will allocate a new local SID mapped to the original VPN SID. And they will advertise this new SID internally within the AS and between the AS. For the forwarding, it's simple, it's just the destination address replacement. For SRv6 TE operation, the ASBR will use binding SID for SR segment list. A new B-SID type will is needed to allow us to replace the B-SID within the SRH header with the next SID. This option avoid the per-VPN interface overhead by redistribution routes via eBGP and use SRv6 SID and B-SID translation and it's offer better scalability than option A.
The last is option C, multihop eBGP. The key idea is the egress PE direct advertise VPN routes and VPN SIDs to ingress PE in the different AS. They use the multihop eBGP between PEs. And for BE they require locator routes to be advertised across intermediate AS. Forwarding is just simple SRv6 encapsulation to VPN SIDs. For TE, if there are cross-domain controller, the controller can direct programs end-to-end SRv6 policy. Without controller, the ASBR may must resolve the next hop to segment list. And the option C offers the best scalability as it's and is require more complexity control plane set up.
Okay this interesting intra-domain interworking, hierarchical VPN. This VPN splits to PE function into UPE for user and SPE for service provider. Okay, this scenario can use the same mechanism developed for inter-domain option B can be direct to solve the intra-domain H-VPN scenario with SRv6.
Okay, this slide shows the key different between the three options and some deployment considerations. Option A is suit- suitable for small deployments and easy to implement but lacks scalability. Option B is have a good balance for operators. But then option C is not have the option C's complexityable. Option C is the best for unified network but it have highly scalable but complex because he require locator routes advertisement. The choice depend- depend the network size, administrative boundary and trust model between the AS and existing operation models. Okay, that's all. Thank you for your attention. Any questions?
Speaker 1: See now please see the chat after the presentation, okay. So let's go to next presentation. Okay.
Jim Guichard: Sorry. Wait a moment. Yeah. So I'm just going to make a comment based on what I said in the in the chat. I haven't read the document in in detail, but alarm bells are going off in my mind that this is quite potentially more of a Spring remit than SRv6 OPS. If nothing else for the security aspects of cross-domain SRv6. But but I think at a minimum the SRv6 OPS and Spring chairs should at least have a conversation to you know to determine based on the charters where this would actually sit. Whether that happens now or whether it happens you know if there's some desire to get adoption of the document somewhere. But you know I don't want to stick my nose in too much, but definitely definitely where this fits in charter whether it be Spring or SRv6 OPS is an important question to answer.
Dhruv Dhody: Jim this is Dhruv. Yeah. Point taken. We should sit down with the chairs and discuss this. It's not the first time that when we talk about boundaries that this point comes up. But yeah. So let's tackle a meeting for that one.
Jim Guichard: Yeah. I mean I don't I don't care whether you do or don't. It's up to the working group it's not my decision but obviously to save me asking that question later it's it's better to just make that clear. Thanks.
Speaker 1: Those are good comment for the mailing list. But is there any other questions or comments? Okay. Next one, Jieming. Please.
Jieming: Okay. Hello everyone. This is Jieming from China Mobile. I would like to introduce a new draft called QP-based SRv6 Load Balancing Deployment. This draft focus on combining load SRv6 with QP and aim to provide a refined solution for DCI network management and control.
So let me first introduce the motivation behind this draft. With the rapid development of AI, the inter-DC network traffic no longer limited to simple backup data transfer, but exhibit distinct character of large bandwidth, low latency and high burstness. And there are large number of RDMA RoCEv2 protocol large elephant flows between GPUs. And these elephant flows are characterized by long-lived, high throughput and extreme sensitive to latency and packet loss. And meanwhile, in AI cluster, task always last for several days with a relatively stable five-tuple fields and a highly stable communication pattern and the flow always bound to the long-term QP instance.
So this cause two main problems. First is link-cause-per congestion caused by ECMP oscillation due to the fixed five-tuple fields and the presence of elephant flows. Multiple similar large will hashed onto the same physical link. And the second is lack of dynamic adaptability. The traffic is bound to QP instance, so even when congestion is detected on a specific link, ECMP cannot dynamically migrate existing long-distance flow to the link.
To to address this problem, the the industry has attempt to introduce finer-grand load balancing mechanism such as flowlet or package-level load balancing. However, these approaches introduce complexity related to package reordering, especially in in DCI networks. So this draft propose a solution. The key idea is simple and effective. So as the QP is the basic unit of RDMA communication, each connection has a unique QP. So even if multiple flows share similar IP addresses and ports, the QP ID will be always be different. So we can not only split large flows into finer-grand flow using QP, but also we can improve the performance of ECMP scheduling.
So the solution can divide into two step. The step one is QP to policy mapping. The ingress device parses the RoCEv2 package header and extracts the destination QP ID and maps the traffic to a specific SRv6 policy according to the QP ID. And step two is enhanced hash based on SL scheduling. Within the selected SRv6 policy, we can use an enhanced hash algorithm takes QP as input to perform deterministic fine-grand load balancing and select a specific SL.
So let me explain with a typical use case. Here are two DC, DC1 and DC2 are connected by PE1, PE2 and four physical link P1 to P4. The solution has three step. And step one is the controller generate BGP FlowSpec rules that include destination QP. And the BGP FlowSpec route defines the matching condition includes destination, source address, destination source address, UDP protocol, destination and source port and especially the most important is destination QP value. And the FlowSpec action is redirect the matched flow to policy one to special to specific policy. Step two is map QP ranges to SRv6 policies. In this step the ingress PE will receive the BGP FlowSpec route delivered by the controller and generate the mapping list. In this scenario, QP1, QP2 are mapped to policy 1 and QP3 and 4 are mapped to policy 2. And step three. when in each policy we use five-tuple plus QP as the hash key to select different path. QP 1 to 4 can hash to the different SL in this step. So the solution have two advantages. The first finer-granted traffic steering and the second is from random distribution to controlled scheduling we can use this change the we can use this solution like gives network operator more precise traffic control. And the next step we want to absorb suggestions from the WG. Okay I think that that all that is all.
Speaker 1: Any question? Comment?
Jeff Tantsura: Jeff Tantsura, Nvidia. I can repeat my comments from IDR. I don't think the complexity associated is not worth implementing. Another interesting part, QP ID is governed by collective library, this is where it's coming from. So you need and this is done dynamically. Usually you define range of QPs or number of QPs runtime variable and then it will be auto-generated. So now you need somehow to communicate with host-side or in fact NIC communication library layer, extract QP IDs, get them somehow to the controller, somehow figure out the path for them. The amount of kind of interactions is absolutely non-trivial. It's quite dynamic. And on top of what already said, it's enough to extract QP ID during local hashing to load balance traffic. Again the benefits are arguable. There are results published by Meta last year and they actually went away from it in my understanding. So all in all, I'm saying where you're coming from, I don't think it's worth doing given the complexity.
Jieming: Okay. Okay. We will discuss this. Okay. Thank you.
Speaker 1: Okay, next one, Funyuan. Please.
Funyuan: Hello everyone. This is Funyuan from China Mobile and I will represent our co-authors to present this slide. The name is SRv6 Policy Selector.
Before we started, we just have some quick recap. The motivation for this proposal is that the IETF has specified the SR policy protection inside one policy. But there is another policy named composite SR policy. Inside a composite SR policy, there will be some constituent policies. The protection among those kind of constituent policy will not be defined so this will define that. So we have presented this idea two years ago and in Shenzhen meeting, we found that this proposal doesn't have any standard extension. So we moved to SRv6 ops group and we received comments in last meeting that this mechanism should include both SRv6 and SR case. So the update there are two updates. One is SR policy selector and another is example.
So in this picture we have the parent SR policy, which is another name of the composite SR policy. So we define the SR policy selector in the middle and that that will select different of the constituent SR policy based on the quality measurement. So on the left side, there is the classified traffic, that's actually the customer service. So we are trying to define one SR policy selector for each user service. So the selection is based on the quality measurement which in here is is the defined is the threshold and we have we have the priority for the selection to break the tie and for from the high priority to low priority if all of constituent SR policy can meet the threshold.
So this is an example we also implemented with the SRv6 in as at the moment. So the first is the definition of the service and there are the step one is the definition of the service. There are two services which are connecting to each SR policy selector and the step two there are two policy which is hasn't been any changed. And the step three we have the defined how the threshold like some is latency, some is the bandwidth and we connect the priority with the service. So that's all. Next step we want to have the comments and what we need to do to for moving to next step. Thank you.
Speaker 1: Let's discuss this document through mailing list. Okay. Next presentation, we have five minutes. Linda, please.
Linda Dunbar: Okay, I'll make it fast. Um, so this is the- this is the use case for our draft. So basically, for some services, financial services, when they go through the SRv6, they need ESP protection. So, you can see here that they need a policy, the SRv6 policy, SR policy to go through the network. So the policy, the SID, has to be outside the domain, and outside the packet.
So what it does is in IDR we have a draft on- called SD-WAN discovery showing that when you have IPsec or ESP protected services, you need to basically add the tunnel encapsulation pass attribute to indicate the appropriate IPsec ESP related information. In SR document we have defined all the sub-TLVs. And here we just showing how do we use that to allow the BGP to advertise those services through the network. So, here's the snapshot of the payload, I think everybody's familiar with. The only difference is the payload is ESP protected.
And however, in the IDR we don't have this specific tunnel type we called ESP protected payload and IANA has assigned the number here, 28. So that we can use the tunnel encapsulation attribute to pass this specific ESP protected payload through the network. And so that both end knows that for those destination they need to do the encryption with the IPsec attributes being advertised by the originating nodes.
So here is the detailed information in a tunnel encap attribute. Basically you have a new tunnel type: ESP protected payload. And underneath this, you can include all the IPsec related sub-TLVs. All those sub-TLVs have been specified in the IDR document. And that's it. And here's putting everything together with pass attribute, with SID list, and here just showing that in the tunnel encap attribute we added this specific tunnel type and utilized the IPsec sub-TLVs being specified by IDR.
Everything else is the same. Um, here just more of the illustration of the process when you have PE1 advertise specific routes which need to be protected by ESP going into the PE2, and then PE2 basically process the BGP update, whenever they receive a route towards this- towards this route, they will be able to apply the encapsulation- encapsulation for the payload. And that's it. Um, this is the just the initial draft and we are looking for more reviews, more comments from the the working group. Thank you.
Speaker 1: One minute. Any comments?
Jeff Tantsura: Trying to find the queue. So good work, but as I texted, I was expecting to see at least an operator on this work with you. Because we are chartered for operators and operational experiences, and in operations what to improve, what to enhance, etc. etc. I don't see that in this document. It feels like it's more for Spring.
Linda Dunbar: Okay, so to explain that is this is really our financial enterprise customers, they are not really service provider per se. So they don't normally come to IETF. But we can definitely reach out for them to be able to chime in and give the the requirement. Thank you.
Speaker 1: One minute. I think a general feedback this time since there were lots of 00 draft we were little bit liberal we were not forcing that it's in our charter or not we wanted to give people an opportunity to introduce their work but in general just a feedback please make sure that it fits into our SRv6 OPS charter when you ask for agenda and thank you Linda.
Linda Dunbar: Thank you.
Speaker 1: So let's meet next time in Vienna. See you. Bye.