Session Date/Time: 12 Jun 2026 11:30
Will Law: All right, let's go.
Ian Swett: Okay. That's what I thought. Yeah. That was not the case of dynamic state. We used to do the the the report. Report part? Yeah. And then we said really bad. Yeah, it was really we didn't go for white out. Now, are you? Now, where are you? Now,
Suhas Nandakumar: That's it, then.
Ian Swett: Did you get the slides I just uploaded?
Suhas Nandakumar: Uh, no, I didn't. Don't worry about it. Just upload a new set. Uh, yeah, the updated slides. No. Hey, did you upload it for that? It's approved. Okay. Set. Put it in meetup cover.
Ian Swett: Uh, we'll see. Like, we had trouble with that in the past, when you do it the last possible second. Uh, okay. Well, I'll ask for slides again and I'll just present whatever. Who is that? Okay.
Will Law: All right, we're starting. I got to page through this. Alan, go.
Aman Sharma: I'm doing this as fast as I can.
Ian Swett: Since it's done. Yeah, yeah. Okay, 9:24. Upstream delivery timeouts. So, the delivery timeout is defined as hop-by-hop. There is no way, right now, to express cumulative end-to-end timeout across a multi-peer path. Latency-sensitive subscribers, for example, getting less than one second end-to-end, can't see if a relay has a larger timeout value. But so far, this is not really been an issue. Like, one person who doesn't come to a lot of meetings filed this issue like a year or two ago. So, is this important? Should we just close it?
Colin Perkins: Oh, I think, I think we should close it. And I'll explain why is the only solution this problem has been discussed infinitely many times, you have to have time synchronization to not do it, and though I actually think time synchronization is completely trivial on the modern internet, and as works, every time I try and convince anyone at IETF that it is possible to synchronize time between computers, like, it's just it's it's like it's an impossible argument.
Suhas Nandakumar: Are you saying if you tried to put something in they would just shut us down or...
Colin Perkins: I'm saying I'm saying we'll shut ourselves we'll shoot ourselves in the foot if we try and say you must have an NTP synchronized server.
Suhas Nandakumar: Right. Definitely.
Colin Perkins: Um, but that's it's so, I I don't want to and I'm not logged on with Jan's computer, but I just like I agree we should close with no action. Um, but it's not that I don't want this and it's not that it's not possible to write solutions for it, it's just impossible to get everyone to agree to it at IETF, in my experience. I've had slides on time stamp signature so we can I'll I'll make Is there a better way to do this? It might be. Why don't we I'll put them in the like we think we don't need to do any...
Suhas Nandakumar: Well, does anybody else have a thing inside?
Will Law: Yeah, I think we should close with no action. Okay, done. This is there there's so many other things you have to propagate it along the path. This is one of them. Just reminder to log in in the MeetEcho session, please, either the QR code or go to the track room. Okay. So, we'll close that one. Yep. Um, stream authorization alias with extension. This was hard. So, first question, have you have you implemented or do you plan to implement auth alias compression? I have implemented it. Raise your hand physically if you... Yeah. How many have not and do not plan to? Okay. Um, I'm strongly I strongly plan to remove auth compression from the draft. Interests is low, and I pre-anticipated the answer to that. So, then um, this is less of an issue in MoQ than it was in HTTP because subscriptions are long-lived, so you don't have to give the token every second or every two seconds. You give it once on the subscription. So, it's it's not as dramatically bad as it maybe would have been in that environment. Once we introduce bidirectional streams, it reduces the effectiveness of what you can do without making the software fancier or without something that's like has the complexity of QPACK has to manage things coming on different like the token that was sent on this stream is now being used by this other stream, and how do you make sure it's available for you to process it? I wrote an extension that does this, it's called MoQ pack. You can go see it in the link. Basically, it's like it defines a for every control message we have, there's parallel control message, which has a all of the parameters and track namespaces are in a compressed block, and it uses QPACK to if people want to have compression, then we can compress track names, compress auth tokens, compress everything in a in a way that works across Quick Streams. And I think and I kind of think that having this auth compression scheme in our document is going to draw more scrutiny when we get to end game. People are going to be like, what are you what are you doing here? So, this is my this is my pitch.
Colin Perkins: Oh, on this issue? Well, so, I know this was your baby, and I've worked on...
Suhas Nandakumar: Just let me motivate why we added this. First, we wanted to protect actions within MoQ, and while a subscription might be long-lived, we just had a discussion this morning about switch from, right? Actively ABR switching. Your your subscriptions might just last 5 seconds before you switch again. So, now I got to send a token, and I got to send a token again. And the trouble is these tokens are 10 to 100 times larger than the message that's being sent. Hence, the alias. The alias is just a number, and it represents it. So, I still think they have a lot of utility. Other the the other side is we have to invent protections that are not applied frequently, but are just applied like it's a macro or on setup, which is lowering our ability to actually protect the code. So, I would rather simplify the compression scheme, if there's a complexity to it. I don't want to go to...
Suhas Nandakumar: It's the opposite. It's that the current scheme, now that we're on bidirectional streams, is harder to make use of effectively. Like, you can't use an alias until you get the okay from the message that that sent it to the other side.
Suhas Nandakumar: In theory, you can, but 99% of the time, you can't.
Suhas Nandakumar: Well, but if you try to use an alias and it's not there, it's a session error.
Suhas Nandakumar: Right. But the relay thing, like, say well maybe one's coming or is there are there ways to accommodate that, because most of the time it would work. I mean... Uh, the answer is what QPACK does. Like, it's already like tried to boil this problem of like things coming on different streams and how you sync them. But I mean, and that that's a failure mode. That's the thing. What's that? I mean, even QPACK has failure modes. I mean, with timing, etc. I mean, if you follow the rules, it doesn't it's not like you can't I don't want to play failure. From that perspective, there are there are certain conditions you still can't send it. Right. But in that mode, the peer gets to tell you how many times you are allowed to send something that it has not yet acknowledged that it has. Because that can cause queueing on its side, where it has to hold your request while it's waiting for the update to arrive. So, there's a there's a tunable in it. I don't want to go in the whole design though. That so many things to set my timer to. Okay, Colin.
Colin Perkins: Um, I I think I'm a little bit on the So, I I think we're asking the wrong question of who plans to implement this. I mean, we haven't really implemented auth yet anywhere, and I think once we start implementing and requiring auth and seeing how frequent the updates are, it will suddenly become clear whether it's no big deal to send the 10k requests every time, or whether this is like, you know, half our traffic. So, I I actually think we're I mean, we're I think we're too soon to really decide whether we need this or not, and that we need to see this, and I do worry about the AV switches, and there was another case, I was against this to start with, and there was a case that somebody raised where you were going to have to refresh hundreds of subscript of things, you're going to have to do like a hundred transactions all pretty much at the same time, and I was like, oh my god, yeah, that's a huge amount of traffic. And it was an update case. Um, so, I don't know, I I feel a little bit like it's like we need to implement auth before we need to know whether we need this or not.
Will Law: Oh, okay. Martin.
Martin Duke: I think, given that we as an individual, I think, given that we have to fix it, anyway, like the current thing is not satisfactory. That's write out write down what you've done, how to do it, and like evaluate that outside of the main MoQ. What's what's the part that doesn't work right now? The that we have uh separate streams, they're out of sync. But the draft is not broken, it just says you're not allowed to use an alias unless you've got the okay from the message there. Okay, so it's very simple to implement, so never mind, okay, then not broken. I mean, no one has implemented, I've implemented it, but I haven't implemented that piece, I just turned to the draft 18 I just turned it off because I don't want to deal with it. We just got token implementations in the last couple of weeks, so it's like okay, a long ways to go here. Victor. Yeah, given that we now have bidirectional streams, it might be more of a cargo-cult exercise. Uh, Suhas. Uh, I think I did get uh write-up on some code, but the token itself, the size of that is around like 400 bytes, uh that's that's that's that's that token is not uh again, the token is useful in the direct action, but if the actions increase more, and also if you're depending on crypto you use, it might go on many, many kilobytes. So, having having said that, I am also in agreement with. I think that we are not at the point of... Um, I'm just going to echo one comment I see from uh Mike in the chat, which is, it's uh maybe too early to remove it, but moving it to an extension seems reasonable. So, we could just take the take the everything that we have and just move it to another document, which is like if you want this is the core, we send the tokens every time. It's already effectively given by a setup option, which is all you need to make it an extension. So, it's fine. It's not totally different if we write it in another place, and it shrinks our document by 500 lines. Um, does that Do people have a problem with that approach or...
Suhas Nandakumar: I think we should have a fall-back. Uh, I I feel like, you know, we even have important time for five tokens, so we'll make a decision of maybe, yes, move. Okay. Again, I also feel like this is a time where people we have open source libraries, libraries now to transport these auth schemes. We should give at least some time for people to... It's too early to remove. Okay. Um, in that so I guess I'm hearing mixed feedback, some people are on team extension, some people are on team too early.
Will Law: MoQ-secure-objects is very pro-extension, but obviously, there are some...
Suhas Nandakumar: MoQ-secure-objects is pro-extension. Um, I see two, I mean, I don't know if we want to call on a show of hands. I guess I'm willing I didn't see your hand shoot up about who's going to implement this soon. So, are you in...
Colin Perkins: Oh, we're not. We can't we can't run a trial until this is implemented.
Suhas Nandakumar: The compression scheme?
Colin Perkins: No, the authorization scheme.
Suhas Nandakumar: Right. The Are you going to implement compression and tell me if it's useful or not? Yes. Okay, by when?
Suhas Nandakumar: Next week.
Colin Perkins: Next next week, but we have we have the things.
Suhas Nandakumar: You're saying you're going to help write, right? For the group last. Okay, fair. Uh, I'll I'll let you know. Uh, it might be, you know, so you've got like 2 months to show that you really need what's in there right now, or then I think it goes to an extension.
Colin Perkins: Why did you choose this instead of sticking over with this of like track alias? Why did you choose auth alias instead of that track?
Suhas Nandakumar: Yeah, this issue was filed a long time ago by somebody else, and I'm just closing issues. If you want to file an issue about track alias, file it. Do you expect MoQ pack to address things to not need track alias? No, not Go read MoQ pack, or go watch the HTTP session from IETF 115.
Suhas Nandakumar: But track alias is in DBT because it's auth list, is taking compression. It's just it compresses everything, compresses track namespace, it compresses for track name. Right? Yeah, but not in the control. Auth alias is a compression for a big token. Sorry. Auth alias doesn't it has the same problems that the other one does, because it's relies on in-order uh processing in order to work, and it's also only used in the control plane, not the data this sorry, it's only used in the data plane, not the control plane. So, anyway, and go read go read MoQ pack, tell me what you think. Uh, I will I guess this is sort of temporarily parking this issue, but with a with a caveat. Yes. So, run on timer, right? Yes. If nobody produces is saying like, I have to have this, then, you know, and I have here's the data that shows how valuable it is to me. Like, I don't know, 2 months. Oh, no. I'm not on the fine thing. Like, let's just let's let us implement this stuff. You said last call, last call is apparently, first last call is going out in mid-August. Well, and if by mid-May, like... Is that your decision, or somebody else's? Okay. That that is my target. So, some people might be thinking here, like, let's work on this stuff. When I figure this out, there's a lot to be done on auth, and this is the this is the this is at the tail end of auth, yet the draft is missing most of the auth stuff, so. So, there is another reason to take it out. No, no, no. But I just think we need to figure out the auth. I mean, I'm not on team take it out or not take it out. I'm just saying trying to decide this today is makes no sense whatsoever, trying to decide on everything. I'll move on. But you're you're denying the fact that auth is missing from the draft, and we need to figure that out, right? I mean, there there's auth there's place to carry auth, there's auth there's track for auth, which is the track payload. There's no discussion in the draft, and I've brought this up with many times, okay, there's not an issue open on it, but it's a big topic, right? We need to say what things you auth, and when, when do you evaluate them, and... What does that have to do with compression? Well, look, if you don't have to authorize updates, this bug isn't relevant. I'm happy to move on. Um, but I just I'm not I don't agree with consensus is like we're going to have a timer on this and it comes out in 2 months. That's what you are saying. Uh, if the chairs want to say that's the consensus of the working group, I'm glad for the chairs to say that, but that's what needs to happen here if you want to put that on. Appreciate it, Colin. Thank you for correcting me, chairs. How would you run things? Let's wait, see, uh and and when it's relevant to do the kind of final next round of cleanup, we can I'm very satisfied with that. Yeah. When we get to a decision point, we'll make a decision. So, we're we're way over time on this, but I'm sorry. So, there's there's two designs, there's the current design and there's MoQ pack. And what is the relative position of those two things? We don't have time to go to talk about MoQ pack, if you would like, I can fill you in. Well, no, no. So, but what I don't want to I don't want to discuss it. What is your intent of that draft versus the My intent of that is like, if we really want to get the maximum possible compression out of MoQ in general, not just auth, but also track namespaces, etc., given our bidi stream design, I think MoQ pack is what we want. That that can happen in fact, right? I yeah, and I don't really care if it does or not. I let people decide if they want to compress. So, do you want to replace the do you want to replace the current scheme with MoQ pack in the in the in the Potentially, I when when we when people decide that compressing is a big problem, we can look at it. Not in the core draft though. I mean, even QPACK wasn't in technically... Yeah, but, sure. All right, let's move on. Happy to take feedback on my repo on MoQ pack, if you want to read it, you can. I'm happy to talk about it anytime, but not now. Um, Suhas, these are your slides, enjoy. Okay, I will. So, this is this was a issue opened by Magnus about. The idea is that, like, we have end subscribers and your original publishers, original publishers announce of some publish namespace, and some tracks are automatically end up end subscriber, there's no way to verify um is that publisher allowed to or authorized to publish local? Not looking to uh think clearly about this, this this is this, with today's we have uh a text that basically says that um like the spec says, receiver verifies publisher is authorized. But it's kind of a black box, it does not clearly say uh what this basically means. But if you really think about it, the way our auth works is hop by hop, we not do an end-to-end authorization uh anyway. Uh so, the idea is that we need to add uh resolution here is that add in the security considerations section, basically talking about different roles in the sense that what would be a publisher to relay, what would relay verify, uh being authorized of a publisher. And in subscribe namespace case when a namespace comes, the publish namespace matching with that subscribe track space basically helps in uh what are the auth that you do with subscribe tracks or subscribe namespace, in everything under that as factory would inherit that authorization, that trust. And same way between relay to relay there might not be, we do not define anything, right? But the expectation is that relay would once a relay authorizes a publisher on the ingress, uh if if it does not authorize a publisher on the ingress, it will not forward that uh on the on the egress. There might be an out-of-band mechanism where uh an application can control the the identity and authorization associations between the publisher and end subscriber, but our the MoQ core transport would not define that.
Colin Perkins: Okay. Um, so, I think we got a whole bunch of assumptions here we probably just disagree with them, we can't sign here, but, yeah, I don't think it's hop-by-hop, and the use case we described before was you're going to send a subscribe, I'm not on the on-topic list now, okay, because of course we'll call it a subscribe sends something to a relay, the relay sends it to the original publisher. And the relay might want one token, and the original publisher might want a different token that you're going to authenticate with. And this was one of the reasons that we put multiple, the support for multiple tokens in, and there might even be another relay network in there, and you had two different authorization tokens for relay networks, plus one for the original publisher. So, I don't, like, let's, like, I don't think it's hop-by-hop.
Suhas Nandakumar: No, the point is that parameters are hop-by-hop. All parameters...
Colin Perkins: No, no, but we're going to have to put statements in that these ones are copied from, you know, upstream, like lots of other things are copied.
Suhas Nandakumar: Also, all subscribes are aggregated, can be aggregated, that's why that's why parameters are hop-by-hop.
Colin Perkins: So, the the authorization of the subscriber is one of them. Like, that is one of the, I think, issues that is going to be one of the most difficult things for us to deal with in this draft before we're done. Please file those, thank you. So, uh but just to high-level, uh what I'm saying is very similar to what you're thinking. I'm not saying what I'm trying to say is that what uh once a subscriber sends an authorization open to relay, and then what relay does, let's say, under the hop is original publisher, right? Um, that token might be different. But the scope of that token for the validation is that hop. If that publisher is allowed to publish, then this subscribe will be let subscriber let to get that data. Right? Right now we say receiver verifies publisher, so what proposal here is that it speaks about what if a publisher publishes something in an auth token, what what is relay authorizing means, same thing, uh relay-to-relay will not say much. Sure. I I see your concern, but...
Colin Perkins: I think we need to get at least some design team to think about how authorization works on it. Because, like, you just to sort of clarify what I think is going to be the complicated problem to deal with, is we have two subscribers that came up with possibly different auth tokens. Right. And so, the first subscriber comes with auth token A. Subscribe, the relay is, yep, that's a great token, and sends it up to the original publisher. And then, the second relay comes, or sorry, the second subscriber B comes with another auth token B, and the relay goes, that's great. But I I like your token, it looks good. But now, I have to make the decision, should I send should I give you the data that was authorized with auth token A that's already flowing to me, or do I need to do something else to check that the auth token B is valid or not? And like, there's a bunch of things we could design in this space. Eckhart and I discussed this for many hours, there's whiteboards, like, it's not a short conversation. Um, but we'll have to make some trade-offs to make a relay work network. Can I hijack and say as chairs we can take an action item to form a design team on auth, and put this issue to that design team? I'm kind of tired of talking about it, I mean can we form the design team? So, does anyone have some on this sub-team to say that... Oh, yeah, actually I think, I think it is hop-by-hop. We don't have to form a design team, because the tokening the token is just who you are sending it to. That relay is going to have a different token to talk to the next relay. And the relay that's exiting the network and going to the original publisher, will have another token. So, I I really think auth is hop-by-hop. That's one model, but that's that's I mean, that model has issues too, so like like... It has issues, but... That's Magnus's only concern, so we're following what he... Yeah, right. Okay, next issue. No design team. Okay. Thank you. This is also auth related, yes? Yes. No, it's on the list, but not on. Yeah, this is also auth-related stuff, I think we can punt this. Um, okay, so, we'll move 1503 to auth as well. It sort of falls in the same space. Um, thanks for making the slides, Suhas.
Suhas Nandakumar: No problem.
Will Law: Okay, lightning round. Do we need to allow multiple ranges in a FETCH request?
Suhas Nandakumar: We have filters also. Well, okay, let me finish with look. So, now that we have filters, you can already, you can already do some of the things that you couldn't previously do with FETCH, right? Like, if I'm missing, if I detect that I'm missing all of the sub-group of, say, every odd-up, like, one of the reasons this issue was filed was so, if I had every other object in my cache, do I have to make, you know, I want to make one FETCH for like all these little mini mini ranges. Um, but now that we have filters, if all those objects go into one sub-group, you can say, like, trying to get or I already have sub-groups here, so that's one. Well, I think what you're saying is location filter will let you spell multiple Are you Does your proposal say we could make it, all the other filters all are multiple ranges. Location filter, if we do it, straight from 1401 does not support multiple ranges, but it would be more in line with all the other filters if we did that that support multiple ranges. Because the syntax of all the other filters is multiple ranges. I'll move on. Maybe that question is, do people want this? Or do people want to keep it simpler? Because you're not going to do the silly case of alternating ones, but you might have a range here, range there, and then a range there. Is there anybody who really doesn't want this? I have a fairly complex FETCH implementation and I thought about like, I'm not sure if I would ever really want to like go scan for all my gaps, and then make a FETCH for that. Like, it's more like, I mean, a lot of people do it simply, it's like, I hit the first gap, give up, get the rest of the group. Like, that's one simple strategy, and another one is like, I'll fill every hole. But like, finding all the holes is a long process, so. I mean, if you're going to do 20 FETCHes, you might as well put them in one FETCH. It still sings the same amount of work, that you've got to find, you know, you're going to figure those 20 out sooner or later. The mechanics is simpler, so the simpler one is just you want three ranges, make three three FETCHes. And you will get them. And does it Can it help the number of implementers? If they're like, do you mean I have to FETCH each individually? Yeah. I would just like to point out that in HTTP, we have range filters and they can be disjoint. And, and, like, this problem is, I don't want to say solved, but they are already like many implementations. It's not like pulling a gigabyte file to the edge to deliver the last 5 high-fives. And like, if it's, like, it may be split among multiple sessions, like, this problem is handled like standard switches. Okay. And I don't think anybody is like, I'm not super eager to implement it, but, okay. In, uh, how are they going to be delivered? Are they going to be delivered on one stream, or is it going to be like each range is like its own stream? I'm asking a question. What does... So, so, we have not really any PRs yet. I think, I think Victor and I were tasked to look at the spelling for the the FETCH. And my rough stab of it, which is not written down anywhere is, when you give a parameter, that is a stream. You're asking for one FETCH stream in a parameter. Within that parameter, our location filter can have multiple ranges. So, if people want four ranges on that one stream, then you're going to get a a FETCH stream with four ranges in it. And you'll send it as continuity in the middle of the of the stream. I mean, I guess it's okay because you do we think it's complicated, or do we think people are going to screw up the fact that like FETCH streams have implicit like gaps in things, and then and so like a person reading a stream is going to track of like, wait, so, I have I I asked for this, I got this, like, keep doing like incremental map, I don't know, maybe it's fine. Like, I'm not infused about implementing it. The FETCH response The FETCH response thing should clarify what a gap is. FETCH response is already required to say whether the gap is intentional or unknown or does not exist, right? So, you can just say unknown. Yeah, are the ranges allowed to overlap, Group Compression Victor? Yeah. The classic, I would like objects 1 through 1000, 1, 2 through 1, 2, 2, 1, 1 through 1000, 2. Um, I would say no. I would say no. That would duplicate everything, right? So, you would do the, you would do the other. Yeah, I think if you're going to put on one stream, I think if they get overlap, I think it's I think it's wasted. Okay, so, yes, people want this, and we'll do it with the location filter in FETCH. Is that what I hear? Anybody want to object to that? Well, we we I mean, should we send each range uh on half a code or or or... There'll be one I mean, you can always still send many FETCHes, no one's taking that away from you. But if you have multiple ranges in FETCH, how are they treated uh one by one, or are they treated at the same time? Think it's coming back on one stream. If you want them on multiple streams, make multiple FETCHes. If I send uh FETCH 1, 2, and uh 5, 6, are they sent simultaneously? Like, a 1, 2, You'll get a stream with 1, 2, 5, 6. 1, 2, 5, 6, and we need to have like a continuous range in order to They have to be like fetched in order, because of the delta encoding in the in the filter. It's impossible to specify overlaps. So, if you say 5, 6, 1, 2, you you will have to can't can't, you have to do in order. For everything is delta encoded on the wire. We're out of time on this one. Okay, I think we'd I'll put Ian, you have it written down? Uh, I have written down, but is the answer that, um, we do want to support them, they come on one stream, and they can overlap? All right. Okay, not the outcome I was expecting, but at least we have response. Um, renumber everything. Uh, I don't think we're going to do anything immediately, but yes, we plan to renumber everything. Um, we'll when when the time comes, we will, um, make sure all the enums start at zero and have no, you know, and are contiguous, and are sorted in in an appropriate order. Um, so, we'll report this issue. Ian. Yes. Your favorite issue. Yes, um. Okay. Flow control. This PR for flow control. We want flow control. It limits the total number of streams and the total number of bytes sent on a subscription. It uses control messages on bidi stream. Now that we have bidi stream, it doesn't uh require we send stream, matter of anything. It's actually quite straightforward. Um, Alan wrote another PR that is actually I think slightly better PR, but they're basically the same proposal. This is mine, and more performance in terms of property. Um, they, I mean, it's not like why is my concern with the same things but with something um in terms of, if you only want to do bytes, for example, you can only do bytes and um then you end up with something that's very similar to um, you know, standard stream flow control and single stream, but it's across multiple streams. So, um, in the past, the workgroup has not expressed a ton of enthusiasm for it. Um, if we did want to do subscribe side flow control in this in the past, I think we probably have to do this. Um, yeah. And so, I mean, my take away from what the working group wants is that probably closing this or moving this to like a extension or something makes the most sense. So like, I want to give it one more go round before we go that path. I'm not finding this PR, where is... What's the number? It's not 11, I'm sure. PR 11 is in my repo, A-Frind/moq-transport. Low low numbers come from... You have to link them, yeah, sorry, that if you click that link, you just go to GitHub, A-Frind/moq-transport. You'll find all kinds of secret stuff I work on. When I dream up things that I don't want to share with people. But but 1591 is this issue. You're looking for the content, it should be fine. Um, I mean, you want I I guess I could put feedback on the message on the, but I guess the thing is is like, if we're going to do something like this, I think limiting the rate would also be really important. I'll I'll have Magnus define how to how to measure the rate. That's a possible, but that would be Magnus's problem. I mean, you can kind of I mean, you can limit the rate by limiting the bytes. Yes. But but there's this is not like like like, I have no idea whether conference call when it starts is going to last 5 minutes or 5 months. Like, literally, they go for 5 months. So, trying I can't put in a reasonable limit for bytes, but I could put in a reasonable limit for rate. No, no. But, you keep feeding it credit, so like, you as you consume, say, you have say, you have like a Yeah, I understand. Okay. Okay. All right, so you just keep feeding it credit, so functionally it works. I mean, it's a little bit tedious, but like, you don't have to worry about... It's not with only, I mean, you have to act packets to it, like, to get congestion window back, so it's like, you know, similar amount of work. And I wrote I wrote one of these PRs but I don't remember anything about it. So, if you're out of credit and subscribe, you still queue it, and then, if credit arrives, it just flushes late. Yes. And it's still subject to delivery timeout. Yes, and at some point, you do deliver, like, too far behind. And then, you can still get too far behind, like, that counts against your queue. Yes. Um, so, I mean, it basically gives you the same flow control you get with like a FETCH. But like on subscribe, so, that's that's the intent. I'm imagining you had FETCH flow control on subscribe. I have not seen a lot of people jumping up and down saying like, I really want to do this. So, I think maybe the thing to do is have it in an extension and maybe people will play with it and find that there's useful and MoQ V2, you know, just like whatever, Speedy V2 didn't have flow control on application. We should have flow control. I sort of feel like this is like, you know, the other big topic that I think is the DDoS stuff we haven't really thought into the draft yet, or whatever. It's like, this may be a very plausible solution to resource exhaustion fights that we have identified in the DDoS draft. Seems like we haven't like, again, I mean, we need a design team for DDoS too, we have a design for DDoS. But they haven't talked about this, this is off in the this is off in the subscriber to the relay. This is not how we've you can sort of throttle them in. I I look I don't I I had always imagined there would be some way to rate-limit things, like rate-limit data, not not control. Control messages will not be how the systems kill. Um, so I thought there would be something you could do with rate limiting data and this seems like sort of it. Though, I can't read I haven't read the PR, obviously. Okay. Um, I don't know. We can repark it for like and actually like, we can plan to talk about next performance draft in more detail, I'm happy to. Like, I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good and, no. So, so, so, you're trying to simulate what you get with FETCH. So, you're just trying to aggregate all the sub-groups and saying, for this whole batch of sub-groups, here is your here is your limit. Yeah, here's the total number of bytes. Okay. You you can send. Yep, and also stream limit too if you want to use that. So, say say you're in a situation where like, I have 100 streams and I want to give this subscription 20 of them but not more than 20 then you basically can make sure that those subscriptions doesn't consume more than 20, which for a relay is is kind of nice, because you know, you're going upstream, you're trying to like, approximately fair share probably between multiple subscriptions. I mean, this is mostly realistically probably really valued with relays where you have, you know, a lot of people who are competing with on the same resources, you potentially, like someone downstream might like just stop consuming, or like, you know, there is bandwidth. Because right now you're only hammer is unsubscribed, right? Right now your only hand, well, you have you have flow, you have flow control on a subscribe, if you don't want the other person to send, you have flow control on stream. Every stream, yes. It stops and then, FETCH is a stream so it has flow control, right? Like, that's it just has it because it's is because it works. Okay. But subscribe is not. Um, do, I was going to say like, what are the like, since this has come up a few times and people have so, it sounds like maybe now people are more interested in this, like what can we do to make sure that we actually advance the ball? Um, that's why I say, do what, how about we we completely on this, I'd rather get our auth stuff straightened out first before we did did the DDoS. This stuff because I think that the auth stuff solves lots of that, so like, if I was sequencing these, I would get auth done in the draft as of something coming up fairly soon, once we got all the currents, I get all the current stuff we have in play landed, right? And then start the auth discussion, and then start the the the rate limiting discussion. Victor's in queue, also. What's up? I didn't see. Oh, I'm sorry. Okay. Ah, yes, this is the description, is sounds like a real problem, the solution does not solve it. Why? It does not authorize all, it does not prove that we do not have flow control for subscription. We do have flow control for subscription. The subscription are implicitly flow control for here, in a sense that the receiver has to consume data and as long as it's generated, or eventually you will get timeouts or so to far behind, and sensors because of nature of things which transmit by subscription you kind of have to consume it as of rate, and that's why is flow control because that's how the the rate of the receiver gets to like, the rate of the sender. Now, is there by the other issues with like, someone sending traffic that is too high rate for the network and things like that, and we might need to have support for that but, I don't think with half flow control solve that. It does not seem to be the right tool for that problem. Okay. I know mixed opinions. I just I'm not sure exactly of the disposition. It sounds like people are interested in hearing more. Maybe we'll schedule it for future virtual or okay, maybe ask for the end time, I don't know. I just want to make sure people come prepared, so that we don't get the same discussion again. Yeah, if you can make a PR 11 and make a new PR out, and that will help us review, actually. I can make I can I can update our, I mean, that was written long enough ago, probably needs to be rewritten anyway, so. What's the final, is is there what I need to do uh, ghost subscription, up stream? No, no, I don't think so. We could we could mix everything here, if we have a design team doing this, because it is resource consumption related to subscription, not auth subscription, so that's uh, it it could be the same, right? I feel like that is a different problem. This seems like a big discussion for us to me, and I would like to move it to extension. Uh, I'm not sure we we need it, and it seems like something that is going to be hard to get right as a traditional PR, but this is a potentially quite complicated mechanism that is going to take a lot of implementation experience to get right and like it's barely even proposed at this point. I don't like a good idea at this stage of the game. Okay. Um, all right, I think we I think we'll try to bring this back in the next couple of months, and then, with a little bit more detail. So, I would say I would say move to an extension, but as an individual, not as chair, but otherwise. All right. Uh, I'm going to roll on, I think we have an hour, so, thank you. Um, so, there's Ian, I'm going to skip over timestamps as we didn't have the time, and I'm going to go to using phrases, because this is the last one. There's two reason there's so, this is our good old-fashioned the thing comes from using phrase, uh and uh the pro having a reason phrase is that it significantly improves the helper experience. The con, everything about it should be structured, and everything with no value should be removed. I made my peace with losing on this one. Oh, in that case, I'll close it. I don't know, let me ask a question, does anybody feel strongly that we need to remove reason phrase entirely, besides Colin? Well, we should. We really need to remove it, because if you look at the set of for outcome of many why? Oh, the one that says go ask Colin about this, yes, so, it's a phrase was helpful, you know, keep it, but but most people are going to have to look up the phrase anyway in the spec to see what the hell does that mean. I don't know. I mean, raise your hand if you've gotten a reason phrase during interop that has helped you. I definitely have. They're useful. I I will put in one negative thing on this is that, um I do think that specifying it is UTF-8 without doing all of the things that you need to internationalize it will bite us in IETF last call. I and I actually think if there's all kinds of security, like, allowing UTF-8 in there, it like opens a ton of security, I mean, I was already planning on updating the relay to send a different message that read the same, but practically rendered what was different every time on every message, if you want. There's a lot of So, I think I'm I'm not going to, not a hill I'm going to die on, I don't really care, but I actually think you would get through the ISG faster if the reason phrase was not UTF-8. Okay, I'm going to, that's a good place to transition to the next issue, which is very related, which is... How about we propose this as the get the phrase the real error problem we have spent many hours debugging is we have no idea what the hell actually closed the session. So, so, you're just going to have to fix it. If we had a number for every place in the in the spec where it says "protocol violation", you can't put that, you have to put protocol violation and a number. Yeah. A unique number, so then anybody that hits it, "Oh, I I hit this." So, we turn protocol violation into into a a pool of like 50, how many ever times that appears in the spec, a pool of 50 numbers, and you put in the spec the number every single time. So, you know, we've we've spent many hours debugging things being like, "What the hell happened? Why is this closed?" and you don't know until you trace through. All right, well, I'm going to chain off of Christian to say there's a bit of security risk in just being too specific about how you fail. Um, I I kind of assumed you were going to rip these out at the end. Oh, really? Yeah. But, I I don't I'm also not going to die on this hill, but I'm kind of with Colin. So, I want to let me bring up the second issue, which I think may offer a different path forward, which might address Colin's also. Suhas filed this today, which is that wants to have an error payload in request error, which is for structured binary data, and I think, if I understand, the use case is that privacy pass sometimes needs to fail you with unauthorized, and give you a challenge or something that you need to complete the next step. Um, so, there's a this issue got opened and there's a PR that does it. So, we should decide that is this something that we want to add? We want to add it only on certain error codes? And then, then I want to think about what's the intersection with reason phrase, because if I'm remembering one of the HTTP specs correctly, they don't have reason phrase, they have debug data, which is defined as a binary blob and you can stick whatever it Now, some people put ASCII strings in there, but that's like one potential option. So, I don't know, Suhas, do you want to say more about this particular use case? I can, um, the thing is that I I had two options there, because uh in in a privacy pass we do the first part of the of the setup, you will not get challenge, and then the relay would ask you for. We had some sometimes some metadata. There's two ways we wanted a way, in setup, when you fail, state setup, to come back with the challenge. And and we can do this, two ways to do this, right? Do something like this, or you define a new auth error. Kind of like this. Hold on, setup doesn't have a response. Setup is just but I I think this is the and like, again, this is where in the category of stuff that I was like, when we get to auth in the draft, we'll fix this, okay, but I don't think we should do this with an error. I mean, every pretty much every auth auth token scheme that isn't a bearer token has a challenge phase. And so, anytime you send up any request, you're you're going to need something that looks like a auth challenge to it, that needs to pass back a authorization. So, that's separate message for. But that's spec-wise, that's a different message. Yeah. Or something like that. But, I mean, we need to design for this one way or another, we have to support this, right? Okay. Yeah. Does that work for you if we say like, okay. All right, so we'll close this. We're not going to do error. I think we need to figure out how to spell this. But this is our but not this way. Okay. That's my goal. Okay, we'll close that PR. I I think the issue to say we need a way to have, right? And let the auth design team to figure it out, and I think it can be message-sized, and finally, yeah, it works. So, the other point I want to make in this space is that now the way we structured it, um if it is if it happens to be an immediate request error, we have this reason phrase, but if it's a error that happens later, it's a reset stream, which has no error phrase, and nobody really cares. So, like, Why do we need reason phrase? Uh, I don't know, like, like, I don't want to spend forever paying the lawsuit here, like, yeah, okay. Um, like, I I kind of think we ought to have it, I don't know what other people think. I guess, you know, H3 doesn't have them in certain places, or Quick doesn't, but they're they are in some places but not others. Someone just needs to make a call. If, I don't know, if Colin, if we just said it's binary data, do your best, like, does that does that resolve all of your issues about UTF-8? It sure sure, because then people don't try and print it. They try and they They'll print it anyway. I will. But like the number of like, okay, at least you sort of have the security group, and like, the number of attacks we saw that like, relied on being able to reverse the bidirectional log message right over top of it and then go forward again, stuff like like UTF-8 allows you to do all of those things, so like, I mean, the I see, okay. I think the text we have for UTF-8 in this particular thing was lifted straight from HTTP/2. Because they have a place for a reason phrase to appear. So, we've at least tried to copy somebody who's more past than we are. I'm not saying that we won't have more trouble. Uh, I don't want to, time is over. How should we resolve, chairs? Resolve this? Yeah. Um, I think I I heard a lot of I think people asked for the debug stuff, I've heard that. I don't think any of the reason phrase haters have a problem with keeping this through the interop phase. Right, Colin? I mean, I don't No, no, no. Here is the concrete proposal going forward. We log in the draft, we put a message in the draft before this gets before this is last called, we'll make we'll make a we'll make a call in the working group of whether we're going to remove it or not. Like, that's just a way, like, punt it to the like, a like, punt it. I mean, we're out of time and I want to give I want to give you time to talk about timestamps. So, like, I don't know in this I'm okay punting it a little bit, but like, who cares? I'm good. I mean, let's just make a decision. Yes, what? Do we want to remove the UTF-8 text? I guess is one question. That's a separate question. Okay, well, why don't I don't think anyone cares strongly. We we could flip a coin, we could we could raise hands, like, I mean, like I would like to see them go away at the end. I think Colin would, but, we're not going to die, we're not going to like, yuck anyone's yum if they really want to have them in there from... If, if, if we're going to remove it, then, then aren't some of the, like, what what I said, which is, we just need a, we just need a reason codes to be bigger and more detailed than just error, protocol violation, you know, go on. Good luck to you guys. Yeah, I guess that's my point. I need somewhere to put something more detailed than just like, "error" is too active. But, but, but, but our reason codes space is big, right? It's like, it's in a 55-m, you can make as many as many codes as we want. I'd rather not actually make more codes, I'd rather, whatever, we do something like what I said, which is, put a line, whether a line, like, this is the line, or see whatever. You can have 9,000. If, if we're going to have rich feedback, like, just leave it as it is. That's what I say. I have no, nobody said, I want the binary, I want the text or binary stuff, to have to have it. Nobody said that, right? I mean, I want no debug, no text. There's one one there, but, that's for expected behaviour. And, and we've written it as binary. But nobody cares about whether there is anything in the UTF-8 or binary, right? Nobody said, "I need data in there." right? It's "no" for everybody. So, so, you're saying, "Don't send reason phrases?" I absolutely do. Do, do people care about it? I care. At least, for now, I care. I have definitely, we've got at least one key, I mean, for H3, I do, like, I mean, I use them. So, all right, if, if, if the desire is to have this really rich feedback, uh, like, with a million codes, uh, I would say like, let's just leave it as it is, like the point is to to keep it, to find the feedback with something, and not mess with the protocol. We're just going to have to recreate as a new structure. So, just leave it as it is. That works for me, Colin. We're switching this to binary data allow us to make it in the protocol that we anticipate already. Oh, yeah, yeah, yeah, because I mean, if, if, if, if you like, oh, I mean, that, I'm just sort of saying that, the, the, the, the problems are is, when you use UTF-8, you're supposed to scribe, uh, normalization form, and we don't, like, if we want to do the things that you're supposed to do when you do UTF-8, like, have normalization form, and things like that, it'll go through no problem. We just haven't done all of the things that UTF is required to do. Which is why US ASCII would be much simpler because then you don't have those requirements, or binary. Or, or like, this is a binary data like, you already have a 9-byte binary field. Okay, I want to give the rest of the time to Ian, like, error code is the same 9-byte binary. You will love arguing about timestamp more than you love arguing about... All right, Ian, you're up, and you have, uh, 12 minutes. Talk about this some in the past, um, but I think we now have more experience to have a way more informed conversation and maybe, um, we get a real direction out of it, so I will try again. Um, conversations around DTS we believe have been very helpful in providing some like real world, like, how people are actually going to do like track switching and such. Okay. So, potential use cases for, um, timestamp across tracks is, you know, audio and video, for example, you trying to keep them approximately in sync. So, like, this is particularly valuable if you have like a smallish jitter buffer, but like it's not like 15 milliseconds, it's more like 2 seconds, um, where it's very easy for the audio to actually get like a good bit ahead of the video if you're, um, not careful and you have a lot of audio. Um, but it also could be very helpful for keeping, um, tracks in sync that are like from different productions. So, like video conferencing, um, any other time you have like two different video feeds and you want to keep uh approximately in sync from like a delivery perspective. So again, this is all about delivery, this is not about playback. This is just there to try to make sure that like you don't have buffer under-runs, because you like, you know, send all of one thing and none of the other. Um, so timestamp cross-attributes is a difficult to impossible problem in the like extreme sense, but as I think Colin kind of alluded, um, you know, perfection's probably not required here, like this is just for delivery. Um, and so, you know, if we can get uh sufficiently accurate timestamps solely, it actually might still be like a net win. And also, I've there are lot of use cases where even if the time is not 100% synchronized, it's like maybe they're all from like the same data center and the same like region or same like rate, right? I mean, like these these clocks can be reasonably close, uh, like maybe where they're transcoded, things like that. Um, so, there's a lot of use cases where we might be able to use this. Um, and of, I wanted to say briefly on DTS, currently uses group alignment, which seems to work fairly well. Um, probably could use a timestamp if if it's available, but I think DTS have proven that like at least for that one use case like probably group alignment is sufficient. It's not required. Like, timestamps aren't required, even if they could be used. So, Yeah, that was going to be my comment. DTS doesn't actually, it's just requiring group IDs to be consistent, they can all completely different media times or something. It doesn't matter. Yeah. And also when you're syncing media, you're not going to do it with the time with the timestamp carried on the delivery item. You're going to do it with the time a much more accurate time signal that's embedded encoded, like MPEG system you have for that, or time base or presentation set, and then, and sync it right. So, this timestamp then would be useful if we had say allowed filters or something that is a property of transport. I want, I only want objects to that are signaling this timestamp. But that timestamp is just a number, so couldn't we, and you can already define an existing property which is a number. So is this really about just standardizing a property and saying it is a timestamp and here's how you you write it? Uh, yes, basically, things like that. Or, is there some other issue? Or like, someone sending traffic that is too high rate for the network and things like that, and we might need to have support for that, but I don't think we've, like, that's that sort of out-of-sync problem like, you know, Ian was saying, use some sort of rate limit, or like, we we definitely need that, I mean even H3 do we do some stuff that mostly avoid that, but like it's a real problem at the... The queue is long, so please are we going to do 20 FETCHes, you might as well put them in one FETCH. It still sings the same amount of work, that you've got to find, you know, you're going to figure those 20 out sooner or later. The mechanics is simpler, so the simpler one is just you want three ranges, make three three FETCHes. And you will get them. And does it Can it help the number of implementers? If they're like, do you mean I have to FETCH each individually? Yeah. I would just like to point out that in HTTP, we have range filters and they can be disjoint. And, and, like, this problem is, I don't want to say solved, but they are already like many implementations. It's not like pulling a gigabyte file to the edge to deliver the last 5 high-fives. And like, if it's, like, it may be split among multiple sessions, like, this problem is handled like standard switches. Okay. And I don't think anybody is like, I'm not super eager to implement it, but, okay. In, uh, how are they going to be delivered? Are they going to be delivered on one stream, or is it going to be like each range is like its own stream? I'm asking a question. What does... So, so, we have not really any PRs yet. I think, I think Victor and I were tasked to look at the spelling for the the FETCH. And my rough stab of it, which is not written down anywhere is, when you give a parameter, that is a stream. You're asking for one FETCH stream in a parameter. Within that parameter, our location filter can have multiple ranges. So, if people want four ranges on that one stream, then you're going to get a a FETCH stream with four ranges in it. And you'll send it as continuity in the middle of the of the stream. I mean, I guess it's okay because you do we think it's complicated, or do we think people are going to screw up the fact that like FETCH streams have implicit like gaps in things, and then and so like a person reading a stream is going to track of like, wait, so, I have I I asked for this, I got this, like, keep doing like incremental map, I don't know, maybe it's fine. Like, I'm not infused about implementing it. The FETCH response The FETCH response thing should clarify what a gap is. FETCH response is already required to say whether the gap is intentional or unknown or does not exist, right? So, you can just say unknown. Yeah, are the ranges allowed to overlap, Group Compression Victor? Yeah. The classic, I would like objects 1 through 1000, 1, 2 through 1, 2, 2, 1, 1 through 1000, 2. Um, I would say no. I would say no. That would duplicate everything, right? So, you would do the, you would do the other. Yeah, I think if you're going to put on one stream, I think if they get overlap, I think it's I think it's wasted. Okay, so, yes, people want this, and we'll do it with the location filter in FETCH. Is that what I hear? Anybody want to object to that? Well, we we I mean, should we send each range uh on half a code or or or... There'll be one I mean, you can always still send many FETCHes, no one's taking that away from you. But if you have multiple ranges in FETCH, how are they treated uh one by one, or are they treated at the same time? Think it's coming back on one stream. If you want them on multiple streams, make multiple FETCHes. If I send uh FETCH 1, 2, and uh 5, 6, are they sent simultaneously? Like, a 1, 2, You'll get a stream with 1, 2, 5, 6. 1, 2, 5, 6, and we need to have like a continuous range in order to They have to be like fetched in order, because of the delta encoding in the in the filter. It's impossible to specify overlaps. So, if you say 5, 6, 1, 2, you you will have to can't can't, you have to do in order. For everything is delta encoded on the wire. We're out of time on this one. Okay, I think we'd I'll put Ian, you have it written down? Uh, I have written down, but is the answer that, um, we do want to support them, they come on one stream, and they can overlap? All right. Okay, not the outcome I was expecting, but at least we have response. Um, renumber everything. Uh, I don't think we're going to do anything immediately, but yes, we plan to renumber everything. Um, we'll when when the time comes, we will, um, make sure all the enums start at zero and have no, you know, and are contiguous, and are sorted in in an appropriate order. Um, so, we'll report this issue. Ian. Yes. Your favorite issue. Yes, um. Okay. Flow control. This PR for flow control. We want flow control. It limits the total number of streams and the total number of bytes sent on a subscription. It uses control messages on bidi stream. Now that we have bidi stream, it doesn't uh require we send stream, matter of anything. It's actually quite straightforward. Um, Alan wrote another PR that is actually I think slightly better PR, but they're basically the same proposal. This is mine, and more performance in terms of property. Um, they, I mean, it's not like why is my concern with the same things but with something um in terms of, if you only want to do bytes, for example, you can only do bytes and um then you end up with something that's very similar to um, you know, standard stream flow control and single stream, but it's across multiple streams. So, um, in the past, the workgroup has not expressed a ton of enthusiasm for it. Um, if we did want to do subscribe side flow control in this in the past, I think we probably have to do this. Um, yeah. And so, I mean, my take away from what the working group wants is that probably closing this or moving this to like a extension or something makes the most sense. So like, I want to give it one more go round before we go that path. I'm not finding this PR, where is... What's the number? It's not 11, I'm sure. PR 11 is in my repo, A-Frind/moq-transport. Low low numbers come from... You have to link them, yeah, sorry, that if you click that link, you just go to GitHub, A-Frind/moq-transport. You'll find all kinds of secret stuff I work on. When I dream up things that I don't want to share with people. But but 1591 is this issue. You're looking for the content, it should be fine. Um, I mean, you want I I guess I could put feedback on the message on the, but I guess the thing is is like, if we're going to do something like this, I think limiting the rate would also be really important. I'll I'll have Magnus define how to how to measure the rate. That's a possible, but that would be Magnus's problem. I mean, you can kind of I mean, you can limit the rate by limiting the bytes. Yes. But but there's this is not like like like, I have no idea whether conference call when it starts is going to last 5 minutes or 5 months. Like, literally, they go for 5 months. So, trying I can't put in a reasonable limit for bytes, but I could put in a reasonable limit for rate. No, no. But, you keep feeding it credit, so like, you as you consume, say, you have say, you have like a Yeah, I understand. Okay. Okay. All right, so you just keep feeding it credit, so functionally it works. I mean, it's a little bit tedious, but like, you don't have to worry about... It's not with only, I mean, you have to act packets to it, like, to get congestion window back, so it's like, you know, similar amount of work. And I wrote I wrote one of these PRs but I don't remember anything about it. So, if you're out of credit and subscribe, you still queue it, and then, if credit arrives, it just flushes late. Yes. And it's still subject to delivery timeout. Yes, and at some point, you do deliver, like, too far behind. And then, you can still get too far behind, like, that counts against your queue. Yes. Um, so, I mean, it basically gives you the same flow control you get with like a FETCH. But like on subscribe, so, that's that's the intent. I'm imagining you had FETCH flow control on subscribe. I have not seen a lot of people jumping up and down saying like, I really want to do this. So, I think maybe the thing to do is have it in an extension and maybe people will play with it and find that there's useful and MoQ V2, you know, just like whatever, Speedy V2 didn't have flow control on application. We should have flow control. I sort of feel like this is like, you know, the other big topic that I think is the DDoS stuff we haven't really thought into the draft yet, or whatever. It's like, this may be a very plausible solution to resource exhaustion fights that we have identified in the DDoS draft. Seems like we haven't like, again, I mean, we need a design team for DDoS too, we have a design for DDoS. But they haven't talked about this, this is off in the this is off in the subscriber to the relay. This is not how we've you can sort of throttle them in. I I look I don't I I had always imagined there would be some way to rate-limit things, like rate-limit data, not not control. Control messages will not be how the systems kill. Um, so I thought there would be something you could do with rate limiting data and this seems like sort of it, though I can't read I haven't read the PR, obviously. Okay. Um, I don't know. We can repark it for like and actually like, we can plan to talk about next performance draft in more detail, I'm happy to. Like, I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good and, no. So, so, so, you're trying to simulate what you get with FETCH. So, you're just trying to aggregate all the sub-groups and saying, for this whole batch of sub-groups, here is your here is your limit. Yeah, here's the total number of bytes. Okay. You you can send. Yep, and also stream limit too if you want to use that. So, say say you're in a situation where like, I have 100 streams and I want to give this subscription 20 of them but not more than 20 then you basically can make sure that those subscriptions doesn't consume more than 20, which for a relay is is kind of nice, because you know, you're going upstream, you're trying to like, approximately fair share probably between multiple subscriptions. I mean, this is mostly realistically probably really valued with relays where you have, you know, a lot of people who are competing with on the same resources, you potentially, like someone downstream might like just stop consuming, or like, you know, there is bandwidth. Because right now you're only hammer is unsubscribed, right? Right now your only hand, well, you have you have flow, you have flow control on a subscribe, if you don't want the other person to send, you have flow control on stream. Every stream, yes. It stops and then, FETCH is a stream so it has flow control, right? Like, that's it just has it because it's is because it works. Okay. But subscribe is not. Um, do, I was going to say like, what are the like, since this has come up a few times and people have so, it sounds like maybe now people are more interested in this, like what can we do to make sure that we actually advance the ball? Um, that's why I say, do what, how about we we completely on this, I'd rather get our auth stuff straightened out first before we did did the DDoS. This stuff because I think that the auth stuff solves lots of that, so like, if I was sequencing these, I would get auth done in the draft as of something coming up fairly soon, once we got all the currents, I get all the current stuff we have in play landed, right? And then start the auth discussion, and then start the the the rate limiting discussion. Victor's in queue, also. What's up? I didn't see. Oh, I'm sorry. Okay. Ah, yes, this is the description, is sounds like a real problem, the solution does not solve it. Why? It does not authorize all, it does not prove that we do not have flow control for subscription. We do have flow control for subscription. The subscription are implicitly flow control for here, in a sense that the receiver has to consume data and as long as it's generated, or eventually you will get timeouts or so to far behind, and sensors because of nature of things which transmit by subscription you kind of have to consume it as of rate, and that's why is flow control because that's how the the rate of the receiver gets to like, the rate of the sender. Now, is there by the other issues with like, someone sending traffic that is too high rate for the network and things like that, and we might need to have support for that, but I don't think we've, like, that's that sort of out-of-sync problem like, you know, Ian was saying, use some sort of rate limit, or like, we we definitely need that, I mean even H3 do we do some stuff that mostly avoid that, but like it's a real problem at the... The queue is long, so please are we going to do 20 FETCHes, you might as well put them in one FETCH. It still sings the same amount of work, that you've got to find, you know, you're going to figure those 20 out sooner or later. The mechanics is simpler, so the simpler one is just you want three ranges, make three three FETCHes. And you will get them. And does it Can it help the number of implementers? If they're like, do you mean I have to FETCH each individually? Yeah. I would just like to point out that in HTTP, we have range filters and they can be disjoint. And, and, like, this problem is, I don't want to say solved, but they are already like many implementations. It's not like pulling a gigabyte file to the edge to deliver the last 5 high-fives. And like, if it's, like, it may be split among multiple sessions, like, this problem is handled like standard switches. Okay. And I don't think anybody is like, I'm not super eager to implement it, but, okay. In, uh, how are they going to be delivered? Are they going to be delivered on one stream, or is it going to be like each range is like its own stream? I'm asking a question. What does... So, so, we have not really any PRs yet. I think, I think Victor and I were tasked to look at the spelling for the the FETCH. And my rough stab of it, which is not written down anywhere is, when you give a parameter, that is a stream. You're asking for one FETCH stream in a parameter. Within that parameter, our location filter can have multiple ranges. So, if people want four ranges on that one stream, then you're going to get a a FETCH stream with four ranges in it. And you'll send it as continuity in the middle of the of the stream. I mean, I guess it's okay because you do we think it's complicated, or do we think people are going to screw up the fact that like FETCH streams have implicit like gaps in things, and then and so like a person reading a stream is going to track of like, wait, so, I have I I asked for this, I got this, like, keep doing like incremental map, I don't know, maybe it's fine. Like, I'm not infused about implementing it. The FETCH response The FETCH response thing should clarify what a gap is. FETCH response is already required to say whether the gap is intentional or unknown or does not exist, right? So, you can just say unknown. Yeah, are the ranges allowed to overlap, Group Compression Victor? Yeah. The classic, I would like objects 1 through 1000, 1, 2 through 1, 2, 2, 1, 1 through 1000, 2. Um, I would say no. I would say no. That would duplicate everything, right? So, you would do the, you would do the other. Yeah, I think if you're going to put on one stream, I think if they get overlap, I think it's I think it's wasted. Okay, so, yes, people want this, and we'll do it with the location filter in FETCH. Is that what I hear? Anybody want to object to that? Well, we we I mean, should we send each range uh on half a code or or or... There'll be one I mean, you can always still send many FETCHes, no one's taking that away from you. But if you have multiple ranges in FETCH, how are they treated uh one by one, or are they treated at the same time? Think it's coming back on one stream. If you want them on multiple streams, make multiple FETCHes. If I send uh FETCH 1, 2, and uh 5, 6, are they sent simultaneously? Like, a 1, 2, You'll get a stream with 1, 2, 5, 6. 1, 2, 5, 6, and we need to have like a continuous range in order to They have to be like fetched in order, because of the delta encoding in the in the filter. It's impossible to specify overlaps. So, if you say 5, 6, 1, 2, you you will have to can't can't, you have to do in order. For everything is delta encoded on the wire. We're out of time on this one. Okay, I think we'd I'll put Ian, you have it written down? Uh, I have written down, but is the answer that, um, we do want to support them, they come on one stream, and they can overlap? All right. Okay, not the outcome I was expecting, but at least we have response. Um, renumber everything. Uh, I don't think we're going to do anything immediately, but yes, we plan to renumber everything. Um, we'll when when the time comes, we will, um, make sure all the enums start at zero and have no, you know, and are contiguous, and are sorted in in an appropriate order. Um, so, we'll report this issue. Ian. Yes. Your favorite issue. Yes, um. Okay. Flow control. This PR for flow control. We want flow control. It limits the total number of streams and the total number of bytes sent on a subscription. It uses control messages on bidi stream. Now that we have bidi stream, it doesn't uh require we send stream, matter of anything. It's actually quite straightforward. Um, Alan wrote another PR that is actually I think slightly better PR, but they're basically the same proposal. This is mine, and more performance in terms of property. Um, they, I mean, it's not like why is my concern with the same things but with something um in terms of, if you only want to do bytes, for example, you can only do bytes and um then you end up with something that's very similar to um, you know, standard stream flow control and single stream, but it's across multiple streams. So, um, in the past, the workgroup has not expressed a ton of enthusiasm for it. Um, if we did want to do subscribe side flow control in this in the past, I think we probably have to do this. Um, yeah. And so, I mean, my take away from what the working group wants is that probably closing this or moving this to like a extension or something makes the most sense. So like, I want to give it one more go round before we go that path. I'm not finding this PR, where is... What's the number? It's not 11, I'm sure. PR 11 is in my repo, A-Frind/moq-transport. Low low numbers come from... You have to link them, yeah, sorry, that if you click that link, you just go to GitHub, A-Frind/moq-transport. You'll find all kinds of secret stuff I work on. When I dream up things that I don't want to share with people. But but 1591 is this issue. You're looking for the content, it should be fine. Um, I mean, you want I I guess I could put feedback on the message on the, but I guess the thing is is like, if we're going to do something like this, I think limiting the rate would also be really important. I'll I'll have Magnus define how to how to measure the rate. That's a possible, but that would be Magnus's problem. I mean, you can kind of I mean, you can limit the rate by limiting the bytes. Yes. But but there's this is not like like like, I have no idea whether conference call when it starts is going to last 5 minutes or 5 months. Like, literally, they go for 5 months. So, trying I can't put in a reasonable limit for bytes, but I could put in a reasonable limit for rate. No, no. But, you keep feeding it credit, so like, you as you consume, say, you have say, you have like a Yeah, I understand. Okay. Okay. All right, so you just keep feeding it credit, so functionally it works. I mean, it's a little bit tedious, but like, you don't have to worry about... It's not with only, I mean, you have to act packets to it, like, to get congestion window back, so it's like, you know, similar amount of work. And I wrote I wrote one of these PRs but I don't remember anything about it. So, if you're out of credit and subscribe, you still queue it, and then, if credit arrives, it just flushes late. Yes. And it's still subject to delivery timeout. Yes, and at some point, you do deliver, like, too far behind. And then, you can still get too far behind, like, that counts against your queue. Yes. Um, so, I mean, it basically gives you the same flow control you get with like a FETCH. But like on subscribe, so, that's that's the intent. I'm imagining you had FETCH flow control on subscribe. I have not seen a lot of people jumping up and down saying like, I really want to do this. So, I think maybe the thing to do is have it in an extension and maybe people will play with it and find that there's useful and MoQ V2, you know, just like whatever, Speedy V2 didn't have flow control on application. We should have flow control. I sort of feel like this is like, you know, the other big topic that I think is the DDoS stuff we haven't really thought into the draft yet, or whatever. It's like, this may be a very plausible solution to resource exhaustion fights that we have identified in the DDoS draft. Seems like we haven't like, again, I mean, we need a design team for DDoS too, we have a design for DDoS. But they haven't talked about this, this is off in the this is off in the subscriber to the relay. This is not how we've you can sort of throttle them in. I I look I don't I I had always imagined there would be some way to rate-limit things, like rate-limit data, not not control. Control messages will not be how the systems kill. Um, so I thought there would be something you could do with rate limiting data and this seems like sort of it, though I can't read I haven't read the PR, obviously. Okay. Um, I don't know. We can repark it for like and actually like, we can plan to talk about next performance draft in more detail, I'm happy to. Like, I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good and, no. So, so, so, you're trying to simulate what you get with FETCH. So, you're just trying to aggregate all the sub-groups and saying, for this whole batch of sub-groups, here is your here is your limit. Yeah, here's the total number of bytes. Okay. You you can send. Yep, and also stream limit too if you want to use that. So, say say you're in a situation where like, I have 100 streams and I want to give this subscription 20 of them but not more than 20 then you basically can make sure that those subscriptions doesn't consume more than 20, which for a relay is is kind of nice, because you know, you're going upstream, you're trying to like, approximately fair share probably between multiple subscriptions. I mean, this is mostly realistically probably really valued with relays where you have, you know, a lot of people who are competing with on the same resources, you potentially, like someone downstream might like just stop consuming, or like, you know, there is bandwidth. Because right now you're only hammer is unsubscribed, right? Right now your only hand, well, you have you have flow, you have flow control on a subscribe, if you don't want the other person to send, you have flow control on stream. Every stream, yes. It stops and then, FETCH is a stream so it has flow control, right? Like, that's it just has it because it's is because it works. Okay. But subscribe is not. Um, do, I was going to say like, what are the like, since this has come up a few times and people have so, it sounds like maybe now people are more interested in this, like what can we do to make sure that we actually advance the ball? Um, that's why I say, do what, how about we we completely on this, I'd rather get our auth stuff straightened out first before we did did the DDoS. This stuff because I think that the auth stuff solves lots of that, so like, if I was sequencing these, I would get auth done in the draft as of something coming up fairly soon, once we got all the currents, I get all the current stuff we have in play landed, right? And then start the auth discussion, and then start the the the rate limiting discussion. Victor's in queue, also. What's up? I didn't see. Oh, I'm sorry. Okay. Ah, yes, this is the description, is sounds like a real problem, the solution does not solve it. Why? It does not authorize all, it does not prove that we do not have flow control for subscription. We do have flow control for subscription. The subscription are implicitly flow control for here, in a sense that the receiver has to consume data and as long as it's generated, or eventually you will get timeouts or so to far behind, and sensors because of nature of things which transmit by subscription you kind of have to consume it as of rate, and that's why is flow control because that's how the the rate of the receiver gets to like, the rate of the sender. Now, is there by the other issues with like, someone sending traffic that is too high rate for the network and things like that, and we might need to have support for that, but I don't think we've, like, that's that sort of out-of-sync problem like, you know, Ian was saying, use some sort of rate limit, or like, we we definitely need that, I mean even H3 do we do some stuff that mostly avoid that, but like it's a real problem at the... I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good, and... No, no, no.
Suhas Nandakumar: We tried to do JSON patch, we actually had JSON patch, we had objections that it was a big heavy library that did a lot of things and wasn't available in every target environment, and people wanted a cleaner mechanism that is simple, that is only what we did, so we dropped JSON patch and went to this. Which I think if we add update, it allows you to do a lightweight change, nothing stops you from republishing the clean catalog. The difference with this is, and we had this with DASH as well, eventually we added a patch mechanism to DASH manifests, is that publishing the whole XML in case of DASH, or maybe the JSON here, is a heavier operation than just an object model update inside your player.
Adrian Roe: I'm not finding this PR, where is... what's the number? It's not 11, I'm sure.
Suhas Nandakumar: PR 11 is in my repo, A-Frind/moq-transport, low numbers come from... You have to link them, yeah, sorry, that if you click that link, you just go to GitHub, A-Frind/moq-transport, you'll find all kinds of secret stuff I work on. When I dream up things that I don't want to share with other people. But 1591 is this issue.
Adrian Roe: Okay, I'm going to, that's a good place to transition to the next issue, which is very related, which is... How about we propose this as the get the phrase the real error problem we have spent many hours debugging is we have no idea what the hell actually closed the session. So, so, you're just going to have to fix it. If we had a number for every place in the in the spec where it says "protocol violation", you can't put that, you have to put protocol violation and a number. Yeah. A unique number, so then anybody that hits it, "Oh, I I hit this." So, we turn protocol violation into into a a pool of like 50, how many ever times that appears in the spec, a pool of 50 numbers, and you put in the spec the number every single time. So, you know, we've we've spent many hours debugging things being like, "What the hell happened? Why is this closed?" and you don't know until you trace through. All right, well, I'm going to chain off of Christian to say there's a bit of security risk in just being too specific about how you fail. Um, I I kind of assumed you were going to rip these out at the end. Oh, really? Yeah. But, I I don't I'm also not going to die on this hill, but I'm kind of with Colin. So, I want to let me bring up the second issue, which I think may offer a different path forward, which might address Colin's also. Suhas filed this today, which is that wants to have an error payload in request error, which is for structured binary data, and I think, if I understand, the use case is that privacy pass sometimes needs to fail you with unauthorized, and give you a challenge or something that you need to complete the next step. Um, so, there's a this issue got opened and there's a PR that does it. So, we should decide that is this something that we want to add? We want to add it only on certain error codes? And then, then I want to think about what's the intersection with reason phrase, because if I'm remembering one of the HTTP specs correctly, they don't have reason phrase, they have debug data, which is defined as a binary blob and you can stick whatever it Now, some people put ASCII strings in there, but that's like one potential option. So, I don't know, Suhas, do you want to say more about this particular use case? I can, um, the thing is that I I had two options there, because uh in in a privacy pass we do the first part of the of the setup, you will not get challenge, and then the relay would ask you for. We had some sometimes some metadata. There's two ways we wanted a way, in setup, when you fail, state setup, to come back with the challenge. And and we can do this, two ways to do this, right? Do something like this, or you define a new auth error. Kind of like this. Hold on, setup doesn't have a response. Setup is just but I I think this is the and like, again, this is where in the category of stuff that I was like, when we get to auth in the draft, we'll fix this, okay, but I don't think we should do this with an error. I mean, every pretty much every auth auth token scheme that isn't a bearer token has a challenge phase. And so, anytime you send up any request, you're you're going to need something that looks like a auth challenge to it, that needs to pass back a authorization. So, that's separate message for. But that's spec-wise, that's a different message. Yeah. Or something like that. But, we need to design for this one way or another, we have to support this, right? Okay. Yeah. Does that work for you if we say like, okay. All right, so we'll close this. We're not going to do error. I think we need to figure out how to spell this. But this is our but not this way. Okay. That's my goal. Okay, we'll close that PR. I I think the issue to say we need a way to have, right? And let the auth design team to figure it out, and I think it can be message-sized, and finally, yeah, it works. So, the other point I want to make in this space is that now the way we structured it, um if it is if it happens to be an immediate request error, we have this reason phrase, but if it's a error that happens later, it's a reset stream, which has no error phrase, and nobody really cares. So, like, Why do we need reason phrase? Uh, I don't know, like, like, I don't want to spend forever paying the lawsuit here, like, yeah, okay. Um, like, I I kind of think we ought to have it, I don't know what other people think. I guess, you know, H3 doesn't have them in certain places, or Quick doesn't, but they're they are in some places but not others. Someone just needs to make a call. If, I don't know, if Colin, if we just said it's binary data, do your best, like, does that does that resolve all of your issues about UTF-8? It sure sure, because then people don't try and print it. They try and they They'll print it anyway. I will. But like the number of like, okay, at least you sort of have the security group, and like, the number of attacks we saw that like, relied on being able to reverse the bidirectional log message right over top of it and then go forward again, stuff like like UTF-8 allows you to do all of those things, so like, I mean, the I see, okay. I think the text we have for UTF-8 in this particular thing was lifted straight from HTTP/2. Because they have a place for a reason phrase to appear. So, we've at least tried to copy somebody who's more past than we are. I'm not saying that we won't have more trouble. Uh, I don't want to, time is over. How should we resolve, chairs? Resolve this? Yeah. Um, I think I I heard a lot of I think people asked for the debug stuff, I've heard that. I don't think any of the reason phrase haters have a problem with keeping this through the interop phase. Right, Colin? I mean, I don't No, no, no. Here is the concrete proposal going forward. We log in the draft, we put a message in the draft before this gets before this is last called, we'll make we'll make a we'll make a call in the working group of whether we're going to remove it or not. Like, that's just a way, like, punt it to the like, a like, punt it. I mean, we're out of time and I want to give I want to give you time to talk about timestamps. So, like, I don't know in this I'm okay punting it a little bit, but like, who cares? I'm good. I mean, let's just make a decision. Yes, what? Do we want to remove the UTF-8 text? I guess is one question. That's a separate question. Okay, well, why don't I don't think anyone cares strongly. We we could flip a coin, we could we could raise hands, like, I mean, like I would like to see them go away at the end. I think Colin would, but, we're not going to die, we're not going to like, yuck anyone's yum if they really want to have them in there from... If, if, if we're going to remove it, then, then aren't some of the, like, what what I said, which is, we just need a, we just need a reason codes to be bigger and more detailed than just error, protocol violation, you know, go on. Good luck to you guys. Yeah, I guess that's my point. I need somewhere to put something more detailed than just like, "error" is too active. But, but, but, but our reason codes space is big, right? It's like, it's in a 55-m, you can make as many as many codes as we want. I'd rather not actually make more codes, I'd rather, whatever, we do something like what I said, which is, put a line, whether a line, like, this is the line, or see whatever. You can have 9,000. If, if we're going to have rich feedback, like, just leave it as it is. That's what I say. I have no, nobody said, I want the binary, I want the text or binary stuff, to have to have it. Nobody said that, right? I mean, I want no debug, no text. There's one one there, but, that's for expected behaviour. And, and we've written it as binary. But nobody cares about whether there is anything in the UTF-8 or binary, right? Nobody said, "I need data in there." right? It's "no" for everybody. So, so, you're saying, "Don't send reason phrases?" I absolutely do. Do, do people care about it? I care. At least, for now, I care. I have definitely, we've got at least one key, I mean, for H3, I do, like, I mean, I use them. So, all right, if, if, if the desire is to have this really rich feedback, uh, like, with a million codes, uh, I would say like, let's just leave it as it is, like the point is to to keep it, to find the feedback with something, and not mess with the protocol. We're just going to have to recreate as a new structure. So, just leave it as it is. That works for me, Colin. We're switching this to binary data allow us to make it in the protocol that we anticipate already. Oh, yeah, yeah, yeah, because I mean, if, if, if, if you like, oh, I mean, that, I'm just sort of saying that, the, the, the, the problems are is, when you use UTF-8, you're supposed to scribe, uh, normalization form, and we don't, like, if we want to do the things that you're supposed to do when you do UTF-8, like, have normalization form, and things like that, it'll go through no problem. We just haven't done all of the things that UTF is required to do. Which is why US ASCII would be much simpler because then you don't have those requirements, or binary. Or, or like, this is a binary data like, you already have a 9-byte binary field. Okay, I want to give the rest of the time to Ian, like, error code is the same 9-byte binary. You will love arguing about timestamp more than you love arguing about... All right, Ian, you're up, and you have, uh, 12 minutes. Talk about this some in the past, um, but I think we now have more experience to have a way more informed conversation and maybe, um, we get a real direction out of it, so I will try again. Um, conversations around DTS we believe have been very helpful in providing some like real world, like, how people are actually going to do like track switching and such. Okay. So, potential use cases for, um, timestamp across tracks is, you know, audio and video, for example, you trying to keep them approximately in sync. So, like, this is particularly valuable if you have like a smallish jitter buffer, but like it's not like 15 milliseconds, it's more like 2 seconds, um, where it's very easy for the audio to actually get like a good bit ahead of the video if you're, um, not careful and you have a lot of audio. Um, but it also could be very helpful for keeping, um, tracks in sync that are like from different productions. So, like video conferencing, um, any other time you have like two different video feeds and you want to keep uh approximately in sync from like a delivery perspective. So again, this is all about delivery, this is not about playback. This is just there to try to make sure that like you don't have buffer under-runs, because you like, you know, send all of one thing and none of the other. Um, so timestamp cross-attributes is a difficult to impossible problem in the like extreme sense, but as I think Colin kind of alluded, um, you know, perfection's probably not required here, like this is just for delivery. Um, and so, you know, if we can get uh sufficiently accurate timestamps solely, it actually might still be like a net win. And also, I've there are lot of use cases where even if the time is not 100% synchronized, it's like maybe they're all from like the same data center and the same like region or same like rate, right? I mean, like these these clocks can be reasonably close, uh, like maybe where they're transcoded, things like that. Um, so, there's a lot of use cases where we might be able to use this. Um, and of, I wanted to say briefly on DTS, currently uses group alignment, which seems to work fairly well. Um, probably could use a timestamp if if it's available, but I think DTS have proven that like at least for that one use case like probably group alignment is sufficient. It's not required. Like, timestamps aren't required, even if they could be used. So, Yeah, that was going to be my comment. DTS doesn't actually, it's just requiring group IDs to be consistent, they can all completely different media times or something. It doesn't matter. Yeah. And also when you're syncing media, you're not going to do it with the time with the timestamp carried on the delivery item. You're going to do it with the time a much more accurate time signal that's embedded encoded, like MPEG system you have for that, or time base or presentation set, and then, and sync it right. So, this timestamp then would be useful if we had say allowed filters or something that is a property of transport. I want, I only want objects to that are signaling this timestamp. But that timestamp is just a number, so couldn't we, and you can already define an existing property which is a number. So is this really about just standardizing a property and saying it is a timestamp and here's how you you write it? Uh, yes, basically, things like that. Or, is there some other issue? Or like, someone sending traffic that is too high rate for the network and things like that, and we might need to have support for that, but I don't think we've, like, that's that sort of out-of-sync problem like, you know, Ian was saying, use some sort of rate limit, or like, we we definitely need that, I mean even H3 do we do some stuff that mostly avoid that, but like it's a real problem at the... I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good, and... No, no, no.
Alan Frindell: We try to simulate what you get with FETCH, so you're just trying to aggregate all the sub-groups and saying, for this whole batch of sub-groups, here is your, here is your limit.
Suhas Nandakumar: Yeah, here's the total number of bytes.
Alan Frindell: Yep, and also stream limit too if you want to use that. So say, say you're in a situation where like, I have 100 streams and I want to give this subscription 20 of them but not more than 20, then you basically can make sure that those subscriptions doesn't consume more than 20, which for a relay is, is kind of nice because, you know, you're going upstream, you're trying to like, approximately fair share probably between multiple subscriptions. I mean this is mostly realistically probably really valued with relays where you have, you know, a lot of people who are competing with on the same resources, you potentially, like someone downstream might like just stop consuming, or like, you know, there's bandwidth. Because right now your only hammer is unsubscribed, right? Right now your only hand, well, you have, you have flow, you have flow control on a subscribe, if you don't want the other person to send, you have flow control on stream.
Suhas Nandakumar: Every stream, yes.
Alan Frindell: It stops and then, FETCH is a stream so it has flow control, right? Like, that's it just has it because it's, is because it works.
Suhas Nandakumar: Okay. But subscribe is not.
Alan Frindell: Um, do, I was going to say like, what are the, like since this has come up a few times and people have, so it sounds like maybe now people are more interested in this, like what can we do to make sure that we actually advance the ball? Um, that's why I say, do what, how about we, we completely on this, I'd rather get our auth stuff straightened out first before we did, did the DDoS. This stuff because I think that the auth stuff solves lots of that, so like, if I was sequencing these, I would get auth done in the draft as of something coming up fairly soon, once we got all the currents, I get all the current stuff we have in play landed, right? And then start the auth discussion, and then start the, the, the rate limiting discussion. Victor's in queue, also. What's up? I didn't see. Oh, I'm sorry. Okay. Ah, yes, this is the description, is sounds like a real problem, the solution does not solve it. Why? It does not authorize all, it does not prove that we do not have flow control for subscription. We do have flow control for subscription. The subscription are implicitly flow control for here, in a sense that the receiver has to consume data and as long as it's generated, or eventually you will get timeouts or so to far behind, and sensors because of nature of things which transmit by subscription you kind of have to consume it as of rate, and that's why is flow control because that's how the the rate of the receiver gets to, like the rate of the sender. Now, is there by the other issues with like, someone sending traffic that is too high rate for the network and things like that, and we might need to have support for that, but I don't think we've, like, that's that sort of out-of-sync problem like, you know, Ian was saying, use some sort of rate limit, or like, we we definitely need that, I mean even H3 do we do some stuff that mostly avoid that, but like it's a real problem at the... I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good, and... No, no, no.
Suhas Nandakumar: We tried to do JSON patch, we actually had JSON patch, we had objections that it was a big heavy library that did a lot of things and wasn't available in every target environment, and people wanted a cleaner mechanism that is simple, that is only what we did, so we dropped JSON patch and went to this. Which I think if we add update, it allows you to do a lightweight change, nothing stops you from republishing the clean catalog. The difference with this is, and we had this with DASH as well, eventually we added a patch mechanism to DASH manifests, is that publishing the whole XML in case of DASH, or maybe the JSON here, is a heavier operation than just an object model update inside your player.
Suhas Nandakumar: Okay. So, then... We have to do this? No, no, no. We are not going to do... No, no, no, I think the other way, I think the other way is better, which is, we just need a, we just need a way to, right? And let the auth design team to figure it out, and I think it can be message-sized, and finally, yeah, it works. So, the other point I want to make in this space is that now the way we structured it, um if it is if it happens to be an immediate request error, we have this reason phrase, but if it's a error that happens later, it's a reset stream, which has no error phrase, and nobody really cares. So, like, Why do we need reason phrase? Uh, I don't know, like, like, I don't want to spend forever paying the lawsuit here, like, yeah, okay. Um, like, I I kind of think we ought to have it, I don't know what other people think. I guess, you know, H3 doesn't have them in certain places, or Quick doesn't, but they're they are in some places but not others. Someone just needs to make a call. If, I don't know, if Colin, if we just said it's binary data, do your best, like, does that does that resolve all of your issues about UTF-8? It sure sure, because then people don't try and print it. They try and they They'll print it anyway. I will. But like the number of like, okay, at least you sort of have the security group, and like, the number of attacks we saw that like, relied on being able to reverse the bidirectional log message right over top of it and then go forward again, stuff like like UTF-8 allows you to do all of those things, so like, I mean, the I see, okay. I think the text we have for UTF-8 in this particular thing was lifted straight from HTTP/2. Because they have a place for a reason phrase to appear. So, we've at least tried to copy somebody who's more past than we are. I'm not saying that we won't have more trouble. Uh, I don't want to, time is over. How should we resolve, chairs? Resolve this? Yeah. Um, I think I I heard a lot of I think people asked for the debug stuff, I've heard that. I don't think any of the reason phrase haters have a problem with keeping this through the interop phase. Right, Colin? I mean, I don't No, no, no. Here is the concrete proposal going forward. We log in the draft, we put a message in the draft before this gets before this is last called, we'll make we'll make a we'll make a call in the working group of whether we're going to remove it or not. Like, that's just a way, like, punt it to the like, a like, punt it. I mean, we're out of time and I want to give I want to give you time to talk about timestamps. So, like, I don't know in this I'm okay punting it a little bit, but like, who cares? I'm good. I mean, let's just make a decision. Yes, what? Do we want to remove the UTF-8 text? I guess is one question. That's a separate question. Okay, well, why don't I don't think anyone cares strongly. We we could flip a coin, we could we could raise hands, like, I mean, like I would like to see them go away at the end. I think Colin would, but, we're not going to die, we're not going to like, yuck anyone's yum if they really want to have them in there from... If, if, if we're going to remove it, then, then aren't some of the, like, what what I said, which is, we just need a, we just need a reason codes to be bigger and more detailed than just error, protocol violation, you know, go on. Good luck to you guys. Yeah, I guess that's my point. I need somewhere to put something more detailed than just like, "error" is too active. But, but, but, but our reason codes space is big, right? It's like, it's in a 55-m, you can make as many as many codes as we want. I'd rather not actually make more codes, I'd rather, whatever, we do something like what I said, which is, put a line, whether a line, like, this is the line, or see whatever. You can have 9,000. If, if we're going to have rich feedback, like, just leave it as it is. That's what I say. I have no, nobody said, I want the binary, I want the text or binary stuff, to have to have it. Nobody said that, right? I mean, I want no debug, no text. There's one one there, but, that's for expected behaviour. And, and we've written it as binary. But nobody cares about whether there is anything in the UTF-8 or binary, right? Nobody said, "I need data in there." right? It's "no" for everybody. So, so, you're saying, "Don't send reason phrases?" I absolutely do. Do, do people care about it? I care. At least, for now, I care. I have definitely, we've got at least one key, I mean, for H3, I do, like, I mean, I use them. So, all right, if, if, if the desire is to have this really rich feedback, uh, like, with a million codes, uh, I would say like, let's just leave it as it is, like the point is to to keep it, to find the feedback with something, and not mess with the protocol. We're just going to have to recreate as a new structure. So, just leave it as it is. That works for me, Colin. We're switching this to binary data allow us to make it in the protocol that we anticipate already. Oh, yeah, yeah, yeah, because I mean, if, if, if, if you like, oh, I mean, that, I'm just sort of saying that, the, the, the, the problems are is, when you use UTF-8, you're supposed to scribe, uh, normalization form, and we don't, like, if we want to do the things that you're supposed to do when you do UTF-8, like, have normalization form, and things like that, it'll go through no problem. We just haven't done all of the things that UTF is required to do. Which is why US ASCII would be much simpler because then you don't have those requirements, or binary. Or, or like, this is a binary data like, you already have a 9-byte binary field. Okay, I want to give the rest of the time to Ian, like, error code is the same 9-byte binary. You will love arguing about timestamp more than you love arguing about... All right, Ian, you're up, and you have, uh, 12 minutes. Talk about this some in the past, um, but I think we now have more experience to have a way more informed conversation and maybe, um, we get a real direction out of it, so I will try again. Um, conversations around DTS we believe have been very helpful in providing some like real world, like, how people are actually going to do like track switching and such. Okay. So, potential use cases for, um, timestamp across tracks is, you know, audio and video, for example, you trying to keep them approximately in sync. So, like, this is particularly valuable if you have like a smallish jitter buffer, but like it's not like 15 milliseconds, it's more like 2 seconds, um, where it's very easy for the audio to actually get like a good bit ahead of the video if you're, um, not careful and you have a lot of audio. Um, but it also could be very helpful for keeping, um, tracks in sync that are like from different productions. So, like video conferencing, um, any other time you have like two different video feeds and you want to keep uh approximately in sync from like a delivery perspective. So again, this is all about delivery, this is not about playback. This is just there to try to make sure that like you don't have buffer under-runs, because you like, you know, send all of one thing and none of the other. Um, so timestamp cross-attributes is a difficult to impossible problem in the like extreme sense, but as I think Colin kind of alluded, um, you know, perfection's probably not required here, like this is just for delivery. Um, and so, you know, if we can get uh sufficiently accurate timestamps solely, it actually might still be like a net win. And also, I've there are lot of use cases where even if the time is not 100% synchronized, it's like maybe they're all from like the same data center and the same like region or same like rate, right? I mean, like these these clocks can be reasonably close, uh, like maybe where they're transcoded, things like that. Um, so, there's a lot of use cases where we might be able to use this. Um, and of, I wanted to say briefly on DTS, currently uses group alignment, which seems to work fairly well. Um, probably could use a timestamp if if it's available, but I think DTS have proven that like at least for that one use case like probably group alignment is sufficient. It's not required. Like, timestamps aren't required, even if they could be used. So, Yeah, that was going to be my comment. DTS doesn't actually, it's just requiring group IDs to be consistent, they can all completely different media times or something. It doesn't matter. Yeah. And also when you're syncing media, you're not going to do it with the time with the timestamp carried on the delivery item. You're going to do it with the time a much more accurate time signal that's embedded encoded, like MPEG system you have for that, or time base or presentation set, and then, and sync it right. So, this timestamp then would be useful if we had say allowed filters or something that is a property of transport. I want, I only want objects to that are signaling this timestamp. But that timestamp is just a number, so couldn't we, and you can already define an existing property which is a number. So is this really about just standardizing a property and saying it is a timestamp and here's how you you write it? Uh, yes, basically, things like that. Or, is there some other issue? Or like, someone sending traffic that is too high rate for the network and things like that, and we might need to have support for that, but I don't think we've, like, that's that sort of out-of-sync problem like, you know, Ian was saying, use some sort of rate limit, or like, we we definitely need that, I mean even H3 do we do some stuff that mostly avoid that, but like it's a real problem at the... I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good, and... No, no, no.
Suhas Nandakumar: Oh. If, if we're going to have rich feedback, like just leave it as it is, that's what I'd say. I've heard nobody say, "I want the binary, I want the text or binary stuff, to have to have it." I've heard nobody say that, right? I mean, I want no debug, no text. There's one one there, but, that's for expected behaviour. And, and we've written it as binary. But nobody cares about whether there is anything in the UTF-8 or binary, right? Nobody said, "I need data in there," right? It's "no" for everybody. So, so, you're saying, "Don't send reason phrases?" I absolutely do. Do, do people care about it? I care. At least, for now, I care. I have definitely, we've got at least one key, I mean, for H3, I do, like, I mean, I use them. So, all right, if, if, if the desire is to have this really rich feedback, uh, like, with a million codes, uh, I would say like, let's just leave it as it is, like the point is to to keep it, to find the feedback with something, and not mess with the protocol. We're just going to have to recreate as a new structure. So, just leave it as it is. That works for me, Colin. We're switching this to binary data allow us to make it in the protocol that we anticipate already. Oh, yeah, yeah, yeah, because I mean, if, if, if, if you like, oh, I mean, that, I'm just sort of saying that, the, the, the, the problems are is, when you use UTF-8, you're supposed to scribe, uh, normalization form, and we don't, like, if we want to do the things that you're supposed to do when you do UTF-8, like, have normalization form, and things like that, it'll go through no problem. We just haven't done all of the things that UTF is required to do. Which is why US ASCII would be much simpler because then you don't have those requirements, or binary. Or, or like, this is a binary data like, you already have a 9-byte binary field. Okay, I want to give the rest of the time to Ian, like, error code is the same 9-byte binary. You will love arguing about timestamp more than you love arguing about... All right, Ian, you're up, and you have, uh, 12 minutes. Talk about this some in the past, um, but I think we now have more experience to have a way more informed conversation and maybe, um, we get a real direction out of it, so I will try again. Um, conversations around DTS we believe have been very helpful in providing some like real world, like, how people are actually going to do like track switching and such. Okay. So, potential use cases for, um, timestamp across tracks is, you know, audio and video, for example, you trying to keep them approximately in sync. So, like, this is particularly valuable if you have like a smallish jitter buffer, but like it's not like 15 milliseconds, it's more like 2 seconds, um, where it's very easy for the audio to actually get like a good bit ahead of the video if you're, um, not careful and you have a lot of audio. Um, but it also could be very helpful for keeping, um, tracks in sync that are like from different productions. So, like video conferencing, um, any other time you have like two different video feeds and you want to keep uh approximately in sync from like a delivery perspective. So again, this is all about delivery, this is not about playback. This is just there to try to make sure that like you don't have buffer under-runs, because you like, you know, send all of one thing and none of the other. Um, so timestamp cross-attributes is a difficult to impossible problem in the like extreme sense, but as I think Colin kind of alluded, um, you know, perfection's probably not required here, like this is just for delivery. Um, and so, you know, if we can get uh sufficiently accurate timestamps solely, it actually might still be like a net win. And also, I've there are lot of use cases where even if the time is not 100% synchronized, it's like maybe they're all from like the same data center and the same like region or same like rate, right? I mean, like these these clocks can be reasonably close, uh, like maybe where they're transcoded, things like that. Um, so, there's a lot of use cases where we might be able to use this. Um, and of, I wanted to say briefly on DTS, currently uses group alignment, which seems to work fairly well. Um, probably could use a timestamp if if it's available, but I think DTS have proven that like at least for that one use case like probably group alignment is sufficient. Not required. Like, timestamps aren't required, even if they could be used. So, Yeah, that was going to be my comment. DTS doesn't actually, it's just requiring group IDs to be consistent, they can all completely different media times or something. It doesn't matter. Yeah. And also when you're syncing media, you're not going to do it with the time with the timestamp carried on the delivery item. You're going to do it with the time a much more accurate time signal that's embedded encoded, like MPEG system you have for that, or time base or presentation set, and then, and sync it right. So, this timestamp then would be useful if we had say allowed filters or something that is a property of transport. I want, I only want objects to that are signaling this timestamp. But that timestamp is just a number, so couldn't we, and you can already define an existing property which is a number. So is this really about just standardizing a property and saying it is a timestamp and here's how you you write it? Uh, yes, basically, things like that. Or, is there some other issue? Or like, someone sending traffic that is too high rate for the network and things like that, and we might need to have support for that, but I don't think we've, like, that's that sort of out-of-sync problem like, you know, Ian was saying, use some sort of rate limit, or like, we we definitely need that, I mean even H3 do we do some stuff that mostly avoid that, but like it's a real problem at the... I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good, and... No, no, no.
Suhas Nandakumar: I think there is value in just standardizing something because it comes in different forms in different applications.
Suhas Nandakumar: Yeah, that's...
Suhas Nandakumar: Yeah, I I think this is a layer violation, um, but...
Suhas Nandakumar: It only, all use cases cited are all applications for syncing media. And as Jordi just said, different applications write different timestamps to media. They are application-defined. Are we going to arbitrarily pick some of them and put them on the transport protocol, which is what we're defining here? So, and I, as much as I use and and need these timestamps, I don't think putting them into the transport protocol is best for them. They should, there's their application utility. They can go in custom-defined properties that the application files.
Suhas Nandakumar: Fair, fair question. Um, so, but what if we want the transport protocol to try to deliver them more effectively? Would it be better then to just have the ability to, like, point to an existing field and say, like, that field is somehow allows you to synchronize between, like, multiple tracks?
Suhas Nandakumar: You call it a sync field?
Suhas Nandakumar: Sure, fine, that's fine.
Suhas Nandakumar: Sync field.
Suhas Nandakumar: Okay. Or...
Suhas Nandakumar: And you can just point at anything. It's not time. Yeah, it's a sync, it's a tech-agnostic sync field. We like physics, right? We deliver boxes. We don't have this call closed.
Suhas Nandakumar: That, that would be fine by me. No, sure, I I think that's valid, Colin.
Suhas Nandakumar: Um, I, I really, the layer violation thing, and I love it. So, I actually think this time, so I want to use this for something different, which is, um, we debug all time using timestamps of trying to figure out what's happening across the relay networks, and between the more whatever, and often relays will statistically sample a small percentage of the relay, of the timestamps and report them up to met- a metric servers, and things like that. Okay. And like, RTP explicitly designed the timestamp to not be encrypted from near mediators, it's explicit so near mediators could see it, which is, like, really wild, like, highly non-encrypted, like, I don't even mean it's, like, inside encrypted packets, like, it should be bare on the internet, even when you're using DTLS-SRTP. Because it's such a useful debugging tool. Okay. And, so, I think that we, I, I'd sort of go the direction, and maybe it's a different extension, maybe we end up with two, I think it'd be useful to be able to have, um, an, an, you know, NTP absolute timestamp of what we think the time of this packet is, that you can statistically drop into some of the objects or all of the objects, if you, if you felt like it, and then it can be used for this as well as being used for metrics and debugging processes, and maybe those, maybe I'm trying to combine two things that are different. And I agree it's a layer violation, but I think this one timestamp is, is so critical to debugging real-time flows, that is worth having the layer violation for timestamps.
Suhas Nandakumar: These are send timestamps, minted by the sender?
Suhas Nandakumar: Yes, at some point, you do deliver, like, too far behind. And then, you can still get too far behind, like, that counts against your queue. Yes. Um, so, I mean, it basically gives you the same flow control you get with like a FETCH. But like on subscribe, so, that's that's the intent. I'm imagining you had FETCH flow control on subscribe. I have not seen a lot of people jumping up and down saying like, I really want to do this. So, I think maybe the thing to do is have it in an extension and maybe people will play with it and find that there's useful and MoQ V2, you know, just like whatever, Speedy V2 didn't have flow control on application. We should have flow control. I sort of feel like this is like, you know, the other big topic that I think is the DDoS stuff we haven't really thought into the draft yet, or whatever. It's like, this may be a very plausible solution to resource exhaustion fights that we have identified in the DDoS draft. Seems like we haven't like, again, I mean, we need a design team for DDoS too, we have a design for DDoS. But they haven't talked about this, this is off in the this is off in the subscriber to the relay. This is not how we've you can sort of throttle them in. I I look I don't I I had always imagined there would be some way to rate-limit things, like rate-limit data, not not control. Control messages will not be how the systems kill. Um, so I thought there would be something you could do with rate limiting data and this seems like sort of it, though I can't read I haven't read the PR, obviously. Okay. Um, I don't know. We can repark it for like and actually like, we can plan to talk about next performance draft in more detail, I'm happy to. Like, I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good and, no. So, so, so, you're trying to simulate what you get with FETCH. So, you're just trying to aggregate all the sub-groups and saying, for this whole batch of sub-groups, here is your here is your limit. Yeah, here's the total number of bytes. Okay. You you can send. Yep, and also stream limit too if you want to use that. So, say say you're in a situation where like, I have 100 streams and I want to give this subscription 20 of them but not more than 20 then you basically can make sure that those subscriptions doesn't consume more than 20, which for a relay is is kind of nice, because you know, you're going upstream, you're trying to like, approximately fair share probably between multiple subscriptions. I mean, this is mostly realistically probably really valued with relays where you have, you know, a lot of people who are competing with on the same resources, you potentially, like someone downstream might like just stop consuming, or like, you know, there is bandwidth. Because right now you're only hammer is unsubscribed, right? Right now your only hand, well, you have you have flow, you have flow control on a subscribe, if you don't want the other person to send, you have flow control on stream. Every stream, yes. It stops and then, FETCH is a stream so it has flow control, right? Like, that's it just has it because it's is because it works. Okay. But subscribe is not. Um, do, I was going to say like, what are the like, since this has come up a few times and people have so, it sounds like maybe now people are more interested in this, like what can we do to make sure that we actually advance the ball? Um, that's why I say, do what, how about we we completely on this, I'd rather get our auth stuff straightened out first before we did did the DDoS. This stuff because I think that the auth stuff solves lots of that, so like, if I was sequencing these, I would get auth done in the draft as of something coming up fairly soon, once we got all the currents, I get all the current stuff we have in play landed, right? And then start the auth discussion, and then start the the the rate limiting discussion. Victor's in queue, also. What's up? I didn't see. Oh, I'm sorry. Okay. Ah, yes, this is the description, is sounds like a real problem, the solution does not solve it. Why? It does not authorize all, it does not prove that we do not have flow control for subscription. We do have flow control for subscription. The subscription are implicitly flow control for here, in a sense that the receiver has to consume data and as long as it's generated, or eventually you will get timeouts or so to far behind, and sensors because of nature of things which transmit by subscription you kind of have to consume it as of rate, and that's why is flow control because that's how the the rate of the receiver gets to like, the rate of the sender. Now, is there by the other issues with like, someone sending traffic that is too high rate for the network and things like that, and we might need to have support for that, but I don't think we've, like, that's that sort of out-of-sync problem like, you know, Ian was saying, use some sort of rate limit, or like, we we definitely need that, I mean even H3 do we do some stuff that mostly avoid that, but like it's a real problem at the... I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good, and... No, no, no.
Ian Swett: Okay. I think we, I Ian, you have it written down?
Ian Swett: Uh, I've written down, but is the answer that, um, we do want to support them, they come on one stream, and they can overlap? All right.
Will Law: Okay. Not the outcome I was expecting, but at least we have response. Um, renumber everything. Uh, I don't think we're going to do anything immediately, but yes, we plan to renumber everything. Um, we'll, when, when the time comes, we will, um, make sure all the enums start at zero and have no, you know, and are contiguous, and are sorted in in an appropriate order. Um, so, we'll report this issue. Ian.
Ian Swett: Yes.
Will Law: Your favorite issue.
Ian Swett: Yes, um.
Ian Swett: Okay. Flow control. This PR for flow control. We want flow control. It limits the total number of streams and the total number of bytes sent on a subscription. It uses control messages on bidi stream. Now that we have bidi stream, it doesn't uh require we send stream, matter of anything. It's actually quite straightforward. Um, Alan wrote another PR that is actually I think slightly better PR, but they're basically the same proposal. This is mine, and more performance in terms of property. Um, they, I mean, it's not like why is my concern with the same things but with something um in terms of, if you only want to do bytes, for example, you can only do bytes and um then you end up with something that's very similar to um, you know, standard stream flow control and single stream, but it's across multiple streams. So, um, in the past, the workgroup has not expressed a ton of enthusiasm for it. Um, if we did want to do subscribe side flow control in this in the past, I think we probably have to do this. Um, yeah. And so, I mean, my take away from what the working group wants is that probably closing this or moving this to like a extension or something makes the most sense. So like, I want to give it one more go round before we go that path. I'm not finding this PR, where is... What's the number? It's not 11, I'm sure. PR 11 is in my repo, A-Frind/moq-transport. Low low numbers come from... You have to link them, yeah, sorry, that if you click that link, you just go to GitHub, A-Frind/moq-transport. You'll find all kinds of secret stuff I work on. When I dream up things that I don't want to share with people. But but 1591 is this issue. You're looking for the content, it should be fine. Um, I mean, you want I I guess I could put feedback on the message on the, but I guess the thing is is like, if we're going to do something like this, I think limiting the rate would also be really important. I'll I'll have Magnus define how to how to measure the rate. That's a possible, but that would be Magnus's problem. I mean, you can kind of I mean, you can limit the rate by limiting the bytes. Yes. But but there's this is not like like like, I have no idea whether conference call when it starts is going to last 5 minutes or 5 months. Like, literally, they go for 5 months. So, trying I can't put in a reasonable limit for bytes, but I could put in a reasonable limit for rate. No, no. But, you keep feeding it credit, so like, you as you consume, say, you have say, you have like a Yeah, I understand. Okay. Okay. All right, so you just keep feeding it credit, so functionally it works. I mean, it's a little bit tedious, but like, you don't have to worry about... It's not with only, I mean, you have to act packets to it, like, to get congestion window back, so it's like, you know, similar amount of work. And I wrote I wrote one of these PRs but I don't remember anything about it. So, if you're out of credit and subscribe, you still queue it, and then, if credit arrives, it just flushes late. Yes. And it's still subject to delivery timeout. Yes, and at some point, you do deliver, like, too far behind. And then, you can still get too far behind, like, that counts against your queue. Yes. Um, so, I mean, it basically gives you the same flow control you get with like a FETCH. But like on subscribe, so, that's that's the intent. I'm imagining you had FETCH flow control on subscribe. I have not seen a lot of people jumping up and down saying like, I really want to do this. So, I think maybe the thing to do is have it in an extension and maybe people will play with it and find that there's useful and MoQ V2, you know, just like whatever, Speedy V2 didn't have flow control on application. We should have flow control. I sort of feel like this is like, you know, the other big topic that I think is the DDoS stuff we haven't really thought into the draft yet, or whatever. It's like, this may be a very plausible solution to resource exhaustion fights that we have identified in the DDoS draft. Seems like we haven't like, again, I mean, we need a design team for DDoS too, we have a design for DDoS. But they haven't talked about this, this is off in the this is off in the subscriber to the relay. This is not how we've you can sort of throttle them in. I I look I don't I I had always imagined there would be some way to rate-limit things, like rate-limit data, not not control. Control messages will not be how the systems kill. Um, so I thought there would be something you could do with rate limiting data and this seems like sort of it, though I can't read I haven't read the PR, obviously. Okay. Um, I don't know. We can repark it for like and actually like, we can plan to talk about next performance draft in more detail, I'm happy to. Like, I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good and, no. So, so, so, you're trying to simulate what you get with FETCH. So, you're just trying to aggregate all the sub-groups and saying, for this whole batch of sub-groups, here is your here is your limit. Yeah, here's the total number of bytes. Okay. You you can send. Yep, and also stream limit too if you want to use that. So, say say you're in a situation where like, I have 100 streams and I want to give this subscription 20 of them but not more than 20 then you basically can make sure that those subscriptions doesn't consume more than 20, which for a relay is is kind of nice, because you know, you're going upstream, you're trying to like, approximately fair share probably between multiple subscriptions. I mean, this is mostly realistically probably really valued with relays where you have, you know, a lot of people who are competing with on the same resources, you potentially, like someone downstream might like just stop consuming, or like, you know, there is bandwidth. Because right now you're only hammer is unsubscribed, right? Right now your only hand, well, you have you have flow, you have flow control on a subscribe, if you don't want the other person to send, you have flow control on stream. Every stream, yes. It stops and then, FETCH is a stream so it has flow control, right? Like, that's it just has it because it's is because it works. Okay. But subscribe is not. Um, do, I was going to say like, what are the like, since this has come up a few times and people have so, it sounds like maybe now people are more interested in this, like what can we do to make sure that we actually advance the ball? Um, that's why I say, do what, how about we we completely on this, I'd rather get our auth stuff straightened out first before we did did the DDoS. This stuff because I think that the auth stuff solves lots of that, so like, if I was sequencing these, I would get auth done in the draft as of something coming up fairly soon, once we got all the currents, I get all the current stuff we have in play landed, right? And then start the auth discussion, and then start the the the rate limiting discussion. Victor's in queue, also. What's up? I didn't see. Oh, I'm sorry. Okay. Ah, yes, this is the description, is sounds like a real problem, the solution does not solve it. Why? It does not authorize all, it does not prove that we do not have flow control for subscription. We do have flow control for subscription. The subscription are implicitly flow control for here, in a sense that the receiver has to consume data and as long as it's generated, or eventually you will get timeouts or so to far behind, and sensors because of nature of things which transmit by subscription you kind of have to consume it as of rate, and that's why is flow control because that's how the the rate of the receiver gets to like, the rate of the sender. Now, is there by the other issues with like, someone sending traffic that is too high rate for the network and things like that, and we might need to have support for that, but I don't think we've, like, that's that sort of out-of-sync problem like, you know, Ian was saying, use some sort of rate limit, or like, we we definitely need that, I mean even H3 do we do some stuff that mostly avoid that, but like it's a real problem at the... I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good, and... No, no, no.
Ian Swett: I'm not finding this PR, where is... what's the number? It's not 11, I'm sure.
Will Law: PR 11 is in my repo, A-Frind/moq-transport, those low numbers come from... You have to link them, yeah, sorry, that if you click that link, you'll just go to GitHub, A-Frind/moq-transport, you'll find all kinds of secret stuff I work on. When I dream up things that I don't want to share with other people. But 1591 is this issue. You're looking for the content, it should be fine.
Ian Swett: Um, do you want... I guess I can put feedback on the, I guess the thing is is like, if we're going to do something like this, I think limiting the rate would also be really important.
Ian Swett: I'll have Magnus define how to how to measure the rate. That's a possible, but that would be Magnus's problem.
Ian Swett: No, you can kind of, I mean you can limit the rate by limiting the bytes. But, but there's, this is not like, like like, I have no idea whether a conference call when it starts is going to last 5 minutes or 5 months, like literally they go for 5 months. So trying, I can't put in a reasonable limit for bytes, but I could put in a reasonable limit for rate. No, no, but you keep feeding it credit, so like you as you consume, say you have, say you have like a... yeah, I understand. Okay. So, you just keep feeding it. So functionally it works, I mean it's a little bit tedious but like...
Ian Swett: But, but there's, this is not like, like like, I have no idea whether a conference call when it starts is going to last 5 minutes or 5 months, like literally they go for 5 months. So trying, I can't put in a reasonable limit for bytes. But I could put in a reasonable limit for rate.
Ian Swett: No, no, but you keep feeding it credit, so like you as you consume, say you have, say you have like a... yeah, I understand. Okay. So, you just keep feeding it. So functionally it works, I mean it's a little bit tedious but like...
Ian Swett: We can repark it for like and actually, like we can plan to talk about next performance draft in more detail, I'm happy to. I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good in next draft. No.
Suhas Nandakumar: Is you try to simulate what you get with FETCH, so you're just trying to aggregate all the sub-groups, and saying, for this whole batch of sub-groups, here is your, here is your limit.
Ian Swett: Yeah, here's the total number of bytes.
Suhas Nandakumar: Yeah, and what's our stream limit to, if you want to use that? So say, say you're in a situation where like, I have 100 streams and I want to give this subscription 20 of them but not more than 20, then you basically can make sure that those subscriptions doesn't consume more than 20, which for a relay is is kind of nice because, you know, you're going upstream, you're trying to like, approximately fair share probably between multiple subscriptions. I mean, this is mostly realistically probably really valued with relays where you have, you know, a lot of people who are competing with on the same resources, you potentially, like someone downstream might like just stop consuming, or like, you know, there's bandwidth. Because right now your only hammer is unsubscribed, right? Right now your only hand, well, you have, you have flow, you have flow control on a subscribe, if you don't want the other person to send.
Ian Swett: Every stream has it. Yeah, and stops. And FETCH is a stream so it has flow control, right? Like, that's it just has it because it's, is because it works. Okay. But subscribe is not. Um, do, I was going to say like, what are the, like since this has come up a few times and people have, so it sounds like maybe now people are more interested in this, like what can we do to make sure that we actually advance the ball?
Suhas Nandakumar: Um, that's why I say, do what, how about we, we completely on this, I'd rather get our auth stuff straightened out first before we did, did the DDoS. This stuff because I think that the auth stuff solves lots of that, so like, if I was sequencing these, I would get auth done in the draft as of something coming up fairly soon, once we got all the currents, I get all the current stuff we have in play landed, right? And then start the auth discussion, and then start the, the, the rate limiting discussion.
Suhas Nandakumar: Victor's in queue, also. What's up? I didn't see. Oh, I'm sorry. Okay. Ah, yes, this is the description, is sounds like a real problem, the solution does not solve it. Why? It does not authorize all, it does not prove that we do not have flow control for subscription. We do have flow control for subscription. The subscription are implicitly flow control for here, in a sense that the receiver has to consume data and as long as it's generated, or eventually you will get timeouts or so to far behind, and sensors because of nature of things which transmit by subscription you kind of have to consume it as of rate, and that's why is flow control because that's how the the rate of the receiver gets to, like the rate of the sender. Now, is there by the other issues with like, someone sending traffic that is too high rate for the network and things like that, and we might need to have support for that, but I don't think we've, like, that's that sort of out-of-sync problem like, you know, Ian was saying, use some sort of rate limit, or like, we we definitely need that, I mean even H3 do we do some stuff that mostly avoid that, but like it's a real problem at the... I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good, and... No, no, no.
Suhas Nandakumar: Oh, yeah, yeah, yeah, because I mean, if, if, if, if you like, oh, I mean, that, I'm just sort of saying that, the, the, the, the problems are is, when you use UTF-8, you're supposed to scribe, uh, normalization form, and we don't, like, if we want to do the things that you're supposed to do when you do UTF-8, like, have normalization form, and things like that, it'll go through no problem. We just haven't done all of the things that UTF is required to do. Which is why US ASCII would be much simpler because then you don't have those requirements, or binary. Or, or like, this is a binary data like, you already have a 9-byte binary field. Okay, I want to give the rest of the time to Ian, like, error code is the same 9-byte binary. You will love arguing about timestamp more than you love arguing about... All right, Ian, you're up, and you have, uh, 12 minutes. Talk about this some in the past, um, but I think we now have more experience to have a way more informed conversation and maybe, um, we get a real direction out of it, so I will try again. Um, conversations around DTS we believe have been very helpful in providing some like real world, like, how people are actually going to do like track switching and such. Okay. So, potential use cases for, um, timestamp across tracks is, you know, audio and video, for example, you trying to keep them approximately in sync. So, like, this is particularly valuable if you have like a smallish jitter buffer, but like it's not like 15 milliseconds, it's more like 2 seconds, um, where it's very easy for the audio to actually get like a good bit ahead of the video if you're, um, not careful and you have a lot of audio. Um, but it also could be very helpful for keeping, um, tracks in sync that are like from different productions. So, like video conferencing, um, any other time you have like two different video feeds and you want to keep uh approximately in sync from like a delivery perspective. So again, this is all about delivery, this is not about playback. This is just there to try to make sure that like you don't have buffer under-runs, because you like, you know, send all of one thing and none of the other. Um, so timestamp cross-attributes is a difficult to impossible problem in the like extreme sense, but as I think Colin kind of alluded, um, you know, perfection's probably not required here, like this is just for delivery. Um, and so, you know, if we can get uh sufficiently accurate timestamps solely, it actually might still be like a net win. And also, I've there are lot of use cases where even if the time is not 100% synchronized, it's like maybe they're all from like the same data center and the same like region or same like rate, right? I mean, like these these clocks can be reasonably close, uh, like maybe where they're transcoded, things like that. Um, so, there's a lot of use cases where we might be able to use this. Um, and of, I wanted to say briefly on DTS, currently uses group alignment, which seems to work fairly well. Um, probably could use a timestamp if if it's available, but I think DTS have proven that like at least for that one use case like probably group alignment is sufficient. Not required. Like, timestamps aren't required, even if they could be used. So, Yeah, that was going to be my comment. DTS doesn't actually, it's just requiring group IDs to be consistent, they can all completely different media times or something. It doesn't matter. Yeah. And also when you're syncing media, you're not going to do it with the time with the timestamp carried on the delivery item. You're going to do it with the time a much more accurate time signal that's embedded encoded, like MPEG system you have for that, or time base or presentation set, and then, and sync it right. So, this timestamp then would be useful if we had say allowed filters or something that is a property of transport. I want, I only want objects to that are signaling this timestamp. But that timestamp is just a number, so couldn't we, and you can already define an existing property which is a number. So is this really about just standardizing a property and saying it is a timestamp and here's how you you write it? Uh, yes, basically, things like that. Or, is there some other issue? Or like, someone sending traffic that is too high rate for the network and things like that, and we might need to have support for that, but I don't think we've, like, that's that sort of out-of-sync problem like, you know, Ian was saying, use some sort of rate limit, or like, we we definitely need that, I mean even H3 do we do some stuff that mostly avoid that, but like it's a real problem at the... I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good, and... No, no, no.
Suhas Nandakumar: Okay, I think we... No, I think there is value in just standardizing something because it comes in different forms in different applications.
Suhas Nandakumar: Yeah, that's...
Suhas Nandakumar: Yeah, I I think this is a layer violation, um, but...
Suhas Nandakumar: It only, all use cases cited are all applications for syncing media. And as Jordi just said, different applications write different timestamps to media. They are application-defined. Are we going to arbitrarily pick some of them and put them on the transport protocol, which is what we're defining here? So, and I, as much as I use and and need these timestamps, I don't think putting them into the transport protocol is best for them. They should, there's their application utility. They can go in custom-defined properties that the application files.
Suhas Nandakumar: Fair, fair question. Um, so, but what if we want the transport protocol to try to deliver them more effectively? Would it be better then to just have the ability to, like, point to an existing field and say, like, that field is somehow allows you to synchronize between, like, multiple tracks?
Suhas Nandakumar: You call it a sync field?
Suhas Nandakumar: Sure, fine, that's fine.
Suhas Nandakumar: Sync field.
Suhas Nandakumar: Okay. Or...
Suhas Nandakumar: And you can just point at anything. It's not time. Yeah, it's a sync, it's a tech-agnostic sync field. We like physics, right? We deliver boxes. We don't have this call closed.
Suhas Nandakumar: That, that would be fine by me. No, sure, I I think that's valid, Colin.
Suhas Nandakumar: Um, I, I really, the layer violation thing, and I love it. So, I actually think this time, so I want to use this for something different, which is, um, we debug all time using timestamps of trying to figure out what's happening across the relay networks, and between the more whatever, and often relays will statistically sample a small percentage of the relay, of the timestamps and report them up to met- a metric servers, and things like that. Okay. And like, RTP explicitly designed the timestamp to not be encrypted from near mediators, it's explicit so near mediators could see it, which is, like, really wild, like, highly non-encrypted, like, I don't even mean it's, like, inside encrypted packets, like, it should be bare on the internet, even when you're using DTLS-SRTP. Because it's such a useful debugging tool. Okay. And, so, I think that we, I, I'd sort of go the direction, and maybe it's a different extension, maybe we end up with two, I think it'd be useful to be able to have, um, an, an, you know, NTP absolute timestamp of what we think the time of this packet is, that you can statistically drop into some of the objects or all of the objects, if you, if you felt like it, and then it can be used for this as well as being used for metrics and debugging processes, and maybe those, maybe I'm trying to combine two things that are different. And I agree it's a layer violation, but I think this one timestamp is, is so critical to debugging real-time flows, that is worth having the layer violation for timestamps.
Suhas Nandakumar: These are send timestamps, minted by the sender?
Suhas Nandakumar: Yes, at some point, you do deliver, like, too far behind. And then, you can still get too far behind, like, that counts against your queue. Yes. Um, so, I mean, it basically gives you the same flow control you get with like a FETCH. But like on subscribe, so, that's that's the intent. I'm imagining you had FETCH flow control on subscribe. I have not seen a lot of people jumping up and down saying like, I really want to do this. So, I think maybe the thing to do is have it in an extension and maybe people will play with it and find that there's useful and MoQ V2, you know, just like whatever, Speedy V2 didn't have flow control on application. We should have flow control. I sort of feel like this is like, you know, the other big topic that I think is the DDoS stuff we haven't really thought into the draft yet, or whatever. It's like, this may be a very plausible solution to resource exhaustion fights that we have identified in the DDoS draft. Seems like we haven't like, again, I mean, we need a design team for DDoS too, we have a design for DDoS. But they haven't talked about this, this is off in the this is off in the subscriber to the relay. This is not how we've you can sort of throttle them in. I I look I don't I I had always imagined there would be some way to rate-limit things, like rate-limit data, not not control. Control messages will not be how the systems kill. Um, so I thought there would be something you could do with rate limiting data and this seems like sort of it, though I can't read I haven't read the PR, obviously. Okay. Um, I don't know. We can repark it for like and actually like, we can plan to talk about next performance draft in more detail, I'm happy to. Like, I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good and, no. So, so, so, you're trying to simulate what you get with FETCH. So, you're just trying to aggregate all the sub-groups and saying, for this whole batch of sub-groups, here is your here is your limit. Yeah, here's the total number of bytes. Okay. You you can send. Yep, and also stream limit too if you want to use that. So, say say you're in a situation where like, I have 100 streams and I want to give this subscription 20 of them but not more than 20 then you basically can make sure that those subscriptions doesn't consume more than 20, which for a relay is is kind of nice, because you know, you're going upstream, you're trying to like, approximately fair share probably between multiple subscriptions. I mean, this is mostly realistically probably really valued with relays where you have, you know, a lot of people who are competing with on the same resources, you potentially, like someone downstream might like just stop consuming, or like, you know, there is bandwidth. Because right now you're only hammer is unsubscribed, right? Right now your only hand, well, you have you have flow, you have flow control on a subscribe, if you don't want the other person to send, you have flow control on stream. Every stream, yes. It stops and then, FETCH is a stream so it has flow control, right? Like, that's it just has it because it's is because it works. Okay. But subscribe is not. Um, do, I was going to say like, what are the like, since this has come up a few times and people have so, it sounds like maybe now people are more interested in this, like what can we do to make sure that we actually advance the ball? Um, that's why I say, do what, how about we we completely on this, I'd rather get our auth stuff straightened out first before we did did the DDoS. This stuff because I think that the auth stuff solves lots of that, so like, if I was sequencing these, I would get auth done in the draft as of something coming up fairly soon, once we got all the currents, I get all the current stuff we have in play landed, right? And then start the auth discussion, and then start the the the rate limiting discussion. Victor's in queue, also. What's up? I didn't see. Oh, I'm sorry. Okay. Ah, yes, this is the description, is sounds like a real problem, the solution does not solve it. Why? It does not authorize all, it does not prove that we do not have flow control for subscription. We do have flow control for subscription. The subscription are implicitly flow control for here, in a sense that the receiver has to consume data and as long as it's generated, or eventually you will get timeouts or so to far behind, and sensors because of nature of things which transmit by subscription you kind of have to consume it as of rate, and that's why is flow control because that's how the the rate of the receiver gets to like, the rate of the sender. Now, is there by the other issues with like, someone sending traffic that is too high rate for the network and things like that, and we might need to have support for that, but I don't think we've, like, that's that sort of out-of-sync problem like, you know, Ian was saying, use some sort of rate limit, or like, we we definitely need that, I mean even H3 do we do some stuff that mostly avoid that, but like it's a real problem at the... I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good, and... No, no, no.
Suhas Nandakumar: Yeah, I can um... The thing is that I, I had two options there, because uh in, in a privacy pass we do the first part of the of the setup, you will not get challenge, and then the relay would ask you for. We had some, sometimes some metadata. There's two ways we wanted a way, in setup, when you fail, state setup, to come back with the challenge. And and we can do this, two ways to do this, right? Do something like this, or you define a new auth error. Kind of like this. Hold on, setup doesn't have a response. Setup is just but I, I think this is the, and like, again, this is where in the category of stuff that I was like, when we get to auth in the draft, we'll fix this, okay, but I don't think we should do this with an error. I mean, every pretty much every auth auth token scheme that isn't a bearer token has a challenge phase. And so, anytime you send up any request, you're, you're going to need something that looks like a auth challenge to it, that needs to pass back a authorization. So, that's separate message for. But that's spec-wise, that's a different message. Yeah, or something like that. But, we need to design for this one way or another, we have to support this, right? Okay. Yeah. Does that work for you if we say like, okay. All right, so we'll close this. We're not going to do error. I think we need to figure out how to spell this. But this is our, but not this way. Okay, that's my goal. Okay, we'll close that PR. I, I think the issue to say we need a way to have, right? And let the auth design team to figure it out, and I think it can be message-sized, and finally, yeah, it works. So, the other point I want to make in this space is that now the way we structured it, um if it is if it happens to be an immediate request error, we have this reason phrase, but if it's a error that happens later, it's a reset stream, which has no error phrase, and nobody really cares. So, like, Why do we need reason phrase? Uh, I don't know, like, like, I don't want to spend forever paying the lawsuit here, like, yeah, okay. Um, like, I I kind of think we ought to have it, I don't know what other people think. I guess, you know, H3 doesn't have them in certain places, or Quick doesn't, but they're they are in some places but not others. Someone just needs to make a call. If, I don't know, if Colin, if we just said it's binary data, do your best, like, does that does that resolve all of your issues about UTF-8? It sure sure, because then people don't try and print it. They try and they They'll print it anyway. I will. But like the number of like, okay, at least you sort of have the security group, and like, the number of attacks we saw that like, relied on being able to reverse the bidirectional log message right over top of it and then go forward again, stuff like like UTF-8 allows you to do all of those things, so like, I mean, the I see, okay. I think the text we have for UTF-8 in this particular thing was lifted straight from HTTP/2. Because they have a place for a reason phrase to appear. So, we've at least tried to copy somebody who's more past than we are. I'm not saying that we won't have more trouble. Uh, I don't want to, time is over. How should we resolve, chairs? Resolve this? Yeah. Um, I think I I heard a lot of I think people asked for the debug stuff, I've heard that. I don't think any of the reason phrase haters have a problem with keeping this through the interop phase. Right, Colin? I mean, I don't No, no, no. Here is the concrete proposal going forward. We log in the draft, we put a message in the draft before this gets before this is last called, we'll make we'll make a we'll make a call in the working group of whether we're going to remove it or not. Like, that's just a way, like, punt it to the like, a like, punt it. I mean, we're out of time and I want to give I want to give you time to talk about timestamps. So, like, I don't know in this I'm okay punting it a little bit, but like, who cares? I'm good. I mean, let's just make a decision. Yes, what? Do we want to remove the UTF-8 text? I guess is one question. That's a separate question. Okay, well, why don't I don't think anyone cares strongly. We we could flip a coin, we could we could raise hands, like, I mean, like I would like to see them go away at the end. I think Colin would, but, we're not going to die, we're not going to like, yuck anyone's yum if they really want to have them in there from... If, if, if we're going to remove it, then, then aren't some of the, like, what what I said, which is, we just need a, we just need a reason codes to be bigger and more detailed than just error, protocol violation, you know, go on. Good luck to you guys. Yeah, I guess that's my point. I need somewhere to put something more detailed than just like, "error" is too active. But, but, but, but our reason codes space is big, right? It's like, it's in a 55-m, you can make as many as many codes as we want. I'd rather not actually make more codes, I'd rather, whatever, we do something like what I said, which is, put a line, whether a line, like, this is the line, or see whatever. You can have 9,000. If, if we've going to have rich feedback, like, just leave it as it is. That's what I say. I have no, nobody said, I want the binary, I want the text or binary stuff, to have to have it. Nobody said that, right? I mean, I want no debug, no text. There's one one there, but, that's for expected behaviour. And, and we've written it as binary. But nobody cares about whether there is anything in the UTF-8 or binary, right? Nobody said, "I need data in there." right? It's "no" for everybody. So, so, you're saying, "Don't send reason phrases?" I absolutely do. Do, do people care about it? I care. At least, for now, I care. I have definitely, we've got at least one key, I mean, for H3, I do, like, I mean, I use them. So, all right, if, if, if the desire is to have this really rich feedback, uh, like, with a million codes, uh, I would say like, let's just leave it as it is, like the point is to to keep it, to find the feedback with something, and not mess with the protocol. We're just going to have to recreate as a new structure. So, just leave it as it is. That works for me, Colin. We're switching this to binary data allow us to make it in the protocol that we anticipate already. Oh, yeah, yeah, yeah, because I mean, if, if, if, if you like, oh, I mean, that, I'm just sort of saying that, the, the, the, the problems are is, when you use UTF-8, you're supposed to scribe, uh, normalization form, and we don't, like, if we want to do the things that you're supposed to do when you do UTF-8, like, have normalization form, and things like that, it'll go through no problem. We just haven't done all of the things that UTF is required to do. Which is why US ASCII would be much simpler because then you don't have those requirements, or binary. Or, or like, this is a binary data like, you already have a 9-byte binary field. Okay, I want to give the rest of the time to Ian, like, error code is the same 9-byte binary. You will love arguing about timestamp more than you love arguing about... All right, Ian, you're up, and you have, uh, 12 minutes. Talk about this some in the past, um, but I think we now have more experience to have a way more informed conversation and maybe, um, we get a real direction out of it, so I will try again. Um, conversations around DTS we believe have been very helpful in providing some like real world, like, how people are actually going to do like track switching and such. Okay. So, potential use cases for, um, timestamp across tracks is, you know, audio and video, for example, you trying to keep them approximately in sync. So, like, this is particularly valuable if you have like a smallish jitter buffer, but like it's not like 15 milliseconds, it's more like 2 seconds, um, where it's very easy for the audio to actually get like a good bit ahead of the video if you're, um, not careful and you have a lot of audio. Um, but it also could be very helpful for keeping, um, tracks in sync that are like from different productions. So, like video conferencing, um, any other time you have like two different video feeds and you want to keep uh approximately in sync from like a delivery perspective. So again, this is all about delivery, this is not about playback. This is just there to try to make sure that like you don't have buffer under-runs, because you like, you know, send all of one thing and none of the other. Um, so timestamp cross-attributes is a difficult to impossible problem in the like extreme sense, but as I think Colin kind of alluded, um, you know, perfection's probably not required here, like this is just for delivery. Um, and so, you know, if we can get uh sufficiently accurate timestamps solely, it actually might still be like a net win. And also, I've there are lot of use cases where even if the time is not 100% synchronized, it's like maybe they're all from like the same data center and the same like region or same like rate, right? I mean, like these these clocks can be reasonably close, uh, like maybe where they're transcoded, things like that. Um, so, there's a lot of use cases where we might be able to use this. Um, and of, I wanted to say briefly on DTS, currently uses group alignment, which seems to work fairly well. Um, probably could use a timestamp if if it's available, but I think DTS have proven that like at least for that one use case like probably group alignment is sufficient. Not required. Like, timestamps aren't required, even if they could be used. So, Yeah, that was going to be my comment. DTS doesn't actually, it's just requiring group IDs to be consistent, they can all completely different media times or something. It doesn't matter. Yeah. And also when you're syncing media, you're not going to do it with the time with the timestamp carried on the delivery item. You're going to do it with the time a much more accurate time signal that's embedded encoded, like MPEG system you have for that, or time base or presentation set, and then, and sync it right. So, this timestamp then would be useful if we had say allowed filters or something that is a property of transport. I want, I only want objects to that are signaling this timestamp. But that timestamp is just a number, so couldn't we, and you can already define an existing property which is a number. So is this really about just standardizing a property and saying it is a timestamp and here's how you you write it? Uh, yes, basically, things like that. Or, is there some other issue? Or like, someone sending traffic that is too high rate for the network and things like that, and we might need to have support for that, but I don't think we've, like, that's that sort of out-of-sync problem like, you know, Ian was saying, use some sort of rate limit, or like, we we definitely need that, I mean even H3 do we do some stuff that mostly avoid that, but like it's a real problem at the... I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good, and... No, no, no.
Suhas Nandakumar: I agree with that. Yes, I'm in favor of some... I think we are not at the point of, I think we have options there because in privacy pass we do the first part of the setup, you will not get challenge, and then the relay would ask you for... ...some metadata. There's two ways we wanted a way, in setup, when you fail, state setup, to come back with the challenge. And we can do this two ways to do this, right? Do something like this, or you define a new auth error. Kind of like this. Setup doesn't have a response. Setup is just but I think this is the and like, again, this is where in the category of stuff that I was like, when we get to auth in the draft, we'll fix this, okay, but I don't think we should do this with an error. I mean, every pretty much every auth auth token scheme that isn't a bearer token has a challenge phase. And so, anytime you send up any request, you're, you're going to need something that looks like a auth challenge to it, that needs to pass back a authorization. So, that's separate message for. But that's spec-wise, that's a different message. Yeah, or something like that. But, we need to design for this one way or another, we have to support this, right? Okay. Yeah. Does that work for you if we say like, okay. All right, so we'll close this. We're not going to do error. I think we need to figure out how to spell this. But this is our, but not this way. Okay, that's my goal. Okay, we'll close that PR. I, I think the issue to say we need a way to have, right? And let the auth design team to figure it out, and I think it can be message-sized, and finally, yeah, it works. So, the other point I want to make in this space is that now the way we structured it, um if it is if it happens to be an immediate request error, we have this reason phrase, but if it's a error that happens later, it's a reset stream, which has no error phrase, and nobody really cares. So, like, Why do we need reason phrase? Uh, I don't know, like, like, I don't want to spend forever paying the lawsuit here, like, yeah, okay. Um, like, I I kind of think we ought to have it, I don't know what other people think. I guess, you know, H3 doesn't have them in certain places, or Quick doesn't, but they're they are in some places but not others. Someone just needs to make a call. If, I don't know, if Colin, if we just said it's binary data, do your best, like, does that does that resolve all of your issues about UTF-8? It sure sure, because then people don't try and print it. They try and they They'll print it anyway. I will. But like the number of like, okay, at least you sort of have the security group, and like, the number of attacks we saw that like, relied on being able to reverse the bidirectional log message right over top of it and then go forward again, stuff like like UTF-8 allows you to do all of those things, so like, I mean, the I see, okay. I think the text we have for UTF-8 in this particular thing was lifted straight from HTTP/2. Because they have a place for a reason phrase to appear. So, we've at least tried to copy somebody who's more past than we are. I'm not saying that we won't have more trouble. Uh, I don't want to, time is over. How should we resolve, chairs? Resolve this? Yeah. Um, I think I I heard a lot of I think people asked for the debug stuff, I've heard that. I don't think any of the reason phrase haters have a problem with keeping this through the interop phase. Right, Colin? I mean, I don't No, no, no. Here is the concrete proposal going forward. We log in the draft, we put a message in the draft before this gets before this is last called, we'll make we'll make a we'll make a call in the working group of whether we're going to remove it or not. Like, that's just a way, like, punt it to the like, a like, punt it. I mean, we're out of time and I want to give I want to give you time to talk about timestamps. So, like, I don't know in this I'm okay punting it a little bit, but like, who cares? I'm good. I mean, let's just make a decision. Yes, what? Do we want to remove the UTF-8 text? I guess is one question. That's a separate question. Okay, well, why don't I don't think anyone cares strongly. We we could flip a coin, we could we could raise hands, like, I mean, like I would like to see them go away at the end. I think Colin would, but, we're not going to die, we're not going to like, yuck anyone's yum if they really want to have them in there from... If, if, if we're going to remove it, then, then aren't some of the, like, what what I said, which is, we just need a, we just need a reason codes to be bigger and more detailed than just error, protocol violation, you know, go on. Good luck to you guys. Yeah, I guess that's my point. I need somewhere to put something more detailed than just like, "error" is too active. But, but, but, but our reason codes space is big, right? It's like, it's in a 55-m, you can make as many as many codes as we want. I'd rather not actually make more codes, I'd rather, whatever, we do something like what I said, which is, put a line, whether a line, like, this is the line, or see whatever. You can have 9,000. If, if we're going to have rich feedback, like, just leave it as it is. That's what I say. I have no, nobody said, I want the binary, I want the text or binary stuff, to have to have it. Nobody said that, right? I mean, I want no debug, no text. There's one one there, but, that's for expected behaviour. And, and we've written it as binary. But nobody cares about whether there is anything in the UTF-8 or binary, right? Nobody said, "I need data in there." right? It's "no" for everybody. So, so, you're saying, "Don't send reason phrases?" I absolutely do. Do, do people care about it? I care. At least, for now, I care. I have definitely, we've got at least one key, I mean, for H3, I do, like, I mean, I use them. So, all right, if, if, if the desire is to have this really rich feedback, uh, like, with a million codes, uh, I would say like, let's just leave it as it is, like the point is to to keep it, to find the feedback with something, and not mess with the protocol. We're just going to have to recreate as a new structure. So, just leave it as it is. That works for me, Colin. We're switching this to binary data allow us to make it in the protocol that we anticipate already. Oh, yeah, yeah, yeah, because I mean, if, if, if, if you like, oh, I mean, that, I'm just sort of saying that, the, the, the, the problems are is, when you use UTF-8, you're supposed to scribe, uh, normalization form, and we don't, like, if we want to do the things that you're supposed to do when you do UTF-8, like, have normalization form, and things like that, it'll go through no problem. We just haven't done all of the things that UTF is required to do. Which is why US ASCII would be much simpler because then you don't have those requirements, or binary. Or, or like, this is a binary data like, you already have a 9-byte binary field. Okay, I want to give the rest of the time to Ian, like, error code is the same 9-byte binary. You will love arguing about timestamp more than you love arguing about... All right, Ian, you're up, and you have, uh, 12 minutes. Talk about this some in the past, um, but I think we now have more experience to have a way more informed conversation and maybe, um, we get a real direction out of it, so I will try again. Um, conversations around DTS we believe have been very helpful in providing some like real world, like, how people are actually going to do like track switching and such. Okay. So, potential use cases for, um, timestamp across tracks is, you know, audio and video, for example, you trying to keep them approximately in sync. So, like, this is particularly valuable if you have like a smallish jitter buffer, but like it's not like 15 milliseconds, it's more like 2 seconds, um, where it's very easy for the audio to actually get like a good bit ahead of the video if you're, um, not careful and you have a lot of audio. Um, but it also could be very helpful for keeping, um, tracks in sync that are like from different productions. So, like video conferencing, um, any other time you have like two different video feeds and you want to keep uh approximately in sync from like a delivery perspective. So again, this is all about delivery, this is not about playback. This is just there to try to make sure that like you don't have buffer under-runs, because you like, you know, send all of one thing and none of the other. Um, so timestamp cross-attributes is a difficult to impossible problem in the like extreme sense, but as I think Colin kind of alluded, um, you know, perfection's probably not required here, like this is just for delivery. Um, and so, you know, if we can get uh sufficiently accurate timestamps solely, it actually might still be like a net win. And also, I've there are lot of use cases where even if the time is not 100% synchronized, it's like maybe they're all from like the same data center and the same like region or same like rate, right? I mean, like these these clocks can be reasonably close, uh, like maybe where they're transcoded, things like that. Um, so, there's a lot of use cases where we might be able to use this. Um, and of, I wanted to say briefly on DTS, currently uses group alignment, which seems to work fairly well. Um, probably could use a timestamp if if it's available, but I think DTS have proven that like at least for that one use case like probably group alignment is sufficient. Not required. Like, timestamps aren't required, even if they could be used. So, Yeah, that was going to be my comment. DTS doesn't actually, it's just requiring group IDs to be consistent, they can all completely different media times or something. It doesn't matter. Yeah. And also when you're syncing media, you're not going to do it with the time with the timestamp carried on the delivery item. You're going to do it with the time a much more accurate time signal that's embedded encoded, like MPEG system you have for that, or time base or presentation set, and then, and sync it right. So, this timestamp then would be useful if we had say allowed filters or something that is a property of transport. I want, I only want objects to that are signaling this timestamp. But that timestamp is just a number, so couldn't we, and you can already define an existing property which is a number. So is this really about just standardizing a property and saying it is a timestamp and here's how you you write it? Uh, yes, basically, things like that. Or, is there some other issue? Or like, someone sending traffic that is too high rate for the network and things like that, and we might need to have support for that, but I don't think we've, like, that's that sort of out-of-sync problem like, you know, Ian was saying, use some sort of rate limit, or like, we we definitely need that, I mean even H3 do we do some stuff that mostly avoid that, but like it's a real problem at the... I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good, and... No, no, no.
Suhas Nandakumar: I agree with that. Yes, I'm in favor of some, I think we are not at the point of, I think we have options there because in privacy pass we do the first part of the setup, you will not get challenge, and then the relay would ask you for... ...some metadata. There's two ways we wanted a way, in setup, when you fail, state setup, to come back with the challenge. And we can do this two ways to do this, right? Do something like this, or you define a new auth error. Kind of like this. Setup doesn't have a response. Setup is just but I think this is the and like, again, this is where in the category of stuff that I was like, when we get to auth in the draft, we'll fix this, okay, but I don't think we should do this with an error. I mean, every pretty much every auth auth token scheme that isn't a bearer token has a challenge phase. And so, anytime you send up any request, you're, you're going to need something that looks like a auth challenge to it, that needs to pass back a authorization. So, that's separate message for. But that's spec-wise, that's a different message. Yeah, or something like that. But, we need to design for this one way or another, we have to support this, right? Okay. Yeah. Does that work for you if we say like, okay. All right, so we'll close this. We're not going to do error. I think we need to figure out how to spell this. But this is our, but not this way. Okay, that's my goal. Okay, we'll close that PR. I, I think the issue to say we need a way to have, right? And let the auth design team to figure it out, and I think it can be message-sized, and finally, yeah, it works. So, the other point I want to make in this space is that now the way we structured it, um if it is if it happens to be an immediate request error, we have this reason phrase, but if it's a error that happens later, it's a reset stream, which has no error phrase, and nobody really cares. So, like, Why do we need reason phrase? Uh, I don't know, like, like, I don't want to spend forever paying the lawsuit here, like, yeah, okay. Um, like, I I kind of think we ought to have it, I don't know what other people think. I guess, you know, H3 doesn't have them in certain places, or Quick doesn't, but they're they are in some places but not others. Someone just needs to make a call. If, I don't know, if Colin, if we just said it's binary data, do your best, like, does that does that resolve all of your issues about UTF-8? It sure sure, because then people don't try and print it. They try and they They'll print it anyway. I will. But like the number of like, okay, at least you sort of have the security group, and like, the number of attacks we saw that like, relied on being able to reverse the bidirectional log message right over top of it and then go forward again, stuff like like UTF-8 allows you to do all of those things, so like, I mean, the I see, okay. I think the text we have for UTF-8 in this particular thing was lifted straight from HTTP/2. Because they have a place for a reason phrase to appear. So, we've at least tried to copy somebody who's more past than we are. I'm not saying that we won't have more trouble. Uh, I don't want to, time is over. How should we resolve, chairs? Resolve this? Yeah. Um, I think I I heard a lot of I think people asked for the debug stuff, I've heard that. I don't think any of the reason phrase haters have a problem with keeping this through the interop phase. Right, Colin? I mean, I don't No, no, no. Here is the concrete proposal going forward. We log in the draft, we put a message in the draft before this gets before this is last called, we'll make we'll make a we'll make a call in the working group of whether we're going to remove it or not. Like, that's just a way, like, punt it to the like, a like, punt it. I mean, we're out of time and I want to give I want to give you time to talk about timestamps. So, like, I don't know in this I'm okay punting it a little bit, but like, who cares? I'm good. I mean, let's just make a decision. Yes, what? Do we want to remove the UTF-8 text? I guess is one question. That's a separate question. Okay, well, why don't I don't think anyone cares strongly. We we could flip a coin, we could we could raise hands, like, I mean, like I would like to see them go away at the end. I think Colin would, but, we're not going to die, we're not going to like, yuck anyone's yum if they really want to have them in there from... If, if, if we're going to remove it, then, then aren't some of the, like, what what I said, which is, we just need a, we just need a reason codes to be bigger and more detailed than just error, protocol violation, you know, go on. Good luck to you guys. Yeah, I guess that's my point. I need somewhere to put something more detailed than just like, "error" is too active. But, but, but, but our reason codes space is big, right? It's like, it's in a 55-m, you can make as many as many codes as we want. I'd rather not actually make more codes, I'd rather, whatever, we do something like what I said, which is, put a line, whether a line, like, this is the line, or see whatever. You can have 9,000. If, if we're going to have rich feedback, like, just leave it as it is. That's what I say. I have no, nobody said, I want the binary, I want the text or binary stuff, to have to have it. Nobody said that, right? I mean, I want no debug, no text. There's one one there, but, that's for expected behaviour. And, and we've written it as binary. But nobody cares about whether there is anything in the UTF-8 or binary, right? Nobody said, "I need data in there." right? It's "no" for everybody. So, so, you're saying, "Don't send reason phrases?" I absolutely do. Do, do people care about it? I care. At least, for now, I care. I have definitely, we've got at least one key, I mean, for H3, I do, like, I mean, I use them. So, all right, if, if, if the desire is to have this really rich feedback, uh, like, with a million codes, uh, I would say like, let's just leave it as it is, like the point is to to keep it, to find the feedback with something, and not mess with the protocol. We're just going to have to recreate as a new structure. So, just leave it as it is. That works for me, Colin. We're switching this to binary data allow us to make it in the protocol that we anticipate already. Oh, yeah, yeah, yeah, because I mean, if, if, if, if you like, oh, I mean, that, I'm just sort of saying that, the, the, the, the problems are is, when you use UTF-8, you're supposed to scribe, uh, normalization form, and we don't, like, if we want to do the things that you're supposed to do when you do UTF-8, like, have normalization form, and things like that, it'll go through no problem. We just haven't done all of the things that UTF is required to do. Which is why US ASCII would be much simpler because then you don't have those requirements, or binary. Or, or like, this is a binary data like, you already have a 9-byte binary field. Okay, I want to give the rest of the time to Ian, like, error code is the same 9-byte binary. You will love arguing about timestamp more than you love arguing about... All right, Ian, you're up, and you have, uh, 12 minutes. Talk about this some in the past, um, but I think we now have more experience to have a way more informed conversation and maybe, um, we get a real direction out of it, so I will try again. Um, conversations around DTS we believe have been very helpful in providing some like real world, like, how people are actually going to do like track switching and such. Okay. So, potential use cases for, um, timestamp across tracks is, you know, audio and video, for example, you trying to keep them approximately in sync. So, like, this is particularly valuable if you have like a smallish jitter buffer, but like it's not like 15 milliseconds, it's more like 2 seconds, um, where it's very easy for the audio to actually get like a good bit ahead of the video if you're, um, not careful and you have a lot of audio. Um, but it also could be very helpful for keeping, um, tracks in sync that are like from different productions. So, like video conferencing, um, any other time you have like two different video feeds and you want to keep uh approximately in sync from like a delivery perspective. So again, this is all about delivery, this is not about playback. This is just there to try to make sure that like you don't have buffer under-runs, because you like, you know, send all of one thing and none of the other. Um, so timestamp cross-attributes is a difficult to impossible problem in the like extreme sense, but as I think Colin kind of alluded, um, you know, perfection's probably not required here, like this is just for delivery. Um, and so, you know, if we can get uh sufficiently accurate timestamps solely, it actually might still be like a net win. And also, I've there are lot of use cases where even if the time is not 100% synchronized, it's like maybe they're all from like the same data center and the same like region or same like rate, right? I mean, like these these clocks can be reasonably close, uh, like maybe where they're transcoded, things like that. Um, so, there's a lot of use cases where we might be able to use this. Um, and of, I wanted to say briefly on DTS, currently uses group alignment, which seems to work fairly well. Um, probably could use a timestamp if if it's available, but I think DTS have proven that like at least for that one use case like probably group alignment is sufficient. Not required. Like, timestamps aren't required, even if they could be used. So, Yeah, that was going to be my comment. DTS doesn't actually, it's just requiring group IDs to be consistent, they can all completely different media times or something. It doesn't matter. Yeah. And also when you're syncing media, you're not going to do it with the time with the timestamp carried on the delivery item. You're going to do it with the time a much more accurate time signal that's embedded encoded, like MPEG system you have for that, or time base or presentation set, and then, and sync it right. So, this timestamp then would be useful if we had say allowed filters or something that is a property of transport. I want, I only want objects to that are signaling this timestamp. But that timestamp is just a number, so couldn't we, and you can already define an existing property which is a number. So is this really about just standardizing a property and saying it is a timestamp and here's how you you write it? Uh, yes, basically, things like that. Or, is there some other issue? Or like, someone sending traffic that is too high rate for the network and things like that, and we might need to have support for that, but I don't think we've, like, that's that sort of out-of-sync problem like, you know, Ian was saying, use some sort of rate limit, or like, we we definitely need that, I mean even H3 do we do some stuff that mostly avoid that, but like it's a real problem at the... I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good, and... No, no, no.
Suhas Nandakumar: Uh, yeah. I I didn't get uh write-up on some code, but the token itself, the size of that is around like 400 bytes, uh that's that's that's that token is not uh again, the token is useful in the direct action, but if the actions increase more, and also if you're depending on crypto you use, it might go on many, many kilobytes. So, having having said that, I am also in agreement with. I think that we are not at the point of...
Victor Vasiliev: I think we have to have a PR 11 and make a new PR out. That will help us review, actually. I can make...
Will Law: Oh, okay.
Ian Swett: Okay. That's what I thought. Yeah. That was not the case of dynamic state. We used to do the the the report. Report part? Yeah. And then we said really bad. Yeah, it was really we didn't go for white out. Now, are you? Now, where are you? Now,
Suhas Nandakumar: That's it, then.
Ian Swett: Did you get the slides I just uploaded?
Suhas Nandakumar: Uh, no, I didn't. Don't worry about it. Just upload a new set. Uh, yeah, the updated slides. No. Hey, did you upload it for that? It's approved. Okay. Set. Put it in meetup cover.
Ian Swett: Uh, we'll see. Like, we had trouble with that in the past, when you do it the last possible second. Uh, okay. Well, I'll ask for slides again and I'll just present whatever. Who is that? Okay.
Will Law: All right, we're starting. I got to page through this. Alan, go.
Aman Sharma: I'm doing this as fast as I can.
Ian Swett: Since it's done. Yeah, yeah. Okay, 9:24. Upstream delivery timeouts. So, the delivery timeout is defined as hop-by-hop. There is no way, right now, to express cumulative end-to-end timeout across a multi-peer path. Latency-sensitive subscribers, for example, getting less than one second end-to-end, can't see if a relay has a larger timeout value. But so far, this is not really been an issue. Like, one person who doesn't come to a lot of meetings filed this issue like a year or two ago. So, is this important? Should we just close it?
Colin Perkins: Oh, I think, I think we should close it. And I'll explain why is the only solution this problem has been discussed infinitely many times, you have to have time synchronization to not do it, and though I actually think time synchronization is completely trivial on the modern internet, and as works, every time I try and convince anyone at IETF that it is possible to synchronize time between computers, like, it's just it's it's like it's an impossible argument.
Suhas Nandakumar: Are you saying if you tried to put something in they would just shut us down or...
Colin Perkins: I'm saying I'm saying we'll shut ourselves we'll shoot ourselves in the foot if we try and say you must have an NTP synchronized server.
Suhas Nandakumar: Right. Definitely.
Colin Perkins: Um, but that's it's so, I I don't want to and I'm not logged on with Jan's computer, but I just like I agree we should close with no action. Um, but it's not that I don't want this and it's not that it's not possible to write solutions for it, it's just impossible to get everyone to agree to it at IETF, in my experience. I've had slides on time stamp signature so we can I'll I'll make Is there a better way to do this? It might be. Why don't we I'll put them in the like we think we don't need to do any...
Suhas Nandakumar: Well, does anybody else have a thing inside?
Will Law: Yeah, I think we should close with no action. Okay, done. This is there there's so many other things you have to propagate it along the path. This is one of them. Just reminder to log in in the MeetEcho session, please, either the QR code or go to the track room. Okay. So, we'll close that one. Yep. Um, stream authorization alias with extension. This was hard. So, first question, have you have you implemented or do you plan to implement auth alias compression? I have implemented it. Raise your hand physically if you... Yeah. How many have not and do not plan to? Okay. Um, I'm strongly I strongly plan to remove auth compression from the draft. Interests is low, and I pre-anticipated the answer to that. So, then um, this is less of an issue in MoQ than it was in HTTP because subscriptions are long-lived, so you don't have to give the token every second or every two seconds. You give it once on the subscription. So, it's it's not as dramatically bad as it maybe would have been in that environment. Once we introduce bidirectional streams, it reduces the effectiveness of what you can do without making the software fancier or without something that's like has the complexity of QPACK has to manage things coming on different like the token that was sent on this stream is now being used by this other stream, and how do you make sure it's available for you to process it? I wrote an extension that does this, it's called MoQ pack. You can go see it in the link. Basically, it's like it defines a for every control message we have, there's parallel control message, which has a all of the parameters and track namespaces are in a compressed block, and it uses QPACK to if people want to have compression, then we can compress track names, compress auth tokens, compress everything in a in a way that works across Quick Streams. And I think and I kind of think that having this auth compression scheme in our document is going to draw more scrutiny when we get to end game. People are going to be like, what are you what are you doing here? So, this is my this is my pitch.
Colin Perkins: Oh, on this issue? Well, so, I know this was your baby, and I've worked on...
Suhas Nandakumar: Just let me motivate why we added this. First, we wanted to protect actions within MoQ, and while a subscription might be long-lived, we just had a discussion this morning about switch from, right? Actively ABR switching. Your your subscriptions might just last 5 seconds before you switch again. So, now I got to send a token, and I got to send a token again. And the trouble is these tokens are 10 to 100 times larger than the message that's being sent. Hence, the alias. The alias is just a number, and it represents it. So, I still think they have a lot of utility. Other the the other side is we have to invent protections that are not applied frequently, but are just applied like it's a macro or on setup, which is lowering our ability to actually protect the code. So, I would rather simplify the compression scheme, if there's a complexity to it. I don't want to go to...
Suhas Nandakumar: It's the opposite. It's that the current scheme, now that we're on bidirectional streams, is harder to make use of effectively. Like, you can't use an alias until you get the okay from the message that that sent it to the other side.
Suhas Nandakumar: In theory, you can, but 99% of the time, you can't.
Suhas Nandakumar: Well, but if you try to use an alias and it's not there, it's a session error.
Suhas Nandakumar: Right. But the relay thing, like, say well maybe one's coming or is there are there ways to accommodate that, because most of the time it would work. I mean... Uh, the answer is what QPACK does. Like, it's already like tried to boil this problem of like things coming on different streams and how you sync them. But I mean, and that that's a failure mode. That's the thing. What's that? I mean, even QPACK has failure modes. I mean, with timing, etc. I mean, if you follow the rules, it doesn't it's not like you can't I don't want to play failure. From that perspective, there are there are certain conditions you still can't send it. Right. But in that mode, the peer gets to tell you how many times you are allowed to send something that it has not yet acknowledged that it has. Because that can cause queueing on its side, where it has to hold your request while it's waiting for the update to arrive. So, there's a there's a tunable in it. I don't want to go in the whole design though. That so many things to set my timer to. Okay, Colin.
Colin Perkins: Um, I I think I'm a little bit on the So, I I think we're asking the wrong question of who plans to implement this. I mean, we haven't really implemented auth yet anywhere, and I think once we start implementing and requiring auth and seeing how frequent the updates are, it will suddenly become clear whether it's no big deal to send the 10k requests every time, or whether this is like, you know, half our traffic. So, I I actually think we're I mean, we're I think we're too soon to really decide whether we need this or not, and that we need to see this, and I do worry about the AV switches, and there was another case, I was against this to start with, and there was a case that somebody raised where you were going to have to refresh hundreds of subscript of things, you're going to have to do like a hundred transactions all pretty much at the same time, and I was like, oh my god, yeah, that's a huge amount of traffic. And it was an update case. Um, so, I don't know, I I feel a little bit like it's like we need to implement auth before we need to know whether we need this or not.
Will Law: Oh, okay. Martin.
Martin Duke: I think, given that we as an individual, I think, given that we have to fix it, anyway, like the current thing is not satisfactory. That's write out write down what you've done, how to do it, and like evaluate that outside of the main MoQ. What's what's the part that doesn't work right now? The that we have uh separate streams, they're out of sync. But the draft is not broken, it just says you're not allowed to use an alias unless you've got the okay from the message there. Okay, so it's very simple to implement, so never mind, okay, then not broken. I mean, no one has implemented, I've implemented it, but I haven't implemented that piece, I just turned to the draft 18 I just turned it off because I don't want to deal with it. We just got token implementations in the last couple of weeks, so it's like okay, a long ways to go here. Victor. Yeah, given that we now have bidirectional streams, it might be more of a cargo-cult exercise. Uh, Suhas. Uh, I think I did get uh write-up on some code, but the token itself, the size of that is around like 400 bytes, uh that's that's that's that token is not uh again, the token is useful in the direct action, but if the actions increase more, and also if you're depending on crypto you use, it might go on many, many kilobytes. So, having having said that, I am also in agreement with. I think that we are not at the point of... Um, I'm just going to echo one comment I see from uh Mike in the chat, which is, it's uh maybe too early to remove it, but moving it to an extension seems reasonable. So, we could just take the take the everything that we have and just move it to another document, which is like if you want this is the core, we send the tokens every time. It's already effectively given by a setup option, which is all you need to make it an extension. So, it's fine. It's not totally different if we write it in another place, and it shrinks our document by 500 lines. Um, does that Do people have a problem with that approach or...
Suhas Nandakumar: I think we should have a fall-back. Uh, I I feel like, you know, we even have important time for five tokens, so we'll make a decision of maybe, yes, move. Okay. Again, I also feel like this is a time where people we have open source libraries, libraries now to transport these auth schemes. We should give at least some time for people to... It's too early to remove. Okay. Um, in that so I guess I'm hearing mixed feedback, some people are on team extension, some people are on team too early.
Will Law: MoQ-secure-objects is very pro-extension, but obviously, there are some...
Suhas Nandakumar: MoQ-secure-objects is pro-extension. Um, I see two, I mean, I don't know if we want to call on a show of hands. I guess I'm willing I didn't see your hand shoot up about who's going to implement this soon. So, are you in...
Colin Perkins: Oh, we're not. We can't we can't run a trial until this is implemented.
Suhas Nandakumar: The compression scheme?
Colin Perkins: No, the authorization scheme.
Suhas Nandakumar: Right. The Are you going to implement compression and tell me if it's useful or not? Yes. Okay, by when?
Suhas Nandakumar: Next week.
Colin Perkins: Next next week, but we have we have the things.
Suhas Nandakumar: You're saying you're going to help write, right? For the group last. Okay, fair. Uh, I'll I'll let you know. Uh, it might be, you know, so you've got like 2 months to show that you really need what's in there right now, or then I think it goes to an extension.
Colin Perkins: Why did you choose this instead of sticking over with this of like track alias? Why did you choose auth alias instead of that track?
Suhas Nandakumar: Yeah, this issue was filed a long time ago by somebody else, and I'm just closing issues. If you want to file an issue about track alias, file it. Do you expect MoQ pack to address things to not need track alias? No, not Go read MoQ pack, or go watch the HTTP session from IETF 115.
Suhas Nandakumar: But track alias is in DBT because it's auth list, is taking compression. It's just it compresses everything, compresses track namespace, it compresses for track name. Right? Yeah, but not in the control. Auth alias is a compression for a big token. Sorry. Auth alias doesn't it has the same problems that the other one does, because it's relies on in-order uh processing in order to work, and it's also only used in the data plane, not the control plane. So, anyway, and go read go read MoQ pack, tell me what you think. Uh, I will I guess this is sort of temporarily parking this issue, but with a with a caveat. Yes. So, run on timer, right? Yes. If nobody produces is saying like, I have to have this, then, you know, and I have here's the data that shows how valuable it is to me. Like, I don't know, 2 months. Oh, no. I'm not on the fine thing. Like, let's just let us implement this stuff. You said last call, last call is apparently, first last call is going out in mid-August. Well, and if by mid-May, like... Is that your decision, or somebody else's? Okay. That that is my target. So, some people might be thinking here, like, let's work on this stuff. When I figure this out, there's a lot to be done on auth, and this is the this is the this is at the tail end of auth, yet the draft is missing most of the auth stuff, so. So, there is another reason to take it out. No, no, no. But I just think we need to figure out the auth. I mean, I'm not on team take it out or not take it out. I'm just saying trying to decide this today is makes no sense whatsoever, trying to decide on everything. I'll move on. But you're you're denying the fact that auth is missing from the draft, and we need to figure that out, right? I mean, there there's auth there's place to carry auth, there's auth there's track for auth, which is the track payload. There's no discussion in the draft, and I've brought this up with many times, okay, there's not an issue open on it, but it's a big topic, right? We need to say what things you auth, and when, when do you evaluate them, and... What does that have to do with compression? Well, look, if you don't have to authorize updates, this bug isn't relevant. I'm happy to move on. Um, but I just I'm not I don't agree with consensus is like we're going to have a timer on this and it comes out in 2 months. That's what you are saying. Uh, if the chairs want to say that's the consensus of the working group, I'm glad for the chairs to say that, but that's what needs to happen here if you want to put that on. Appreciate it, Colin. Thank you for correcting me, chairs. How would you run things? Let's wait, see, uh and and when it's relevant to do the kind of final next round of cleanup, we can I'm very satisfied with that. Yeah. When we get to a decision point, we'll make a decision. So, we're we're way over time on this, but I'm sorry. So, there's there's two designs, there's the current design and there's MoQ pack. And what is the relative position of those two things? We don't have time to go to talk about MoQ pack, if you would like, I can fill you in. Well, no, no. So, but what I don't want to I don't want to discuss it. What is your intent of that draft versus the My intent of that is like, if we really want to get the maximum possible compression out of MoQ in general, not just auth, but also track namespaces, etc., given our bidi stream design, I think MoQ pack is what we want. That that can happen in fact, right? I yeah, and I don't really care if it does or not. I let people decide if they want to compress. So, do you want to replace the do you want to replace the current scheme with MoQ pack in the in the in the Potentially, I when when we when people decide that compressing is a big problem, we can look at it. Not in the core draft though. I mean, even QPACK wasn't in technically... Yeah, but, sure. All right, let's move on. Happy to take feedback on my repo on MoQ pack, if you want to read it, you can. I'm happy to talk about it anytime, but not now. Um, Suhas, these are your slides, enjoy. Okay, I will. So, this is this was a issue opened by Magnus about. The idea is that, like, we have end subscribers and your original publishers, original publishers announce of some publish namespace, and some tracks are automatically end up end subscriber, there's no way to verify um is that publisher allowed to or authorized to publish local? Not looking to uh think clearly about this, this this is this, with today's we have uh a text that basically says that um like the spec says, receiver verifies publisher is authorized. But it's kind of a black box, it does not clearly say uh what this basically means. But if you really think about it, the way our auth works is hop by hop, we not do an end-to-end authorization uh anyway. Uh so, the idea is that we need to add uh resolution here is that add in the security considerations section, basically talking about different roles in the sense that what would be a publisher to relay, what would relay verify, uh being authorized of a publisher. And in subscribe namespace case when a namespace comes, the publish namespace matching with that subscribe track space basically helps in uh what are the auth that you do with subscribe tracks or subscribe namespace, in everything under that as factory would inherit that authorization, that trust. And same way between relay to relay there might not be, we do not define anything, right? But the expectation is that relay would once a relay authorizes a publisher on the ingress, uh if if it does not authorize a publisher on the ingress, it will not forward that uh on the on the egress. There might be an out-of-band mechanism where uh an application can control the the identity and authorization associations between the publisher and end subscriber, but our the MoQ core transport would not define that.
Colin Perkins: Okay. Um, so, I think we got a whole bunch of assumptions here we probably just disagree with them, we can't sign here, but, yeah, I don't think it's hop-by-hop, and the use case we described before was you're going to send a subscribe, I'm not on the on-topic list now, okay, because of course we'll call it a subscribe sends something to a relay, the relay sends it to the original publisher. And the relay might want one token, and the original publisher might want a different token that you're going to authenticate with. And this was one of the reasons that we put multiple, the support for multiple tokens in, and there might even be another relay network in there, and you had two different authorization tokens for relay networks, plus one for the original publisher. So, I don't, like, let's, like, I don't think it's hop-by-hop.
Suhas Nandakumar: No, the point is that parameters are hop-by-hop. All parameters...
Colin Perkins: No, no, but we're going to have to put statements in that these ones are copied from, you know, upstream, like lots of other things are copied.
Suhas Nandakumar: Also, all subscribes are aggregated, can be aggregated, that's why that's why parameters are hop-by-hop.
Colin Perkins: So, the the authorization of the subscriber is one of them. Like, that is one of the, I think, issues that is going to be one of the most difficult things for us to deal with in this draft before we're done. Please file those, thank you. So, uh but just to high-level, uh what I'm saying is very similar to what you're thinking. I'm not saying what I'm trying to say is that what uh once a subscriber sends an authorization open to relay, and then what relay does, let's say, under the hop is original publisher, right? Um, that token might be different. But the scope of that token for the validation is that hop. If that publisher is allowed to publish, then this subscribe will be let subscriber let to get that data. Right? Right now we say receiver verifies publisher, so what proposal here is that it speaks about what if a publisher publishes something in an auth token, what what is relay authorizing means, same thing, uh relay-to-relay will not say much. Sure. I I see your concern, but...
Colin Perkins: I think we need to get at least some design team to think about how authorization works on it. Because, like, you just to sort of clarify what I think is going to be the complicated problem to deal with, is we have two subscribers that came up with possibly different auth tokens. Right. And so, the first subscriber comes with auth token A. Subscribe, the relay is, yep, that's a great token, and sends it up to the original publisher. And then, the second relay comes, or sorry, the second subscriber B comes with another auth token B, and the relay goes, that's great. But I I like your token, it looks good. But now, I have to make the decision, should I send should I give you the data that was authorized with auth token A that's already flowing to me, or do I need to do something else to check that the auth token B is valid or not? And like, there's a bunch of things we could design in this space. Eckhart and I discussed this for many hours, there's whiteboards, like, it's not a short conversation. Um, but we'll have to make some trade-offs to make a relay work network. Can I hijack and say as chairs we can take an action item to form a design team on auth, and put this issue to that design team? I'm kind of tired of talking about it, I mean can we form the design team? So, does anyone have some on this sub-team to say that... Oh, yeah, actually I think, I think it is hop-by-hop. We don't have to form a design team, because the tokening the token is just who you are sending it to. That relay is going to have a different token to talk to the next relay. And the relay that's exiting the network and going to the original publisher, will have another token. So, I I really think auth is hop-by-hop. That's one model, but that's that's I mean, that model has issues too, so like like... It has issues, but... That's Magnus's only concern, so we're following what he... Yeah, right. Okay, next issue. No design team. Okay. Thank you. This is also auth related, yes? Yes. No, it's on the list, but not on. Yeah, this is also auth-related stuff, I think we can punt this. Um, okay, so, we'll move 1503 to auth as well. It sort of falls in the same space. Um, thanks for making the slides, Suhas.
Suhas Nandakumar: No problem.
Will Law: Okay, lightning round. Do we need to allow multiple ranges in a FETCH request?
Suhas Nandakumar: We have filters also. Well, okay, let me finish with look. So, now that we have filters, you can already, you can already do some of the things that you couldn't previously do with FETCH, right? Like, if I'm missing, if I detect that I'm missing all of the sub-group of, say, every odd-up, like, one of the reasons this issue was filed was so, if I had every other object in my cache, do I have to make, you know, I want to make one FETCH for like all these little mini mini ranges. Um, but now that we have filters, if all those objects go into one sub-group, you can say, like, trying to get or I already have sub-groups here, so that's one. Well, I think what you're saying is location filter will let you spell multiple Are you Does your proposal say we could make it, all the other filters all are multiple ranges. Location filter, if we do it, straight from 1401 does not support multiple ranges, but it would be more in line with all the other filters if we did that that support multiple ranges. Because the syntax of all the other filters is multiple ranges. I'll move on. Maybe that question is, do people want this? Or do people want to keep it simpler? Because you're not going to do the silly case of alternating ones, but you might have a range here, range there, and then a range there. Is there anybody who really doesn't want this? I have a fairly complex FETCH implementation and I thought about like, I'm not sure if I would ever really want to like go scan for all my gaps, and then make a FETCH for that. Like, it's more like, I mean, a lot of people do it simply, it's like, I hit the first gap, give up, get the rest of the group. Like, that's one simple strategy, and another one is like, I'll fill every hole. But like, finding all the holes is a long process, so. I mean, if you're going to do 20 FETCHes, you might as well put them in one FETCH. It still sings the same amount of work, that you've got to find, you know, you're going to figure those 20 out sooner or later. The mechanics is simpler, so the simpler one is just you want three ranges, make three three FETCHes. And you will get them. And does it Can it help the number of implementers? If they're like, do you mean I have to FETCH each individually? Yeah. I would just like to point out that in HTTP, we have range filters and they can be disjoint. And, and, like, this problem is, I don't want to say solved, but they are already like many implementations. It's not like pulling a gigabyte file to the edge to deliver the last 5 high-fives. And like, if it's, like, it may be split among multiple sessions, like, this problem is handled like standard switches. Okay. And I don't think anybody is like, I'm not super eager to implement it, but, okay. In, uh, how are they going to be delivered? Are they going to be delivered on one stream, or is it going to be like each range is like its own stream? I'm asking a question. What does... So, so, we have not really any PRs yet. I think, I think Victor and I were tasked to look at the spelling for the the FETCH. And my rough stab of it, which is not written down anywhere is, when you give a parameter, that is a stream. You're asking for one FETCH stream in a parameter. Within that parameter, our location filter can have multiple ranges. So, if people want four ranges on that one stream, then you're going to get a a FETCH stream with four ranges in it. And you'll send it as continuity in the middle of the of the stream. I mean, I guess it's okay because you do we think it's complicated, or do we think people are going to screw up the fact that like FETCH streams have implicit like gaps in things, and then and so like a person reading a stream is going to track of like, wait, so, I have I I asked for this, I got this, like, keep doing like incremental map, I don't know, maybe it's fine. Like, I'm not infused about implementing it. The FETCH response The FETCH response thing should clarify what a gap is. FETCH response is already required to say whether the gap is intentional or unknown or does not exist, right? So, you can just say unknown. Yeah, are the ranges allowed to overlap, Group Compression Victor? Yeah. The classic, I would like objects 1 through 1000, 1, 2 through 1, 2, 2, 1, 1 through 1000, 2. Um, I would say no. I would say no. That would duplicate everything, right? So, you would do the, you would do the other. Yeah, I think if you're going to put on one stream, I think if they get overlap, I think it's I think it's wasted. Okay, so, yes, people want this, and we'll do it with the location filter in FETCH. Is that what I hear? Anybody want to object to that? Well, we we I mean, should we send each range uh on half a code or or or... There'll be one I mean, you can always still send many FETCHes, no one's taking that away from you. But if you have multiple ranges in FETCH, how are they treated uh one by one, or are they treated at the same time? Think it's coming back on one stream. If you want them on multiple streams, make multiple FETCHes. If I send uh FETCH 1, 2, and uh 5, 6, are they sent simultaneously? Like, a 1, 2, You'll get a stream with 1, 2, 5, 6. 1, 2, 5, 6, and we need to have like a continuous range in order to They have to be like fetched in order, because of the delta encoding in the in the filter. It's impossible to specify overlaps. So, if you say 5, 6, 1, 2, you you will have to can't can't, you have to do in order. For everything is delta encoded on the wire. We're out of time on this one. Okay, I think we'd I'll put Ian, you have it written down? Uh, I have written down, but is the answer that, um, we do want to support them, they come on one stream, and they can overlap? All right. Okay, not the outcome I was expecting, but at least we have response. Um, renumber everything. Uh, I don't think we're going to do anything immediately, but yes, we plan to renumber everything. Um, we'll when when the time comes, we will, um, make sure all the enums start at zero and have no, you know, and are contiguous, and are sorted in in an appropriate order. Um, so, we'll report this issue. Ian. Yes. Your favorite issue. Yes, um. Okay. Flow control. This PR for flow control. We want flow control. It limits the total number of streams and the total number of bytes sent on a subscription. It uses control messages on bidi stream. Now that we have bidi stream, it doesn't uh require we send stream, matter of anything. It's actually quite straightforward. Um, Alan wrote another PR that is actually I think slightly better PR, but they're basically the same proposal. This is mine, and more performance in terms of property. Um, they, I mean, it's not like why is my concern with the same things but with something um in terms of, if you only want to do bytes, for example, you can only do bytes and um then you end up with something that's very similar to um, you know, standard stream flow control and single stream, but it's across multiple streams. So, um, in the past, the workgroup has not expressed a ton of enthusiasm for it. Um, if we did want to do subscribe side flow control in this in the past, I think we probably have to do this. Um, yeah. And so, I mean, my take away from what the working group wants is that probably closing this or moving this to like a extension or something makes the most sense. So like, I want to give it one more go round before we go that path. I'm not finding this PR, where is... What's the number? It's not 11, I'm sure. PR 11 is in my repo, A-Frind/moq-transport. Low low numbers come from... You have to link them, yeah, sorry, that if you click that link, you just go to GitHub, A-Frind/moq-transport. You'll find all kinds of secret stuff I work on. When I dream up things that I don't want to share with people. But but 1591 is this issue. You're looking for the content, it should be fine. Um, I mean, you want I I guess I could put feedback on the message on the, but I guess the thing is is like, if we're going to do something like this, I think limiting the rate would also be really important. I'll I'll have Magnus define how to how to measure the rate. That's a possible, but that would be Magnus's problem. I mean, you can kind of I mean, you can limit the rate by limiting the bytes. Yes. But but there's this is not like like like, I have no idea whether conference call when it starts is going to last 5 minutes or 5 months. Like, literally, they go for 5 months. So, trying I can't put in a reasonable limit for bytes, but I could put in a reasonable limit for rate. No, no. But, you keep feeding it credit, so like, you as you consume, say, you have say, you have like a Yeah, I understand. Okay. Okay. All right, so you just keep feeding it credit, so functionally it works. I mean, it's a little bit tedious, but like, you don't have to worry about... It's not with only, I mean, you have to act packets to it, like, to get congestion window back, so it's like, you know, similar amount of work. And I wrote I wrote one of these PRs but I don't remember anything about it. So, if you're out of credit and subscribe, you still queue it, and then, if credit arrives, it just flushes late. Yes. And it's still subject to delivery timeout. Yes, and at some point, you do deliver, like, too far behind. And then, you can still get too far behind, like, that counts against your queue. Yes. Um, so, I mean, it basically gives you the same flow control you get with like a FETCH. But like on subscribe, so, that's that's the intent. I'm imagining you had FETCH flow control on subscribe. I have not seen a lot of people jumping up and down saying like, I really want to do this. So, I think maybe the thing to do is have it in an extension and maybe people will play with it and find that there's useful and MoQ V2, you know, just like whatever, Speedy V2 didn't have flow control on application. We should have flow control. I sort of feel like this is like, you know, the other big topic that I think is the DDoS stuff we haven't really thought into the draft yet, or whatever. It's like, this may be a very plausible solution to resource exhaustion fights that we have identified in the DDoS draft. Seems like we haven't like, again, I mean, we need a design team for DDoS too, we have a design for DDoS. But they haven't talked about this, this is off in the this is off in the subscriber to the relay. This is not how we've you can sort of throttle them in. I I look I don't I I had always imagined there would be some way to rate-limit things, like rate-limit data, not not control. Control messages will not be how the systems kill. Um, so I thought there would be something you could do with rate limiting data and this seems like sort of it, though I can't read I haven't read the PR, obviously. Okay. Um, I don't know. We can repark it for like and actually like, we can plan to talk about next performance draft in more detail, I'm happy to. Like, I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good and, no. So, so, so, you're trying to simulate what you get with FETCH. So, you're just trying to aggregate all the sub-groups and saying, for this whole batch of sub-groups, here is your here is your limit. Yeah, here's the total number of bytes. Okay. You you can send. Yep, and also stream limit too if you want to use that. So, say say you're in a situation where like, I have 100 streams and I want to give this subscription 20 of them but not more than 20 then you basically can make sure that those subscriptions doesn't consume more than 20, which for a relay is is kind of nice, because you know, you're going upstream, you're trying to like, approximately fair share probably between multiple subscriptions. I mean, this is mostly realistically probably really valued with relays where you have, you know, a lot of people who are competing with on the same resources, you potentially, like someone downstream might like just stop consuming, or like, you know, there is bandwidth. Because right now you're only hammer is unsubscribed, right? Right now your only hand, well, you have you have flow, you have flow control on a subscribe, if you don't want the other person to send, you have flow control on stream. Every stream, yes. It stops and then, FETCH is a stream so it has flow control, right? Like, that's it just has it because it's is because it works. Okay. But subscribe is not. Um, do, I was going to say like, what are the like, since this has come up a few times and people have so, it sounds like maybe now people are more interested in this, like what can we do to make sure that we actually advance the ball? Um, that's why I say, do what, how about we we completely on this, I'd rather get our auth stuff straightened out first before we did did the DDoS. This stuff because I think that the auth stuff solves lots of that, so like, if I was sequencing these, I would get auth done in the draft as of something coming up fairly soon, once we got all the currents, I get all the current stuff we have in play landed, right? And then start the auth discussion, and then start the the the rate limiting discussion. Victor's in queue, also. What's up? I didn't see. Oh, I'm sorry. Okay. Ah, yes, this is the description, is sounds like a real problem, the solution does not solve it. Why? It does not authorize all, it does not prove that we do not have flow control for subscription. We do have flow control for subscription. The subscription are implicitly flow control for here, in a sense that the receiver has to consume data and as long as it's generated, or eventually you will get timeouts or so to far behind, and sensors because of nature of things which transmit by subscription you kind of have to consume it as of rate, and that's why is flow control because that's how the the rate of the receiver gets to like, the rate of the sender. Now, is there by the other issues with like, someone sending traffic that is too high rate for the network and things like that, and we might need to have support for that, but I don't think we've, like, that's that sort of out-of-sync problem like, you know, Ian was saying, use some sort of rate limit, or like, we we definitely need that, I mean even H3 do we do some stuff that mostly avoid that, but like it's a real problem at the... I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good, and... No, no, no.
Suhas Nandakumar: I'm on to the next.
Will Law: Oh, you're up on... These are your slides. Enjoy.
Suhas Nandakumar: This is, this was an issue opened by Magnus. The idea is that we have end subscribers and original publishers. Original publishers announce subscription publish namespace and some hacks are automatically subscriber. There is no way to verify, "Is that publisher allowed to, or authorized to publish?" Not looking to think clearly about this. This is, we today, we have text that basically says that, like, the spec says, "receiver verifies publisher is authorized," but it's kind of a black box. It does not clearly say what this basically means. But if you really think about it, the way our auth works is hop by hop. We do not do an end-to-end authorization anyway. So the idea is that we need to add a resolution here that is at the security considerations section, basically talking about different roles in the sense that what would be a publisher to relay, what would relay verify, being authorized by a publisher. And in subscribe namespace case, when a namespace comes, the publish namespace matching with that subscribe track space basically helps in what are the auth that you do with subscribe tracks or subscribe namespace, in everything under that as factory would inherit that authorization, that trust. And same way between relay to relay there might not be, we do not define anything, right? But the expectation is that relay would once a relay authorizes a publisher on the ingress, if if it does not authorize a publisher on the ingress, it will not forward that on the on the egress. There might be an out-of-band mechanism where an application can control the the identity and authorization associations between the publisher and end subscriber, but our the MoQ core transport would not define that.
Colin Perkins: Okay. Um, so, I think we got a whole bunch of assumptions here we probably just disagree with them, we can't sign here, but, yeah, I don't think it's hop-by-hop, and the use case we described before was you're going to send a subscribe, I'm not on the on-topic list now, okay, because of course we'll call it a subscribe sends something to a relay, the relay sends it to the original publisher. And the relay might want one token, and the original publisher might want a different token that you're going to authenticate with. And this was one of the reasons that we put multiple, the support for multiple tokens in, and there might even be another relay network in there, and you had two different authorization tokens for relay networks, plus one for the original publisher. So, I don't, like, let's, like, I don't think it's hop-by-hop.
Suhas Nandakumar: No, the point is that parameters are hop-by-hop. All parameters...
Colin Perkins: No, no, but we're going to have to put statements in that these ones are copied from, you know, upstream, like lots of other things are copied.
Suhas Nandakumar: Also, all subscribes are aggregated, can be aggregated, that's why that's why parameters are hop-by-hop.
Colin Perkins: So, the the authorization of the subscriber is one of them. Like, that is one of the, I think, issues that is going to be one of the most difficult things for us to deal with in this draft before we're done. Please file those, thank you. So, uh but just to high-level, uh what I'm saying is very similar to what you're thinking. I'm not saying what I'm trying to say is that what uh once a subscriber sends an authorization open to relay, and then what relay does, let's say, under the hop is original publisher, right? Um, that token might be different. But the scope of that token for the validation is that hop. If that publisher is allowed to publish, then this subscribe will be let subscriber let to get that data. Right? Right now we say receiver verifies publisher, so what proposal here is that it speaks about what if a publisher publishes something in an auth token, what what is relay authorizing means, same thing, uh relay-to-relay will not say much. Sure. I I see your concern, but...
Colin Perkins: Again, I really like that proposal, but the implementation that I did 3 years ago when I did the rewind window, I did it based on that. Exactly timestamps. The most complex and then I knew I was at second, and I requested the relay, second. So, but... When I see timestamp in the media space, close my mind, because there is, as you know, there is a lot of timestamps that are related to transport, related to media, related to audio, related to video. There is a lot of clock. Not a lot, but few clocks, that you can sync. So... This proposal in general, I like it, but I think if we want to provide people some best practices of how to do it, I think we should be much more accurate defining what timestamp is in this. Yeah.
Torbjörn Einarsson: Uh, in MSF, we have defined the, uh, media timeline and the event timeline, which are specific tracks which give the media PTS location in the wall-clock time. So, it brings kind of exactly this, but in the separate track. So, if we need to have something like this, it would mean that the relay could act based on the property gets from the object to do something, but I don't know if it is one of the use cases.
Ian Swett: Okay. I know mixed opinions. I'm just not sure exactly of the disposition. It sounds like people are interested in hearing more. Maybe we'll schedule it for future virtual or okay, maybe ask for the end time, I don't know. I just want to make sure people come prepared, so that we don't get the same discussion again.
Ian Swett: We can prepare and think at the end we can can make a PR 11 and make a new PR out, and that will help us review, actually. I can make...
Suhas Nandakumar: It is because, we can we can write it in the...
Victor Vasiliev: Yes, is, is put in a...
Ian Swett: No, no, no, I think the other way, I think the other way is better, which is, we just need a, we just need a way to, right? And let the auth design team to figure it out, and I think it can be message-sized, and finally, yeah, it works. So, the other point I want to make in this space is that now the way we structured it, um if it is if it happens to be an immediate request error, we have this reason phrase, but if it's a error that happens later, it's a reset stream, which has no error phrase, and nobody really cares. So, like, Why do we need reason phrase? Uh, I don't know, like, like, I don't want to spend forever paying the lawsuit here, like, yeah, okay. Um, like, I I kind of think we ought to have it, I don't know what other people think. I guess, you know, H3 doesn't have them in certain places, or Quick doesn't, but they're they are in some places but not others. Someone just needs to make a call. If, I don't know, if Colin, if we just said it's binary data, do your best, like, does that does that resolve all of your issues about UTF-8? It sure sure, because then people don't try and print it. They try and they They'll print it anyway. I will. But like the number of like, okay, at least you sort of have the security group, and like, the number of attacks we saw that like, relied on being able to reverse the bidirectional log message right over top of it and then go forward again, stuff like like UTF-8 allows you to do all of those things, so like, I mean, the I see, okay. I think the text we have for UTF-8 in this particular thing was lifted straight from HTTP/2. Because they have a place for a reason phrase to appear. So, we've at least tried to copy somebody who's more past than we are. I'm not saying that we won't have more trouble. Uh, I don't want to, time is over. How should we resolve, chairs? Resolve this? Yeah. Um, I think i I heard a lot of I think people asked for the debug stuff, I've heard that. I don't think any of the reason phrase haters have a problem with keeping this through the interop phase. Right, Colin? I mean, I don't No, no, no. Here is the concrete proposal going forward. We log in the draft, we put a message in the draft before this gets before this is last called, we'll make we'll make a we'll make a call in the working group of whether we're going to remove it or not. Like, that's just a way, like, punt it to the like, a like, punt it. I mean, we're out of time and I want to give I want to give you time to talk about timestamps. So, like, I don't know in this I'm okay punting it a little bit, but like, who cares? I'm good. I mean, let's just make a decision. Yes, what? Do we want to remove the UTF-8 text? I guess is one question. That's a separate question. Okay, well, why don't I don't think anyone cares strongly. We we could flip a coin, we could we could raise hands, like, I mean, like I would like to see them go away at the end. I think Colin would, but, we're not going to die, we're not going to like, yuck anyone's yum if they really want to have them in there from... If, if, if we're going to remove it, then, then aren't some of the, like, what what I said, which is, we just need a, we just need a reason codes to be bigger and more detailed than just error, protocol violation, you know, go on. Good luck to you guys. Yeah, I guess that's my point. I need somewhere to put something more detailed than just like, "error" is too active. But, but, but, but our reason codes space is big, right? It's like, it's in a 55-m, you can make as many as many codes as we want. I'd rather not actually make more codes, I'd rather, whatever, we do something like what I said, which is, put a line, whether a line, like, this is the line, or see whatever. You can have 9,000. If, if we're going to have rich feedback, like, just leave it as it is. That's what I say. I have no, nobody said, I want the binary, I want the text or binary stuff, to have to have it. Nobody said that, right? I mean, I want no debug, no text. There's one one there, but, that's for expected behaviour. And, and we've written it as binary. But nobody cares about whether there is anything in the UTF-8 or binary, right? Nobody said, "I need data in there." right? It's "no" for everybody. So, so, you're saying, "Don't send reason phrases?" I absolutely do. Do, do people care about it? I care. At least, for now, I care. I have definitely, we've got at least one key, I mean, for H3, I do, like, I mean, I use them. So, all right, if, if, if the desire is to have this really rich feedback, uh, like, with a million codes, uh, I would say like, let's just leave it as it is, like the point is to to keep it, to find the feedback with something, and not mess with the protocol. We're just going to have to recreate as a new structure. So, just leave it as it is. That works for me, Colin. We're switching this to binary data allow us to make it in the protocol that we anticipate already. Oh, yeah, yeah, yeah, because I mean, if, if, if, if you like, oh, I mean, that, I'm just sort of saying that, the, the, the, the problems are is, when you use UTF-8, you're supposed to scribe, uh, normalization form, and we don't, like, if we want to do the things that you're supposed to do when you do UTF-8, like, have normalization form, and things like that, it'll go through no problem. We just haven't done all of the things that UTF is required to do. Which is why US ASCII would be much simpler because then you don't have those requirements, or binary. Or, or like, this is a binary data like, you already have a 9-byte binary field. Okay, I want to give the rest of the time to Ian, like, error code is the same 9-byte binary. You will love arguing about timestamp more than you love arguing about... All right, Ian, you're up, and you have, uh, 12 minutes. Talk about this some in the past, um, but I think we now have more experience to have a way more informed conversation and maybe, um, we get a real direction out of it, so I will try again. Um, conversations around DTS we believe have been very helpful in providing some like real world, like, how people are actually going to do like track switching and such. Okay. So, potential use cases for, um, timestamp across tracks is, you know, audio and video, for example, you trying to keep them approximately in sync. So, like, this is particularly valuable if you have like a smallish jitter buffer, but like it's not like 15 milliseconds, it's more like 2 seconds, um, where it's very easy for the audio to actually get like a good bit ahead of the video if you're, um, not careful and you have a lot of audio. Um, but it also could be very helpful for keeping, um, tracks in sync that are like from different productions. So, like video conferencing, um, any other time you have like two different video feeds and you want to keep uh approximately in sync from like a delivery perspective. So again, this is all about delivery, this is not about playback. This is just there to try to make sure that like you don't have buffer under-runs, because you like, you know, send all of one thing and none of the other. Um, so timestamp cross-attributes is a difficult to impossible problem in the like extreme sense, but as I think Colin kind of alluded, um, you know, perfection's probably not required here, like this is just for delivery. Um, and so, you know, if we can get uh sufficiently accurate timestamps solely, it actually might still be like a net win. And also, I've there are lot of use cases where even if the time is not 100% synchronized, it's like maybe they're all from like the same data center and the same like region or same like rate, right? I mean, like these these clocks can be reasonably close, uh, like maybe where they're transcoded, things like that. Um, so, there's a lot of use cases where we might be able to use this. Um, and of, I wanted to say briefly on DTS, currently uses group alignment, which seems to work fairly well. Um, probably could use a timestamp if if it's available, but I think DTS have proven that like at least for that one use case like probably group alignment is sufficient. Not required. Like, timestamps aren't required, even if they could be used. So, Yeah, that was going to be my comment. DTS doesn't actually, it's just requiring group IDs to be consistent, they can all completely different media times or something. It doesn't matter. Yeah. And also when you're syncing media, you're not going to do it with the time with the timestamp carried on the delivery item. You're going to do it with the time a much more accurate time signal that's embedded encoded, like MPEG system you have for that, or time base or presentation set, and then, and sync it right. So, this timestamp then would be useful if we had say allowed filters or something that is a property of transport. I want, I only want objects to that are signaling this timestamp. But that timestamp is just a number, so couldn't we, and you can already define an existing property which is a number. So is this really about just standardizing a property and saying it is a timestamp and here's how you you write it? Uh, yes, basically, things like that. Or, is there some other issue? Or like, someone sending traffic that is too high rate for the network and things like that, and we might need to have support for that, but I don't think we've, like, that's that sort of out-of-sync problem like, you know, Ian was saying, use some sort of rate limit, or like, we we definitely need that, I mean even H3 do we do some stuff that mostly avoid that, but like it's a real problem at the... I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good, and... No, no, no.
Suhas Nandakumar: I agree with that. Yes, I'm in favor of some, I think we are not at the point of, I think we have options there because in privacy pass we do the first part of the setup, you will not get challenge, and then the relay would ask you for... ...some metadata. There's two ways we wanted a way, in setup, when you fail, state setup, to come back with the challenge. And we can do this two ways to do this, right? Do something like this, or you define a new auth error. Kind of like this. Setup doesn't have a response. Setup is just but I think this is the and like, again, this is where in the category of stuff that I was like, when we get to auth in the draft, we'll fix this, okay, but I don't think we should do this with an error. I mean, every pretty much every auth auth token scheme that isn't a bearer token has a challenge phase. And so, anytime you send up any request, you're, you're going to need something that looks like a auth challenge to it, that needs to pass back a authorization. So, that's separate message for. But that's spec-wise, that's a different message. Yeah, or something like that. But, we need to design for this one way or another, we have to support this, right? Okay. Yeah. Does that work for you if we say like, okay. All right, so we'll close this. We're not going to do error. I think we need to figure out how to spell this. But this is our, but not this way. Okay, that's my goal. Okay, we'll close that PR. I, I think the issue to say we need a way to have, right? And let the auth design team to figure it out, and I think it can be message-sized, and finally, yeah, it works. So, the other point I want to make in this space is that now the way we structured it, um if it is if it happens to be an immediate request error, we have this reason phrase, but if it's a error that happens later, it's a reset stream, which has no error phrase, and nobody really cares. So, like, Why do we need reason phrase? Uh, I don't know, like, like, I don't want to spend forever paying the lawsuit here, like, yeah, okay. Um, like, I I kind of think we ought to have it, I don't know what other people think. I guess, you know, H3 doesn't have them in certain places, or Quick doesn't, but they're they are in some places but not others. Someone just needs to make a call. If, I don't know, if Colin, if we just said it's binary data, do your best, like, does that does that resolve all of your issues about UTF-8? It sure sure, because then people don't try and print it. They try and they They'll print it anyway. I will. But like the number of like, okay, at least you sort of have the security group, and like, the number of attacks we saw that like, relied on being able to reverse the bidirectional log message right over top of it and then go forward again, stuff like like UTF-8 allows you to do all of those things, so like, I mean, the I see, okay. I think the text we have for UTF-8 in this particular thing was lifted straight from HTTP/2. Because they have a place for a reason phrase to appear. So, we've at least tried to copy somebody who's more past than we are. I'm not saying that we won't have more trouble. Uh, I don't want to, time is over. How should we resolve, chairs? Resolve this? Yeah. Um, I think I I heard a lot of I think people asked for the debug stuff, I've heard that. I don't think any of the reason phrase haters have a problem with keeping this through the interop phase. Right, Colin? I mean, I don't No, no, no. Here is the concrete proposal going forward. We log in the draft, we put a message in the draft before this gets before this is last called, we'll make we'll make a we'll make a call in the working group of whether we're going to remove it or not. Like, that's just a way, like, punt it to the like, a like, punt it. I mean, we're out of time and I want to give I want to give you time to talk about timestamps. So, like, I don't know in this I'm okay punting it a little bit, but like, who cares? I'm good. I mean, let's just make a decision. Yes, what? Do we want to remove the UTF-8 text? I guess is one question. That's a separate question. Okay, well, why don't I don't think anyone cares strongly. We we could flip a coin, we could we could raise hands, like, I mean, like I would like to see them go away at the end. I think Colin would, but, we're not going to die, we're not going to like, yuck anyone's yum if they really want to have them in there from... If, if, if we're going to remove it, then, then aren't some of the, like, what what I said, which is, we just need a, we just need a reason codes to be bigger and more detailed than just error, protocol violation, you know, go on. Good luck to you guys. Yeah, I guess that's my point. I need somewhere to put something more detailed than just like, "error" is too active. But, but, but, but our reason codes space is big, right? It's like, it's in a 55-m, you can make as many as many codes as we want. I'd rather not actually make more codes, I'd rather, whatever, we do something like what I said, which is, put a line, whether a line, like, this is the line, or see whatever. You can have 9,000. If, if we're going to have rich feedback, like, just leave it as it is. That's what I say. I have no, nobody said, I want the binary, I want the text or binary stuff, to have to have it. Nobody said that, right? I mean, I want no debug, no text. There's one one there, but, that's for expected behaviour. And, and we've written it as binary. But nobody cares about whether there is anything in the UTF-8 or binary, right? Nobody said, "I need data in there." right? It's "no" for everybody. So, so, you're saying, "Don't send reason phrases?" I absolutely do. Do, do people care about it? I care. At least, for now, I care. I have definitely, we've got at least one key, I mean, for H3, I do, like, I mean, I use them. So, all right, if, if, if the desire is to have this really rich feedback, uh, like, with a million codes, uh, I would say like, let's just leave it as it is, like the point is to to keep it, to find the feedback with something, and not mess with the protocol. We're just going to have to recreate as a new structure. So, just leave it as it is. That works for me, Colin. We're switching this to binary data allow us to make it in the protocol that we anticipate already. Oh, yeah, yeah, yeah, because I mean, if, if, if, if you like, oh, I mean, that, I'm just sort of saying that, the, the, the, the problems are is, when you use UTF-8, you're supposed to scribe, uh, normalization form, and we don't, like, if we want to do the things that you're supposed to do when you do UTF-8, like, have normalization form, and things like that, it'll go through no problem. We just haven't done all of the things that UTF is required to do. Which is why US ASCII would be much simpler because then you don't have those requirements, or binary. Or, or like, this is a binary data like, you already have a 9-byte binary field. Okay, I want to give the rest of the time to Ian, like, error code is the same 9-byte binary. You will love arguing about timestamp more than you love arguing about... All right, Ian, you're up, and you have, uh, 12 minutes. Talk about this some in the past, um, but I think we now have more experience to have a way more informed conversation and maybe, um, we get a real direction out of it, so I will try again. Um, conversations around DTS we believe have been very helpful in providing some like real world, like, how people are actually going to do like track switching and such. Okay. So, potential use cases for, um, timestamp across tracks is, you know, audio and video, for example, you trying to keep them approximately in sync. So, like, this is particularly valuable if you have like a smallish jitter buffer, but like it's not like 15 milliseconds, it's more like 2 seconds, um, where it's very easy for the audio to actually get like a good bit ahead of the video if you're, um, not careful and you have a lot of audio. Um, but it also could be very helpful for keeping, um, tracks in sync that are like from different productions. So, like video conferencing, um, any other time you have like two different video feeds and you want to keep uh approximately in sync from like a delivery perspective. So again, this is all about delivery, this is not about playback. This is just there to try to make sure that like you don't have buffer under-runs, because you like, you know, send all of one thing and none of the other. Um, so timestamp cross-attributes is a difficult to impossible problem in the like extreme sense, but as I think Colin kind of alluded, um, you know, perfection's probably not required here, like this is just for delivery. Um, and so, you know, if we can get uh sufficiently accurate timestamps solely, it actually might still be like a net win. And also, I've there are lot of use cases where even if the time is not 100% synchronized, it's like maybe they're all from like the same data center and the same like region or same like rate, right? I mean, like these these clocks can be reasonably close, uh, like maybe where they're transcoded, things like that. Um, so, there's a lot of use cases where we might be able to use this. Um, and of, I wanted to say briefly on DTS, currently uses group alignment, which seems to work fairly well. Um, probably could use a timestamp if if it's available, but I think DTS have proven that like at least for that one use case like probably group alignment is sufficient. Not required. Like, timestamps aren't required, even if they could be used. So, Yeah, that was going to be my comment. DTS doesn't actually, it's just requiring group IDs to be consistent, they can all completely different media times or something. It doesn't matter. Yeah. And also when you're syncing media, you're not going to do it with the time with the timestamp carried on the delivery item. You're going to do it with the time a much more accurate time signal that's embedded encoded, like MPEG system you have for that, or time base or presentation set, and then, and sync it right. So, this timestamp then would be useful if we had say allowed filters or something that is a property of transport. I want, I only want objects to that are signaling this timestamp. But that timestamp is just a number, so couldn't we, and you can already define an existing property which is a number. So is this really about just standardizing a property and saying it is a timestamp and here's how you you write it? Uh, yes, basically, things like that. Or, is there some other issue? Or like, someone sending traffic that is too high rate for the network and things like that, and we might need to have support for that, but I don't think we've, like, that's that sort of out-of-sync problem like, you know, Ian was saying, use some sort of rate limit, or like, we we definitely need that, I mean even H3 do we do some stuff that mostly avoid that, but like it's a real problem at the... I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good, and... No, no, no.
Suhas Nandakumar: I think there is value in just standardizing something because it comes in different forms in different applications.
Suhas Nandakumar: Yeah, that's...
Suhas Nandakumar: Yeah, I I think this is a layer violation, um, but...
Suhas Nandakumar: It only, all use cases cited are all applications for syncing media. And as Jordi just said, different applications write different timestamps to media. They are application-defined. Are we going to arbitrarily pick some of them and put them on the transport protocol, which is what we're defining here? So, and I, as much as I use and and need these timestamps, I don't think putting them into the transport protocol is best for them. They should, there's their application utility. They can go in custom-defined properties that the application files.
Suhas Nandakumar: Fair, fair question. Um, so, but what if we want the transport protocol to try to deliver them more effectively? Would it be better then to just have the ability to, like, point to an existing field and say, like, that field is somehow allows you to synchronize between, like, multiple tracks?
Suhas Nandakumar: You call it a sync field?
Suhas Nandakumar: Sure, fine, that's fine.
Suhas Nandakumar: Sync field.
Suhas Nandakumar: Okay. Or...
Suhas Nandakumar: And you can just point at anything. It's not time. Yeah, it's a sync, it's a tech-agnostic sync field. We like physics, right? We deliver boxes. We don't have this call closed.
Suhas Nandakumar: That, that would be fine by me. No, sure, I I think that's valid, Colin.
Suhas Nandakumar: Um, I, I really, the layer violation thing, and I love it. So, I actually think this time, so I want to use this for something different, which is, um, we debug all time using timestamps of trying to figure out what's happening across the relay networks, and between the more whatever, and often relays will statistically sample a small percentage of the relay, of the timestamps and report them up to met- a metric servers, and things like that. Okay. And like, RTP explicitly designed the timestamp to not be encrypted from near mediators, it's explicit so near mediators could see it, which is, like, really wild, like, highly non-encrypted, like, I don't even mean it's, like, inside encrypted packets, like, it should be bare on the internet, even when you're using DTLS-SRTP. Because it's such a useful debugging tool. Okay. And, so, I think that we, I, I'd sort of go the direction, and maybe it's a different extension, maybe we end up with two, I think it'd be useful to be able to have, um, an, an, you know, NTP absolute timestamp of what we think the time of this packet is, that you can statistically drop into some of the objects or all of the objects, if you, if you felt like it, and then it can be used for this as well as being used for metrics and debugging processes, and maybe those, maybe I'm trying to combine two things that are different. And I agree it's a layer violation, but I think this one timestamp is, is so critical to debugging real-time flows, that is worth having the layer violation for timestamps.
Suhas Nandakumar: These are send timestamps, minted by the sender?
Suhas Nandakumar: Yes, at some point, you do deliver, like, too far behind. And then, you can still get too far behind, like, that counts against your queue. Yes. Um, so, I mean, it basically gives you the same flow control you get with like a FETCH. But like on subscribe, so, that's that's the intent. I'm imagining you had FETCH flow control on subscribe. I have not seen a lot of people jumping up and down saying like, I really want to do this. So, I think maybe the thing to do is have it in an extension and maybe people will play with it and find that there's useful and MoQ V2, you know, just like whatever, Speedy V2 didn't have flow control on application. We should have flow control. I sort of feel like this is like, you know, the other big topic that I think is the DDoS stuff we haven't really thought into the draft yet, or whatever. It's like, this may be a very plausible solution to resource exhaustion fights that we have identified in the DDoS draft. Seems like we haven't like, again, I mean, we need a design team for DDoS too, we have a design for DDoS. But they haven't talked about this, this is off in the this is off in the subscriber to the relay. This is not how we've you can sort of throttle them in. I I look I don't I I had always imagined there would be some way to rate-limit things, like rate-limit data, not not control. Control messages will not be how the systems kill. Um, so I thought there would be something you could do with rate limiting data and this seems like sort of it, though I can't read I haven't read the PR, obviously. Okay. Um, I don't know. We can repark it for like and actually like, we can plan to talk about next performance draft in more detail, I'm happy to. Like, I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good and, no. So, so, so, you're trying to simulate what you get with FETCH. So, you're just trying to aggregate all the sub-groups and saying, for this whole batch of sub-groups, here is your here is your limit. Yeah, here's the total number of bytes. Okay. You you can send. Yep, and also stream limit too if you want to use that. So, say say you're in a situation where like, I have 100 streams and I want to give this subscription 20 of them but not more than 20 then you basically can make sure that those subscriptions doesn't consume more than 20, which for a relay is is kind of nice, because you know, you're going upstream, you're trying to like, approximately fair share probably between multiple subscriptions. I mean, this is mostly realistically probably really valued with relays where you have, you know, a lot of people who are competing with on the same resources, you potentially, like someone downstream might like just stop consuming, or like, you know, there is bandwidth. Because right now you're only hammer is unsubscribed, right? Right now your only hand, well, you have you have flow, you have flow control on a subscribe, if you don't want the other person to send, you have flow control on stream. Every stream, yes. It stops and then, FETCH is a stream so it has flow control, right? Like, that's it just has it because it's is because it works. Okay. But subscribe is not. Um, do, I was going to say like, what are the like, since this has come up a few times and people have so, it sounds like maybe now people are more interested in this, like what can we do to make sure that we actually advance the ball? Um, that's why I say, do what, how about we we completely on this, I'd rather get our auth stuff straightened out first before we did did the DDoS. This stuff because I think that the auth stuff solves lots of that, so like, if I was sequencing these, I would get auth done in the draft as of something coming up fairly soon, once we got all the currents, I get all the current stuff we have in play landed, right? And then start the auth discussion, and then start the the the rate limiting discussion. Victor's in queue, also. What's up? I didn't see. Oh, I'm sorry. Okay. Ah, yes, this is the description, is sounds like a real problem, the solution does not solve it. Why? It does not authorize all, it does not prove that we do not have flow control for subscription. We do have flow control for subscription. The subscription are implicitly flow control for here, in a sense that the receiver has to consume data and as long as it's generated, or eventually you will get timeouts or so to far behind, and sensors because of nature of things which transmit by subscription you kind of have to consume it as of rate, and that's why is flow control because that's how the the rate of the receiver gets to like, the rate of the sender. Now, is there by the other issues with like, someone sending traffic that is too high rate for the network and things like that, and we might need to have support for that, but I don't think we've, like, that's that sort of out-of-sync problem like, you know, Ian was saying, use some sort of rate limit, or like, we we definitely need that, I mean even H3 do we do some stuff that mostly avoid that, but like it's a real problem at the... I can make a slide that goes into more detail and like, update my PR to take whatever Alan's things that I think are good, and... No, no, no.
Suhas Nandakumar: I agree with that. Yes, I'm in favor of some, I think we are not at the point of, I think we have options there because in privacy pass we do the first part of the setup, you will not get challenge, and then the relay would ask you for... ...some metadata. There's two ways we wanted a way, in setup, when you fail, state setup, to come back with the challenge. And we can do this two ways to do this, right? Do something like this, or you define a new auth error. Kind of like this. Setup doesn't have a response. Setup is just but I think this is the and like, again, this is where in the category of stuff that I was like, when we get to auth in the draft, we'll fix this, okay, but I don't think we should do this with an error. I mean, every pretty much every auth auth token scheme that isn't a bearer token has a challenge phase. And so, anytime you send up any request, you're, you're going to need something that looks
Session Date/Time: 12 Jun 2026 08:00
Alan Frindell: Okay. Good morning, everyone. This is the beginning of day two of the Media over QUIC interim. Those of you who are new, welcome. We'll jump right into it.
This is the IETF Note Well. We have some new people here, so I'll run through it very briefly. The Note Well is a collection of policy documents that explain the intellectual property implications of you attending here at IETF, as well as outlining some of our code of conduct. If you're unfamiliar with these documents, I encourage you to type "note well" into your favorite search engine and read up on some of these details.
Okay, meeting logistics. We had a few hiccups today, basically because we were unfamiliar with using Meetecho in a hybrid interim environment, but I think those will be ironed out today. I encourage all of you, whether or not you're remote or present, to log into Meetecho, partly because you will need to do it for various queueing functions, but also it registers your attendance here for the so-called blue sheets, as well as giving you access to the chat.
So, there are numerous ways to log in. Possibly the easiest one is to use your device to scan that QR code on the left screen there, which will give you access to the light client, and will not serve you the video of the meeting you're physically attending. But it will give you the chat and other functions that you need. Alternatively, you could go to the Datatracker page, datatracker.ietf.org, and look for upcoming meetings, or go to the MOQ working group page, and that will give you a link to the full client that runs on your laptop. If you do run the full client, please mute your speaker, or else we will hear an echo of everything we say here coming out of your speaker.
Okay, so we're going to have the queueing work more or less like we did yesterday. We're going to allow some amount of spontaneous discussion here. If you just have a quick response, quick question, and there's not a queue building, go ahead and just speak out. Obviously, if you're remote, you're going to have to raise your hand using the "raise your hand" button on the tool. I think the principle that I think we've come up with is, if you feel like you need to raise your hand here because things are backing up, just raise your hand in the tool. That's a much easier way to try to manage it. We don't have a whiteboard or anything, so that's what we're going to do for queueing.
If you are remote and you choose to speak, I strongly encourage you to put on a headset. The echo cancellation properties of Meetecho are not great, and we'll have to mute you progressively if you're participating in a back-and-forth conversation. There's about a one-second or so lag when you unmute yourself, so do that a little bit before you plan to speak if you're a remote attendee.
This is today's agenda. There's going to be quite a bit of MOQ time, in keeping with our theme of trying to get through as many of the majority of the MOQ outstanding issues we have. We're going to, but before we get there, we're going to talk a little bit about Top-N, DTS, and switch work that has been presented over the last four weeks or so, some updates on that, some follow-ups, and some outstanding issues. And we're going to have MOQ, we have a lunch break where you're going to have to go out on the economy and find your own food, and then at 2:30, we're going to talk about some of the, we're going to have a hard stop at 2:30 because of one of our MOQ editors leaving, and then we're going to spend the remainder of the time, the last 90 minutes or so, talking about some of the other adopt drafts we have. I made a mistake here, I believe we're going to do secure objects of more privacy pass, I believe that's what we discussed yesterday, I don't know why I typed the opposite there. So, Suhas, like, at best you'll get about 10 minutes, and I mean, given how we go, you may not get any time at all, and thank you for being flexible with that. I want to thank all of our, all of the authors of those drafts for being very understanding about maybe punting some issues to later, so we can really focus on MOQ this week, which has been great because I think we're, I think we're all ready to have that document finally.
Okay, is there any, any other issues?
Cullen Jennings: No, not really. I mean, I think we're, MOQ is going to fill the time allotted, and I'm going to unusually for this meeting, I'm going to be pretty strict about the deadlines on those early issues so we can start at 11:15 and then we'll go until we run out of time, but if I, if we start with MOQ, we're not going to get off MOQ, that's kind of how this goes, right? So, I want to make sure these other people have an opportunity. Would anyone like to bash this agenda, as I'm eating into my speaking time?
Okay, with that, I am going to stop talking, and Mike will talk about how ENRP went this week.
Mike English: Alright, so Mike English from NetEase. This is the Interop Report for this interim. Our interop target this time was draft-18. Here is what the latest results from the automated interop test runner look like. We still have a good number of failures that are not necessarily protocol-related, but just kind of setup-related. So, we continue to resolve those issues. We had a number of things get fixed again, because we're building images and those types of things. So, we're working through things, but we have 16 tests that ran at target, meaning at the draft version that we're targeting. And a number of other tests that ran for previous drafts.
This is a very small picture of, you know, a lot of tests that were run. If we zoom in to some of these, I can do in a moment, you can see a bit more detail about what the tests are and what draft versions the different implementations negotiated. These are some of the specific tests that were done with draft-18.
And because there's been some kind of pushback on putting things in the automated test runner immediately on, you know, kind of our first rev of doing interop testing on the new version, I also started a wiki page to kind of collect the ad hoc interop reports. Previously, we had a spreadsheet that was somewhere in between these two approaches. And we were trying to be meticulous about defining test cases and then having people self-reporting. But there was a lot of ambiguity about what the test cases really meant and inconsistency in reporting, so I felt like that was kind of a false sense of precision that we had in that spreadsheet.
So, the goal is still to get as much as possible into the automated test runner so that we have, you know, structured, you know, formalized results. We can tell exactly what version of what software, you know, ran which test, that was specified, and we can get something concrete from that. But I think it may be useful to also kind of collect this type of information that's just kind of, you know, people running whatever they're able to run, and noting what worked and what didn't.
So, two questions to discuss: Should we keep draft-18 as the interop target for Vienna? I am suggesting yes. And two, how can we make the automated test runner better or easier to use?
Suhas Nandakumar: On the second one. Maybe this is already there, I haven't checked, but is it easy to interop or run a web client into the interop runner? That was one of the missing things we wanted.
Mike English: Yes, that's a good point. That is a limitation that we still have. We don't have a good way to drive web-based clients in the interop runner.
Suhas Nandakumar: Second question is that I've watched the interop, I don't know the exact APIs, but are we checking for the signaling handshakes only? At some point, we should also think about if we have a media-centric media transit inside, because some of the things that I was working on, the MoQ worked was streaming video, but at some point, not right now, figure out how do we get some of those.
Mike English: Right. Yes. So, the long-term goal would be to actually have a way to test further up the stack to media that's going from one implementation to another, or through a relay and back into a client.
Jordi Cenzano: Um, I was going through draft-18, like I, I hadn't actually read the draft since 14, and, and I was just reading draft-18, right, and read it, and, you know, and I think I saw roughly counting, somewhere between 50 and 70 sort of what I viewed as sort of features or functionality things in draft-18. I think we're at the point as we're starting to get closer to working group last call, we should start being identifying all of those sort of features and then somehow, and I don't, I like, I'm looking for your guidance on it, having a spreadsheet to check that we, you know, how many implementations of this do we have, or, you know, whatever do they work, you know? Because I think a lot of our testing here is like, "Yeah, basic stuff worked. Yes."
Mike English: Um, yeah, and let me actually dig in a little to that. So, these tests, with they and running, if we zoom into one of these tests, is self-tests. This is, this is post-ends, super implementation, so we get, you know, a couple tests that passed against, like, the Lorenzo's implementation, for example. So, we're able to exchange setup, but this test that's called "publish namespace only" fails. And this test that has "subscriber." So, the idea is that there are these specified test cases, and these are specified in the repo here. So, in docs, tests, this test cases file has each of the tests that's implemented specified. So, there's different features and, you know, how you actually exercise that systematically. So, the idea is that we would write down like, "Okay, here's this feature. Here's how we want to test that it's implemented." And the test client implements this algorithm. It says, "I'm going to send this and expect that." And so, right now we only have a handful of tests. That was to kind of incentivize people to implement test clients and kind of get the basic framework in place first. I think we're now at the point where we want to expand the number of test cases to start to flesh out and say, "Okay, what other feature coverage do we have?"
Jordi Cenzano: I want to make sure, yeah, I just got, I got in to be answering your questions, I think that the draft-18 for Vienna is absolutely the right call. I think, since we started going to draft-18, I feel like it's, it's gone better for two reasons. One, the interop that we got on 18 was like, to this point, was kind of abysmal. And I think in our second go at it, it will be much more complete because people got their aspirations in. They might make over line and, you know, know what they need to do, they'll keep chasing it. And it also sort of gives us a draft where we can put some stuff in that we think we want, and then some people who are very aggressive, they go out and implement those odd drafts, learn something about, "Oh, actually this thing probably won't work," and then you get a chance to pull it out before everybody's committed to it. I'm kind of thinking along those lines for some of the stuff yesterday as well. Let's see if we can get it in draft-18, get some early feedback and, you know, don't have to necessarily fully commit. So, I think that's a good target, and I think on the "how can we make interop runner better," I definitely think that what Colin is going is like, yeah, we need to get from these five tests, we need to get to 70 like, or whatever, like, and I don't know, like, that's a lot of, like, given that we're all out implementing the spec, like, I'm not sure how many people want to, like, sit and think about what those 70 test cases look like, and then everybody has to write their interop client also to get those 70 tests.
Mike English: Right. And yeah, and I want to think through, you know, how do we write the tests that have the most value first, and kind of slowly sequence those things in a way that like makes sense, because I don't think we want to like have somebody go off and spec 70 tests and then everybody's like on the hook to implement 70 tests.
Jordi Cenzano: And I guess I also, so, to make sure I understand correctly, the test that is there today are primarily control plane tests, and like the tool that I wrote in like November is like, entirely data plane tests, right, where it checks all of the features, like, there's 41 different tests against like the different things you can do in the data plane. So, I mean, trying to find a way to merge that might be good. Because like we found like under TLS on my interop, we found a lot of like, "Oh, like, it works when the extensions bit is on, but not when it's off," and like, those kinds of little, it's only one byte, but, no, I mean, I'm not throwing anybody under the bus. I'm just saying like, those, there's a lot of, there's a lot of bits and bobs in that as well, that if you haven't tested, you probably don't.
Alan Frindell: Thanks, Mike. We're out of time. Next up is Ali, who's going to present his test results. Also note that at least three of you have not signed into Meetecho, please do so if you're not already. If you're new, you may need a Datatracker account, which is, which takes a little bit of time to get, but it's, again, it's what you do, what you do to do it.
Ali Begen: Good morning. I am Ali Begen from Ozyegin University. And there are some results from my students. This, this study was conducted just before some switch from proposals came in, so that's not really included here. Can you go to presentation mode? Better? Yes, thank you. So, let's get input. So, the way we are, we have done these studies is, so we have subscriber-side track switching, and, you know, more specifically, this subscriber-initiated and relay-executed switch. And then on the other hand, one version of DTS where the relay is actually doing the switch for us. So, we delegate the relay to do the track switching for us. In the first paradigm, subscriber-initiated track switching, we have three methods. Forward State to Link, so forward 0 or 1. Join and Fetch approach, and the Switch message. And this Switch message is pretty close to what Gwendal has in his draft, but not identical. But the methodology is more or less the same, as something we have shown before. In this case, now, on the subscriber-side paradigm, the relay actually, since it doesn't know the catalog, it has no visibility into what tracks are related, so we need to let the relay know about this via request. And the subscriber must manage the timing by itself, because it can't really know, know the positions of the tracks on the relay. So, if we are going to switch from track A to track B, you know, how they are aligned, you know, whether there's some latency between them or not, the subscriber doesn't know that. On the other hand, if you go to the DTS approach, so the relay actually again doesn't know which tracks are related, so you need to let the relay know about this. But the timing now can be managed by the relay, because it knows exactly, you know, where the group boundaries are and the positions of individual tracks. Now, we would like to have a seamless switch, and by seamless switch we mean no stalls during the switch, so fully no frame storing, frame repeating. No gaps, this is also important, because especially when the tracks are actually, they have some position issues like one track is way ahead of another track, there could be some gaps. We call them skips, and this is irrecoverable missing content. So, you just jump one or two frames, and this is undesirable as well. And lastly, no wasted data. We don't want to have any redundant traffic, because some of these methods will introduce redundant traffic as well. Especially when you are downshifting because of some bandwidth issues, you know, traffic duplication is really bad, and you usually want to avoid that. So, I need to tell you about the testbed first, because if you think of every scenario, there will be really a lot of different scenarios. So, we tried to grab the most important cases. So, we have two tracks here, so this is a simulated switch, track A and B. And we do switch between track A and B. Track A is 720p, 2.5 megabits per second, synthetic stream. And track B is 1080p, so high resolution, 4 megabit, higher bit rate synthetic video stream. The GOPs are 25 frames, and 1-second duration. And the jitter buffer on the player side is half a second, 500 milliseconds. So, we test with A to B, B to A, and different types of positions in all of the different tests. Now, the scenarios we are looking at: first, the track alignment scenarios. In this case, whether the tracks are aligned on the relay at the time of switch, or whether one track is ahead of the other track. It could be they are in sync, or 1 second apart, or 2 seconds apart. 2 seconds apart means like there are two GOPs between them. And in the other scenario, we limit the bandwidth. So, in the first case, the bandwidth is not limited, so it's just a track alignment only. In the second case, there is a bandwidth limitation, and we have chosen these bandwidths carefully. 4.5 means, you know, track B is going to make it, track A is going to make it, but both of them, if they are sent both of them at the same time, they are not going to make it, right? It is less than the summation of both tracks. 3 megabits per second, only track A is going to make it. And 7 megabits per second is something sufficient for both tracks to be transmitted. So, different scenarios. And the last scenario is downstream delay. So, if your switch request, or somehow, you know, from relay to your subscriber, the delay is significant. We are testing that as well, so half a, half a second in this case. So, it's a baby, baby step iteration, but, you know, this is really too far, relay and that's where we are switching from. And the metrics we are looking at: switching delay, so how quickly I can switch; stall duration, whether I am stalled during the switch; skip duration, whether I skip any content; and finally, average excess traffic ratio. And we really would like to have this as small as possible. And this is a normalized value. 1 means one GOP of traffic has been redundant. 2 means twice the GOP duration traffic has been redundant. So, Forward State to Link, subscriber subscribes to both tracks, track A and B, right? At the switch time, we are going to send two request updates, 1 and 0, to switch on the relay side. There is no coordination, and totally this is three control messages. The risks are gaps or duplicate data, depending on where how the tracks are aligned. Joining Fetch is using a fetch message to buffer of the past GOPs. Subscriber first checks for largest object metadata to decide whether it is the fetch or not. And depending on this, three to five control messages. So, there is more control traffic in this case, and there are some corner cases where this completely fails. And the Switch message is the atomic message. You tell, "I want to switch from this track to this track," and the relay finds the appropriate group, and it's exactly one control message. The idea is to have zero gap, zero overlap, zero redundant traffic. But one problem with the Switch message, atleast in our implementation, is if the relay cannot find an appropriate switch position, it will timeout, and then it will send an error message. So, this table is describing what we have. And in terms of number of control messages, delivery gap, redundant traffic, and well-aligned switch, is clearly the winner here. Now, let's look at some numbers. I haven't included all the results, so just a few of them to simplify this presentation, but if you don't understand anything, just stop me, okay? So, on the left-hand side, skip the right table for now, just focus on the first table on the left. This is just a track alignment scenario and bandwidth is not limited, so we have unlimited bandwidth. When the tracks are in sync, so they arrive at the same time at the relay, the switching delay for Forward State to Link is about half a second. But that's also half a GOP, redundant traffic. If, if there is 2 seconds, the target is 2 seconds ahead, we are switching from A to B, then B is 2 seconds ahead, then, you know, the delay is still comparable, and then the redundant traffic is the same, but then obviously there will be skips. Right? Joining Fetch has similar delay, but no excess traffic, but it skips the 1 GOP. And the Switch method has a bit less delay, switching delay, and far less switching delay in the second case because the data is already there, and no redundant traffic. Now, this is again, bandwidth is not the issue, so client might be switching for another purpose, for another reason. And these are the results that we get from these three methods. Now, if we look at, at the right table, this is where the bandwidth is limited, but the tracks are in sync. So, we are trying to fix the other conditions while we are playing the others. So, A to B, remember, A is 2.5 megabits per second, and B is 4 megabits per second. And A to B is an upshift, and the bandwidth is 4.5. So, it is easy, it is also easy to get B, but you cannot get A plus B at the same time. So, Forward State to Link is going to have about half a second switching delay, and that much redundant traffic. But when you switch from B to A, although they are in sync, that's because of the bandwidth, the delay is going to increase to 7 seconds. This is extremely a bad switch, because you are switching at 7 seconds, 4.5 seconds of that 7 seconds is a stall, you are missing the other content, and you are just stuck. Joining Fetch also has a very large delay, almost 5 seconds, and then has very significant duplicate traffic because it can't really keep up with the switch. And then when switching to, from B to A, so you are downshifting at a lower bandwidth, again the delay is significant. The Switch message has a good delay when you are switching from A to B, so you are upshifting and the bandwidth is sufficient, but you are not really introducing an extra traffic because the bandwidth is sufficient, and you know where GOP boundaries are. But when you are downshifting at 3 megabits per second, there is also a delay. So, that's really a bad condition for any of these scenarios, and there is excessive traffic there too. Any questions so far? Yes.
Magnus Westerlund: I try to understand, so, if the limited bandwidth is four and a half megabits, and A is four, A is 2.5, B is four megabits.
Ali Begen: A is 2.5, B is 4 megabits.
Magnus Westerlund: Okay. Four, what I'm talking about is the B to A delay of 4.5. So, from the relay to the client, the subscriber, the bandwidth is 4.5, and the track bandwidths are 2.5 and 4.
Ali Begen: Yeah, so basically, the pipe is 4.5, and the, the, the B is, B is 4, so it fits. If it's a Forward State to Link, if I understand properly, there is no bandwidth overlap. Because you are downloading four, and then there is a point in the future that you say, "Hey, now, I want track A." I don't understand why there is a stall and why there is a delay, because that bandwidth should be used, you've changed, the, the pipe is 4.5, you're changing from 4 to 3. There is no overlap, so never, bandwidth is never higher than 4.5. So, why that is the result?
Ali Begen: Are you asking about the right top corner, right?
Magnus Westerlund: B to A, yes.
Ali Begen: Uh, let's see. Because you're, you're switching from a high bit rate, and can't complete it at the 3 megabit per second bandwidth.
Magnus Westerlund: But the pipe is four and a half.
Ali Begen: I am talking about B to A, where the subscriber, the bandwidth is 4.5.
Magnus Westerlund: And you can't download four, because you only have three. So, you're, every second, you're backing up another second.
Ali Begen: The pipe is 4.5. I am downloading four, and everything is fine. And then we start to, we start to switch.
Cullen Jennings: We're getting in 2 minutes. Both slides are already, yes. Oh, yes. Oh, okay, we're almost on 25 slides. Okay, thank you. Yeah, well, we will have to upload, I think, those slides, and then they'll show up. Well, we, we're going to have to duplicate our slides for both places. And then the, the, the ad hoc draft slides, definitely not be this morning, so you can put those in the afternoon session. Oh, so, privacy pass, MSF, secure objects, folk, we're not going to have a lock. We don't have a lock.
Okay. I think I wanted to secure objects, before privacy pass. I think that's what we discussed yesterday. I don't know why I typed the opposite there. So, Suhas, like, at best you'll get about 10 minutes, and I mean, given how we go, you may not get any time at all. And, and thank you for being flexible with that. I want to thank all of our, all of the authors of those drafts for being very understanding about maybe punting some issues to later, so we can really focus on MOQ this week, which has been great because I think we're, I think we're all ready to have that document finally. Um, is there any, any other issues?
Cullen Jennings: No, not really. I mean, I think we're, MOQ is going to fill the time allotted, and, um, I'm going to unusually for this meeting, I'm going to be pretty strict about the deadlines on those early issues so we can start at 11:15, and, um, and then we'll go until we run out of time. But if I, if we start with MOQ, we're not going to get off MOQ, that's kind of how this goes, right? So, I want to make sure these other people have an opportunity. Would anyone like to bash this agenda, as I'm eating into my speaking time?
Okay, with that, I am going to stop talking, and Mike will talk about how ENRP went this week.
Mike English: Alright, so Mike English, is this the, is this the interop report? No, no, no, no, the DTS. Oh, okay. Here is the...
Suhas Nandakumar: It is because we have to have a way to test...
Mike English: Oh, okay.
Ali Begen: Okay. And then, we're getting in 2 minutes. Both slides are already. Yes, okay, we're almost on 25 slides. Thank you. Yes.
So, Forward State to Link, subscriber subscribes to both tracks, track A and B, right? At the switch time, we are going to send two request updates, 1 and 0, to switch on the relay side. There is no coordination, and totally this is three control messages. The risks are gaps or duplicate data, depending on where how the tracks are aligned.
Join and Fetch is using a fetch message to buffer of the past GOPs. Subscriber first checks for largest object metadata to decide whether it is the fetch or not. And depending on this, three to five control messages. So, there is more control traffic in this case, and there are some corner cases where this completely fails.
And the Switch message is the atomic message. You tell, "I want to switch from this track to this track," and the relay finds the appropriate group, and it's exactly one control message. The idea is to have zero gap, zero overlap, zero redundant traffic. But one problem with the Switch message, atleast in our implementation, is if the relay cannot find an appropriate switch position, it will timeout, and then it will send an error message. So, this table is describing what we have. And in terms of number of control messages, delivery gap, redundant traffic, and well-aligned switch, is clearly the winner here.
Now, let's look at some numbers. I haven't included all the results, so just a few of them to simplify this presentation, but if you don't understand anything, just stop me, okay? So, on the left-hand side, skip the right table for now, just focus on the first table on the left. This is just a track alignment scenario and bandwidth is not limited, so we have unlimited bandwidth. When the tracks are in sync, so they arrive at the same time at the relay, the switching delay for Forward State to Link is about half a second. But that's also half a GOP, redundant traffic. If, if there is 2 seconds, the target is 2 seconds ahead, we are switching from A to B, then B is 2 seconds ahead, then, you know, the delay is still comparable, and then the redundant traffic is the same, but then obviously there will be skips. Right? Joining Fetch has similar delay, but no excess traffic, but it skips the 1 GOP. And the Switch method has a bit less delay, switching delay, and far less switching delay in the second case because the data is already there, and no redundant traffic. Now, this is again, bandwidth is not the issue, so client might be switching for another purpose, for another reason. And these are the results that we get from these three methods.
Now, if we look at, at the right table, this is where the bandwidth is limited, but the tracks are in sync. So, we are trying to fix the other conditions while we are playing the others. So, A to B, remember, A is 2.5 megabits per second, and B is 4 megabits per second. And A to B is an upshift, and the bandwidth is 4.5. So, it is easy, it is also easy to get B, but you cannot get A plus B at the same time. So, Forward State to Link is going to have about half a second switching delay, and that much redundant traffic. But when you switch from B to A, although they are in sync, that's because of the bandwidth, the delay is going to increase to 7 seconds. This is extremely a bad switch, because you are switching at 7 seconds, 4.5 seconds of that 7 seconds is a stall, you are missing the other content, and you are just stuck. Joining Fetch also has a very large delay, almost 5 seconds, and then has very significant duplicate traffic because it can't really keep up with the switch. And then when switching to, from B to A, so you are downshifting at a lower bandwidth, again the delay is significant. The Switch message has a good delay when you are switching from A to B, so you are upshifting and the bandwidth is sufficient, but you are not really introducing an extra traffic because the bandwidth is sufficient, and you know where GOP boundaries are. But when you are downshifting at 3 megabits per second, there is also a delay. So, that's really a bad condition for any of these scenarios, and there is excessive traffic there too.
Any questions so far? Yes.
Magnus Westerlund: I try to understand, so, if the limited bandwidth is four and a half megabits, and A is four, A is 2.5, B is four megabits.
Ali Begen: A is 2.5, B is 4 megabits.
Magnus Westerlund: Okay, so, what I'm talking about is the B to A delay of 4.5. So, from the relay to the client, the subscriber, the bandwidth is 4.5, and the track bandwidths are 2.5 and 4.
Ali Begen: Yeah, so basically, the pipe is 4.5, and the, the, the B is, B is 4, so it fits. If it's a Forward State to Link, if I understand properly, there is no bandwidth overlap. Because you are downloading four, and then there is a point in the future that you say, "Hey, now, I want track A." I don't understand why there is a stall and why there is a delay, because that bandwidth should be used, you've changed, the, the pipe is 4.5, you're changing from 4 to 3. There is no overlap, so never, bandwidth is never higher than 4.5. So, why that is the result?
Ali Begen: Are you asking about the right top corner, right?
Magnus Westerlund: B to A, yes.
Ali Begen: Uh, let's see. Because you're, you're switching from a high bit rate, and can't complete it at the 3 megabit per second bandwidth.
Magnus Westerlund: But the pipe is four and a half.
Ali Begen: I am talking about B to A, where the subscriber, the bandwidth is 4.5.
Magnus Westerlund: And the track bandwidths are 2.5 and 4.
Ali Begen: Yes. So, basically, the pipe is 4.5, and the, the B is 4. So, it fits.
Magnus Westerlund: Yes, but if you're switching to A, which is 2.5.
Ali Begen: Yes.
Magnus Westerlund: Why does it take 7 seconds?
Ali Begen: Because it's trying to finish the segment it was sending.
Magnus Westerlund: Yes.
Ali Begen: Till it can send the next one.
Magnus Westerlund: But what is measuring? What, what, what is, um, the number that we see, the delay is?
Ali Begen: The switching delay is by the time you make the switch request, and by the time the new track plays.
Magnus Westerlund: What are you trying to play? The first object of the track? You have to see the first byte of the desired track?
Ali Begen: Correct. It has to be object 0.
Magnus Westerlund: Yes.
Ali Begen: Okay.
Magnus Westerlund: Okay, I think it's because that means there's, there's, there's data queued at the relay to send, and when you switch the forward, it doesn't affect any of that queued data.
Ali Begen: Yes.
Magnus Westerlund: So, I think, I mean, I thought about that too when I was trying to write up a switch, and that's why I added that, not only do you, when you switch even in hard mode, you switch forward, you cancel any queued data you have to send yet on the old track. And that'll get your delay down if you add that extra step.
Ali Begen: Yeah, the, two points of attention. Subscriber is, if you switch immediately, uh, when the bandwidth is, uh, low, it's not completely realistic, because usually you check your, you check your buffer and see the way and then it makes switching, but, but this is something that we might, uh, improve. Second, second question is that, most of the interop, uh, the runner today is actually, uh, checking for the signaling handshakes and so on. At some point, we should also think about if we have, uh, a file transfer. Send media to the other side. Because some of the things that I was working on, work was streaming video. Um, but, I mean, at some point, not right now, figure out how to be get some of this.
Magnus Westerlund: Right. Yeah. So, the long-term goal would be to actually have a way to test, you know, further up the stack to media that's going from one implementation to another, or through a relay and back into a client.
Jordi Cenzano: Um, I was going through draft-18, like I, like I hadn't actually read the drafts since 14, and, and I was just reading draft-18, right, and read it, and, you know, and I think I saw, I saw roughly counting somewhere between 50 and 70 sort of what I viewed as sort of features or functionality things in draft-18. Um, I think we're at the point as we're starting to get closer to working group last call, we should start being identifying all of the sort of features, and then somehow, and I don't, I like, I'm looking for your guidance on it, having a spreadsheet to check that we, you know, how many implementations of this do we have, or, you know, whatever, do they work, you know? Um, because I think a lot of our testing here is like, "Yeah, basic stuff worked."
Ali Begen: Um, yeah, and let me actually, dig in a little to that. So, these tests, uh, with they and running, you know, if we zoom into one of these tests, um, these self-tests, uh, this, this is post-ends, super implementation, so we get, you know, a couple tests that passed against, uh, the Lorenzo's implementation, for example. Um, so we're able to exchange setup, but this test that's called "publish namespace only" fails, um, and this test that has "subscriber." So, the idea is that there are these specified test cases, um, and these are specified in the, repo here. So, in docs, uh, tests, this test cases file, um, has each of the tests that's implemented, specified. So, there's, you know, different features and, you know, how you actually exercise that systematically. Um, so the idea is that we would, we would write down like, "Okay, here's this feature. Here's how we want to test that it's implemented." And the test client implements this algorithm, you know, it says, "I'm going to send this and expect that." And so, right now we only have a handful of tests. Um, that was to kind of incentivize people to implement test clients and kind of get the basic framework in place first. I think we're now at the point where we want to expand the number of test cases to start to flesh out and say, "Okay, what other feature coverage do we have?"
Jordi Cenzano: I want to make sure, yeah, I just, I got in to be answering your questions, I think that the draft-18 for Vienna is absolutely the right call. I think, since we started going to draft-18, I feel like it's, it's gone better for two reasons. One, the interop that we got on 18 was like, to this point, was kind of abysmal. And I think in our second go at it, it will be much more complete, because people got their aspirations in. They might make over line and, you know, know what they need to do, they'll keep chasing it. And it also sort of gives us a draft where we can put some stuff in that we think we want, and then some people who are very aggressive, they go out and implement those odd drafts, learn something about, "Oh, actually this thing probably won't work," and then you get a chance to pull it out before everybody's committed to it. Um, I'm kind of thinking along those lines for some of the stuff yesterday as well. Let's see if we can get it in draft-18, get some early feedback and, you know, don't have to necessarily fully commit. Um, so I, I think that's a good target, and I think on the "how can we make interop runner better," I definitely think that what Colin is going is like, yeah, we need to get from these five tests, we need to get to 70 like, or whatever, like, and I don't know, like, that's a lot of, like, given that we're all out implementing the spec, like, I'm not sure how many people want to, like, sit and think about what those 70 test cases look like, and then everybody has to write their interop client also to get those 70 tests.
Magnus Westerlund: Right.
Ali Begen: Yeah, and I want to think through, you know, how do we write the tests that have the most value first, and kind of slowly sequence those things in a way that like, makes sense, because I don't think we want to like have somebody go off and spec 70 tests and then everybody's like on the hook to implement 70 tests.
Jordi Cenzano: And I guess I also, so, to make sure I understand correctly, the test that is there today are primarily control plane tests, and like the tool that I wrote in like November is like, entirely data plane tests.
Ali Begen: Yeah, so basically, the pipe is 4.5, and the, the B is 4. So, it fits.
Jordi Cenzano: Yeah, so basically, the pipe is 4.5, and the, the B is 4. So, it fits.
Alan Frindell: Thanks, Mike. Um, we're out of time. Next up is Ali, who's going to present his test results. Also note that at least three of you have not signed into Meetecho, please do so if you're not already. If you're new, you may need a Datatracker account, which is, which takes a little bit of time to get, but it's, again, it's what you do, what you do to do it.
Ali Begen: Good morning. Um, I am Ali Begen from Ozyegin University. And there are some results from my students. And, uh, this, this study was conducted just before, uh, switch from proposals came in. So, that's not really included here. Can you go to presentation mode? Better? Yes, thank you. So, let's get input. So, the way we are, we have done these studies is, um, so we have subscriber-side track switching, uh, and, you know, more specifically, this subscriber-initiated and relay-executed switch. And then on the other hand, one version of DTS where the relay is actually doing the switch for us. So, we delegate the relay to do the track switching for us. In the first paradigm, subscriber-initiated track switching, we have three methods. Forward State to Link, so forward 0 or 1. Joining Fetch approach, and the Switch message. And this Switch message is pretty close to what Gwendal has in his draft, but not identical. But the methodology is more or less the same, as something we have shown before. In this case, now, on the subscriber-side paradigm, the relay actually, since it doesn't know the catalog, it has no visibility into what tracks are related, so we need to let the relay know about this via request. And the subscriber must manage the timing by itself, because it can't really know, know the positions of the tracks on the relay. So, if we are going to switch from track A to track B, you know, how they are aligned, you know, whether there's some latency between them or not, the subscriber doesn't know that. On the other hand, if you go to the DTS approach, so the relay actually again doesn't know which tracks are related, so you need to let the relay know about this. But the timing now can be managed by the relay, because it knows exactly, you know, where the group boundaries are and the positions of individual tracks. Now, we would like to have a seamless switch, and by seamless switch we mean no stalls during the switch, so fully no frame storing, frame repeating. No gaps, this is also important, because especially when the tracks are actually, they have some position issues like one track is way ahead of another track, there could be some gaps. We call them skips, and this is irrecoverable missing content. So, you just jump one or two frames, and this is undesirable as well. And lastly, no wasted data. We don't want to have any redundant traffic, because some of these methods will introduce redundant traffic as well. Especially when you are downshifting because of some bandwidth issues, you know, traffic duplication is really bad, and you usually want to avoid that. So, I need to tell you about the testbed first, because if you think of every scenario, there will be really a lot of different scenarios. So, we tried to grab the most important cases. So, we have two tracks here, so this is a simulated switch, track A and B. And we do switch between track A and B. Track A is 720p, 2.5 megabits per second, synthetic stream. And track B is 1080p, so high resolution, 4 megabit, higher bit rate synthetic video stream. The GOPs are 25 frames, and 1-second duration. And the jitter buffer on the player side is half a second, 500 milliseconds. So, we test with A to B, B to A, and different types of positions in all of the different tests. Now, the scenarios we are looking at: first, the track alignment scenarios. In this case, whether the tracks are aligned on the relay at the time of switch, or whether one track is ahead of the other track. It could be they are in sync, or 1 second apart, or 2 seconds apart. 2 seconds apart means like there are two GOPs between them. And in the other scenario, we limit the bandwidth. So, in the first case, the bandwidth is not limited, so it's just a track alignment only. In the second case, there is a bandwidth limitation, and we have chosen these bandwidths carefully. 4.5 means, you know, track B is going to make it, track A is going to make it, but both of them, if they are sent both of them at the same time, they are not going to make it, right? It is less than the summation of both tracks. 3 megabits per second, only track A is going to make it. And 7 megabits per second is something sufficient for both tracks to be transmitted. So, different scenarios. And the last scenario is downstream delay. So, if your switch request, or somehow, you know, from relay to your subscriber, the delay is significant. We are testing that as well, so half a, half a second in this case. So, it's a baby, baby step iteration, but, you know, this is really too far, relay and that's where we are switching from. And the metrics we are looking at: switching delay, so how quickly I can switch; stall duration, whether I am stalled during the switch; skip duration, whether I skip any content; and finally, average excess traffic ratio. And we really would like to have this as small as possible. And this is a normalized value. 1 means one GOP of traffic has been redundant. 2 means twice the GOP duration traffic has been redundant. So, Forward State to Link, subscriber subscribes to both tracks, track A and B, right? At the switch time, we are going to send two request updates, 1 and 0, to switch on the relay side. There is no coordination, and totally this is three control messages. The risks are gaps or duplicate data, depending on where how the tracks are aligned. Joining Fetch is using a fetch message to buffer of the past GOPs. Subscriber first checks for largest object metadata to decide whether it is the fetch or not. And depending on this, three to five control messages. So, there is more control traffic in this case, and there are some corner cases where this completely fails. And the Switch message is the atomic message. You tell, "I want to switch from this track to this track," and the relay finds the appropriate group, and it's exactly one control message. The idea is to have zero gap, zero overlap, zero redundant traffic. But one problem with the Switch message, atleast in our implementation, is if the relay cannot find an appropriate switch position, it will timeout, and then it will send an error message. So, this table is describing what we have. And in terms of number of control messages, delivery gap, redundant traffic, and well-aligned switch, is clearly the winner here. Now, let's look at some numbers. I haven't included all the results, so just a few of them to simplify this presentation, but if you don't understand anything, just stop me, okay? So, on the left-hand side, skip the right table for now, just focus on the first table on the left. This is just a track alignment scenario and bandwidth is not limited, so we have unlimited bandwidth. When the tracks are in sync, so they arrive at the same time at the relay, the switching delay for Forward State to Link is about half a second. But that's also half a GOP, redundant traffic. If, if there is 2 seconds, the target is 2 seconds ahead, we are switching from A to B, then B is 2 seconds ahead, then, you know, the delay is still comparable, and then the redundant traffic is the same, but then obviously there will be skips. Right? Joining Fetch has similar delay, but no excess traffic, but it skips the 1 GOP. And the Switch method has a bit less delay, switching delay, and far less switching delay in the second case because the data is already there, and no redundant traffic. Now, this is again, bandwidth is not the issue, so client might be switching for another purpose, for another reason. And these are the results that we get from these three methods. Now, if we look at, at the right table, this is where the bandwidth is limited, but the tracks are in sync. So, we are trying to fix the other conditions while we are playing the others. So, A to B, remember, A is 2.5 megabits per second, and B is 4 megabits per second. And A to B is an upshift, and the bandwidth is 4.5. So, it is easy, it is also easy to get B, but you cannot get A plus B at the same time. So, Forward State to Link is going to have about half a second switching delay, and that much redundant traffic. But when you switch from B to A, although they are in sync, that's because of the bandwidth, the delay is going to increase to 7 seconds. This is extremely a bad switch, because you are switching at 7 seconds, 4.5 seconds of that 7 seconds is a stall, you are missing the other content, and you are just stuck. Joining Fetch also has a very large delay, almost 5 seconds, and then has very significant duplicate traffic because it can't really keep up with the switch. And then when switching to, from B to A, so you are downshifting at a lower bandwidth, again the delay is significant. The Switch message has a good delay when you are switching from A to B, so you are upshifting and the bandwidth is sufficient, but you are not really introducing an extra traffic because the bandwidth is sufficient, and you know where GOP boundaries are. But when you are downshifting at 3 megabits per second, there is also a delay. So, that's really a bad condition for any of these scenarios, and there is excessive traffic there too. Any questions so far? Yes.
Magnus Westerlund: I try to understand, so, if the limited bandwidth is four and a half megabits, and A is four, A is 2.5, B is four megabits.
Ali Begen: A is 2.5, B is 4 megabits.
Magnus Westerlund: Okay, so, what I'm talking about is the B to A delay of 4.5. So, from the relay to the client, the subscriber, the bandwidth is 4.5, and the track bandwidths are 2.5 and 4.
Ali Begen: Yeah, so basically, the pipe is 4.5, and the, the, the B is, B is 4, so it fits. If it's a Forward State to Link, if I understand properly, there is no bandwidth overlap. Because you are downloading four, and then there is a point in the future that you say, "Hey, now, I want track A." I don't understand why there is a stall and why there is a delay, because that bandwidth should be used, you've changed, the, the pipe is 4.5, you're changing from 4 to 3. There is no overlap, so never, bandwidth is never higher than 4.5. So, why that is the result?
Ali Begen: Are you asking about the right top corner, right?
Magnus Westerlund: B to A, yes.
Ali Begen: Uh, let's see. Because you're, you're switching from a high bit rate, and can't complete it at the 3 megabit per second bandwidth.
Magnus Westerlund: But the pipe is four and a half.
Ali Begen: I am talking about B to A, where the subscriber, the bandwidth is 4.5.
Magnus Westerlund: And the track bandwidths are 2.5 and 4.
Ali Begen: Yes. So, basically, the pipe is 4.5, and the, the B is 4. So, it fits.
Magnus Westerlund: Yes, but if you're switching to A, which is 2.5.
Ali Begen: Yes.
Magnus Westerlund: Why does it take 7 seconds?
Ali Begen: Because it's trying to finish the segment it was sending.
Magnus Westerlund: Yes.
Ali Begen: Till it can send theAli Begen: next one.
Magnus Westerlund: What is measuring? What, what, what is, um, the number that we see, the delay is?
Ali Begen: The switching delay is by the time you make the switch request, and by the time the new track plays.
Magnus Westerlund: What are you trying to play? The first object of the track? You have to see the first byte of the desired track?
Ali Begen: Correct. It has to be object 0.
Magnus Westerlund: Yes.
Ali Begen: Okay.
Magnus Westerlund: Okay, I think it's because that means there's, there's, there's data queued at the relay to send, and when you switch the forward, it doesn't affect any of that queued data.
Ali Begen: Yes.
Magnus Westerlund: So, I think, I mean, I thought about that too when I was trying to write up a switch, and that's why I added that, not only do you, when you switch even in hard mode, you switch forward, you cancel any queued data you have to send yet on the old track. And that'll get your delay down if you add that extra step.
Ali Begen: Yeah, the, two points of attention. Subscriber is, if you switch immediately, uh, when the bandwidth is, uh, low, it's not completely realistic, because usually you check your, you check your buffer and see the way and then it makes switching, but, but this is something that we might, uh, improve. Second, second question is that, most of the interop, uh, the runner today is actually, uh, checking for the signaling handshakes and so on. At some point, we should also think about if we have, uh, a file transfer. Send media to the other side. Because some of the things that I was working on, work was streaming video. Um, but, I mean, at some point, not right now, figure out how to be get some of this.
Magnus Westerlund: Right. Yeah. So, the long-term goal would be to actually have a way to test, you know, further up the stack to media that's going from one implementation to another, or through a relay and back into a client.
Jordi Cenzano: Um, I was going through draft-18, like I, like I hadn't actually read the drafts since 14, and, and I was just reading draft-18, right, and read it, and, you know, and I think I saw, I saw roughly counting somewhere between 50 and 70 sort of what I viewed as sort of features or functionality things in draft-18. Um, I think we're at the point as we're starting to get closer to working group last call, we should start being identifying all of the sort of features, and then somehow, and I don't, I like, I'm looking for your guidance on it, having a spreadsheet to check that we, you know, how many implementations of this do we have, or, you know, whatever, do they work, you know? Um, because I think a lot of our testing here is like, "Yeah, basic stuff worked."
Ali Begen: Um, yeah, and let me actually, dig in a little to that. So, these tests, uh, with they and running, you know, if we zoom into one of these tests, um, these self-tests, uh, this, this is post-ends, super implementation, so we get, you know, a couple tests that passed against, uh, the Lorenzo's implementation, for example. Um, so we're able to exchange setup, but this test that's called "publish namespace only" fails, um, and this test that has "subscriber." So, the idea is that there are these specified test cases, um, and these are specified in the, repo here. So, in docs, uh, tests, this test cases file, um, has each of the tests that's implemented, specified. So, there's, you know, different features and, you know, how you actually exercise that systematically. Um, so the idea is that we would, we would write down like, "Okay, here's this feature. Here's how we want to test that it's implemented." And the test client implements this algorithm, you know, it says, "I'm going to send this and expect that." And so, right now we only have a handful of tests. Um, that was to kind of incentivize people to implement test clients and kind of get the basic framework in place first. I think we're now at the point where we want to expand the number of test cases to start to flesh out and say, "Okay, what other feature coverage do we have?"
Jordi Cenzano: I want to make sure, yeah, I just, I got in to be answering your questions, I think that the draft-18 for Vienna is absolutely the right call. I think, since we started going to draft-18, I feel like it's, it's gone better for two reasons. One, the interop that we got on 18 was like, to this point, was kind of abysmal. And I think in our second go at it, it will be much more complete, because people got their aspirations in. They might make over line and, you know, know what they need to do, they'll keep chasing it. And it also sort of gives us a draft where we can put some stuff in that we think we want, and then some people who are very aggressive, they go out and implement those odd drafts, learn something about, "Oh, actually this thing probably won't work," and then you get a chance to pull it out before everybody's committed to it. Um, I'm kind of thinking along those lines for some of the stuff yesterday as well. Let's see if we can get it in draft-18, get some early feedback and, you know, don't have to necessarily fully commit. Um, so I, I think that's a good target, and I think on the "how can we make interop runner better," I definitely think that what Colin is going is like, yeah, we need to get from these five tests, we need to get to 70 like, or whatever, like, and I don't know, like, that's a lot of, like, given that we're all out implementing the spec, like, I'm not sure how many people want to, like, sit and think about what those 70 test cases look like, and then everybody has to write their interop client also to get those 70 tests.
Ali Begen: Yeah, I want to think through, you know, how do we write the tests that have the most value first and kind of slowly sequence those things in a way that like, makes sense, because I don't think we want to like have somebody go off and spec 70 tests and then everybody's like on the hook to implement 70 tests.
Jordi Cenzano: And I guess I also, so, to make sure I understand correctly, the test that is there today are primarily control plane tests and like the tool that I wrote in like November is like, entirely data plane tests, right, where it checks all of the features, like, there's 41 different tests against like the different things you can do on the data plane. So, I mean, trying to find a way to merge that might be good. Because like we found like under TLS on my interop, we found a lot of like, "Oh, like, it works when the extensions bit is on, but not when it's off," and like, those kinds of little, it's only one byte, but, no, I mean, I'm not throwing anybody under the bus. I'm just saying like, those, there's a lot of, there's a lot of bits and bobs in that as well, that if you haven't tested, you probably don't.
Alan Frindell: Thanks, Ali. We're out of time. Next up is Colin, who's going to present his, test results. Also note that at least three of you have not signed into Meetecho, please do so if you're not already. If you're new, you may need a Datatracker account, which is, which takes a little bit of time to get, but it's, again, it's what you do, what you do to do it.
Colin: Good morning. I am Colin from Ozyegin University. And there are some results from my students. This, this study was conducted just before some switch from proposals came in, so that's not really included here. Can you go to presentation mode? Better? Yes, thank you. So, let's get input. So, the way we are, we have done these studies is, so we have subscriber-side track switching, and, you know, more specifically, this subscriber-initiated and relay-executed switch. And then on the other hand, one version of DTS where the relay is actually doing the switch for us. So, we delegate the relay to do the track switching for us. In the first paradigm, subscriber-initiated track switching, we have three methods. Forward State to Link, so forward 0 or 1. Joining Fetch approach, and the Switch message. And this Switch message is pretty close to what Gwendal has in his draft, but not identical. But the methodology is more or less the same, as something we have shown before. In this case, now, on the subscriber-side paradigm, the relay actually, since it doesn't know the catalog, it has no visibility into what tracks are related, so we need to let the relay know about this via request. And the subscriber must manage the timing by itself, because it can't really know, know the positions of the tracks on the relay. So, if we are going to switch from track A to track B, you know, how they are aligned, you know, whether there's some latency between them or not, the subscriber doesn't know that. On the other hand, if you go to the DTS approach, so the relay actually again doesn't know which tracks are related, so you need to let the relay know about this. But the timing now can be managed by the relay, because it knows exactly, you know, where the group boundaries are and the positions of individual tracks. Now, we would like to have a seamless switch, and by seamless switch we mean no stalls during the switch, so fully no frame storing, frame repeating. No gaps, this is also important, because especially when the tracks are actually, they have some position issues like one track is way ahead of another track, there could be some gaps. We call them skips, and this is irrecoverable missing content. So, you just jump one or two frames, and this is undesirable as well. And lastly, no wasted data. We don't want to have any redundant traffic, because some of these methods will introduce redundant traffic as well. Especially when you are downshifting because of some bandwidth issues, you know, traffic duplication is really bad, and you usually want to avoid that. So, I need to tell you about the testbed first, because if you think of every scenario, there will be really a lot of different scenarios. So, we tried to grab the most important cases. So, we have two tracks here, so this is a simulated switch, track A and B. And we do switch between track A and B. Track A is 720p, 2.5 megabits per second, synthetic stream. And track B is 1080p, so high resolution, 4 megabit, higher bit rate synthetic video stream. The GOPs are 25 frames, and 1-second duration. And the jitter buffer on the player side is half a second, 500 milliseconds. So, we test with A to B, B to A, and different types of positions in all of the different tests. Now, the scenarios we are looking at: first, the track alignment scenarios. In this case, whether the tracks are aligned on the relay at the time of switch, or whether one track is ahead of the other track. It could be they are in sync, or 1 second apart, or 2 seconds apart. 2 seconds apart means like there are two GOPs between them. And in the other scenario, we limit the bandwidth. So, in the first case, the bandwidth is not limited, so it's just a track alignment only. In the second case, there is a bandwidth limitation, and we have chosen these bandwidths carefully. 4.5 means, you know, track B is going to make it, track A is going to make it, but both of them, if they are sent both of them at the same time, they are not going to make it, right? It is less than the summation of both tracks. 3 megabits per second, only track A is going to make it. And 7 megabits per second is something sufficient for both tracks to be transmitted. So, different scenarios. And the last scenario is downstream delay. So, if your switch request, or somehow, you know, from relay to your subscriber, the delay is significant. We are testing that as well, so half a, half a second in this case. So, it's a baby, baby step iteration, but, you know, this is really too far, relay and that's where we are switching from. And the metrics we are looking at: switching delay, so how quickly I can switch; stall duration, whether I am stalled during the switch; skip duration, whether I skip any content; and finally, average excess traffic ratio. And we really would like to have this as small as possible. And this is a normalized value. 1 means one GOP of traffic has been redundant. 2 means twice the GOP duration traffic has been redundant. So, Forward State to Link, subscriber subscribes to both tracks, track A and B, right? At the switch time, we are going to send two request updates, 1 and 0, to switch on the relay side. There is no coordination, and totally this is three control messages. The risks are gaps or duplicate data, depending on where how the tracks are aligned. Joining Fetch is using a fetch message to buffer of the past GOPs. Subscriber first checks for largest object metadata to decide whether it is the fetch or not. And depending on this, three to five control messages. So, there is more control traffic in this case, and there are some corner cases where this completely fails. And the Switch message is the atomic message. You tell, "I want to switch from this track to this track," and the relay finds the appropriate group, and it's exactly one control message. The idea is to have zero gap, zero overlap, zero redundant traffic. But one problem with the Switch message, atleast in our implementation, is if the relay cannot find an appropriate switch position, it will timeout, and then it will send an error message. So, this table is describing what we have. And in terms of number of control messages, delivery gap, redundant traffic, and well-aligned switch, is clearly the winner here. Now, let's look at some numbers. I haven't included all the results, so just a few of them to simplify this presentation, but if you don't understand anything, just stop me, okay? So, on the left-hand side, skip the right table for now, just focus on the first table on the left. This is just a track alignment scenario and bandwidth is not limited, so we have unlimited bandwidth. When the tracks are in sync, so they arrive at the same time at the relay, the switching delay for Forward State to Link is about half a second. But that's also half a GOP, redundant traffic. If, if there is 2 seconds, the target is 2 seconds ahead, we are switching from A to B, then B is 2 seconds ahead, then, you know, the delay is still comparable, and then the redundant traffic is the same, but then obviously there will be skips. Right? Joining Fetch has similar delay, but no excess traffic, but it skips the 1 GOP. And the Switch method has a bit less delay, switching delay, and far less switching delay in the second case because the data is already there, and no redundant traffic. Now, this is again, bandwidth is not the issue, so client might be switching for another purpose, for another reason. And these are the results that we get from these three methods. Now, if we look at, at the right table, this is where the bandwidth is limited, but the tracks are in sync. So, we are trying to fix the other conditions while we are playing the others. So, A to B, remember, A is 2.5 megabits per second, and B is 4 megabits per second. And A to B is an upshift, and the bandwidth is 4.5. So, it is easy, it is also easy to get B, but you cannot get A plus B at the same time. So, Forward State to Link is going to have about half a second switching delay, and that much redundant traffic. But when you switch from B to A, although they are in sync, that's because of the bandwidth, the delay is going to increase to 7 seconds. This is extremely a bad switch, because you are switching at 7 seconds, 4.5 seconds of that 7 seconds is a stall, you are missing the other content, and you are just stuck. Joining Fetch also has a very large delay, almost 5 seconds, and then has very significant duplicate traffic because it can't really keep up with the switch. And then when switching to, from B to A, so you are downshifting at a lower bandwidth, again the delay is significant. The Switch message has a good delay when you are switching from A to B, so you are upshifting and the bandwidth is sufficient, but you are not really introducing an extra traffic because the bandwidth is sufficient, and you know where GOP boundaries are. But when you are downshifting at 3 megabits per second, there is also a delay. So, that's really a bad condition for any of these scenarios, and there is excessive traffic there too. Any questions so far? Yes.
Magnus Westerlund: I try to understand, so, if the limited bandwidth is four and a half megabits, and A is four, A is 2.5, B is four megabits.
Colin: A is 2.5, B is 4 megabits.
Magnus Westerlund: Okay, so, what I'm talking about is the B to A delay of 4.5. So, from the relay to the client, the subscriber, the bandwidth is 4.5, and the track bandwidths are 2.5 and 4.
Colin: Yeah, so basically, the pipe is 4.5, and the, the, the B is, B is 4, so it fits. If it's a Forward State to Link, if I understand properly, there is no bandwidth overlap. Because you are downloading four, and then there is a point in the future that you say, "Hey, now, I want track A." I don't understand why there is a stall and why there is a delay, because that bandwidth should be used, you've changed, the, the pipe is 4.5, you're changing from 4 to 3. There is no overlap, so never, bandwidth is never higher than 4.5. So, why that is the result?
Colin: Are you asking about the right top corner, right?
Magnus Westerlund: B to A, yes.
Colin: Uh, let's see. Because you're, you're switching from a high bit rate, and can't complete it at the 3 megabit per second bandwidth.
Magnus Westerlund: But the pipe is four and a half.
Colin: I am talking about B to A, where the subscriber, the bandwidth is 4.5.
Magnus Westerlund: And the track bandwidths are 2.5 and 4.
Colin: Yes. So, basically, the pipe is 4.5, and the, the B is 4. So, it fits.
Magnus Westerlund: Yes, but if you're switching to A, which is 2.5.
Colin: Yes.
Magnus Westerlund: Why does it take 7 seconds?
Colin: Because it's trying to finish the segment it was sending.
Magnus Westerlund: Yes.
Colin: Till it can send the next one.
Magnus Westerlund: What is measuring? What, what, what is, um, the number that we see, the delay is?
Colin: The switching delay is by the time you make the switch request, and by the time the new track plays.
Magnus Westerlund: What are you trying to play? The first object of the track? You have to see the first byte of the desired track?
Colin: Correct. It has to be object 0.
Magnus Westerlund: Yes.
Colin: Okay.
Magnus Westerlund: Okay, I think it's because that means there's, there's, there's data queued at the relay to send, and when you switch the forward, it doesn't affect any of that queued data.
Colin: Yes.
Magnus Westerlund: So, I think, I mean, I thought about that too when I was trying to write up a switch, and that's why I added that, not only do you, when you switch even in hard mode, you switch forward, you cancel any queued data you have to send yet on the old track. And that'll get your delay down if you add that extra step.
Colin: Yeah, the, two points of attention. Subscriber is, if you switch immediately, uh, when the bandwidth is, uh, low, it's not completely realistic, because usually you check your, you check your buffer and see the way and then it makes switching, but, but this is something that we might, uh, improve. Second, second question is that, most of the interop, uh, the runner today is actually, uh, checking for the signaling handshakes and so on. At some point, we should also think about if we have, uh, a file transfer. Send media to the other side. Because some of the things that I was working on, work was streaming video. Um, but, I mean, at some point, not right now, figure out how to be get some of this.
Magnus Westerlund: Right. Yeah. So, the long-term goal would be to actually have a way to test, you know, further up the stack to media that's going from one implementation to another, or through a relay and back into a client.
Jordi Cenzano: Um, I was going through draft-18, like I, like I hadn't actually read the drafts since 14, and, and I was just reading draft-18, right, and read it, and, you know, and I think I saw, I saw roughly counting somewhere between 50 and 70 sort of what I viewed as sort of features or functionality things in draft-18. Um, I think we're at the point as we're starting to get closer to working group last call, we should start being identifying all of the sort of features, and then somehow, and I don't, I like, I'm looking for your guidance on it, having a spreadsheet to check that we, you know, how many implementations of this do we have, or, you know, whatever, do they work, you know? Um, because I think a lot of our testing here is like, "Yeah, basic stuff worked."
Colin: Um, yeah, and let me actually, dig in a little to that. So, these tests, uh, with they and running, you know, if we zoom into one of these tests, um, these self-tests, uh, this, this is post-ends, super implementation, so we get, you know, a couple tests that passed against, uh, the Lorenzo's implementation, for example. Um, so we're able to exchange setup, but this test that's called "publish namespace only" fails, um, and this test that has "subscriber." So, the idea is that there are these specified test cases, um, and these are specified in the, repo here. So, in docs, uh, tests, this test cases file, um, has each of the tests that's implemented, specified. So, there's, you know, different features and, you know, how you actually exercise that systematically. Um, so the idea is that we would, we would write down like, "Okay, here's this feature. Here's how we want to test that it's implemented." And the test client implements this algorithm, you know, it says, "I'm going to send this and expect that." And so, right now we only have a handful of tests. Um, that was to kind of incentivize people to implement test clients and kind of get the basic framework in place first. I think we're now at the point where we want to expand the number of test cases to start to flesh out and say, "Okay, what other feature coverage do we have?"
Jordi Cenzano: I want to make sure, yeah, I just, I got in to be answering your questions, I think that the draft-18 for Vienna is absolutely the right call. I think, since we started going to draft-18, I feel like it's, it's gone better for two reasons. One, the interop that we got on 18 was like, to this point, was kind of abysmal. And I think in our second go at it, it will be much more complete, because people got their aspirations in. They might make over line and, you know, know what they need to do, they'll keep chasing it. And it also sort of gives us a draft where we can put some stuff in that we think we want, and then some people who are very aggressive, they go out and implement those odd drafts, learn something about, "Oh, actually this thing probably won't work," and then you get a chance to pull it out before everybody's committed to it. Um, I'm kind of thinking along those lines for some of the stuff yesterday as well. Let's see if we can get it in draft-18, get some early feedback and, you know, don't have to necessarily fully commit. Um, so I, I think that's a good target, and I think on the "how can we make interop runner better," I definitely think that what Colin is going is like, yeah, we need to get from these five tests, we need to get to 70 like, or whatever, like, and I don't know, like, that's a lot of, like, given that we're all out implementing the spec, like, I'm not sure how many people want to, like, sit and think about what those 70 test cases look like, and then everybody has to write their interop client also to get those 70 tests.
Colin: Yeah, and I want to think through, you know, how do we write the tests that have the most value first, and kind of slowly sequence those things in a way that like, makes sense, because I don't think we want to like have somebody go off and spec 70 tests and then everybody's like on the hook to implement 70 tests.
Jordi Cenzano: And I guess I also, so, to make sure I understand correctly, the test that is there today are primarily control plane tests, and like the tool that I wrote in like November is like, entirely data plane tests, right, where it checks all of the features, like, there's 41 different tests against like the different things you can do on the data plane. So, I mean, trying to find a way to merge that might be good. Because like we found like under TLS on my interop, we found a lot of like, "Oh, like, it works when the extensions bit is on, but not when it's off," and like, those kinds of little, it's only one byte, but, no, I mean, I'm not throwing anybody under the bus. I'm just saying like, those, there's a lot of, there's a lot of bits and bobs in that as well, that if you haven't tested, you probably don't.
Alan Frindell: Thanks, Ali. Um, we're out of time. Next up is Colin, who's going to present his, test results. Also note that at least three of you have not signed into Meetecho, please do so if you're not already. If you're new, you may need a Datatracker account, which is, which takes a little bit of time to get, but it's, again, it's what you do, what you do to do it.
Colin: Good morning. Um, I am Colin from Ozyegin University. And there are some results from my students. And, uh, this, this study was conducted just before, uh, switch from proposals came in. So, that's not really included here. Can you go to presentation mode? Better? Yes, thank you. So, let's get input. So, the way we are, we have done these studies is, um, so we have subscriber-side track switching, uh, and, you know, more specifically, this subscriber-initiated and relay-executed switch. And then on the other hand, one version of DTS where the relay is actually doing the switch for us. So, we delegate the relay to do the track switching for us. In the first paradigm, subscriber-initiated track switching, we have three methods. Forward State to Link, so forward 0 or 1. Joining Fetch approach, and the Switch message. And this Switch message is pretty close to what Gwendal has in his draft, but not identical. But the methodology is more or less the same, as something we have shown before. In this case, now, on the subscriber-side paradigm, the relay actually, since it doesn't know the catalog, it has no visibility into what tracks are related, so we need to let the relay know about this via request. And the subscriber must manage the timing by itself, because it can't really know, know the positions of the tracks on the relay. So, if we are going to switch from track A to track B, you know, how they are aligned, you know, whether there's some latency between them or not, the subscriber doesn't know that. On the other hand, if you go to the DTS approach, so the relay actually again doesn't know which tracks are related, so you need to let the relay know about this. But the timing now can be managed by the relay, because it knows exactly, you know, where the group boundaries are and the positions of individual tracks. Now, we would like to have a seamless switch, and by seamless switch we mean no stalls during the switch, so fully no frame storing, frame repeating. No gaps, this is also important, because especially when the tracks are actually, they have some position issues like one track is way ahead of another track, there could be some gaps. We call them skips, and this is irrecoverable missing content. So, you just jump one or two frames, and this is undesirable as well. And lastly, no wasted data. We don't want to have any redundant traffic, because some of these methods will introduce redundant traffic as well. Especially when you are downshifting because of some bandwidth issues, you know, traffic duplication is really bad, and you usually want to avoid that. So, I need to tell you about the testbed first, because if you think of every scenario, there will be really a lot of different scenarios. So, we tried to grab the most important cases. So, we have two tracks here, so this is a simulated switch, track A and B. And we do switch between track A and B. Track A is 720p, 2.5 megabits per second, synthetic stream. And track B is 1080p, so high resolution, 4 megabit, higher bit rate synthetic video stream. The GOPs are 25 frames, and 1-second duration. And the jitter buffer on the player side is half a second, 500 milliseconds. So, we test with A to B, B to A, and different types of positions in all of the different tests. Now, the scenarios we are looking at: first, the track alignment scenarios. In this case, whether the tracks are aligned on the relay at the time of switch, or whether one track is ahead of the other track. It could be they are in sync, or 1 second apart, or 2 seconds apart. 2 seconds apart means like there are two GOPs between them. And in the other scenario, we limit the bandwidth. So, in the first case, the bandwidth is not limited, so it's just a track alignment only. In the second case, there is a bandwidth limitation, and we have chosen these bandwidths carefully. 4.5 means, you know, track B is going to make it, track A is going to make it, but both of them, if they are sent both of them at the same time, they are not going to make it, right? It is less than the summation of both tracks. 3 megabits per second, only track A is going to make it. And 7 megabits per second is something sufficient for both tracks to be transmitted. So, different scenarios. And the last scenario is downstream delay. So, if your switch request, or somehow, you know, from relay to your subscriber, the delay is significant. We are testing that as well, so half a, half a second in this case. So, it's a baby, baby step iteration, but, you know, this is really too far, relay and that's where we are switching from. And the metrics we are looking at: switching delay, so how quickly I can switch; stall duration, whether I am stalled during the switch; skip duration, whether I skip any content; and finally, average excess traffic ratio. And we really would like to have this as small as possible. And this is a normalized value. 1 means one GOP of traffic has been redundant. 2 means twice the GOP duration traffic has been redundant. So, Forward State to Link, subscriber subscribes to both tracks, track A and B, right? At the switch time, we are going to send two request updates, 1 and 0, to switch on the relay side. There is no coordination, and totally this is three control messages. The risks are gaps or duplicate data, depending on where how the tracks are aligned. Joining Fetch is using a fetch message to buffer of the past GOPs. Subscriber first checks for largest object metadata to decide whether it is the fetch or not. And depending on this, three to five control messages. So, there is more control traffic in this case, and there are some corner cases where this completely fails. And the Switch message is the atomic message. You tell, "I want to switch from this track to this track," and the relay finds the appropriate group, and it's exactly one control message. The idea is to have zero gap, zero overlap, zero redundant traffic. But one problem with the Switch message, atleast in our implementation, is if the relay cannot find an appropriate switch position, it will timeout, and then it will send an error message. So, this table is describing what we have. And in terms of number of control messages, delivery gap, redundant traffic, and well-aligned switch, is clearly the winner here. Now, let's look at some numbers. I haven't included all the results, so just a few of them to simplify this presentation, but if you don't understand anything, just stop me, okay? So, on the left-hand side, skip the right table for now, just focus on the first table on the left. This is just a track alignment scenario and bandwidth is not limited, so we have unlimited bandwidth. When the tracks are in sync, so they arrive at the same time at the relay, the switching delay for Forward State to Link is about half a second. But that's also half a GOP, redundant traffic. If, if there is 2 seconds, the target is 2 seconds ahead, we are switching from A to B, then B is 2 seconds ahead, then, you know, the delay is still comparable, and then the redundant traffic is the same, but then obviously there will be skips. Right? Joining Fetch has similar delay, but no excess traffic, but it skips the 1 GOP. And the Switch method has a bit less delay, switching delay, and far less switching delay in the second case because the data is already there, and no redundant traffic. Now, this is again, bandwidth is not the issue, so client might be switching for another purpose, for another reason. And these are the results that we get from these three methods. Now, if we look at, at the right table, this is where the bandwidth is limited, but the tracks are in sync. So, we are trying to fix the other conditions while we are playing the others. So, A to B, remember, A is 2.5 megabits per second, and B is 4 megabits per second. And A to B is an upshift, and the bandwidth is 4.5. So, it is easy, it is also easy to get B, but you cannot get A plus B at the same time. So, Forward State to Link is going to have about half a second switching delay, and that much redundant traffic. But when you switch from B to A, although they are in sync, that's because of the bandwidth, the delay is going to increase to 7 seconds. This is extremely a bad switch, because you are switching at 7 seconds, 4.5 seconds of that 7 seconds is a stall, you are missing the other content, and you are just stuck. Joining Fetch also has a very large delay, almost 5 seconds, and then has very significant duplicate traffic because it can't really keep up with the switch. And then when switching to, from B to A, so you are downshifting at a lower bandwidth, again the delay is significant. The Switch message has a good delay when you are switching from A to B, so you are upshifting and the bandwidth is sufficient, but you are not really introducing an extra traffic because the bandwidth is sufficient, and you know where GOP boundaries are. But when you are downshifting at 3 megabits per second, there is also a delay. So, that's really a bad condition for any of these scenarios, and there is excessive traffic there too. Any questions so far? Yes.
Magnus Westerlund: I try to understand, so, if the limited bandwidth is four and a half megabits, and A is four, A is 2.5, B is four megabits.
Colin: A is 2.5, B is 4 megabits.
Magnus Westerlund: Okay, so, what I'm talking about is the B to A delay of 4.5. So, from the relay to the client, the subscriber, the bandwidth is 4.5, and the track bandwidths are 2.5 and 4.
Colin: Yeah, so basically, the pipe is 4.5, and the, the, the B is, B is 4, so it fits. If it's a Forward State to Link, if I understand properly, there is no bandwidth overlap. Because you are downloading four, and then there is a point in the future that you say, "Hey, now, I want track A." I don't understand why there is a stall and why there is a delay, because that bandwidth should be used, you've changed, the, the pipe is 4.5, you're changing from 4 to 3. There is no overlap, so never, bandwidth is never higher than 4.5. So, why that is the result?
Colin: Are you asking about the right top corner, right?
Magnus Westerlund: B to A, yes.
Colin: Uh, let's see. Because you're, you're switching from a high bit rate, and can't complete it at the 3 megabit per second bandwidth.
Magnus Westerlund: But the pipe is four and a half.
Colin: I am talking about B to A, where the subscriber, the bandwidth is 4.5.
Magnus Westerlund: And the track bandwidths are 2.5 and 4.
Colin: Yes. So, basically, the pipe is 4.5, and the, the B is 4. So, it fits.
Magnus Westerlund: Yes, but if you're switching to A, which is 2.5.
Colin: Yes.
Magnus Westerlund: Why does it take 7 seconds?
Colin: Because it's trying to finish the segment it was sending.
Magnus Westerlund: Yes.
Colin: Till it can send the next one.
Magnus Westerlund: What is measuring? What, what, what is, um, the number that we see, the delay is?
Colin: The switching delay is by the time you make the switch request, and by the time the new track plays.
Magnus Westerlund: What are you trying to play? The first object of the track? You have to see the first byte of the desired track?
Colin: Correct. It has to be object 0.
Magnus Westerlund: Yes.
Colin: Okay.
Magnus Westerlund: Okay, I think it's because that means there's, there's, there's data queued at the relay to send, and when you switch the forward, it doesn't affect any of that queued data.
Colin: Yes.
Magnus Westerlund: So, I think, I mean, I thought about that too when I was trying to write up a switch, and that's why I added that, not only do you, when you switch even in hard mode, you switch forward, you cancel any queued data you have to send yet on the old track. And that'll get your delay down if you add that extra step.
Colin: Yeah, the, two points of attention. Subscriber is, if you switch immediately, uh, when the bandwidth is, uh, low, it's not completely realistic, because usually you check your, you check your buffer and see the way and then it makes switching, but, but this is something that we might, uh, improve. Second, second question is that, most of the interop, uh, the runner today is actually, uh, checking for the signaling handshakes and so on. At some point, we should also think about if we have, uh, a file transfer. Send media to the other side. Because some of the things that I was working on, work was streaming video. Um, but, I mean, at some point, not right now, figure out how to be get some of this.
Magnus Westerlund: Right. Yeah. So, the long-term goal would be to actually have a way to test, you know, further up the stack to media that's going from one implementation to another, or through a relay and back into a client.
Jordi Cenzano: Um, I was going through draft-18, like I, like I hadn't actually read the drafts since 14, and, and I was just reading draft-18, right, and read it, and, you know, and I think I saw, I saw roughly counting somewhere between 50 and 70 sort of what I viewed as sort of features or functionality things in draft-18. Um, I think we're at the point as we're starting to get closer to working group last call, we should start being identifying all of the sort of features, and then somehow, and I don't, I like, I'm looking for your guidance on it, having a spreadsheet to check that we, you know, how many implementations of this do we have, or, you know, whatever, do they work, you know? Um, because I think a lot of our testing here is like, "Yeah, basic stuff worked."
Colin: Um, yeah, and let me actually, dig in a little to that. So, these tests, uh, with they and running, you know, if we zoom into one of these tests, um, these self-tests, uh, this, this is post-ends, super implementation, so we get, you know, a couple tests that passed against, uh, the Lorenzo's implementation, for example. Um, so we're able to exchange setup, but this test that's called "publish namespace only" fails, um, and this test that has "subscriber." So, the idea is that there are these specified test cases, um, and these are specified in the, repo here. So, in docs, uh, tests, this test cases file, um, has each of the tests that's implemented, specified. So, there's, you know, different features and, you know, how you actually exercise that systematically. Um, so the idea is that we would, we would write down like, "Okay, here's this feature. Here's how we want to test that it's implemented." And the test client implements this algorithm, you know, it says, "I'm going to send this and expect that." And so, right now we only have a handful of tests. Um, that was to kind of incentivize people to implement test clients and kind of get the basic framework in place first. I think we're now at the point where we want to expand the number of test cases to start to flesh out and say, "Okay, what other feature coverage do we have?"
Jordi Cenzano: I want to make sure, yeah, I just, I got in to be answering your questions, I think that the draft-18 for Vienna is absolutely the right call. I think, since we started going to draft-18, I feel like it's, it's gone better for two reasons. One, the interop that we got on 18 was like, to this point, was kind of abysmal. And I think in our second go at it, it will be much more complete, because people got their aspirations in. They might make over line and, you know, know what they need to do, they'll keep chasing it. And it also sort of gives us a draft where we can put some stuff in that we think we want, and then some people who are very aggressive, they go out and implement those odd drafts, learn something about, "Oh, actually this thing probably won't work," and then you get a chance to pull it out before everybody's committed to it. Um, I'm kind of thinking along those lines for some of the stuff yesterday as well. Let's see if we can get it in draft-18, get some early feedback and, you know, don't have to necessarily fully commit. Um, so I, I think that's a good target, and I think on the "how can we make interop runner better," I definitely think that what Colin is going is like, yeah, we need to get from these five tests, we need to get to 70 like, or whatever, like, and I don't know, like, that's a lot of, like, given that we're all out implementing the spec, like, I'm not sure how many people want to, like, sit and think about what those 70 test cases look like, and then everybody has to write their interop client also to get those 70 tests.
Colin: Yeah, and I want to think through, you know, how do we write the tests that have the most value first, and kind of slowly sequence those things in a way that like, makes sense, because I don't think we want to like have somebody go off and spec 70 tests and then everybody's like on the hook to implement 70 tests.
Jordi Cenzano: And I guess I also, so, to make sure I understand correctly, the test that is there today are primarily control plane tests, and like the tool that I wrote in like November is like, entirely data plane tests, right, where it checks all of the features, like, there's 41 different tests against like the different things you can do on the data plane. So, I mean, trying to find a way to merge that might be good. Because like we found like under TLS on my interop, we found a lot of like, "Oh, like, it works when the extensions bit is on, but not when it's off," and like, those kinds of little, it's only one byte, but, no, I mean, I'm not throwing anybody under the bus. I'm just saying like, those, there's a lot of, there's a lot of bits and bobs in that as well, that if you haven't tested, you probably don't.
Alan Frindell: Thanks, Ali. Um, we're out of time. Next up is Colin, who's going to present his, test results. Also note that at least three of you have not signed into Meetecho, please do so if you're not already. If you're new, you may need a Datatracker account, which is, which takes a little bit of time to get, but it's, again, it's what you do, what you do to do it.
Colin: Good morning. Um, I am Colin from Ozyegin University. And there are some results from my students. And, uh, this, this study was conducted just before, uh, switch from proposals came in. So, that's not really included here. Can you go to presentation mode? Better? Yes, thank you. So, let's get input. So, the way we are, we have done these studies is, um, so we have subscriber-side track switching, uh, and, you know, more specifically, this subscriber-initiated and relay-executed switch. And then on the other hand, one version of DTS where the relay is actually doing the switch for us. So, we delegate the relay to do the track switching for us. In the first paradigm, subscriber-initiated track switching, we have three methods. Forward State to Link, so forward 0 or 1. Joining Fetch approach, and the Switch message. And this Switch message is pretty close to what Gwendal has in his draft, but not identical. But the methodology is more or less the same, as something we have shown before. In this case, now, on the subscriber-side paradigm, the relay actually, since it doesn't know the catalog, it has no visibility into what tracks are related, so we need to let the relay know about this via request. And the subscriber must manage the timing by itself, because it can't really know, know the positions of the tracks on the relay. So, if we are going to switch from track A to track B, you know, how they are aligned, you know, whether there's some latency between them or not, the subscriber doesn't know that. On the other hand, if you go to the DTS approach, so the relay actually again doesn't know which tracks are related, so you need to let the relay know about this. But the timing now can be managed by the relay, because it knows exactly, you know, where the group boundaries are and the positions of individual tracks. Now, we would like to have a seamless switch, and by seamless switch we mean no stalls during the switch, so fully no frame storing, frame repeating. No gaps, this is also important, because especially when the tracks are actually, they have some position issues like one track is way ahead of another track, there could be some gaps. We call them skips, and this is irrecoverable missing content. So, you just jump one or two frames, and this is undesirable as well. And lastly, no wasted data. We don't want to have any redundant traffic, because some of these methods will introduce redundant traffic as well. Especially when you are downshifting because of some bandwidth issues, you know, traffic duplication is really bad, and you usually want to avoid that. So, I need to tell you about the testbed first, because if you think of every scenario, there will be really a lot of different scenarios. So, we tried to grab the most important cases. So, we have two tracks here, so this is a simulated switch, track A and B. And we do switch between track A and B. Track A is 720p, 2.5 megabits per second, synthetic stream. And track B is 1080p, so high resolution, 4 megabit, higher bit rate synthetic video stream. The GOPs are 25 frames, and 1-second duration. And the jitter buffer on the player side is half a second, 500 milliseconds. So, we test with A to B, B to A, and different types of positions in all of the different tests. Now, the scenarios we are looking at: first, the track alignment scenarios. In this case, whether the tracks are aligned on the relay at the time of switch, or whether one track is ahead of the other track. It could be they are in sync, or 1 second apart, or 2 seconds apart. 2 seconds apart means like there are two GOPs between them. And in the other scenario, we limit the bandwidth. So, in the first case, the bandwidth is not limited, so it's just a track alignment only. In the second case, there is a bandwidth limitation, and we have chosen these bandwidths carefully. 4.5 means, you know, track B is going to make it, track A is going to make it, but both of them, if they are sent both of them at the same time, they are not going to make it, right? It is less than the summation of both tracks. 3 megabits per second, only track A is going to make it. And 7 megabits per second is something sufficient for both tracks to be transmitted. So, different scenarios. And the last scenario is downstream delay. So, if your switch request, or somehow, you know, from relay to your subscriber, the delay is significant. We are testing that as well, so half a, half a second in this case. So, it's a baby, baby step iteration, but, you know, this is really too far, relay and that's where we are switching from. And the metrics we are looking at: switching delay, so how quickly I can switch; stall duration, whether I am stalled during the switch; skip duration, whether I skip any content; and finally, average excess traffic ratio. And we really would like to have this as small as possible. And this is a normalized value. 1 means one GOP of traffic has been redundant. 2 means twice the GOP duration traffic has been redundant. So, Forward State to Link, subscriber subscribes to both tracks, track A and B, right? At the switch time, we are going to send two request updates, 1 and 0, to switch on the relay side. There is no coordination, and totally this is three control messages. The risks are gaps or duplicate data, depending on where how the tracks are aligned. Joining Fetch is using a fetch message to buffer of the past GOPs. Subscriber first checks for largest object metadata to decide whether it is the fetch or not. And depending on this, three to five control messages. So, there is more control traffic in this case, and there are some corner cases where this completely fails. And the Switch message is the atomic message. You tell, "I want to switch from this track to this track," and the relay finds the appropriate group, and it's exactly one control message. The idea is to have zero gap, zero overlap, zero redundant traffic. But one problem with the Switch message, atleast in our implementation, is if the relay cannot find an appropriate switch position, it will timeout, and then it will send an error message. So, this table is describing what we have. And in terms of number of control messages, delivery gap, redundant traffic, and well-aligned switch, is clearly the winner here. Now, let's look at some numbers. I haven't included all the results, so just a few of them to simplify this presentation, but if you don't understand anything, just stop me, okay? So, on the left-hand side, skip the right table for now, just focus on the first table on the left. This is just a track alignment scenario and bandwidth is not limited, so we have unlimited bandwidth. When the tracks are in sync, so they arrive at the same time at the relay, the switching delay for Forward State to Link is about half a second. But that's also half a GOP, redundant traffic. If, if there is 2 seconds, the target is 2 seconds ahead, we are switching from A to B, then B is 2 seconds ahead, then, you know, the delay is still comparable, and then the redundant traffic is the same, but then obviously there will be skips. Right? Joining Fetch has similar delay, but no excess traffic, but it skips the 1 GOP. And the Switch method has a bit less delay, switching delay, and far less switching delay in the second case because the data is already there, and no redundant traffic. Now, this is again, bandwidth is not the issue, so client might be switching for another purpose, for another reason. And these are the results that we get from these three methods. Now, if we look at, at the right table, this is where the bandwidth is limited, but the tracks are in sync. So, we are trying to fix the other conditions while we are playing the others. So, A to B, remember, A is 2.5 megabits per second, and B is 4 megabits per second. And A to B is an upshift, and the bandwidth is 4.5. So, it is easy, it is also easy to get B, but you cannot get A plus B at the same time. So, Forward State to Link is going to have about half a second switching delay, and that much redundant traffic. But when you switch from B to A, although they are in sync, that's because of the bandwidth, the delay is going to increase to 7 seconds. This is extremely a bad switch, because you are switching at 7 seconds, 4.5 seconds of that 7 seconds is a stall, you are missing the other content, and you are just stuck. Joining Fetch also has a very large delay, almost 5 seconds, and then has very significant duplicate traffic because it can't really keep up with the switch. And then when switching to, from B to A, so you are downshifting at a lower bandwidth, again the delay is significant. The Switch message has a good delay when you are switching from A to B, so you are upshifting and the bandwidth is sufficient, but you are not really introducing an extra traffic because the bandwidth is sufficient, and you know where GOP boundaries are. But when you are downshifting at 3 megabits per second, there is also a delay. So, that's really a bad condition for any of these scenarios, and there is excessive traffic there too. Any questions so far? Yes.
Magnus Westerlund: I try to understand, so, if the limited bandwidth is four and a half megabits, and A is four, A is 2.5, B is four megabits.
Colin: A is 2.5, B is 4 megabits.
Magnus Westerlund: Okay, so, what I'm talking about is the B to A delay of 4.5. So, from the relay to the client, the subscriber, the bandwidth is 4.5, and the track bandwidths are 2.5 and 4.
Colin: Yeah, so basically, the pipe is 4.5, and the, the, the B is, B is 4, so it fits. If it's a Forward State to Link, if I understand properly, there is no bandwidth overlap. Because you are downloading four, and then there is a point in the future that you say, "Hey, now, I want track A." I don't understand why there is a stall and why there is a delay, because that bandwidth should be used, you've changed, the, the pipe is 4.5, you're changing from 4 to 3. There is no overlap, so never, bandwidth is never higher than 4.5. So, why that is the result?
Colin: Are you asking about the right top corner, right?
Magnus Westerlund: B to A, yes.
Colin: Uh, let's see. Because you're, you're switching from a high bit rate, and can't complete it at the 3 megabit per second bandwidth.
Magnus Westerlund: But the pipe is four and a half.
Colin: I am talking about B to A, where the subscriber, the bandwidth is 4.5.
Magnus Westerlund: And the track bandwidths are 2.5 and 4.
Colin: Yes. So, basically, the pipe is 4.5, and the, the B is 4. So, it fits.
Magnus Westerlund: Yes, but if you're switching to A, which is 2.5.
Colin: Yes.
Magnus Westerlund: Why does it take 7 seconds?
Colin: Because it's trying to finish the segment it was sending.
Magnus Westerlund: Yes.
Colin: Till it can send the next one.
Magnus Westerlund: What is measuring? What, what, what is, um, the number that we see, the delay is?
Colin: The switching delay is by the time you make the switch request, and by the time the new track plays.
Magnus Westerlund: What are you trying to play? The first object of the track? You have to see the first byte of the desired track?
Colin: Correct. It has to be object 0.
Magnus Westerlund: Yes.
Colin: Okay.
Magnus Westerlund: Okay, I think it's because that means there's, there's, there's data queued at the relay to send, and when you switch the forward, it doesn't affect any of that queued data.
Colin: Yes.
Magnus Westerlund: So, I think, I mean, I thought about that too when I was trying to write up a switch, and that's why I added that, not only do you, when you switch even in hard mode, you switch forward, you cancel any queued data you have to send yet on the old track. And that'll get your delay down if you add that extra step.
Colin: Yeah, the, two points of attention. Subscriber is, if you switch immediately, uh, when the bandwidth is, uh, low, it's not completely realistic, because usually you check your, you check your buffer and see the way and then it makes switching, but, but this is something that we might, uh, improve. Second, second question is that, most of the interop, uh, the runner today is actually, uh, checking for the signaling handshakes and so on. At some point, we should also think about if we have, uh, a file transfer. Send media to the other side. Because some of the things that I was working on, work was streaming video. Um, but, I mean, at some point, not right now, figure out how to be get some of this.
Magnus Westerlund: Right. Yeah. So, the long-term goal would be to actually have a way to test, you know, further up the stack to media that's going from one implementation to another, or through a relay and back into a client.
Jordi Cenzano: Um, I was going through draft-18, like I, like I hadn't actually read the drafts since 14, and, and I was just reading draft-18, right, and read it, and, you know, and I think I saw, I saw roughly counting somewhere between 50 and 70 sort of what I viewed as sort of features or functionality things in draft-18. Um, I think we're at the point as we're starting to get closer to working group last call, we should start being identifying all of the sort of features, and then somehow, and I don't, I like, I'm looking for your guidance on it, having a spreadsheet to check that we, you know, how many implementations of this do we have, or, you know, whatever, do they work, you know? Um, because I think a lot of our testing here is like, "Yeah, basic stuff worked."
Colin: Um, yeah, and let me actually, dig in a little to that. So, these tests, uh, with they and running, you know, if we zoom into one of these tests, um, these self-tests, uh, this, this is post-ends, super implementation, so we get, you know, a couple tests that passed against, uh, the Lorenzo's implementation, for example. Um, so we're able to exchange setup, but this test that's called "publish namespace only" fails, um, and this test that has "subscriber." So, the idea is that there are these specified test cases, um, and these are specified in the, repo here. So, in docs, uh, tests, this test cases file, um, has each of the tests that's implemented, specified. So, there's, you know, different features and, you know, how you actually exercise that systematically. Um, so the idea is that we would, we would write down like, "Okay, here's this feature. Here's how we want to test that it's implemented." And the test client implements this algorithm, you know, it says, "I'm going to send this and expect that." And so, right now we only have a handful of tests. Um, that was to kind of incentivize people to implement test clients and kind of get the basic framework in place first. I think we're now at the point where we want to expand the number of test cases to start to flesh out and say, "Okay, what other feature coverage do we have?"
Jordi Cenzano: I want to make sure, yeah, I just, I got in to be answering your questions, I think that the draft-18 for Vienna is absolutely the right call. I think, since we started going to draft-18, I feel like it's, it's gone better for two reasons. One, the interop that we got on 18 was like, to this point, was kind of abysmal. And I think in our second go at it, it will be much more complete, because people got their aspirations in. They might make over line and, you know, know what they need to do, they'll keep chasing it. And it also sort of gives us a draft where we can put some stuff in that we think we want, and then some people who are very aggressive, they go out and implement those odd drafts, learn something about, "Oh, actually this thing probably won't work," and then you get a chance to pull it out before everybody's committed to it. Um, I'm kind of thinking along those lines for some of the stuff yesterday as well. Let's see if we can get it in draft-18, get some early feedback and, you know, don't have to necessarily fully commit. Um, so I, I think that's a good target, and I think on the "how can we make interop runner better," I definitely think that what Colin is going is like, yeah, we need to get from these five tests, we need to get to 70 like, or whatever, like, and I don't know, like, that's a lot of, like, given that we're all out implementing the spec, like, I'm not sure how many people want to, like, sit and think about what those 70 test cases look like, and then everybody has to write their interop client also to get those 70 tests.
Colin: Yeah, and I want to think through, you know, how do we write the tests that have the most value first, and kind of slowly sequence those things in a way that like, makes sense, because I don't think we want to like have somebody go off and spec 70 tests and then everybody's like on the hook to implement 70 tests.
Jordi Cenzano: And I guess I also, so, to make sure I understand correctly, the test that is there today are primarily control plane tests, and like the tool that I wrote in like November is like, entirely data plane tests, right, where it checks all of the features, like, there's 41 different tests against like the different things you can do on the data plane. So, I mean, trying to find a way to merge that might be good. Because like we found like under TLS on my interop, we found a lot of like, "Oh, like, it works when the extensions bit is on, but not when it's off," and like, those kinds of little, it's only one byte, but, no, I mean, I'm not throwing anybody under the bus. I'm just saying like, those, there's a lot of, there's a lot of bits and bobs in that as well, that if you haven't tested, you probably don't.
Alan Frindell: Thanks, Ali. Um, we're out of time. Next up is Colin, who's going to present his, test results. Also note that at least three of you have not signed into Meetecho, please do so if you're not already. If you're new, you may need a Datatracker account, which is, which takes a little bit of time to get, but it's, again, it's what you do, what you do to do it.
Colin: Good morning. Um, I am Colin from Ozyegin University. And there are some results from my students. And, uh, this, this study was conducted just before, uh, switch from proposals came in. So, that's not really included here. Can you go to presentation mode? Better? Yes, thank you. So, let's get input. So, the way we are, we have done these studies is, um, so we have subscriber-side track switching, uh, and, you know, more specifically, this subscriber-initiated and relay-executed switch. And then on the other hand, one version of DTS where the relay is actually doing the switch for us. So, we delegate the relay to do the track switching for us. In the first paradigm, subscriber-initiated track switching, we have three methods. Forward State to Link, so forward 0 or 1. Joining Fetch approach, and the Switch message. And this Switch message is pretty close to what Gwendal has in his draft, but not identical. But the methodology is more or less the same, as something we have shown before. In this case, now, on the subscriber-side paradigm, the relay actually, since it doesn't know the catalog, it has no visibility into what tracks are related, so we need to let the relay know about this via request. And the subscriber must manage the timing by itself, because it can't really know, know the positions of the tracks on the relay. So, if we are going to switch from track A to track B, you know, how they are aligned, you know, whether there's some latency between them or not, the subscriber doesn't know that. On the other hand, if you go to the DTS approach, so the relay actually again doesn't know which tracks are related, so you need to let the relay know about this. But the timing now can be managed by the relay, because it knows exactly, you know, where the group boundaries are and the positions of individual tracks. Now, we would like to have a seamless switch, and by seamless switch we mean no stalls during the switch, so fully no frame storing, frame repeating. No gaps, this is also important, because especially when the tracks are actually, they have some position issues like one track is way ahead of another track, there could be some gaps. We call them skips, and this is irrecoverable missing content. So, you just jump one or two frames, and this is undesirable as well. And lastly, no wasted data. We don't want to have any redundant traffic, because some of these methods will introduce redundant traffic as well. Especially when you are downshifting because of some bandwidth issues, you know, traffic duplication is really bad, and you usually want to avoid that. So, I need to tell you about the testbed first, because if you think of every scenario, there will be really a lot of different scenarios. So, we tried to grab the most important cases. So, we have two tracks here, so this is a simulated switch, track A and B. And we do switch between track A and B. Track A is 720p, 2.5 megabits per second, synthetic stream. And track B is 1080p, so high resolution, 4 megabit, higher bit rate synthetic video stream. GOPs are 25 frames, and 1-second duration. And the jitter buffer on the player side is half a second, 500 milliseconds. So, we test with A to B, B to A, and different types of positions in all of the different tests. Now, the scenarios we are looking at: first, the track alignment scenarios. In this case, whether the tracks are aligned on the relay at the time of switch, or whether one track is ahead of the other track. It could be they are in sync, or 1 second apart, or 2 seconds apart. 2 seconds apart means like there are two GOPs between them. And in the other scenario, we limit the bandwidth. So, in the first case, the bandwidth is not limited, so it's just a track alignment only. In the second case, there is a bandwidth limitation, and we have chosen these bandwidths carefully. 4.5 means, you know, track B is going to make it, track A is going to make it, but both of them, if they are sent both of them at the same time, they are not going to make it, right? It is less than the summation of both tracks. 3 megabits per second, only track A is going to make it. And 7 megabits per second is something sufficient for both tracks to be transmitted. So, different scenarios. And the last scenario is downstream delay. So, if your switch request, or somehow, you know, from relay to your subscriber, the delay is significant. We are testing that as well, so half a, half a second in this case. So, it's a baby, baby step iteration, but, you know, this is really too far, relay and that's where we are switching from. And the metrics we are looking at: switching delay, so how quickly I can switch; stall duration, whether I am stalled during the switch; skip duration, whether I skip any content; and finally, average excess traffic ratio. And we really would like to have this as small as possible. And this is a normalized value. 1 means one GOP of traffic has been redundant. 2 means twice the GOP duration traffic has been redundant. So, Forward State to Link, subscriber subscribes to both tracks, track A and B, right? At the switch time, we are going to send two request updates, 1 and 0, to switch on the relay side. There is no coordination, and totally this is three control messages. The risks are gaps or duplicate data, depending on where how the tracks are aligned. Joining Fetch is using a fetch message to buffer of the past GOPs. Subscriber first checks for largest object metadata to decide whether it is the fetch or not. And depending on this, three to five control messages. So, there is more control traffic in this case, and there are some corner cases where this completely fails. And the Switch message is the atomic message. You tell, "I want to switch from this track to this track," and the relay finds the appropriate group, and it's exactly one control message. The idea is to have zero gap, zero overlap, zero redundant traffic. But one problem with the Switch message, atleast in our implementation, is if the relay cannot find an appropriate switch position, it will timeout, and then it will send an error message. So, this table is describing what we have. And in terms of number of control messages, delivery gap, redundant traffic, and well-aligned switch, is clearly the winner here. Now, let's look at some numbers. I haven't included all the results, so just a few of them to simplify this presentation, but if you don't understand anything, just stop me, okay? So, on the left-hand side, skip the right table for now, just focus on the first table on the left. This is just a track alignment scenario and bandwidth is not limited, so we have unlimited bandwidth. When the tracks are in sync, so they arrive at the same time at the relay, the switching delay for Forward State to Link is about half a second. But that's also half a GOP, redundant traffic. If, if there is 2 seconds, the target is 2 seconds ahead, we are switching from A to B, then B is 2 seconds ahead, then, you know, the delay is still comparable, and then the redundant traffic is the same, but then obviously there will be skips. Right? Joining Fetch has similar delay, but no excess traffic, but it skips the 1 GOP. And the Switch method has a bit less delay, switching delay, and far less switching delay in the second case because the data is already there, and no redundant traffic. Now, this is again, bandwidth is not the issue, so client might be switching for another purpose, for another reason. And these are the results that we get from these three methods. Now, if we look at, at the right table, this is where the bandwidth is limited, but the tracks are in sync. So, we are trying to fix the other conditions while we are playing the others. So, A to B, remember, A is 2.5 megabits per second, and B is 4 megabits per second. And A to B is an upshift, and the bandwidth is 4.5. So, it is easy, it is also easy to get B, but you cannot get A plus B at the same time. So, Forward State to Link is going to have about half a second switching delay, and that much redundant traffic. But when you switch from B to A, although they are in sync, that's because of the bandwidth, the delay is going to increase to 7 seconds. This is extremely a bad switch, because you are switching at 7 seconds, 4.5 seconds of that 7 seconds is a stall, you are missing the other content, and you are just stuck. Joining Fetch also has a very large delay, almost 5 seconds, and then has very significant duplicate traffic because it can't really keep up with the switch. And then when switching to, from B to A, so you are downshifting at a lower bandwidth, again the delay is significant. The Switch message has a good delay when you are switching from A to B, so you are upshifting and the bandwidth is sufficient, but you are not really introducing an extra traffic because the bandwidth is sufficient, and you know where GOP boundaries are. But when you are downshifting at 3 megabits per second, there is also a delay. So, that's really a bad condition for any of these scenarios, and there is excessive traffic there too. Any questions so far? Yes.
Magnus Westerlund: I try to understand, so, if the limited bandwidth is four and a half megabits, and A is four, A is 2.5, B is four megabits.
Colin: A is 2.5, B is 4 megabits.
Magnus Westerlund: Okay, so, what I'm talking about is the B to A delay of 4.5. So, from the relay to the client, the subscriber, the bandwidth is 4.5, and the track bandwidths are 2.5 and 4.
Colin: Yeah, so basically, the pipe is 4.5, and the, the, the B is, B is 4, so it fits. If it's a Forward State to Link, if I understand properly, there is no bandwidth overlap. Because you are downloading four, and then there is a point in the future that you say, "Hey, now, I want track A." I don't understand why there is a stall and why there is a delay, because that bandwidth should be used, you've changed, the, the pipe is 4.5, you're changing from 4 to 3. There is no overlap, so never, bandwidth is never higher than 4.5. So, why that is the result?
Colin: Are you asking about the right top corner, right?
Magnus Westerlund: B to A, yes.
Colin: Uh, let's see. Because you're, you're switching from a high bit rate, and can't complete it at the 3 megabit per second bandwidth.
Magnus Westerlund: But the pipe is four and a half.
Colin: I am talking about B to A, where the subscriber, the bandwidth is 4.5.
Magnus Westerlund: And the track bandwidths are 2.5 and 4.
Colin: Yes. So, basically, the pipe is 4.5, and the, the B is 4. So, it fits.
Magnus Westerlund: Yes, but if you're switching to A, which is 2.5.
Colin: Yes.
Magnus Westerlund: Why does it take 7 seconds?
Colin: Because it's trying to finish the segment it was sending.
Magnus Westerlund: Yes.
Colin: Till it can send the next one.
Magnus Westerlund: What is measuring? What, what, what is, um, the number that we see, the delay is?
Colin: The switching delay is by the time you make the switch request, and by the time the new track plays.
Magnus Westerlund: What are you trying to play? The first object of the track? You have to see the first byte of the desired track?
Colin: Correct. It has to be object 0.
Magnus Westerlund: Yes.
Colin: Okay.
Magnus Westerlund: Okay, I think it's because that means there's, there's, there's data queued at the relay to send, and when you switch the forward, it doesn't affect any of that queued data.
Colin: Yes.
Magnus Westerlund: So, I think, I mean, I thought about that too when I was trying to write up a switch, and that's why I added that, not only do you, when you switch even in hard mode, you switch forward, you cancel any queued data you have to send yet on the old track. And that'll get your delay down if you add that extra step.
Colin: Yeah, the, two points of attention. Subscriber is, if you switch immediately, uh, when the bandwidth is, uh, low, it's not completely realistic, because usually you check your, you check your buffer and see the way and then it makes switching, but, but this is something that we might, uh, improve. Second, second question is that, most of the interop, uh, the runner today is actually, uh, checking for the signaling handshakes and so on. At some point, we should also think about if we have, uh, a file transfer. Send media to the other side. Because some of the things that I was working on, work was streaming video. Um, but, I mean, at some point, not right now, figure out how to be get some of this.
Magnus Westerlund: Right. Yeah. So, the long-term goal would be to actually have a way to test, you know, further up the stack to media that's going from one implementation to another, or through a relay and back into a client.
Jordi Cenzano: Um, I was going through draft-18, like I, like I hadn't actually read the drafts since 14, and, and I was just reading draft-18, right, and read it, and, you know, and I think I saw, I saw roughly counting somewhere between 50 and 70 sort of what I viewed as sort of features or functionality things in draft-18. Um, I think we're at the point as we're starting to get closer to working group last call, we should start being identifying all of the sort of features, and then somehow, and I don't, I like, I'm looking for your guidance on it, having a spreadsheet to check that we, you know, how many implementations of this do we have, or, you know, whatever, do they work, you know? Um, because I think a lot of our testing here is like, "Yeah, basic stuff worked."
Colin: Um, yeah, and let me actually, dig in a little to that. So, these tests, uh, with they and running, you know, if we zoom into one of these tests, um, these self-tests, uh, this, this is post-ends, super implementation, so we get, you know, a couple tests that passed against, uh, the Lorenzo's implementation, for example. Um, so we're able to exchange setup, but this test that's called "publish namespace only" fails, um, and this test that has "subscriber." So, the idea is that there are these specified test cases, um, and these are specified in the, repo here. So, in docs, uh, tests, this test cases file, um, has each of the tests that's implemented, specified. So, there's, you know, different features and, you know, how you actually exercise that systematically. Um, so the idea is that we would, we would write down like, "Okay, here's this feature. Here's how we want to test that it's implemented." And the test client implements this algorithm, you know, it says, "I'm going to send this and expect that." And so, right now we only have a handful of tests. Um, that was to kind of incentivize people to implement test clients and kind of get the basic framework in place first. I think we're now at the point where we want to expand the number of test cases to start to flesh out and say, "Okay, what other feature coverage do we have?"
Jordi Cenzano: I want to make sure, yeah, I just, I got in to be answering your questions, I think that the draft-18 for Vienna is absolutely the right call. I think, since we started going to draft-18, I feel like it's, it's gone better for two reasons. One, the interop that we got on 18 was like, to this point, was kind of abysmal. And I think in our second go at it, it will be much more complete, because people got their aspirations in. They might make over line and, you know, know what they need to do, they'll keep chasing it. And it also sort of gives us a draft where we can put some stuff in that we think we want, and then some people who are very aggressive, they go out and implement those odd drafts, learn something about, "Oh, actually this thing probably won't work," and then you get a chance to pull it out before everybody's committed to it. Um, I'm kind of thinking along those lines for some of the stuff yesterday as well. Let's see if we can get it in draft-18, get some early feedback and, you know, don't have to necessarily fully commit. Um, so I, I think that's a good target, and I think on the "how can we make interop runner better," I definitely think that what Colin is going is like, yeah, we need to get from these five tests, we need to get to 70 like, or whatever, like, and I don't know, like, that's a lot of, like, given that we're all out implementing the spec, like, I'm not sure how many people want to, like, sit and think about what those 70 test cases look like, and then everybody has to write their interop client also to get those 70 tests.
Colin: Yeah, and I want to think through, you know, how do we write the tests that have the most value first, and kind of slowly sequence those things in a way that like, makes sense, because I don't think we want to like have somebody go off and spec 70 tests and then everybody's like on the hook to implement 70 tests.
Jordi Cenzano: And I guess I also, so, to make sure I understand correctly, the test that is there today are primarily control plane tests, and like the tool that I wrote in like November is like, entirely data plane tests, right, where it checks all of the features, like, there's 41 different tests against like the different things you can do on the data plane. So, I mean, trying to find a way to merge that might be good. Because like we found like under TLS on my interop, we found a lot of like, "Oh, like, it works when the extensions bit is on, but not when it's off," and like, those kinds of little, it's only one byte, but, no, I mean, I'm not throwing anybody under the bus. I'm just saying like, those, there's a lot of, there's a lot of bits and bobs in that as well, that if you haven't tested, you probably don't.
Alan Frindell: Thanks, Ali. Um, we're out of time. Next up is Colin, who's going to present his, test results. Also note that at least three of you have not signed into Meetecho, please do so if you're not already. If you're new, you may need a Datatracker account, which is, which takes a little bit of time to get, but it's, again, it's what you do, what you do to do it.
Colin: Good morning. Um, I am Colin from Ozyegin University. And there are some results from my students. And, uh, this, this study was conducted just before, uh, switch from proposals came in. So, that's not really included here. Can you go to presentation mode? Better? Yes, thank you. So, let's get input. So, the way we are, we have done these studies is, um, so we have subscriber-side track switching, uh, and, you know, more specifically, this subscriber-initiated and relay-executed switch. And then on the other hand, one version of DTS where the relay is actually doing the switch for us. So, we delegate the relay to do the track switching for us. In the first paradigm, subscriber-initiated track switching, we have three methods. Forward State to Link, so forward 0 or 1. Joining Fetch approach, and the Switch message. And this Switch message is pretty close to what Gwendal has in his draft, but not identical. But the methodology is more or less the same, as something we have shown before. In this case, now, on the subscriber-side paradigm, the relay actually, since it doesn't know the catalog, it has no visibility into what tracks are related, so we need to let the relay know about this via request. And the subscriber must manage the timing by itself, because it can't really know, know the positions of the tracks on the relay. So, if we are going to switch from track A to track B, you know, how they are aligned, you know, whether there's some latency between them or not, the subscriber doesn't know that. On the other hand, if you go to the DTS approach, so the relay actually again doesn't know which tracks are related, so you need to let the relay know about this. But the timing now can be managed by the relay, because it knows exactly, you know, where the group boundaries are and the positions of individual tracks. Now, we would like to have a seamless switch, and by seamless switch we mean no stalls during the switch, so fully no frame storing, frame repeating. No gaps, this is also important, because especially when the tracks are actually, they have some position issues like one track is way ahead of another track, there could be some gaps. We call them skips, and this is irrecoverable missing content. So, you just jump one or two frames, and this is undesirable as well. And lastly, no wasted data. We don't want to have any redundant traffic, because some of these methods will introduce redundant traffic as well. Especially when you are downshifting because of some bandwidth issues, you know, traffic duplication is really bad, and you usually want to avoid that. So, I need to tell you about the testbed first, because if you think of every scenario, there will be really a lot of different scenarios. So, we tried to grab the most important cases. So, we have two tracks here, so this is a simulated switch, track A and B. And we do switch between track A and B. Track A is 720p, 2.5 megabits per second, synthetic stream. And track B is 1080p, so high resolution, 4 megabit, higher bit rate synthetic video stream. GOPs are 25 frames, and 1-second duration. And the jitter buffer on the player side is half a second, 500 milliseconds. So, we test with A to B, B to A, and different types of positions in all of the different tests. Now, the scenarios we are looking at: first, the track alignment scenarios. In this case, whether the tracks are aligned on the relay at the time of switch, or whether one track is ahead of the other track. It could be they are in sync, or 1 second apart, or 2 seconds apart. 2 seconds apart means like there are two GOPs between them. And in the other scenario, we limit the bandwidth. So, in the first case, the bandwidth is not limited, so it's just a track alignment only. In the second case, there is a bandwidth limitation, and we have chosen these bandwidths carefully. 4.5 means, you know, track B is going to make it, track A is going to make it, but both of them, if they are sent both of them at the same time, they are not going to make it, right? It is less than the summation of both tracks. 3 megabits per second, only track A is going to make it. And 7 megabits per second is something sufficient for both tracks to be transmitted. So, different scenarios. And the last scenario is downstream delay. So, if your switch request, or somehow, you know, from relay to your subscriber, the delay is significant. We are testing that as well, so half a, half a second in this case. So, it's a baby, baby step iteration, but, you know, this is really too far, relay and that's where we are switching from. And the metrics we are looking at: switching delay, so how quickly I can switch; stall duration, whether I am stalled during the switch; skip duration, whether I skip any content; and finally, average excess traffic ratio. And we really would like to have this as small as possible. And this is a normalized value. 1 means one GOP of traffic has been redundant. 2 means twice the GOP duration traffic has been redundant. So, Forward State to Link, subscriber subscribes to both tracks, track A and B, right? At the switch time, we are going to send two request updates, 1 and 0, to switch on the relay side. There is no coordination, and totally this is three control messages. The risks are gaps or duplicate data, depending on where how the tracks are aligned. Joining Fetch is using a fetch message to buffer of the past GOPs. Subscriber first checks for largest object metadata to decide whether it is the fetch or not. And depending on this, three to five control messages. So, there is more control traffic in this case, and there are some corner cases where this completely fails. And the Switch message is the atomic message. You tell, "I want to switch from this track to this track," and the relay finds the appropriate group, and it's exactly one control message. The idea is to have zero gap, zero overlap, zero redundant traffic. But one problem with the Switch message, atleast in our implementation, is if the relay cannot find an appropriate switch position, it will timeout, and then it will send an error message. So, this table is describing what we have. And in terms of number of control messages, delivery gap, redundant traffic, and well-aligned switch, is clearly the winner here. Now, let's look at some numbers. I haven't included all the results, so just a few of them to simplify this presentation, but if you don't understand anything, just stop me, okay? So, on the left-hand side, skip the right table for now, just focus on the first table on the left. This is just a track alignment scenario and bandwidth is not limited, so we have unlimited bandwidth. When the tracks are in sync, so they arrive at the same time at the relay, the switching delay for Forward State to Link is about half a second. But that's also half a GOP, redundant traffic. If, if there is 2 seconds, the target is 2 seconds ahead, we are switching from A to B, then B is 2 seconds ahead, then, you know, the delay is still comparable, and then the redundant traffic is the same, but then obviously there will be skips. Right? Joining Fetch has similar delay, but no excess traffic, but it skips the 1 GOP. And the Switch method has a bit less delay, switching delay, and far less switching delay in the second case because the data is already there, and no redundant traffic. Now, this is again, bandwidth is not the issue, so client might be switching for another purpose, for another reason. And these are the results that we get from these three methods. Now, if we look at, at the right table, this is where the bandwidth is limited, but the tracks are in sync. So, we are trying to fix the other conditions while we are playing the others. So, A to B, remember, A is 2.5 megabits per second, and B is 4 megabits per second. And A to B is an upshift, and the bandwidth is 4.5. So, it is easy, it is also easy to get B, but you cannot get A plus B at the same time. So, Forward State to Link is going to have about half a second switching delay, and that much redundant traffic. But when you switch from B to A, although they are in sync, that's because of the bandwidth, the delay is going to increase to 7 seconds. This is extremely a bad switch, because you are switching at 7 seconds, 4.5 seconds of that 7 seconds is a stall, you are missing the other content, and you are just stuck. Joining Fetch also has a very large delay, almost 5 seconds, and then has very significant duplicate traffic because it can't really keep up with the switch. And then when switching to, from B to A, so you are downshifting at a lower bandwidth, again the delay is significant. The Switch message has a good delay when you are switching from A to B, so you are upshifting and the bandwidth is sufficient, but you are not really introducing an extra traffic because the bandwidth is sufficient, and you know where GOP boundaries are. But when you are downshifting at 3 megabits per second, there is also a delay. So, that's really a bad condition for any of these scenarios, and there is excessive traffic there too. Any questions so far? Yes.
Magnus Westerlund: I try to understand, so, if the limited bandwidth is four and a half megabits, and A is four, A is 2.5, B is four megabits.
Colin: A is 2.5, B is 4 megabits.
Magnus Westerlund: Okay, so, what I'm talking about is the B to A delay of 4.5. So, from the relay to the client, the subscriber, the bandwidth is 4.5, and the track bandwidths are 2.5 and 4.
Colin: Yeah, so basically, the pipe is 4.5, and the, the, the B is, B is 4, so it fits. If it's a Forward State to Link, if I understand properly, there is no bandwidth overlap. Because you are downloading four, and then there is a point in the future that you say, "Hey, now, I want track A." I don't understand why there is a stall and why there is a delay, because that bandwidth should be used, you've changed, the, the pipe is 4.5, you're changing from 4 to 3. There is no overlap, so never, bandwidth is never higher than 4.5. So, why that is the result?
Colin: Are you asking about the right top corner, right?
Magnus Westerlund: B to A, yes.
Colin: Uh, let's see. Because you're, you're switching from a high bit rate, and can't complete it at the 3 megabit per second bandwidth.
Magnus Westerlund: But the pipe is four and a half.
Colin: I am talking about B to A, where the subscriber, the bandwidth is 4.5.
Magnus Westerlund: And the track bandwidths are 2.5 and 4.
Colin: Yes. So, basically, the pipe is 4.5, and the, the B is 4. So, it fits.
Magnus Westerlund: Yes, but if you're switching to A, which is 2.5.
Colin: Yes.
Magnus Westerlund: Why does it take 7 seconds?
Colin: Because it's trying to finish the segment it was sending.
Magnus Westerlund: Yes.
Colin: Till it can send the next one.
Magnus Westerlund: What is measuring? What, what, what is, um, the number that we see, the delay is?
Colin: The switching delay is by the time you make the switch request, and by the time the new track plays.
Magnus Westerlund: What are you trying to play? The first object of the track? You have to see the first byte of the desired track?
Colin: Correct. It has to be object 0.
Magnus Westerlund: Yes.
Colin: Okay.
Magnus Westerlund: Okay, I think it's because that means there's, there's, there's data queued at the relay to send, and when you switch the forward, it doesn't affect any of that queued data.
Colin: Yes.
Magnus Westerlund: So, I think, I mean, I thought about that too when I was trying to write up a switch, and that's why I added that, not only do you, when you switch even in hard mode, you switch forward, you cancel any queued data you have to send yet on the old track. And that'll get your delay down if you add that extra step.
Colin: Yeah, the, two points of attention. Subscriber is, if you switch immediately, uh, when the bandwidth is, uh, low, it's not completely realistic, because usually you check your, you check your buffer and see the way and then it makes switching, but, but this is something that we might, uh, improve. Second, second question is that, most of the interop, uh, the runner today is actually, uh, checking for the signaling handshakes and so on. At some point, we should also think about if we have, uh, a file transfer. Send media to the other side. Because some of the things that I was working on, work was streaming video. Um, but, I mean, at some point, not right now, figure out how to be get some of this.
Magnus Westerlund: Right. Yeah. So, the long-term goal would be to actually have a way to test, you know, further up the stack to media that's going from one implementation to another, or through a relay and back into a client.
Jordi Cenzano: Um, I was going through draft-18, like I, like I hadn't actually read the drafts since 14, and, and I was just reading draft-18, right, and read it, and, you know, and I think I saw, I saw roughly counting somewhere between 50 and 70 sort of what I viewed as sort of features or functionality things in draft-18. Um, I think we're at the point as we're starting to get closer to working group last call, we should start being identifying all of the sort of features, and then somehow, and I don't, I like, I'm looking for your guidance on it, having a spreadsheet to check that we, you know, how many implementations of this do we have, or, you know, whatever, do they work, you know? Um, because I think a lot of our testing here is like, "Yeah, basic stuff worked."
Colin: Um, yeah, and let me actually, dig in a little to that. So, these tests, uh, with they and running, you know, if we zoom into one of these tests, um, these self-tests, uh, this, this is post-ends, super implementation, so we get, you know, a couple tests that passed against, uh, the Lorenzo's implementation, for example. Um, so we're able to exchange setup, but this test that's called "publish namespace only" fails, um, and this test that has "subscriber." So, the idea is that there are these specified test cases, um, and these are specified in the, repo here. So, in docs, uh, tests, this test cases file, um, has each of the tests that's implemented, specified. So, there's, you know, different features and, you know, how you actually exercise that systematically. Um, so the idea is that we would, we would write down like, "Okay, here's this feature. Here's how we want to test that it's implemented." And the test client implements this algorithm, you know, it says, "I'm going to send this and expect that." And so, right now we only have a handful of tests. Um, that was to kind of incentivize people to implement test clients and kind of get the basic framework in place first. I think we're now at the point where we want to expand the number of test cases to start to flesh out and say, "Okay, what other feature coverage do we have?"
Jordi Cenzano: I want to make sure, yeah, I just, I got in to be answering your questions, I think that the draft-18 for Vienna is absolutely the right call. I think, since we started going to draft-18, I feel like it's, it's gone better for two reasons. One, the interop that we got on 18 was like, to this point, was kind of abysmal. And I think in our second go at it, it will be much more complete, because people got their aspirations in. They might make over line and, you know, know what they need to do, they'll keep chasing it. And it also sort of gives us a draft where we can put some stuff in that we think we want, and then some people who are very aggressive, they go out and implement those odd drafts, learn something about, "Oh, actually this thing probably won't work," and then you get a chance to pull it out before everybody's committed to it. Um, I'm kind of thinking along those lines for some of the stuff yesterday as well. Let's see if we can get it in draft-18, get some early feedback and, you know, don't have to necessarily fully commit. Um, so I, I think that's a good target, and I think on the "how can we make interop runner better," I definitely think that what Colin is going is like, yeah, we need to get from these five tests, we need to get to 70 like, or whatever, like, and I don't know, like, that's a lot of, like, given that we're all out implementing the spec, like, I'm not sure how many people want to, like, sit and think about what those 70 test cases look like, and then everybody has to write their interop client also to get those 70 tests.
Colin: Yeah, and I want to think through, you know, how do we write the tests that have the most value first, and kind of slowly sequence those things in a way that like, makes sense, because I don't think we want to like have somebody go off and spec 70 tests and then everybody's like on the hook to implement 70 tests.
Jordi Cenzano: And I guess I also, so, to make sure I understand correctly, the test that is there today are primarily control plane tests, and like the tool that I wrote in like November is like, entirely data plane tests, right, where it checks all of the features, like, there's 41 different tests against like the different things you can do on the data plane. So, I mean, trying to find a way to merge that might be good. Because like we found like under TLS on my interop, we found a lot of like, "Oh, like, it works when the extensions bit is on, but not when it's off," and like, those kinds of little, it's only one byte, but, no, I mean, I'm not throwing anybody under the bus. I'm just saying like, those, there's a lot of, there's a lot of bits and bobs in that as well, that if you haven't tested, you probably don't.
Alan Frindell: Thanks, Ali. Um, we're out of time. Next up is Colin, who's going to present his, test results. Also note that at least three of you have not signed into Meetecho, please do so if you're not already. If you're new, you may need a Datatracker account, which is, which takes a little bit of time to get, but it's, again, it's what you do, what you do to do it.
Colin: Good morning. Um, I am Colin from Ozyegin University. And there are some results from my students. And, uh, this, this study was conducted just before, uh, switch from proposals came in. So, that's not really included here. Can you go to presentation mode? Better? Yes, thank you. So, let's get input. So, the way we are, we have done these studies is, um, so we have subscriber-side track switching, uh, and, you know, more specifically, this subscriber-initiated and relay-executed switch. And then on the other hand, one version of DTS where the relay is actually doing the switch for us. So, we delegate the relay to do the track switching for us. In the first paradigm, subscriber-initiated track switching, we have three methods. Forward State to Link, so forward 0 or 1. Joining Fetch approach, and the Switch message. And this Switch message is pretty close to what Gwendal has in his draft, but not identical. But the methodology is more or less the same, as something we have shown before. In this case, now, on the subscriber-side paradigm, the relay actually, since it doesn't know the catalog, it has no visibility into what tracks are related, so we need to let the relay know about this via request. And the subscriber must manage the timing by itself, because it can't really know, know the positions of the tracks on the relay. So, if we are going to switch from track A to track B, you know, how they are aligned, you know, whether there's some latency between them or not, the subscriber doesn't know that. On the other hand, if you go to the DTS approach, so the relay actually again doesn't know which tracks are related, so you need to let the relay know about this. But the timing now can be managed by the relay, because it knows exactly, you know, where the group boundaries are and the positions of individual tracks. Now, we would like to have a seamless switch, and by seamless switch we mean no stalls during the switch, so fully no frame storing, frame repeating. No gaps, this is also important, because especially when the tracks are actually, they have some position issues like one track is way ahead of another track, there could be some gaps. We call them skips, and this is irrecoverable missing content. So, you just jump one or two frames, and this is undesirable as well. And lastly, no wasted data. We don't want to have any redundant traffic, because some of these methods will introduce redundant traffic as well. Especially when you are downshifting because of some bandwidth issues, you know, traffic duplication is really bad, and you usually want to avoid that. So, I need to tell you about the testbed first, because if you think of every scenario, there will be really a lot of different scenarios. So, we tried to grab the most important cases. So, we have two tracks here, so this is a simulated switch, track A and B. And we do switch between track A and B. Track A is 720p, 2.5 megabits per second, synthetic stream. And track B is 1080p, so high resolution, 4 megabit, higher bit rate synthetic video stream. GOPs are 25 frames, and 1-second duration. And the jitter buffer on the player side is half a second, 500 milliseconds. So, we test with A to B, B to A, and different types of positions in all of the different tests. Now, the scenarios we are looking at: first, the track alignment scenarios. In this case, whether the tracks are aligned on the relay at the time of switch, or whether one track is ahead of the other track. It could be they are in sync, or 1 second apart, or 2 seconds apart. 2 seconds apart means like there are two GOPs between them. And in the other scenario, we limit the bandwidth. So, in the first case, the bandwidth is not limited, so it's just a track alignment only. In the second case, there is a bandwidth limitation, and we have chosen these bandwidths carefully. 4.5 means, you know, track B is going to make it, track A is going to make it, but both of them, if they are sent both of them at the same time, they are not going to make it, right? It is less than the summation of both tracks. 3 megabits per second, only track A is going to make it. And 7 megabits per second is something sufficient for both tracks to be transmitted. So, different scenarios. And the last scenario is downstream delay. So, if your switch request, or somehow, you know, from relay to your subscriber, the delay is significant. We are testing that as well, so half a, half a second in this case. So, it's a baby, baby step iteration, but, you know, this is really too far, relay and that's where we are switching from. And the metrics we are looking at: switching delay, so how quickly I can switch; stall duration, whether I am stalled during the switch; skip duration, whether I skip any content; and finally, average excess traffic ratio. And we really would like to have this as small as possible. And this is a normalized value. 1 means one GOP of traffic has been redundant. 2 means twice the GOP duration traffic has been redundant. So, Forward State to Link, subscriber subscribes to both tracks, track A and B, right? At the switch time, we are going to send two request updates, 1 and 0, to switch on the relay side. There is no coordination, and totally this is three control messages. The risks are gaps or duplicate data, depending on where how the tracks are aligned. Joining Fetch is using a fetch message to buffer of the past GOPs. Subscriber first checks for largest object metadata to decide whether it is the fetch or not. And depending on this, three to five control messages. So, there is more control traffic in this case, and there are some corner cases where this completely fails. And the Switch message is the atomic message. You tell, "I want to switch from this track to this track," and the relay finds the appropriate group, and it's exactly one control message. The idea is to have zero gap, zero overlap, zero redundant traffic. But one problem with the Switch message, atleast in our implementation, is if the relay cannot find an appropriate switch position, it will timeout, and then it will send an error message. So, this table is describing what we have. And in terms of number of control messages, delivery gap, redundant traffic, and well-aligned switch, is clearly the winner here. Now, let's look at some numbers. I haven't included all the results, so just a few of them to simplify this presentation, but if you don't understand anything, just stop me, okay? So, on the left-hand side, skip the right table for now, just focus on the first table on the left. This is just a track alignment scenario and bandwidth is not limited, so we have unlimited bandwidth. When the tracks are in sync, so they arrive at the same time at the relay, the switching delay for Forward State to Link is about half a second. But that's also half a GOP, redundant traffic. If, if there is 2 seconds, the target is 2 seconds ahead, we are switching from A to B, then B is 2 seconds ahead, then, you know, the delay is still comparable, and then the redundant traffic is the same, but then obviously there will be skips. Right? Joining Fetch has similar delay, but no excess traffic, but it skips the 1 GOP. And the Switch method has a bit less delay, switching delay, and far less switching delay in the second case because the data is already there, and no redundant traffic. Now, this is again, bandwidth is not the issue, so client might be switching for another purpose, for another reason. And these are the results that we get from these three methods. Now, if we look at, at the right table, this is where the bandwidth is limited, but the tracks are in sync. So, we are trying to fix the other conditions while we are playing the others. So, A to B, remember, A is 2.5 megabits per second, and B is 4 megabits per second. And A to B is an upshift, and the bandwidth is 4.5. So, it is easy, it is also easy to get B, but you cannot get A plus B at the same time. So, Forward State to Link is going to have about half a second switching delay, and that much redundant traffic. But when you switch from B to A, although they are in sync, that's because of the bandwidth, the delay is going to increase to 7 seconds. This is extremely a bad switch, because you are switching at 7 seconds, 4.5 seconds of that 7 seconds is a stall, you are missing the other content, and you are just stuck. Joining Fetch also has a very large delay, almost 5 seconds, and then has very significant duplicate traffic because it can't really keep up with the switch. And then when switching to, from B to A, so you are downshifting at a lower bandwidth, again the delay is significant. The Switch message has a good delay when you are switching from A to B, so you are upshifting and the bandwidth is sufficient, but you are not really introducing an extra traffic because the bandwidth is sufficient, and you know where GOP boundaries are. But when you are downshifting at 3 megabits per second, there is also a delay. So, that's really a bad condition for any of these scenarios, and there is excessive traffic there too. Any questions so far? Yes.
Magnus Westerlund: I try to understand, so, if the limited bandwidth is four and a half megabits, and A is four, A is 2.5, B is four megabits.
Colin: A is 2.5, B is 4 megabits.
Magnus Westerlund: Okay, so, what I'm talking about is the B to A delay of 4.5. So, from the relay to the client, the subscriber, the bandwidth is 4.5, and the track bandwidths are 2.5 and 4.
Colin: Yeah, so basically, the pipe is 4.5, and the, the, the B is, B is 4, so it fits. If it's a Forward State to Link, if I understand properly, there is no bandwidth overlap. Because you are downloading four, and then there is a point in the future that you say, "Hey, now, I want track A." I don't understand why there is a stall and why there is a delay, because that bandwidth should be used, you've changed, the, the pipe is 4.5, you're changing from 4 to 3. There is no overlap, so never, bandwidth is never higher than 4.5. So, why that is the result?
Colin: Are you asking about the right top corner, right?
Magnus Westerlund: B to A, yes.
Colin: Uh, let's see. Because you're, you're switching from a high bit rate, and can't complete it at the 3 megabit per second bandwidth.
Magnus Westerlund: But the pipe is four and a half.
Colin: I am talking about B to A, where the subscriber, the bandwidth is 4.5.
Magnus Westerlund: And the track bandwidths are 2.5 and 4.
Colin: Yes. So, basically, the pipe is 4.5, and the, the B is 4. So, it fits.
Magnus Westerlund: Yes, but if you're switching to A, which is 2.5.
Colin: Yes.
Magnus Westerlund: Why does it take 7 seconds?
Colin: Because it's trying to finish the segment it was sending.
Magnus Westerlund: Yes.
Colin: Till it can send the next one.
Magnus Westerlund: What is measuring? What, what, what is, um, the number that we see, the delay is?
Colin: The switching delay is by the time you make the switch request, and by the time the new track plays.
Magnus Westerlund: What are you trying to play? The first object of the track? You have to see the first byte of the desired track?
Colin: Correct. It has to be object 0.
Magnus Westerlund: Yes.
Colin: Okay.
Magnus Westerlund: Okay, I think it's because that means there's, there's, there's data queued at the relay to send, and when you switch the forward, it doesn't affect any of that queued data.
Colin: Yes.
Magnus Westerlund: So, I think, I mean, I thought about that too when I was trying to write up a switch, and that's why I added that, not only do you, when you switch even in hard mode, you switch forward, you cancel any queued data you have to send yet on the old track. And that'll get your delay down if you add that extra step.
Colin: Yeah, the, two points of attention. Subscriber is, if you switch immediately, uh, when the bandwidth is, uh, low, it's not completely realistic, because usually you check your, you check your buffer and see the way and then it makes switching, but, but this is something that we might, uh, improve. Second, second question is that, most of the interop, uh, the runner today is actually, uh, checking for the signaling handshakes and so on. At some point, we should also think about if we have, uh, a file transfer. Send media to the other side. Because some of the things that I was working on, work was streaming video. Um, but, I mean, at some point, not right now, figure out how to be get some of this.
Magnus Westerlund: Right. Yeah. So, the long-term goal would be to actually have a way to test, you know, further up the stack to media that's going from one implementation to another, or through a relay and back into a client.
Jordi Cenzano: Um, I was going through draft-18, like I, like I hadn't actually read the drafts since 14, and, and I was just reading draft-18, right, and read it, and, you know, and I think I saw, I saw roughly counting somewhere between 50 and 70 sort of what I viewed as sort of features or functionality things in draft-18. Um, I think we're at the point as we're starting to get closer to working group last call, we should start being identifying all of the sort of features, and then somehow, and I don't, I like, I'm looking for your guidance on it, having a spreadsheet to check that we, you know, how many implementations of this do we have, or, you know, whatever, do they work, you know? Um, because I think a lot of our testing here is like, "Yeah, basic stuff worked."
Colin: Um, yeah, and let me actually, dig in a little to that. So, these tests, uh, with they and running, you know, if we zoom into one of these tests, um, these self-tests, uh, this, this is post-ends, super implementation, so we get, you know, a couple tests that passed against, uh, the Lorenzo's implementation, for example. Um, so we're able to exchange setup, but this test that's called "publish namespace only" fails, um, and this test that has "subscriber." So, the idea is that there are these specified test cases, um, and these are specified in the, repo here. So, in docs, uh, tests, this test cases file, um, has each of the tests that's implemented, specified. So, there's, you know, different features and, you know, how you actually exercise that systematically. Um, so the idea is that we would, we would write down like, "Okay, here's this feature. Here's how we want to test that it's implemented." And the test client implements this algorithm, you know, it says, "I'm going to send this and expect that." And so, right now we only have a handful of tests. Um, that was to kind of incentivize people to implement test clients and kind of get the basic framework in place first. I think we're now at the point where we want to expand the number of test cases to start to flesh out and say, "Okay, what other feature coverage do we have?"
Jordi Cenzano: I want to make sure, yeah, I just, I got in to be answering your questions, I think that the draft-18 for Vienna is absolutely the right call. I think, since we started going to draft-18, I feel like it's, it's gone better for two reasons. One, the interop that we got on 18 was like, to this point, was kind of abysmal. And I think in our second go at it, it will be much more complete, because people got their aspirations in. They might make over line and, you know, know what they need to do, they'll keep chasing it. And it also sort of gives us a draft where we can put some stuff in that we think we want, and then some people who are very aggressive, they go out and implement those odd drafts, learn something about, "Oh, actually this thing probably won't work," and then you get a chance to pull it out before everybody's committed to it. Um, I'm kind of thinking along those lines for some of the stuff yesterday as well. Let's see if we can get it in draft-18, get some early feedback and, you know, don't have to necessarily fully commit. Um, so I, I think that's a good target, and I think on the "how can we make interop runner better," I definitely think that what Colin is going is like, yeah, we need to get from these five tests, we need to get to 70 like, or whatever, like, and I don't know, like, that's a lot of, like, given that we're all out implementing the spec, like, I'm not sure how many people want to, like, sit and think about what those 70 test cases look like, and then everybody has to write their interop client also to get those 70 tests.
Colin: Yeah, and I want to think through, you know, how do we write the tests that have the most value first, and kind of slowly sequence those things in a way that like, makes sense, because I don't think we want to like have somebody go off and spec 70 tests and then everybody's like on the hook to implement 70 tests.
Jordi Cenzano: And I guess I also, so, to make sure I understand correctly, the test that is there today are primarily control plane tests, and like the tool that I wrote in like November is like, entirely data plane tests, right, where it checks all of the features, like, there's 41 different tests against like the different things you can do on the data plane. So, I mean, trying to find a way to merge that might be good. Because like we found like under TLS on my interop, we found a lot of like, "Oh, like, it works when the extensions bit is on, but not when it's off," and like, those kinds of little, it's only one byte, but, no, I mean, I'm not throwing anybody under the bus. I'm just saying like, those, there's a lot of, there's a lot of bits and bobs in that as well, that if you haven't tested, you probably don't.
Alan Frindell: Thanks, Ali. Um, we're out of time. Next up is Colin, who's going to present his, test results. Also note that at least three of you have not signed into Meetecho, please do so if you're not already. If you're new, you may need a Datatracker account, which is, which takes a little bit of time to get, but it's, again, it's what you do, what you do to do it.
Colin: Good morning. Um, I am Colin from Ozyegin University. And there are some results from my students. And, uh, this, this study was conducted just before, uh, switch from proposals came in. So, that's not really included here. Can you go to presentation mode? Better? Yes, thank you. So, let's get input. So, the way we are, we have done these studies is, um, so we have subscriber-side track switching, uh, and, you know, more specifically, this subscriber-initiated and relay-executed switch. And then on the other hand, one version of DTS where the relay is actually doing the switch for us. So, we delegate the relay to do the track switching for us. In the first paradigm, subscriber-initiated track switching, we have three methods. Forward State to Link, so forward 0 or 1. Joining Fetch approach, and the Switch message. And this Switch message is pretty close to what Gwendal has in his draft, but not identical. But the methodology is more or less the same, as something we have shown before. In this case, now, on the subscriber-side paradigm, the relay actually, since it doesn't know the catalog, it has no visibility into what tracks are related, so we need to let the relay know about this via request. And the subscriber must manage the timing by itself, because it can't really know, know the positions of the tracks on the relay. So, if we are going to switch from track A to track B, you know, how they are aligned, you know, whether there's some latency between them or not, the subscriber know that. On the other hand, if you go to the DTS approach, so the relay actually again doesn't know which tracks are related, so you need to let the relay know about this. But the timing now can be managed by the relay, because it knows exactly, you know, where the group boundaries are and the positions of individual tracks. Now, we would like to have a seamless switch, and by seamless switch we mean no stalls during the switch, so fully no frame storing, frame repeating. No gaps, this is also important, because especially when the tracks are actually, they have some position issues like one track is way ahead of another track, there could be some gaps. We call them skips, and this is irrecoverable missing content. So, you just jump one or two frames, and this is undesirable as well. And lastly, no wasted data. We don't want to have any redundant traffic, because some of these methods will introduce redundant traffic as well. Especially when you are downshifting because of some bandwidth issues, you know, traffic duplication is really bad, and you usually want to avoid that. So, I need to tell you about the testbed first, because if you think of every scenario, there will be really a lot of different scenarios. So, we tried to grab the most important cases. So, we have two tracks here, so this is a simulated switch, track A and B. And we do switch between track A and B. Track A is 720p, 2.5 megabits per second, synthetic stream. And track B is 1080p, so high resolution, 4 megabit, higher bit rate synthetic video stream. GOPs are 25 frames, and 1-second duration. And the jitter buffer on the player side is half a second, 500 milliseconds. So, we test with A to B, B to A, and different types of positions in all of the different tests. Now, the scenarios we are looking at: first, the track alignment scenarios. In this case, whether the tracks are aligned on the relay at the time of switch, or whether one track is ahead of the other track. It could be they are in sync, or 1 second apart, or 2 seconds apart. 2 seconds apart means like there are two GOPs between them. And in the other scenario, we limit the bandwidth. So, in the first case, the bandwidth is not limited, so it's just a track alignment only. In the second case, there is a bandwidth limitation, and we have chosen these bandwidths carefully. 4.5 means, you know, track B is going to make it, track A is going to make it, but both of them, if they are sent both of them at the same time, they are not going to make it, right? It is less than the summation of both tracks. 3 megabits per second, only track A is going to make it. And 7 megabits per second is something sufficient for both tracks to be transmitted. So, different scenarios. And the last scenario is downstream delay. So, if your switch request, or somehow, you know, from relay to your subscriber, the delay is significant. We are testing that as well, so half a, half a second in this case. So, it's a baby, baby step iteration, but, you know, this is really too far, relay and that's where we are switching from. And the metrics we are looking at: switching delay, so how quickly I can switch; stall duration, whether I am stalled during the switch; skip duration, whether I skip any content; and finally, average excess traffic ratio. And we really would like to have this as small as possible. And this is a normalized value. 1 means one GOP of traffic has been redundant. 2 means twice the GOP duration traffic has been redundant. So, Forward State to Link, subscriber subscribes to both tracks, track A and B, right? At the switch time, we are going to send two request updates, 1 and 0, to switch on the relay side. There is no coordination, and totally this is three control messages. The risks are gaps or duplicate data, depending on where how the tracks are aligned. Joining Fetch is using a fetch message to buffer of the past GOPs. Subscriber first checks for largest object metadata to decide whether it is the fetch or not. And depending on this, three to five control messages. So, there is more control traffic in this case, and there are some corner cases where this completely fails. And the Switch message is the atomic message. You tell, "I want to switch from this track to this track," and the relay finds the appropriate group, and it's exactly one control message. The idea is to zero gap, zero overlap, zero redundant traffic. But one problem with the Switch message, atleast in our implementation, is if the relay cannot find an appropriate switch position, it will timeout, and then it will send an error message. So, this table is describing what we have. And in terms of number of control messages, delivery gap, redundant traffic, and well-aligned switch, is clearly the winner here. Now, let's look at some numbers. I haven't included all the results, so just a few of them to simplify this presentation, but if you don't understand anything, just stop me, okay? So, on the left-hand side, skip the right table for now, just focus on the first table on the left. This is just a track alignment scenario and bandwidth is not limited, so we have unlimited bandwidth. When the tracks are in sync, so they arrive at the same time at the relay, the switching delay for Forward State to Link is about half a second. But that's also half a GOP, redundant traffic. If, if there is 2 seconds, the target is 2 seconds ahead, we are switching from A to B, then B is 2 seconds ahead, then, you know, the delay is still comparable, and then the redundant traffic is the same, but then obviously there will be skips. Right? Joining Fetch has similar delay, but no excess traffic, but it skips the 1 GOP. And the Switch method has a bit less delay, switching delay, and far less switching delay in the second case because the data is already there, and no redundant traffic. Now, this is again, bandwidth is not the issue, so client might be switching for another purpose, for another reason. And these are the results that we get from these three methods. Now, if we look at, at the right table, this is where the bandwidth is limited, but the tracks are in sync. So, we are trying to fix the other conditions while we are playing the others. So, A to B, remember, A is 2.5 megabits per second, and B is 4 megabits per second. And A to B is an upshift, and the bandwidth is 4.5. So, it is easy, it is also easy to get B, but you cannot get A plus B at the same time. So, Forward State to Link is going to have about half a second switching delay, and that much redundant traffic. But when you switch from B to A, although they are in sync, that's because of the bandwidth, the delay is going to increase to 7 seconds. This is extremely a bad switch, because you are switching at 7 seconds, 4.5 seconds of that 7 seconds is a stall, you are missing the other content, and you are just stuck. Joining Fetch also has a very large delay, almost 5 seconds, and then has very significant duplicate traffic because it can't really keep up with the switch. And then when switching to, from B to A, so you are downshifting at a lower bandwidth, again the delay is significant. The Switch message has a good delay when you are switching from A to B, so you are upshifting and the bandwidth is sufficient, but you are not really introducing an extra traffic because the bandwidth is sufficient, and you know where GOP boundaries are. But when you are downshifting at 3 megabits per second, there is also a delay. So, that's really a bad condition for any of these scenarios, and there is excessive traffic there too. Any questions so far? Yes.
Magnus Westerlund: I try to understand, so, if the limited bandwidth is four and a half megabits, and A is four, A is 2.5, B is four megabits.
Colin: A is 2.5, B is 4 megabits.
Magnus Westerlund: Okay, so, what I'm talking about is the B to A delay of 4.5. So, from the relay to the client, the subscriber, the bandwidth is 4.5, and the track bandwidths are 2.5 and 4.
Colin: Yeah, so basically, the pipe is 4.5, and the, the, the B is, B is 4, so it fits. If it's a Forward State to Link, if I understand properly, there is no bandwidth overlap. Because you are downloading four, and then there is a point in the future that you say, "Hey, now, I want track A." I don't understand why there is a stall and why there is a delay, because that bandwidth should be used, you've changed, the, the pipe is 4.5, you're changing from 4 to 3. There is no overlap, so never, bandwidth is never higher than 4.5. So, why that is the result?
Colin: Are you asking about the right top corner, right?
Magnus Westerlund: B to A, yes.
Colin: Uh, let's see. Because you're, you're switching from a high bit rate, and can't complete it at the 3 megabit per second bandwidth.
Magnus Westerlund: But the pipe is four and a half.
Colin: I am talking about B to A, where the subscriber, the bandwidth is 4.5.
Magnus Westerlund: And the track bandwidths are 2.5 and 4.
Colin: Yes. So, basically, the pipe is 4.5, and the, the B is 4. So, it fits.
Magnus Westerlund: Yes, but if you're switching to A, which is 2.5.
Colin: Yes.
Magnus Westerlund: Why does it take 7 seconds?
Colin: Because it's trying to finish the segment it was sending.
Magnus Westerlund: Yes.
Colin: Till it can send the next one.
Magnus Westerlund: What is measuring? What, what, what is, um, the number that we see, the delay is?
Colin: The switching delay is by the time you make the switch request, and by the time the new track plays.
Magnus Westerlund: What are you trying to play? The first object of the track? You have to see the first byte of the desired track?
Colin: Correct. It has to be object 0.
Magnus Westerlund: Yes.
Colin: Okay.
Magnus Westerlund: Okay, I think it's because that means there's, there's, there's data queued at the relay to send, and when you switch the forward, it doesn't affect any of that queued data.
Colin: Yes.
Magnus Westerlund: So, I think, I mean, I thought about that too when I was trying to write up a switch, and that's why I added that, not only do you, when you switch even in hard mode, you switch forward, you cancel any queued data you have to send yet on the old track. And that'll get your delay down if you add that extra step.
Colin: Yeah, the, two points of attention. Subscriber is, if you switch immediately, uh, when the bandwidth is, uh, low, it's not completely realistic, because usually you check your, you check your buffer and see the way and then it makes switching, but, but this is something that we might, uh, improve. Second, second question is that, most of the interop, uh, the runner today is actually, uh, checking for the signaling handshakes and so on. At some point, we should also think about if we have, uh, a file transfer. Send media to the other side. Because some of the things that I was working on, work was streaming video. Um, but, I mean, at some point, not right now, figure out how to be get some of this.
Magnus Westerlund: Right. Yeah. So, the long-term goal would be to actually have a way to test, you know, further up the stack to media that's going from one implementation to another, or through a relay and back into a client.
Jordi Cenzano: Um, I was going through draft-18, like I, like I hadn't actually read the drafts since 14, and, and I was just reading draft-18, right, and read it, and, you know, and I think I saw, I saw roughly counting somewhere between 50 and 70 sort of what I viewed as sort of features or functionality things in draft-18. Um, I think we're at the point as we're starting to get closer to working group last call, we should start being identifying all of the sort of features, and then somehow, and I don't, I like, I'm looking for your guidance on it, having a spreadsheet to check that we, you know, how many implementations of this do we have, or, you know, whatever, do they work, you know? Um, because I think a lot of our testing here is like, "Yeah, basic stuff worked."
Colin: Um, yeah, and let me actually, dig in a little to that. So, these tests, uh, with they and running, you know, if we zoom into one of these tests, um, these self-tests, uh, this, this is post-ends, super implementation, so we get, you know, a couple tests that passed against, uh, the Lorenzo's implementation, for example. Um, so we're able to exchange setup, but this test that's called "publish namespace only" fails, um, and this test that has "subscriber." So, the idea is that there are these specified test cases, um, and these are specified in the, repo here. So, in docs, uh, tests, this test cases file, um, has each of the tests that's implemented, specified. So, there's, you know, different features and, you know, how you actually exercise that systematically. Um, so the idea is that we would, we would write down like, "Okay, here's this feature. Here's how we want to test that it's implemented." And the test client implements this algorithm, you know, it says, "I'm going to send this and expect that." And so, right now we only have a handful of tests. Um, that was to kind of incentivize people to implement test clients and kind of get the basic framework in place first. I think we're now at the point where we want to expand the number of test cases to start to flesh out and say, "Okay, what other feature coverage do we have?"
Jordi Cenzano: I want to make sure, yeah, I just, I got in to be answering your questions, I think that the draft-18 for Vienna is absolutely the right call. I think, since we started going to draft-18, I feel like it's, it's gone better for two reasons. One, the interop that we got on 18 was like, to this point, was kind of abysmal. And I think in our second go at it, it will be much more complete, because people got their aspirations in. They might make over line and, you know, know what they need to do, they'll keep chasing it. And it also sort of gives us a draft where we can put some stuff in that we think we want, and then some people who are very aggressive, they go out and implement those odd drafts, learn something about, "Oh, actually this thing probably won't work," and then you get a chance to pull it out before everybody's committed to it. Um, I'm kind of thinking along those lines for some of the stuff yesterday as well. Let's see if we can get it in draft-18, get some early feedback and, you know, don't have to necessarily fully commit. Um, so I, I think that's a good target, and I think on the "how can we make interop runner better," I definitely think that what Colin is going is like, yeah, we need to get from these five tests, we need to get to 70 like, or whatever, like, and I don't know, like, that's a lot of, like, given that we're all out implementing the spec, like, I'm not sure how many people want to, like, sit and think about what those 70 test cases look like, and then everybody has to write their interop client also to get those 70 tests.
Colin: Yeah, and I want to think through, you know, how do we write the tests that have the most value first, and kind of slowly sequence those things in a way that like, makes sense, because I don't think we want to like have somebody go off and spec 70 tests and then everybody's like on the hook to implement 70 tests.
Jordi Cenzano: And I guess I also, so, to make sure I understand correctly, the test that is there today are primarily control plane tests, and like the tool that I wrote in like November is like, entirely data plane tests, right, where it checks all of the features, like, there's 41 different tests against like the different things you can do on the data plane. So, I mean, trying to find a way to merge that might be good. Because like we found like under TLS on my interop, we found a lot of like, "Oh, like, it works when the extensions bit is on, but not when it's off," and like, those kinds of little, it's only one byte, but, no, I mean, I'm not throwing anybody under the bus. I'm just saying like, those, there's a lot of, there's a lot of bits and bobs in that as well, that if you haven't tested, you probably don't.
Alan Frindell: Thanks, Ali. Um, we're out of time. Next up is Colin, who's going to present his, test results. Also note that at least three of you have not signed into Meetecho, please do so if you're not already. If you're new, you may need a Datatracker account, which is, which takes a little bit of time to get, but it's, again, it's what you do, what you do to do it.
Colin: Good morning. Um, I am Colin from Ozyegin University. And there are some results from my students. And, uh, this, this study was conducted just before, uh, switch from proposals came in. So, that's not really included here. Can you go to presentation mode? Better? Yes, thank you. So, let's get input. So, the way we are, we have done these studies is, um, so we have subscriber-side track switching, uh, and, you know, more specifically, this subscriber-initiated and relay-executed switch. And then on the other hand, one version of DTS where the relay is actually doing the switch for us. So, we delegate the relay to do the track switching for us. In the first paradigm, subscriber-initiated track switching, we have three methods. Forward State to Link, so forward 0 or 1. Joining Fetch approach, and the Switch message. And this Switch message is pretty close to what Gwendal has in his draft, but not identical. But the methodology is more or less the same, as something we have shown before. In this case, now, on the subscriber-side paradigm, the relay actually, since it doesn't know the catalog, it has no visibility into what tracks are related, so we need to let the relay know about this via request. And the subscriber must manage the timing by itself, because it can't really know, know the positions of the tracks on the relay. So, if we are going to switch from track A to track B, you know, how they are aligned, you know, whether there's some latency between them or not, the subscriber know that. On the other hand, if you go to the DTS approach, so the relay actually again doesn't know which tracks are related, so you need to let the relay know about this. But the timing now can be managed by the relay, because it knows exactly, you know, where the group boundaries are and the positions of individual tracks. Now, we would like to have a seamless switch, and by seamless switch we mean no stalls during the switch, so fully no frame storing, frame repeating. No gaps, this is also important, because especially when the tracks are actually, they have some position issues like one track is way ahead of another track, there could be some gaps. We call them skips, and this is irrecoverable missing content. So, you just jump one or two frames, and this is undesirable as well. And lastly, no wasted data. We don't want to have any redundant traffic, because some of these methods will introduce redundant traffic as well. Especially when you are downshifting because of some bandwidth issues, you know, traffic duplication is really bad, and you usually want to avoid that. So, I need to tell you about the testbed first, because if you think of every scenario, there will be really a lot of different scenarios. So, we tried to grab the most important cases. So, we have two tracks here, so this is a simulated switch, track A and B. And we do switch between track A and B. Track A is 720p, 2.5 megabits per second, synthetic stream. And track B is 1080p, so high resolution, 4 megabit, higher bit rate synthetic video stream. GOPs are 25 frames, and 1-second duration. And the jitter buffer on the player side is half a second, 500 milliseconds. So, we test with A to B, B to A, and different types of positions in all of the different tests. Now, the scenarios we are looking at: first, the track alignment scenarios. In this case, whether the tracks are aligned on the relay at the time of switch, or whether one track is ahead of the other track. It could be they are in sync, or 1 second apart, or 2 seconds apart. 2 seconds apart means like there are two GOPs between them. And in the other scenario, we limit the bandwidth. So, in the first case, the bandwidth is not limited, so it's just a track alignment only. In the second case, there is a bandwidth limitation, and we have chosen these bandwidths carefully. 4.5 means, you know, track B is going to make it, track A is going to make it, but both of them, if they are sent both of them at the same time, they are not going to make it, right? It is less than the summation of both tracks. 3 megabits per second, only track A is going to make it. And 7 megabits per second is something sufficient for both tracks to be transmitted. So, different scenarios. And the last scenario is downstream delay. So, if your switch request, or somehow, you know, from relay to your subscriber, the delay is significant. We are testing that as well, so half a, half a second in this case. So, it's a baby, baby step iteration, but, you know, this is really too far, relay and that's where we are switching from. And the metrics we are looking at: switching delay, so how quickly I can switch; stall duration, whether I am stalled during the switch; skip duration, whether I skip any content; and finally, average excess traffic ratio. And we really would like to have this as small as possible. And this is a normalized value. 1 means one GOP of traffic has been redundant. 2 means twice the GOP duration traffic has been redundant. So, Forward State to Link, subscriber subscribes to both tracks, track A and B, right? At the switch time, we are going to send two request updates, 1 and 0, to switch on the relay side. There is no coordination, and totally this is three control messages. The risks are gaps or duplicate data, depending on where how the tracks are aligned. Joining Fetch is using a fetch message to buffer of the past GOPs. Subscriber first checks for largest object metadata to decide whether it is the fetch or not. And depending on this, three to five control messages. So, there is more control traffic in this case, and there are some corner cases where this completely fails. And the Switch message is the atomic message. You tell, "I want to switch from this track to this track," and the relay finds the appropriate group, and it's exactly one control message. The idea is to zero gap, zero overlap, zero redundant traffic. But one problem with the Switch message, atleast in our implementation, is if the relay cannot find an appropriate switch position, it will timeout, and then it will send an error message. So, this table is describing what we have. And in terms of number of control messages, delivery gap, redundant traffic, and well-aligned switch, is clearly the winner here. Now, let's look at some numbers. I haven't included all the results, so just a few of them to simplify this presentation, but if you don't understand anything, just stop me, okay? So, on the left-hand side, skip the right table for now, just focus on the first table on the left. This is just a track alignment scenario and bandwidth is not limited, so we have unlimited bandwidth. When the tracks are in sync, so they arrive at the same time at the relay, the switching delay for Forward State to Link is about half a second. But that's also half a GOP, redundant traffic. If, if there is 2 seconds, the target is 2 seconds ahead, we are switching from A to B, then B is 2 seconds ahead, then, you know, the delay is still comparable, and then the redundant traffic is the same, but then obviously there will be skips. Right? Joining Fetch has similar delay, but no excess traffic, but it skips the 1 GOP. And the Switch method has a bit less delay, switching delay, and far less switching delay in the second case because the data is already there, and no redundant traffic. Now, this is again, bandwidth is not the issue, so client might be switching for another purpose, for another reason. And these are the results that we get from these three methods. Now, if we look at, at the right table, this is where the bandwidth is limited, but the tracks are in sync. So, we are trying to fix the other conditions while we are playing the others. So, A to B, remember, A is 2.5 megabits per second, and B is 4 megabits per second. And A to B is an upshift, and the bandwidth is 4.5. So, it is easy, it is also easy to get B, but you cannot get A plus B at the same time. So, Forward State to Link is going to have about half a second switching delay, and that much redundant traffic. But when you switch from B to A, although they are in sync, that's because of the bandwidth, the delay is going to increase to 7 seconds. This is extremely a bad switch, because you are switching at 7 seconds, 4.5 seconds of that 7 seconds is a stall, you are missing the other content, and you are just stuck. Joining Fetch also has a very large delay, almost 5 seconds, and then has very significant duplicate traffic because it can't really keep up with the switch. And then when switching to, from B to A, so you are downshifting at a lower bandwidth, again the delay is significant. The Switch message has a good delay when you are switching from A to B, so you are upshifting and the bandwidth is sufficient, but you are not really introducing an extra traffic because the bandwidth is sufficient, and you know where GOP boundaries are. But when you are downshifting at 3 megabits per second, there is also a delay. So, that's really a bad condition for any of these scenarios, and there is excessive traffic there too. Any questions so far? Yes.
Magnus Westerlund: I try to understand, so, if the limited bandwidth is four and a half megabits, and A is four, A is 2.5, B is four megabits.
Colin: A is 2.5, B is 4 megabits.
Magnus Westerlund: Okay, so, what I'm talking about is the B to A delay of 4.5. So, from the relay to the client, the subscriber, the bandwidth is 4.5, and the track bandwidths are 2.5 and 4.
Colin: Yeah, so basically, the pipe is 4.5, and the, the, the B is, B is 4, so it fits. If it's a Forward State to Link, if I understand properly, there is no bandwidth overlap. Because you are downloading four, and then there is a point in the future that you say, "Hey, now, I want track A." I don't understand why there is a stall and why there is a delay, because that bandwidth should be used, you've changed, the, the pipe is 4.5, you're changing from 4 to 3. There is no overlap, so never, bandwidth is never higher than 4.5. So, why that is the result?
Colin: Asking about the right top corner, right?
Magnus Westerlund: B to A, yes.
Colin: Uh, let's see. Because you're, you're switching from a high bit rate, and can't complete it at the 3 megabit per second bandwidth.
Magnus Westerlund: But the pipe is four and a half.
Colin: I am talking about B to A, where the subscriber, the bandwidth is 4.5.
Magnus Westerlund: And the track bandwidths are 2.5 and 4.
Colin: Yes. So, basically, the pipe is 4.5, and the, the B is 4. So, it fits.
Magnus Westerlund: Yes, but if you're switching to A, which is 2.5.
Colin: Yes.
Magnus Westerlund: Why does it take 7 seconds?
Colin: Because it's trying to finish the segment it was sending.
Magnus Westerlund: Yes.
Colin: Till it can send the next one.
Magnus Westerlund: What is measuring? What, what, what is, um, the number that we see, the delay is?
Colin: The switching delay is by the time you make the switch request, and by the time the new track plays.
Magnus Westerlund: What are you trying to play? The first object of the track? You have to see the first byte of the desired track?
Colin: Correct. It has to be object 0.
Magnus Westerlund: Yes.
Colin: Okay.
Magnus Westerlund: Okay, I think it's because that means there's, there's, there's data queued at the relay to send, and when you switch the forward, it doesn't affect any of that queued data.
Colin: Yes.
Magnus Westerlund: So, I think, I mean, I thought about that too when I was trying to write up a switch, and that's why I added that, not only do you, when you switch even in hard mode, you switch forward, you cancel any queued data you have to send yet on the old track. And that'll get your delay down if you add that extra step.
Colin: Yeah, the, two points of attention. Subscriber is, if you switch immediately, uh, when the bandwidth is, uh, low, it's not completely realistic, because usually you check your, you check your buffer and see the way and then it makes switching, but, but this is something that we might, uh, improve. Second, second question is that, most of the interop, uh, the runner today is actually, uh, checking for the signaling handshakes and so on. At some point, we should also think about if we have, uh, a file transfer. Send media to the other side. Because some of the things that I was working on, work was streaming video. Um, but, I mean, at some point, not right now, figure out how to be get some of this.
Magnus Westerlund: Right. Yeah. So, the long-term goal would be to actually have a way to test, you know, further up the stack to media that's going from one implementation to another, or through a relay and back into a client.
Jordi Cenzano: Um, I was going through draft-18, like I, like I hadn't actually read the drafts since 14, and, and I was just reading draft-18, right, and read it, and, you know, and I think I saw, I saw roughly counting somewhere between 50 and 70 sort of what I viewed as sort of features or functionality things in draft-18. Um, I think we're at the point as we're starting to get closer to working group last call, we should start being identifying all of the sort of features, and then somehow, and I don't, I like, I'm looking for your guidance on it, having a spreadsheet to check that we, you know, how many implementations of this do we have, or, you know, whatever, do they work, you know? Um, because I think a lot of our testing here is like, "Yeah, basic stuff worked."
Colin: Um, yeah, and let me actually, dig in a little to that. So, these tests, uh, with they and running, you know, if we zoom into one of these tests, um, these self-tests, uh, this, this is post-ends, super implementation, so we get, you know, a couple tests that passed against, uh, the Lorenzo's implementation, for example. Um, so we're able to exchange setup, but this test that's called "publish namespace only" fails, um, and this test that has "subscriber." So, the idea is that there are these specified test cases, um, and these are specified in the, repo here. So, in docs, uh, tests, this test cases file, um, has each of the tests that's implemented, specified. So, there's, you know, different features and, you know, how you actually exercise that systematically. Um, so the idea is that we would, we would write down like, "Okay, here's this feature. Here's how we want to test that it's implemented." And the test client implements this algorithm, you know, it says, "I'm going to send this and expect that." And so, right now we only have a handful of tests. Um, that was to kind of incentivize people to implement test clients and kind of get the basic framework in place first. I think we're now at the point where we want to expand the number of test cases to start to flesh out and say, "Okay, what other feature coverage do we have?"
Jordi Cenzano: I want to make sure, yeah, I just, I got in to be answering your questions, I think that the draft-18 for Vienna is absolutely the right call. I think, since we started going to draft-18, I feel like it's, it's gone better for two reasons. One, the interop that we got on 18 was like, to this point, was kind of abysmal. And I think in our second go at it, it will be much more complete, because people got their aspirations in. They might make over line and, you know, know what they need to do, they'll keep chasing it. And it also sort of gives us a draft where we can put some stuff in that we think we want, and then some people who are very aggressive, they go out and implement those odd drafts, learn something about, "Oh, actually this thing probably won't work," and then you get a chance to pull it out before everybody's committed to it. Um, I'm kind of thinking along those lines for some of the stuff yesterday as well. Let's see if we can get it in draft-18, get some early feedback and, you know, don't have to necessarily fully commit. Um, so I, I think that's a good target, and I think on the "how can we make interop runner better," I definitely think that what Colin is going is like, yeah, we need to get from these five tests, we need to get to 70 like, or whatever, like, and I don't know, like, that's a lot of, like, given that we're all out implementing the spec, like, I'm not sure how many people want to, like, sit and think about what those 70 test cases look like, and then everybody has to write their interop client also to get those 70 tests.
Colin: Yeah, and I want to think through, you know, how do we write the tests that have the most value first, and kind of slowly sequence those things in a way that like, makes sense, because I don't think we want to like have somebody go off and spec 70 tests and then everybody's like on the hook to implement 70 tests.
Jordi Cenzano: And I guess I also, so, to make sure I understand correctly, the test that is there today are primarily control plane tests, and like the tool that I wrote in like November is like, entirely data plane tests, right, where it checks all of the features, like, there's 41 different tests against like the different things you can do on the data plane. So, I mean, trying to find a way to merge that might be good. Because like we found like under TLS on my interop, we found a lot of like, "Oh, like, it works when the extensions bit is on, but not when it's off," and like, those kinds of little, it's only one byte, but, no, I mean, I'm not throwing anybody under the bus. I'm just saying like, those, there's a lot of, there's a lot of bits and bobs in that as well, that if you haven't tested, you probably don't.
Alan Frindell: Thanks, Ali. Um, we're out of time. Next up is Colin, who's going to present his, test results. Also note that at least three of you have not signed into Meetecho, please do so if you're not already. If you're new, you may need a Datatracker account, which is, which takes a little bit of time to get, but it's, again, it's what you do, what you do to do it.
Colin: Good morning. Um, I am Colin from Ozyegin University. And there are some results from my students. And, uh, this, this study was conducted just before, uh, switch from proposals came in. So, that's not really included here. Can you go to presentation mode? Better? Yes, thank you. So, let's get input. So, the way we are, we have done these studies is, um, so we have subscriber-side track switching, uh, and, you know, more specifically, this subscriber-initiated and relay-executed switch. And then on the other hand, one version of DTS where the relay is actually doing the switch for us. So, we delegate the relay to do the track switching for us. In the first paradigm, subscriber-initiated track switching, we have three methods. Forward State to Link, so forward 0 or 1. Joining Fetch approach, and the Switch message. And this Switch message is pretty close to what Gwendal has in his draft, but not identical. But the methodology is more or less the same, as something we have shown before. In this case, now, on the subscriber-side paradigm, the relay actually, since it doesn't know the catalog, it has no visibility into what tracks are related, so we need to let the relay know about this via request. And the subscriber must manage the timing by itself, because it can't really know, know the positions of the tracks on the relay. So, if we are going to switch from track A to track B, you know, how they are aligned, you know, whether there's some latency between them or not, the subscriber know that. On the other hand, if you go to the DTS approach, so the relay actually again doesn't know which tracks are related, so you need to let the relay know about this. But the timing now can be managed by the relay, because it knows exactly, you know, where the group boundaries are and the positions of individual tracks. Now, we would like to have a seamless switch, and by seamless switch we mean no stalls during the switch, so fully no frame storing, frame repeating. No gaps, this is also important, because especially when the tracks are actually, they have some position issues like one track is way ahead of another track, there could be some gaps. We call them skips, and this is irrecoverable missing content. So, you just jump one or two frames, and this is undesirable as well. And lastly, no wasted data. We don't want to have any redundant traffic, because some of these methods will introduce redundant traffic as well. Especially when you are downshifting because of some bandwidth issues, you know, traffic duplication is really bad, and you usually want to avoid that. So, I need to tell you about the testbed first, because if you think of every scenario, there will be really a lot of different scenarios. So, we tried to grab the most important cases. So, we have two tracks here, so this is a simulated switch, track A and B. And we do switch between track A and B. Track A is 720p, 2.5 megabits per second, synthetic stream. And track B is 1080p, so high resolution, 4 megabit, higher bit rate synthetic video stream. GOPs are 25 frames, and 1-second duration. And the jitter buffer on the player side is half a second, 500 milliseconds. So, we test with A to B, B to A, and different types of positions in all of the different tests. Now, the scenarios we are looking at: first, the track alignment scenarios. In this case, whether the tracks are aligned on the relay at the time of switch, or whether one track is ahead of the other track. It could be they are in sync, or 1 second apart, or 2 seconds apart. 2 seconds apart means like there are two GOPs between them. And in the other scenario, we limit the bandwidth. So, in the first case, the bandwidth is not limited, so it's just a track alignment only. In the second case, there is a bandwidth limitation, and we have chosen these bandwidths carefully. 4.5 means, you know, track B is going to make it, track A is going to make it, but both of them, if they are sent both of them at the same time, they are not going to make it, right? It is less than the summation of both tracks. 3 megabits per second, only track A is going to make it. And 7 megabits per second is something sufficient for both tracks to be transmitted. So, different scenarios. And the last scenario is downstream delay. So, if your switch request, or somehow, you know, from relay to your subscriber, the delay is significant. We are testing that as well, so half a, half a second in this case. So, it's a baby, baby step iteration, but, you know, this is really too far, relay and that's where we are switching from. And the metrics we are looking at: switching delay, so how quickly I can switch; stall duration, whether I am stalled during the switch; skip duration, whether I skip any content; and finally, average excess traffic ratio. And we really would like to have this as small as possible. And this is a normalized value. 1 means one GOP of traffic has been redundant. 2 means twice the GOP duration traffic has been redundant. So, Forward State to Link, subscriber subscribes to both tracks, track A and B, right? At the switch time, we are going to send two request updates, 1 and 0, to switch on the relay side. There is no coordination, and totally this is three control messages. The risks are gaps or duplicate data, depending on where how the tracks are aligned. Joining Fetch is using a fetch message to buffer of the past GOPs. Subscriber first checks for largest object metadata to decide whether it is the fetch or not. And depending on this, three to five control messages. So, there is more control traffic in this case, and there are some corner cases where this completely fails. And the Switch message is the atomic message. You tell, "I want to switch from this track to this track," and the relay finds the appropriate group, and it's exactly one control message. The idea is to zero gap, zero overlap, zero redundant traffic. But one problem with the Switch message, atleast in our implementation, is if the relay cannot find an appropriate switch position, it will timeout, and then it will send an error message. So, this table is describing what we have. And in terms of number of control messages, delivery gap, redundant traffic, and well-aligned switch, is clearly the winner here. Now, let's look at some numbers. I haven't included all the results, so just a few of them to simplify this presentation, but if you don't understand anything, just stop me, okay? So, on the left-hand side, skip the right table for now, just focus on the first table on the left. This is just a track alignment scenario and bandwidth is not limited, so we have unlimited bandwidth. When the tracks are in sync, so they arrive at the same time at the relay, the switching delay for Forward State to Link is about half a second. But that's also half a GOP, redundant traffic. If, if there is 2 seconds, the target is 2 seconds ahead, we are switching from A to B, then B is 2 seconds ahead, then, you know, the delay is still comparable, and then the redundant traffic is the same, but then obviously there will be skips. Right? Joining Fetch has similar delay, but no excess traffic, but it skips the 1 GOP. And the Switch method has a bit less delay, switching delay, and far less switching delay in the second case because the data is already there, and no redundant traffic. Now, this is again, bandwidth is not the issue, so client might be switching for another purpose, for another reason. And these are the results that we get from these three methods. Now, if we look at, at the right table, this is where the bandwidth is limited, but the tracks are in sync. So, we are trying to fix the other conditions while we are playing the others. So, A to B, remember, A is 2.5 megabits per second, and B is 4 megabits per second. And A to B is an upshift, and the bandwidth is 4.5. So, it is easy, it is also easy to get B, but you cannot get A plus B at the same time. So, Forward State to Link is going to have about half a second switching delay, and that much redundant traffic. But when you switch from B to A, although they are in sync, that's because of the bandwidth, the delay is going to increase to 7 seconds. This is extremely a bad switch, because you are switching at 7 seconds, 4.5 seconds of that 7 seconds is a stall, you are missing the other content, and you are just stuck. Joining Fetch also has a very large delay, almost 5 seconds, and then has very significant duplicate traffic because it can't really keep up with the switch. And then when switching to, from B to A, so you are downshifting at a lower bandwidth, again the delay is significant. The Switch message has a good delay when you are switching from A to B, so you are upshifting and the bandwidth is sufficient, but you are not really introducing an extra traffic because the bandwidth is sufficient, and you know where GOP boundaries are. But when you are downshifting at 3 megabits per second, there is also a delay. So, that's really a bad condition for any of these scenarios, and there is excessive traffic there too. Any questions so far? Yes.
Magnus Westerlund: I try to understand, so, if the limited bandwidth is four and a half megabits, and A is four, A is 2.5, B is four megabits.
Colin: A is 2.5, B is 4 megabits.
Magnus Westerlund: Okay, so, what I'm talking about is the B to A delay of 4.5. So, from the relay to the client, the subscriber, the bandwidth is 4.5, and the track bandwidths are 2.5 and 4.
Colin: Yeah, so basically, the pipe is 4.5, and the, the, the B is, B is 4, so it fits. If it's a Forward State to Link, if I understand properly, there is no bandwidth overlap. Because you are downloading four, and then there is a point in the future that you say, "Hey, now, I want track A." I don't understand why there is a stall and why there is a delay, because that bandwidth should be used, you've changed, the, the pipe is 4.5, you're changing from 4 to 3. There is no overlap, so never, bandwidth is never higher than 4.5. So, why that is the result?
Colin: Are you asking about the right top corner, right?
Magnus Westerlund: B to A, yes.
Colin: Uh, let's see. Because you're, you're switching from a high bit rate, and can't complete it at the 3 megabit per second bandwidth.
Magnus Westerlund: But the pipe is four and a half.
Colin: I am talking about B to A, where the subscriber, the bandwidth is 4.5.
Magnus Westerlund: And the track bandwidths are 2.5 and 4.
Colin: Yes. So, basically, the pipe is 4.5, and the, the B is 4. So, it fits.
Magnus Westerlund: Yes, but if you're switching to A, which is 2.5.
Colin: Yes.
Magnus Westerlund: Why does it take 7 seconds?
Colin: Because it's trying to finish the segment it was sending.
Magnus Westerlund: Yes.
Colin: Till it can send the next one.
Magnus Westerlund: What is measuring? What, what, what is, um, the number that we see, the delay is?
Colin: The switching delay is by the time you make the switch request, and by the time the new track plays.
Magnus Westerlund: What are you trying to play? The first object of the track? You have to see the first byte of the desired track?
Colin: Correct. It has to be object 0.
Magnus Westerlund: Yes.
Colin: Okay.
Magnus Westerlund: Okay, I think it's because that means there's, there's, there's data queued at the relay to send, and when you switch the forward, it doesn't affect any of that queued data.
Colin: Yes.
Magnus Westerlund: So, I think, I mean, I thought about that too when I was trying to write up a switch, and that's why I added that, not only do you, when you switch even in hard mode, you switch forward, you cancel any queued data you have to send yet on the old track. And that'll get your delay down if you add that extra step.
Colin: Yeah, the, two points of attention. Subscriber is, if you switch immediately, uh, when the bandwidth is, uh, low, it's not completely realistic, because usually you check your, you check your buffer and see the way and then it makes switching, but, but this is something that we might, uh, improve. Second, second question is that, most of the interop, uh, the runner today is actually, uh, checking for the signaling handshakes and so on. At some point, we should also think about if we have, uh, a file transfer. Send media to the other side. Because some of the things that I was working on, work was streaming video. Um, but, I mean, at some point, not right now, figure out how to be get some of this.
Magnus Westerlund: Right. Yeah. So, the long-term goal would be to actually have a way to test, you know, further up the stack to media that's going from one implementation to another, or through a relay and back into a client.
Jordi Cenzano: Um, I was going through draft-18, like I, like I hadn't actually read the drafts since 14, and, and I was just reading draft-18, right, and read it, and, you know, and I think I saw, I saw roughly counting somewhere between 50 and 70 sort of what I viewed as sort of features or functionality things in draft-18. Um, I think we're at the point as we're starting to get closer to working group last call, we should start being identifying all of the sort of features, and then somehow, and I don't, I like, I'm looking for your guidance on it, having a spreadsheet to check that we, you know, how many implementations of this do we have, or, you know, whatever, do they work, you know? Um, because I think a lot of our testing here is like, "Yeah, basic stuff worked."
Colin: Um, yeah, and let me actually, dig in a little to that. So, these tests, uh, with they and running, you know, if we zoom into one of these tests, um, these self-tests, uh, this, this is post-ends, super implementation, so we get, you know, a couple tests that passed against, uh, the Lorenzo's implementation, for example. Um, so we're able to exchange setup, but this test that's called "publish namespace only" fails, um, and this test that has "subscriber." So, the idea is that there are these specified test cases, um, and these are specified in the, repo here. So, in docs, uh, tests, this test cases file, um, has each of the tests that's implemented, specified. So, there's, you know, different features and, you know, how you actually exercise that systematically. Um, so the idea is that we would, we would write down like, "Okay, here's this feature. Here's how we want to test that it's implemented." And the test client implements this algorithm, you know, it says, "I'm going to send this and expect that." And so, right now we only have a handful of tests. Um, that was to kind of incentivize people to implement test clients and kind of get the basic framework in place first. I think we're now at the point where we want to expand the number of test cases to start to flesh out and say, "Okay, what other feature coverage do we have?"
Jordi Cenzano: See, in my opinion, the subscriber doesn't have the option to unsubscribe from the old one until it sees the first frame of the new one, because of the delay. That's why you get the overlap on that chart.
Suhas Nandakumar: Yes, but is it possible for the client to send a join-fetch request, get the backlog data on the new track, and then send an unsubscribe to the old track? That way you minimize the overlap.
Jordi Cenzano: That's true. But you're still relying on the client to manage the overlap. With the Switch message, the relay manages it.
Cullen Jennings: If there are no more questions on this topic, let's move on. Ali, you're up next.
Ali Begen: Thank you. So, my second set of slides is about the draft on the DTS, the Decentralized Tracking System, which we also have some results for.
In DTS, we have the relay doing the track switching on behalf of the subscriber. The subscriber sends a request to the relay, saying "I want to receive the stream from participant X," and the relay translates that into subscriptions for the appropriate tracks. The subscriber doesn't need to know the names of the individual tracks, only the name of the participant.
This simplifies the subscriber's logic significantly, as they don't have to manage multiple subscriptions or track the timing of the switches. The relay handles all of that, ensuring that the switches occur on group boundaries and that there is no data wasted or skipped.
We evaluated this approach using the same testbed as before, with the same track alignment, bandwidth limitation, and downstream delay scenarios.
As you can see from the results on the slide, the switching delay for the relay-managed switch is consistently low, around half a second, regardless of whether it's an upshift or a downshift, and regardless of the track alignment.
More importantly, there is zero stall duration, zero skip duration, and zero excess traffic in all scenarios. Because the relay manages the switch, it can ensure that it always occurs on a group boundary, without any data overlap or gap.
Even in the bandwidth-constrained scenarios, the relay can manage the switch effectively, without causing any stalls or duplicate traffic. This is because the relay can stop sending the old track before starting to send the new track, at exactly the right moment.
So, in conclusion, the relay-managed approach in DTS provides a much more robust and efficient way to handle track switching, especially in challenging network conditions. It simplifies the subscriber's implementation and ensures a much better user experience.
Any questions on DTS?
Cullen Jennings: If there's no more questions, thank you, Ali. That was very interesting. Next, we have the MOQ editors who will present their updates.