Session Date/Time: 14 Apr 2026 16:45
WEBVTT
1 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:00:03.819 --> 00:00:23.149 Here we go. So this is the AI preferences, working group meeting. If you're not here for AI preferences, you've made some bad travel choices travel choices? Well, and other choices perhaps. So we have an overview 1st. We'll kick off with a note well for folks who are.
2 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:00:23.149 --> 00:00:43.149 Familiar with the ITF, this should be very familiar for other folks, please take a look. This is what we call the note well, it's the policies under which we operate in the ITF regarding things like professional behavior, harassment, intellectual property, which is interesting to some folks here, and, and, and generally our process and also including.
3 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:00:43.149 --> 00:01:03.149 Privacy that includes that that informs the fact that we were we're recording the meeting. So if you're not familiar with this, please do take a look, at your favorite internet search engine and you can search for ITF notew well and find out more about this. We do take it seriously. And if you have any questions or any concerns, you can talk to.
4 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:01:03.149 --> 00:01:12.030 Rash and I as the chairs and our I guess area director. Mike is here. He can also talk to Mike.
5 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:01:12.030 --> 00:01:32.030 So the agenda for this week, we're gonna do a brief overview. We're gonna talk about the use cases and impacts that we've started to collect on the wiki. Not not for too long, but just to give an overview and freshen people up on those. And then we're gonna dive into the issues. We we've organized it to talk about the.
6 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:01:32.030 --> 00:01:52.030 Things that we think we might be able to make some progress on in 1st and then go down the list here. So our intent is to work through all the issues that we have open that aren't editorial in this meeting if we possibly can, but we are gonna focus on the vocabulary because that's where we need to make the most progress. We.
7 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:01:52.030 --> 00:02:16.290 Did this in our interim meeting, what, about a month ago now, I think to kind of prep for this and and we're hoping we'll make even better progress in this meeting. Any agenda bashing? Okay, we do need to select some people to try to take notes of the meeting. Can anyone volunteer to do that?
8 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:02:16.290 --> 00:02:33.300 Will anyone volunteer to do that? I suppose the other option is to feed the recording into Egger's tools. Yeah, but I think like even like high level points in the.
9 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:02:33.300 --> 00:02:50.370 Not bad, it's like useful, right. So, and we'll be taking notes in the issues list as well. I think we just need a backup effectively. So don't feel that it's it's high pressure that you need to write down everything that everyone says, but if you have somebody volunteered just to keep track of what we decide and what we talk about, that would be much appreciated.
10 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:02:50.370 --> 00:03:11.430 Yeah, please don't all rush at the same time to the studiously looking at their laptops, yeah. I don't know, pick up more see my ADHD, you really do not want me taking notes.
11 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:03:11.430 --> 00:03:30.720 Yeah. You should do that. Yep. All right. Well, why don't you open a note document somewhere on edge doc and share.
12 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:03:30.720 --> 00:03:45.840 Alright, so we're going to open a document and share a link and if folks could collaboratively take notes, we'll, we'll try to make sure we can capture stuff there. We'll see where we get. This doesn't bog well for the rest of the meeting.
13 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:03:45.840 --> 00:04:07.520 Okay, so yeah, at a very, very high level, the goal for this meeting, we want to try and find a path towards successfully shipping specifications. We were originally chartered to finish up last year. We extended that. We don't want to be doing this indefinitely, so we have to figure out how to get from here to there. That doesn't mean we're gonna be necess.
14 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:04:07.520 --> 00:04:40.760 Necessarily shutting down anytime soon, but we do need to demonstrate some progress. And, and one of the things we're concerned about as chairs is is we seem to, to have good discussions in meetings, but outside of meetings things tend to grow into a halt. And so we need to identify concrete things that people can work on between the meetings so we can make progress, if that makes sense to folks. So please keep this in mind in all our discussions. And we wanted to give some reminders to folks as well just to to make sure we keep other things in mind as we, we talk about this. The IT process does require.
15 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:04:40.760 --> 00:05:00.760 Consensus. And what that means is, is that it doesn't mean we need perfect unanimity, but we do need to seriously consider all the objections and to handle those objections. And and that handling needs to stick through the process, which means that when it goes to our area director and then when it goes to the ISG, if there's not full consensus and full agreement and there are objections of things.
16 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:05:00.760 --> 00:05:20.760 They understand why we made the decisions we made and that we actually handled those those objections well. And so please keep that in mind. We do see a bit of a pattern where folks are bringing proposals that make them happy, but they fail to address the objections that other people have, and we need to see that kind of meeting of minds to make.
17 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:05:20.760 --> 00:05:40.760 Progress. Likewise, the ITF code of conduct does require professional behavior, and so we're gonna very strongly ask people to criticize the ideas but not the people or their backgrounds who are making suggestions or or or make criticisms, and especially characterizing other people's state of mind, e.g.
18 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:05:40.760 --> 00:06:10.579 Example calling someone naive or not even naive is not helpful. It doesn't help us come to consensus. So please be careful about how you talk about things. This needs to be a collaborative environment. I know that's difficult because of the different forces we have at play here but it's really important for the ITF process to work. And finally, our our charter guides our scope. That's the document that tells us what we're doing here. We need to to think about that constantly to make sure that we are fulfilling our charter and working within the boundaries that it sets. The charter can change.
19 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:06:10.579 --> 00:06:30.579 But that's not something we can do trivially. We need to get buy in from the ISG to do that. One of the outcomes this of this week could be that we come to an agreement in this room that, hey, let's try and change the charter because we think we'd be able to better make progress. We need to clear that with the actual working group and then take that to the ISG and get them to approve that charter the entire ITF community.
20 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:06:30.579 --> 00:06:43.379 Right, it has to go through an ITF review exactly. So it's not a trivial thing, but it is something that in, you know, we can consider if it's necessary for us to complete our work.
21 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:06:43.379 --> 00:07:01.559 And that's all I have for the the overview. I think we wanted because we do have some new faces, as well as a bunch of folks who haven't seen each other for a while, we'll we'll do a quick round of introductions just so everybody can remember who everyone is and we'll include the online folks as well in that.
22 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:07:01.559 --> 00:07:20.879 Yeah, like if you go through the order on the Webex like start to call people out. Yeah. Just like, one more thing, there's some new proposals and stuff that got added so like we talked about like getting stuff well ahead of the meeting and time, but thanks for everybody who put in something earlier, but there's also some new stuff, so if you can kind of take a look at the GitHub.
23 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:07:20.879 --> 00:07:40.879 To look at the issues like and the stuff that's being posted there, that'll be good to do it, so maybe you can do it like while on the meeting time as well if you instead of browsing on some website, you can look at the GitHub issues but I I think it's like I think like Mark and I emphasized this quite a bit, like, you know, getting ahead of time on this thing makes a meeting more product.
24 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:07:40.879 --> 00:08:13.949 But at least like we have proposals so that's like a buster thing, so thank you yeah and then we'll work those in the proposals that have been made, we'll work those in the agenda to make sure that we are we get them in front of folks and have discussions. So thank you for that. Thanks. So I'll I'll start, do the camera views just cycle? I I I don't control the cameras. Yeah, so the last the the the speaker detection should work, but we kind of have this weird, meeting room set up where we combine two rooms, so it may not work like perfectly, but ideally it would, like.
25 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:08:13.949 --> 00:08:29.039 Pick up the speaker and point at them. So other than the chairs, I think the others people should be fine when they speak. Yeah, that seems to be happening back here, but the cameras that are facing toward you don't seem to be doing anything. Do we have access to the Cisco bug queue?
26 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:08:29.039 --> 00:08:49.039 I think I'll call Colin in the afternoon. You can manually switch it to presenter mode, which I think would activate those but well I'm hoping that we're not gonna be talking a lot so yeah you don't want to be If this meeting goes well. So.
27 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:08:49.039 --> 00:09:09.039 I'll start. I'm Mark Noddingham. I'm one of the working group chairs. I'm also a member of the internet architecture board and I work for cloud Flare, but in this room I'm not representing CloudFlare. I don't actually work on the team that is working on product or or or any kind of proposals in this area. I'm effectively firewalled away from them. And my.
28 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:09:09.039 --> 00:09:30.829 Interactions with them are the same way that interact with any of you, which is helping you participate effectively in this work. Yeah, Suresh Krishnan, the other co chair of AI pref. I work for Cisco, again, like I'm here only at Cisco because I'm hosting all of you, but otherwise, like, you know, not representing them in other ways and also a member of the IAV, and I'm really glad that you made the trick here too.
29 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:09:30.829 --> 00:09:54.599 Be here on everybody who joined online as well to be in whatever time zone you are in to make time for this, so thank you very much. Maybe down on this side or you want to do a little doing the room 1st. Okay yeah ok so yeah hi, I'm John Yarner. I'm from Google. I I work on search, I've worked on robots and text and it was a pleasure to join me.
30 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:09:54.599 --> 00:10:12.719 Welcome John, thanks. Elaine Newton I'm with AWS, I lead our AI standards team. Kathy Lee, Google DeepMind. I'm a product council. Hi, Malika, I'm at Meta working on the media and IP policy.
31 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:10:12.719 --> 00:10:28.289 Caleb Donaldson I'm a copyright lawyer at Google. I'm Kevin Kelly I'm a member of technical staff at Infront. I'm Aaron Simon, I'm product council for search at Google. Ryang, held Severe, their discoverability program.
32 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:10:28.289 --> 00:10:47.909 Principal engineer. Daniel like. I'm Patrick Wong and I work on the site. I'm Mike Jones. I'm an independent consultant in security and.
33 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:10:47.909 --> 00:11:03.749 Hi, I'm Leah Rom. I'm a product council at supporting application security and application. Public policy for advanced superior company here at the advanced local public nations based in Europe.
34 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:11:03.749 --> 00:11:18.929 I Amer from the European Digital Reading Lab in the digital Book business. I'm Warren Kamari, I'm on the internet architecture board, my day job as google, but I don't really do anything useful for them.
35 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:11:18.929 --> 00:11:38.639 My name is Justin. And I I try to do useful things for Cisco and I can. My name is Paul Keller. I'm from Open Future with an independent think Tank based in Amsterdam and I'm one of the editors of the vocabulary.
36 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:11:38.639 --> 00:11:56.969 Mountain Thompson, I'm at Mozilla and one of the editors. Just a reminder, please do raise your voice for the remote people. Nick's all of the internet architecture board. Hi, Kevin Bankston Center for Democracy and technology.
37 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:11:56.969 --> 00:12:14.039 Hello, my name is Timod Robot with Creative Carbons. My name is Sebastian Posts. I'm the CEO of Lizzium, and we work on registry based solutions for AI preferences. Hi, I'm Ronish Alid, chief Compliance and Ethics officer of Brid Data.
38 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:12:14.039 --> 00:12:30.989 I'm Glendeene from Comcast MBC Universal. I'm a technologist nerd. Mike Bishop, I'm employed at Akamai, but I'm here as the responsible area director. I'm Nate Hick I'm the founder of Travel letting, a small and independent travel guy.
39 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:12:30.989 --> 00:12:52.069 Chad Goholic, I'm here representing protocols for publishers. Max Gundler, tech standards at newscope. Alex Gabson for New. And in the back, I'm Andrew Solomon and I didn't work for anyone. All right, let's.
40 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:12:52.069 --> 00:13:03.687 So I go online let's see. Leonard.
41 "Leonard Rosenthol" (1715724288) 00:13:03.687 --> 00:13:17.599 I'm happy to go. Leonard Rosen Thall content authenticity architect at Adobe, I also chair the technical working group for the C two PA, but I will not be representing them today.
42 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:13:20.159 --> 00:13:21.861 Chris, do you want to go next?
43 "Chris Needham" (1250734080) 00:13:21.861 --> 00:13:29.729 Yeah, thank you. Hi everybody. So I'm Chris Needam. I work on internet standards at the BBC.
44 "Lila Bailey " (856738816) 00:13:29.729 --> 00:13:31.309 Thanks.
45 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:13:31.309 --> 00:13:36.060 1st?
46 "Farzdusa" (1320906240) 00:13:36.060 --> 00:13:47.373 Hi everyone, my I'm about the, oh my god. My internet is not good. So I'm digital Medusa, thanks.
47 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:13:48.539 --> 00:13:50.527 Iris?
48 "Iris Johnson" (3769862144) 00:13:50.527 --> 00:14:00.219 Hi, hello, Iris also lecture at the University of applied Sciences in the Netherlands. Thank you.
49 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:14:00.219 --> 00:14:03.652 And JK are you in the room or are you remote?
50 "Jure Kralj" (2778545920) 00:14:03.652 --> 00:14:11.648 I am remote. My name is Judical and I'm a legal and policy executive at ICMP, the global Trade body representing the music publishing industry worldwide.
51 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:14:11.648 --> 00:14:17.103 Perfect. Thank you. Is it possible for you to edit your display name so we can keep it in the blue sheets.
52 "Jure Kralj" (2778545920) 00:14:17.103 --> 00:14:19.781 Well do, thanks. Thank you.
53 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:14:20.965 --> 00:14:40.859 Krishna Sud? Krishna, would you like to introduce yourself? Okay, skipping over Lars?
54 "Lars Eggert" (2105437952) 00:14:40.859 --> 00:14:45.125 Hi, I'm Sagart I'm a senior engineer from Ozilla.
55 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:14:45.125 --> 00:14:48.760 Layla?
56 "Lila Bailey " (856738816) 00:14:48.760 --> 00:15:01.799 Hi, I'm Lila Bailey senior policy Council with the Internet Archive. This is my 1st time at the IETF, thanks to everybody for all of your helpful explanations as we go through. Yeah.
57 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:15:01.799 --> 00:15:04.068 More?
58 "Mohibul Mahmud" (1757986560) 00:15:04.068 --> 00:15:18.659 Hey hi, my day job is at Microsoft but and also I'm a member of I Canallo for, yeah. Thanks.
59 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:15:18.659 --> 00:15:35.902 Tyler? Tyler Martin? Engine.
60 "Yingzhen Qu" (1746380032) 00:15:35.902 --> 00:15:44.643 Hi, everyone, this is from I'm a member of the AB here for my own interest. Thank you.
61 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:15:44.643 --> 00:15:48.490 Thanks. Thomas you know?
62 "Thomas Aynaud" (629682944) 00:15:48.490 --> 00:15:55.659 Hi, I'm Thomas I know I'm CTO of software heritage. We do source code archiving.
63 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:15:55.659 --> 00:16:06.209 Perfect, thank you. Whoever is calling user two? Can you introduce yourself? Calling user two?
64 "EKR" (2135672832) 00:16:06.209 --> 00:16:10.468 I mean, it might be me, for Carla.
65 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:16:10.468 --> 00:16:28.187 Oh, hey, let me rename you. Thanks. Would you like to introduce yourself? Not that many people don't know you, but.
66 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:16:35.903 --> 00:16:38.724 Okay, Krishna, would you like to introduce yourself?
67 "Krishna Sood" (1201976064) 00:16:38.724 --> 00:16:46.979 How are you? Hi, I'm Krishna Sud. I'm a product council at Black Forest Labs. Perfect, thank you.
68 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:16:46.979 --> 00:16:51.869 Okay.
69 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:16:51.869 --> 00:17:08.609 Alright, let me try and share this quickly and it's fine.
70 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:17:08.609 --> 00:17:24.929 I'm getting used to Webex sharing. Excuse me for a moment.
71 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:17:24.929 --> 00:17:37.551 Okay, let's try this.
72 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:17:43.699 --> 00:18:04.649 Great, ok. So we, one of the things we identified a while back was it would be helpful to our discussions to have a focused set of use cases that we can understand and agree upon or at least informal discussions with. And so we started this. It it hasn't changed too much. I see a few people have made invites to it.
73 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:18:04.649 --> 00:18:19.769 I'm got twelve revisions so far, has everyone had a chance to look at this at least? I'm seeing some nodding, which is good.
74 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:18:19.769 --> 00:18:39.599 And so this breaks down things, things down into training and and this is all from the perspective of expressing preferences cause that's what our charter is about. Yes, we had as far as we now have a recipe based use case, which is good. I think that was added by Leonard if I remember correctly.
75 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:18:39.599 --> 00:18:54.779 And so it's expressing preferences about training and a few examples there about use and about the presentation of content.
76 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:18:54.779 --> 00:19:13.919 And so our current vocabulary, I think a characterized it as our our training term is relatively baked. It still needs some work, but it's we have the the fewest number of issues against it. Use we have.
77 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:19:13.919 --> 00:19:31.049 No current text about, and I think we we've kind of gone back and forth on what that could be. So that's one of the things we'll discuss this week is what what could succeed in that space. In the presentation, I think we're gonna have some discussions when we get especially to the search terms. I know we have Krishna's proposal.
78 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:19:31.049 --> 00:19:46.319 Which is display based preferences if that's how you characterize it still. We have a couple of new proposals from Aaron and also from I think it was from Chris.
79 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:19:46.319 --> 00:20:01.349 So we'll talk about those. Do people feel that these use cases capture what they need to? Is there anything else that we need to get into these?
80 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:20:01.349 --> 00:20:21.349 I think it would be helpful to acknowledge that there might be some granularity in, like, whose models or models for what purpose, which, you know, can be accomplished in various ways in the vocabulary, but I I don't think that the preferences are kind of all or nothing.
81 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:20:23.309 --> 00:20:43.309 So selectively applying to targeting effectively. Yeah. I I would add that in addition to images, videos or maybe we should make it more general, but it's not just images to all assets. Sure. So these aren't really meant to be a complete set.
82 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:20:43.309 --> 00:21:10.939 Like more just kind of a vocative to get you thinking about how they could be, expressed or or what they would apply to, but but absolutely yes. And I think the goal of the use cases was to kind of do a cross check after we have the vocabulary to see if it's rich enough to be able to like achieve like what the use cases like wanted to achieve, right? I think that's like, I think the important part of this is to make sure that like we kind of have a list of like what are the things people want to express.
83 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:21:10.939 --> 00:21:28.799 So, but we don't have to spend a lot of time on these. I just wanted to make sure that folks were aware of them and that, you know, we can refer back to them as we move along. Hi, I just had a, a question maybe about the scope of these use cases.
84 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:21:28.799 --> 00:21:44.879 They seem like a really good set of use cases for the preferences that the content holder or the website is expressing, but they don't capture I think all the different reasons that someone might train on that content.
85 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:21:44.879 --> 00:22:04.879 Is that gonna be a different part of the process or do you think that's a question for me here. If you're really trying to figure out if they were, like if these as a testing model as Rash was saying, if they get different activities, not just preferences. We we focused on preferences cause that's what we're scoped to define, I think.
86 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:22:04.879 --> 00:22:22.259 I'm a, I'm a little wary of trying to capture all the different possible uses of AI, e.g., if that's the direction that would go in. Like I said, in terms of the preferences, if you want to map them against the activities people are gonna do, I think you need to have some of those activities as well.
87 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:22:22.259 --> 00:22:43.789 What do people think? I think Aaron, you were also going in the same direction, right? Like talking about like how to make it granular, maybe like you two can talk and try to start at least throwing in something at the end to see if it makes sense because I, I think part of the issue Mark was talking about earlier is like everybody kind of needs to understand what the.
88 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:22:43.789 --> 00:23:08.489 Like the other side of the consumption or like production thing looks like so people can understand the issues and try to find a common ground. Max? Yeah, I have a question I think. Maybe it's a comment. It's unclear to me why some of the use cases include the phrase on unless I have a separate agreement and others don't.
89 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:23:08.489 --> 00:23:28.489 I think that's a pretty overarching question about how we're doing this work. I think it relates frankly, to a lot of the context points that will come up later. So does anyone have a comment on why only some have unless I have a separate agreement, it's it's unclear to me why.
90 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:23:28.489 --> 00:23:44.309 It would only apply to some of these use cases and not to all the use cases. So I'm the culprit this morning I was looking through these and I saw one of them had, I unless I had just started adding them in because I was pretty, then I thought, well, actually.
91 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:23:44.309 --> 00:24:01.649 All of these are potentially subjects that caveats are stocked. So it's only because I think it should apply to all of them, not because it's there for some of them, but I think that, you know, I think in most cases people would say unless I want you to, which is the same thing they're saying.
92 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:24:01.649 --> 00:24:19.589 I have an agreement to allow you to do this. So sorry go ahead. No, you said it was only on one of them. It was, it happened to be this morning when I looked at it, it had only been on mentioned unless I have an agreement I think it was in relation to number.
93 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:24:19.589 --> 00:24:39.589 Two, the one that's not in square brackets. Okay. The website, so I just added them into the others. But of course, you know, there's NO reason, but I think to the to the point there are many flavors of how these use cases maybe part of like unless if, you know.
94 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:24:39.589 --> 00:25:02.899 And that's I guess what we're gonna talk about going, going forward from here. I I I don't know if I originally when I seeded this if I just put it on one or not, but I I think probably if I did it was the month set was it's not to this isn't a requirements list, it's not where you, you know, it's algorithmic, it's just more again to be evocative to say, ok, this is a scenario that we need to think about. You could ref.
95 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:25:02.899 --> 00:25:19.946 Factor this as, use cases for what, what's what someone might want to express and then how it is how they want to apply that in the context of other arrangements or whatever, e.g..
96 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:25:21.239 --> 00:25:38.149 So and and if people have trouble editing the wiki, please talk to Suresh and I will help you with that, by the way. Kevin? Hi, Kevin Bankson CDT and like Lila, I thank you in advance. I'll have a lot of questions since this is my 1st time. I'm not sure I understand the.
97 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:25:38.149 --> 00:26:01.799 Role of distinguishing between types of content or being specific about things like log entry versus other types of content. Is there a reason why we can't just say content or do we need a separate taxonomy of types of content for some reason? No, it's not that we there's an intent to, you know, have different treatments here. It's more just to get people thinking about.
98 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:26:01.799 --> 00:26:19.905 How this might be used in the real world to make it a little less abstract. Thank you. And I'll just one other question, at least in one place there's reference to a general AI model but in other places it's just model and I'm wondering if that's deliberate.
99 "Leonard Rosenthol" (1715724288) 00:26:19.905 --> 00:26:25.503 Mark, do you want us to raise hands online or.
100 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:26:25.503 --> 00:26:33.864 Just I don't know. Go ahead Leonard, I'll I'll watch it, yeah, because I don't know how to watch it. Go ahead Leonard.
101 "Leonard Rosenthol" (1715724288) 00:26:33.864 --> 00:26:35.623 No, I just wanted to.
102 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:26:35.623 --> 00:26:38.999 Ah, of course.
103 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:26:38.999 --> 00:26:58.859 And and just Leela I saw your question. This is right now just use cases for the vocabulary, not for attachment. Do you want to come off mute and ask a question to Brad?
104 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:26:58.859 --> 00:27:02.500 I don't know if Brad is following the chat or not.
105 "Farzdusa" (1320906240) 00:27:02.500 --> 00:27:24.129 I'm sorry sorry like the question, I just don't understand what does it mean how they want to apply these use cases to other arrangements? Yeah.
106 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:27:25.859 --> 00:27:53.789 So the original language had, I prefer that my website not be used to train AI models unless I have a separate agreement. So I think that's, that's, that's clear. I think that in any of these instances there maybe a preference for you not to do something unless there is a separate agreement. So that's I think it's there, there's nothing sort of magical or it's like special.
107 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:27:53.789 --> 00:28:10.319 Bringing in relation to the other points, I think it just was meant to be the same, the same expression of there could be different grades or how you want your content to be used and you may have an agreement with the 3rd party that is an exception to that preference.
108 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:28:10.319 --> 00:28:25.447 You could also not have that language there. I think that the idea was just to talk about, and I think the idea was just to understand like what types of preferences might exist. Thanks Brad. Chris, go ahead.
109 "Chris Needham" (1250734080) 00:28:25.447 --> 00:28:51.459 Yeah, could I perhaps suggest that we remove this part about the agreement because I I do agree that it's, I think it applies across all of these and so kind of having to state it in each one is, it, it creates confusion if it's stated against some and not others, and it's really a cross cutting concern I think across applies across all of it, so I would suggest perhaps removing this and kind of noting it elsewhere.
110 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:28:51.459 --> 00:29:12.479 Thanks, then Erin? Yeah, I I think rather than say unless I have a separate agreement, like, unless I choose to permit the particular use, like it seems to be a little bit narrow to assume that an agreement might be the only reason why somebody would want to make an exception for a particular.
111 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:29:12.479 --> 00:29:28.379 Model or other use. I'm getting the impression that people are gonna try to polish these. I don't think that's a useful exercise. I think the.
112 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:29:28.379 --> 00:29:45.749 Whole point of this exercise is to get some evocative examples that you could use without having to have that sort of fine level of judgment and accuracy. And so I think, insisting that these be very, very precise.
113 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:29:45.749 --> 00:30:05.749 And accurate whether they be too broad or too narrow or whatever is probably self defeating. And to be clear, this is not a deliverable to working group, it's not a gotcha where at the end of the process if we don't exactly meet these, then we have to go back to square one. This is just a tool that we use to guide our discussions.
114 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:30:05.749 --> 00:30:35.149 To help us come converge on on on some sort of and this can be wrong, right. So you can put some you can put something in here that you think is fine and other people will, I think is terrible, it would just be a discussion that we have at the other end of the process to say yes, it doesn't meet this particular use case, but we don't think that use case really applies for the following reason. We, we learned something in the processing and so I would, I would rather overcollect and.
115 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:30:35.149 --> 00:30:55.529 Have these be very imprecise then than worry about the the details. That's why it's a wiki given and then. On the unless I have a separate agreement bit, does anybody disagree with the idea that these signals could be written by a private arrangement?
116 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:30:55.529 --> 00:31:15.529 Okay, we spent a bunch of time talking about that, but I don't can't imagine anybody would would argue the other side of that. It is also in the current draft, right? Like the notion that any signal can be overwritten by expressed agreements between parties isn't the current draft and I don't think it was ever controversial that.
117 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:31:15.529 --> 00:31:36.859 So I I think this can be removed because it is covered on the overarching level. Exactly. Thanks Paul. Thanks. Paul is that what you're like. Okay, Max? I was a little bit confused by what Martin was saying, so I wanted to sort of clarify here.
118 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:31:36.859 --> 00:31:54.779 Is it not useful for us to have a use case document that we all at least roughly understand as as correct? I'm I wasn't really clear. No I think I think my point was that if someone here thinks that the use case is valid, it's more useful for us to have it captured.
119 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:31:54.779 --> 00:32:13.769 Than it is for us to agree, all agree that that isn't that is a use case we'll we'll attempt to to address. I see. Oh yeah, it just sort of flattens I understand what you're saying. I don't particularly agree, but what he's better if you agree yes, but it's not necessarily a goal of this.
120 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:32:13.769 --> 00:32:41.505 Five of this exercise. We could spend nine months or or or 18 months coming to agreement on a set of use cases and then go design a protocol and discover that, oh, we didn't consider X and we're back to square one. Sure Chicken and egg situation. I understand. I don't love it, but yeah Leonard and then Glenn Glenn, whenever you can get on the Webex room, that'll be cool, but I have you next after Leonard.
121 "Leonard Rosenthol" (1715724288) 00:32:41.505 --> 00:32:50.129 Okay, yeah, so one of the things I wanted to set up the use cases was, I think to me, and and and I would recommend maybe to others.
122 "Leonard Rosenthol" (1715724288) 00:32:50.129 --> 00:33:17.560 The more important piece of the use cases is less the specifics within the high level categories, but I wonder it but it's identifying those three top level categories of training use and presentation. And I would say maybe a good question to the group is, does anyone think that there is a 4th category of use cases? Because if not, having at least those broad categories is very helpful in setting our next stages.
123 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:33:17.560 --> 00:33:46.879 That's a great question, Leonard. Given that we have a lot of people here are not familiar with the ITF process, I wonder if there's potentially a misunderstanding about the use cases. They are meant to help guide our work in the development of the other documents. They are not meant ultimately as a limit of what you can use that the future doc.
124 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:33:46.879 --> 00:34:11.149 That's work in. In other words, after we develop the the RFCs that get published, if somebody take that RFC and uses it in a way that is outside of the use cases we identified, that's perfectly ok. And, and nobody's gonna go back and say, Oh, you can't do that with that because it wasn't one of the use cases. This is just a guiding tool to help us make sure that we covered the needs that we want to cover.
125 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:34:11.149 --> 00:34:39.769 But it is not a limiting factor for adopters of the work that we eventually produce. Maybe that helps because we could be, we could be grand in our use cases that way. And I think like to Martin's point, right? Like once we have a listing of use cases here at the end, like when we go through it and say like we cannot meet one of those use cases, we can decide as a working group not to bother about that, right? I think that's the point like Martin was trying to make. So we collect all those things, we know about the things that this can be used for.
126 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:34:39.769 --> 00:34:46.559 Then we can decide collectively what we decide to address or not address.
127 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:34:46.559 --> 00:35:03.329 I think we've trained the Yeah so so let's move on to the other wiki if I can find it. There we go. Aaron, you wanna elaborate on what you typed in there? Because that's in the draft right now.
128 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:35:03.329 --> 00:35:23.279 Yeah, so I was just sort of continuing the conversation a little bit about the granularity of those preferences and the possibility of making private agreements that, you know, take things out of the scope of the preferences, that I.
129 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:35:23.279 --> 00:35:40.049 You completely agree that there wants to be kind of private ordering that overrides these preferences in particular categories, but that that alone is insufficient to solve the granularity problem because it maybe that somebody wants to express a preference without.
130 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:35:40.049 --> 00:35:58.289 I'm having those private agreements. And so I I just don't want our work to overlook that category or that, that situation because we're relying on the fact that somebody could always like reach a private agreement that grants permission separately. Right, makes sense. Sure.
131 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:35:58.289 --> 00:36:17.039 I do think this goes slightly outside of the vocabulary scope, like it seems that you would address some of those things at the attachment level. We're defining the types of users here in the vocabulary, and if you want to go granular with regards to specific users.
132 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:36:17.039 --> 00:36:37.039 Or specific classes of users, it might be that this is better addressed at the attachment level, but I get your point otherwise. It could be, but are the use cases only for the vocabulary and not for the kind of overall effort? Well, these are currently focused on, on the vocabulary. What I think what we wanted to do was.
133 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:36:37.039 --> 00:36:50.728 To establish kind of the basic things people want to express and then expand that to cover, you know, how they get attached and when nuances are there. Okay. Okay, Leonard, yeah. No Leonard Leonard.
134 "Leonard Rosenthol" (1715724288) 00:36:50.728 --> 00:36:53.788 Was your hand lingering? Oh, sorry. That's an old one, I forgot to put it down.
135 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:36:53.788 --> 00:37:18.259 My apologies. Okay, Max, go. Just on the general idea that that it was initially written to just one piece and then maybe it applies to all of them, should there be a separate category to this wiki that isn't developed like the training you're used to sort of the overarching question? You're looking at me like that question didn't make sense. No, but what is the overarching well to to the point of like the.
136 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:37:18.259 --> 00:37:44.239 The private agreements or whatever, like do we need to find room for that in the use cases instead of attaching it to like specific like training use cases we say here's an overarching part to this document. So that makes sense? So Kevin's proposal was that like, you know, there's nobody really disagreeing with that and it's already in the draft, so do we need to document it or not, right? Like we can like I don't think that I think it's better to document agreements than to you know.
137 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:37:44.239 --> 00:38:06.859 But I also think that there are parts of that concept that are agreed upon and there are other parts that aren't. I think that comes up in the context arguments. That's my read on on the current issues that are open. So I'm happy to hear that Kevin thinks there's there's NO disagreement, but I think that there has been, at least in in the last year at certain points. So I think it's worth documenting.
138 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:38:06.859 --> 00:38:26.859 If we have some set of sense of agreement to that. Sure. So, so again, I think as Martin said, let's go ahead and be expansive. You know, if you think that something deserves to be in there, it's a wiki, let's get it in there. We don't all have to agree on the fine points of it. It's, it's, you know, again, it's the the purpose of this is not to constrain our future action. It's to illuminate our way forward. So if you want to.
139 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:38:26.859 --> 00:38:37.786 Add a section we'll get it in there and we'll, we'll talk about it. If you think that'll help the discussion move forward. Thanks. Sounds good. Thanks. Chris then Caleb, and then Paul.
140 "Chris Needham" (1250734080) 00:38:37.786 --> 00:39:10.850 Yeah, thank, thank you. So I think there are the use case wider use cases that we captured and talking talking about this in the context of attachment, it might be that a website might not want to publicly signal where they have private arrangements and so sort of having granular preference expressions that are targeted to sort of different consumers maybe something that, you know, a site may not sort of want to reveal publicly. So I mean that's kind of a wider use case than.
141 "Chris Needham" (1250734080) 00:39:10.850 --> 00:39:24.282 The, you know, should I I mean I'm happy to edit the wiki, should I sort of start to add these kind of additional sort of broader use cases that touch on attachment also?
142 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:39:24.282 --> 00:39:51.280 I, I think that's fine. I I, I would love if we start either a separate page for attachment use cases or if we want to put it all on one page, I, I think we can be flexible about that. But if, if some, some folks want to volunteer to start trying to capture those use cases for attachment, I I do think that keeping a distinction between attachment and and vocabulary is still useful, Although there are gonna be some things that are kind of both or overarching as Max was.
143 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:39:51.280 --> 00:40:14.760 And maybe we can capture that as well. But, you know, for me, I, I would like people to think about Leonard's question, is there another top level kind of categorization of use cases for the vocabulary that we haven't yet? And if so, it'd be really good to get that captured pretty soon. Nate, go after Andrew, so I think.
144 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:40:14.760 --> 00:40:33.030 Calibs next and then Paul. Yeah just to just to try to drive this home, like, isn't that a use separate use case? I would like to be able to signal the 1st use case for all except for a specific number and then also I would like to be able to do that in secret to Chris's point, like.
145 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:40:33.030 --> 00:40:52.950 I'm not sure how, I guess the secrecy part goes to the attachment part, but as far as like what we want the specification to do, we want it to be granular enough so that you can carve out specifics like which is handled I mean in robots in general with.
146 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:40:52.950 --> 00:41:12.950 You know, right. But we we don't have a, we don't hope for that yet. Right. Well, don't over rotate on whether it's attachment or vocabulary. We'll, we'll figure it out. It's more important just to capture this. To your question and Leonard's point, like I think there are use cases and I don't know if it's useful to capture them.
147 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:41:12.950 --> 00:41:29.430 What people want to do was I don't want training and I don't want use. The the combined one, right? We know that there's a whole bunch of people not necessarily represented in this room who have this attitude towards NO AI at all.
148 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:41:29.430 --> 00:41:49.170 Like I think that is a existing use case that some people want to be expressed. I don't know if we add this, but we want to include that here as an overarching training plus use plus potentially presentation or if we keep that in mind as combinations of the use cases that we have here.
149 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:41:49.170 --> 00:42:10.340 But there's clearly people out there who want to say like NO AI, period. Right. Thanks Paul. Andrew? I I thought I understood what we were doing here un until we had this conversation. So welcome. Thank you. It seems to me that that there are two possibilities here.
150 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:42:10.340 --> 00:42:33.230 And and it would be helpful if we picked one. One possibility is there's a bunch of cases and one of them is you get into this, this preference exchange language and in that case we can cover all of those cases. The other one is we're trying to cover all of the preferences that people might express, and some of those are.
151 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:42:33.230 --> 00:42:53.700 Like we don't want to use your protocol. Which one of those are we in? Because one of these is we're trying to figure out like, oh, is there a private case here? Do we have a special side agreement and so on? And then we've got to talk, we've gotta have a way of expressing all of those things. The other one is there's lots of ways that you could talk about this. This is the internet, people can do whatever the hell they want.
152 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:42:53.700 --> 00:43:13.700 But if you're gonna talk our way of preferences, these are the ways you do it. In that case, you don't have to have all of the exception language because that's already covered by you be doing something else. But I don't know which one of those were doing. I think we are trying to do the 1st one, Andrew, right? Like where like we're trying to document like what kind of preferences somebody would like to express, but I think people still want to.
153 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:43:13.700 --> 00:43:38.160 Saying like hey like all these things could be ordered by something private, like it and like Matt said like hey we wanted written down that he kind of agreed that like there's other this stuff that can happen on the side that'll override everything, right? That's pretty much it. But I don't think like we're gonna come up with this like, hey, this is a nice out of band mechanism by which you can do it, it's not gonna be in the working group today. That's your question. Yeah so that's sort of the point that I'm trying to make. I mean.
154 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:43:38.160 --> 00:43:58.160 You know, does the vocabulary, for instance, have to have a bunch of terms for and we're on another planet? Or, or is it just like we're in our vocabulary, we're in the ITF vocabulary. These are the things we're talking about. And in that case, of course, anything that's outside of that is, yeah, sure. You could be, you could be talking, you know.
155 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:43:58.160 --> 00:44:16.230 To your friends on Mars and they got a different way of doing this and and so what? So I I I I'm I'm just concerned that there's the potential here to say, well, we want to cover all of the cases that anybody ever could think of, and that.
156 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:44:16.230 --> 00:44:33.284 That that is not the path to a work group that So I think like a slight modification and we don't want to cover everything. We want to know about all the use cases and then we can decide what to cover like to what Mark said and We'd know about all the things and decide what to cover than like come up with something at the late point that we didn't know exist.
157 "Leonard Rosenthol" (1715724288) 00:44:33.284 --> 00:44:35.809 And and and part of this is, is that.
158 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:44:35.809 --> 00:44:59.586 That we, you know, all these things have been discussed to various degrees and brought to forth by the, by the to by people to the group. We haven't necessarily reached consensus on addressing these use cases, but overlying that is a keen awareness I think since we started that we're working within a much larger context, and we're not trying to define that much larger context or constrain it, but we need to at least acknowledge it. That's good, thanks. Andrew, are you good?
159 "Leonard Rosenthol" (1715724288) 00:44:59.586 --> 00:45:25.340 The other, the other thing I would add just to that Andrew is and and I know some people don't agree with this. We've been having a little chat of the discussion of this in chat, which is that remember that we're trying to define a vocabulary that's usable across multiple attachment methods and protocols. So this isn't about developing a protocol, it's about developing a vocabulary. The protocols are what we've been calling attachments are secondary document.
160 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:45:25.340 --> 00:45:33.300 Thank you, thanks just Leonard, we do have a queue so like we'll just try to go through it. Thank you. Thanks.
161 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:45:33.300 --> 00:45:53.300 Paul, Nate and then Nate. Paul Kevin and then Nate. I think this builds on what Leonard was just saying, specifically we are what we're developing here is a vocabulary of use case, sorry, vocabulary of uses, right? Like we describe types of use in the vocabulary. And I think the use.
162 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:45:53.300 --> 00:46:19.430 Cases that we have here, sort of like are intended as testing the vocabulary if it is sort of like if it allows us to address these use cases or if it maybe potentially makes it impossible to express these use cases. It does not require the vocabulary to be complete in order to do all of the expression there. Some of the expression may take elsewhere at the attachment level.
163 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:46:19.430 --> 00:46:35.700 Right, like the vocabulary is essentially types of users that it defines. It doesn't define types of users. It doesn't define a lot of other things. It is as it's constructed right now, Types of users. If we need to define users, we might need to have.
164 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:46:35.700 --> 00:46:52.350 Another vocabulary or another instrument that allows us to categorize them, but currently the vocabulary is not about classifying different types of users, e.g.. And I think that is something that we keep in mind. Robots TXT in its current form like allows you to address specific users in.
165 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:46:52.350 --> 00:47:09.060 In the forms of like addressing a specific bot, and it's a combination of the vocabulary and the attachment mechanism that enables us to address the use case for not, but not everything needs to be in the vocabulary. Kevin?
166 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:47:09.060 --> 00:47:30.800 I think Paul's point is one well made it, I guess to go back to what Andrew was asking, I think it's, and my understanding from our previous conversations in this working group that we are not trying to create an exhaustive system that can express every possible thing we might want to agree on.
167 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:47:30.800 --> 00:47:49.680 We're trying to define something that express that can express some useful subset and that the private agreement caveat that I think we are, well, I was about to say that doesn't seem like there's any disagreement with, but.
168 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:47:49.680 --> 00:48:05.820 Apparently there might be. I'd love to understand that when we have a chance. Is there an escape valve in some sense, right? Like if you need to express an interest more complex than what we can agree, we can agree on at least initially these machine signals that finds you a way out.
169 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:48:05.820 --> 00:48:28.040 Nick? Yeah, so I wanted to 1st, I appreciate Paul's point about, it's important that we have a way to express just NO to AI generally. I am one of those people who with very rare exceptions, just a NO AI in my life button, and this room I think is.
170 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:48:28.040 --> 00:48:59.160 Probably skewed towards people who are more sort of early adopters of AI given the context, if you actually look at public polling, like, you did a poll and they compared like US adults versus what they call AI experts, like people who are really familiar with AI. AI experts have a, like four times more positive view of the expectations for AI versus like general people. And so I think I'm actually more representative of the majority of like average people and internet users, and so I think it's very important that we have a way to just say NO AI at all.
171 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:48:59.160 --> 00:49:19.160 Secondly, on the the subject, the question earlier as to whether or not these three categories cover everything that we wanna cover. I just want to mention that the charter specifically talks about how we will standardize building blocks that allow for the expression and preferences about how content is collected and processed for artificial intelligence model, development, deployment.
172 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:49:19.160 --> 00:49:37.110 Met and use and the ways that we have it framed here are, in the vocabulary, in this, in this sort of thing or training use and presentation. I I don't particularly know if there's a difference there, but we don't cover deployment, so I just didn't want to flag that if there's something else in there that we're missing. Thanks there.
173 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:49:37.110 --> 00:50:00.050 Yeah, let's go ahead and move on. The other document that we put together was trying to capture and we've heard people concerned about potential impacts of of of the expression of preferences. And so we have a couple of, examples here around, e.g..
174 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:50:00.050 --> 00:50:21.330 For accessibility tools or a local summarization tool? I don't think there's been much progress made on this. Are people still interested in trying to develop this? Or is there a link to this? I don't see it on the agenda. It's in the wiki next to the, use cases document.
175 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:50:21.330 --> 00:50:41.970 Yeah potential impacts. Thank you, Tim and Robot. So I just I'll I guess we could flag this to folks that, you know, this is something also that's useful to capture to understand.
176 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:50:41.970 --> 00:51:00.690 The concerns people have. Kevin, on 2nd one, sorry, did I interrupted here? No, go ahead. This reminds me of the conversation we had in.
177 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:51:00.690 --> 00:51:18.570 Sorry maybe about the situation where a user, let's say I, I have to read lots of very dry technical reports. Some of them are not fully available on the internet. If I have about license to one, I.
178 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:51:18.570 --> 00:51:34.440 I log into my account, I download that PDF, then I want to use a tool that I have also appropriately licensed to interact with that content, to find the bits I'm interested in in that 900 page PDF or whatever. Exactly.
179 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:51:34.440 --> 00:51:54.440 That seems like a use worth protecting and 2nd point doesn't quite say that, but seems like it's moving in the same direction. Is that what your intent was to please if if you want to develop these or or or add new ones, please go ahead. But that, that's the intent here is to capture cases like that.
180 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:51:54.440 --> 00:52:20.810 Let's say that, you know, especially what's the collateral damage of a preferences framework if it's done badly? And I think that goes to, you know, when you have things like when people are able if we just say, well, here's the NO AI signal. I think one of the, the, the cases that's been brought up as a concern is that if we have a NO AI signal and, a software publisher for a summarization tool like that looks at that landscape and.
181 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:52:20.810 --> 00:52:32.806 Says, well, the safest thing for me to do is to never allow summarization if there's a pro preference that says NO AI, you know, what is the outcome for the internet there? And then and so at some point we should talk through that. Thanks.
182 "Farzdusa" (1320906240) 00:52:32.806 --> 00:53:03.140 Yeah, I'm so, I'm talking to a bunch of network operators and people who understand the internet. So, at the risk of being wrong, I have a use case about potential impact and that's a circumvention of sensored domains, people who face sensorship, they can potentially go potentially use LLM.
183 "Farzdusa" (1320906240) 00:53:03.140 --> 00:53:18.349 To have access to content that is blocked and and that's one that I, I wanted to discuss, but I'm just gonna, I'm just gonna add it to the mailing list and we can discuss it.
184 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:53:18.349 --> 00:53:33.870 Okay, yeah, sounds good. Thanks, 1st. And I think like we did have some discussion in a meeting about like people with disabilities and accessibility and stuff as well, like was potentially something that.
185 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:53:33.870 --> 00:53:53.870 Would like be under the impact somehow, right? If you say NO AI like that and also some kind of machine translation as well. So some of them use like AI models and there's like, I think if you say NO AI, probably that would be in the potential impacts as well, right? And I suspect we'll get to these when we get we talked later on. I.
186 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:53:53.870 --> 00:54:12.270 Like the the context issues, the discussion of section three.three e.g., will probably get back to this discussion. Glenn? Sure, I'm I want to raise a bit of a concern here. We in some of the comments we were discussing about like.
187 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:54:12.270 --> 00:54:28.230 The the we're worried about blocking this and we're worried about denying access to that in various use cases. I think there's also a va very valid use case we need to consider with this, like was just raised a minute ago online. There are situations where in fact.
188 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:54:28.230 --> 00:54:45.990 We do want to deny access and we we want to have a mechanism that is effective. It could be for things like blocked domains, but it also could be for things like network security reasons where people are using AI tools.
189 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:54:45.990 --> 00:55:05.990 As mechanisms to bypass blocks that were put in place because we are battling denial service attacks and things like that that are really harmful to the internet. And so if we create a channel which says, look, this channel can never be blocked in any way, that will be an invitation to the bad actors.
190 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:55:05.990 --> 00:55:34.620 To make use of that channel, and we've seen this in the past as well where we've done things like that. So I I do think we gotta be air towards supporting all the different use cases, not just the one use case which is make everything available at all times without restriction. Thank you, Paul, I do want to point out that that seems out of charger to me blocking access like this is neither about access nor about blocking.
191 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:55:34.620 --> 00:55:54.620 Applies to your point, probably also applies to what suggested. Like I'm not sure if I fully understood that, but like this is about use of assets of content that are accessible that that you have access to, right? Like this is not an instrument for blocking access.
192 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:55:54.620 --> 00:56:15.030 Is to anything or ensuring that access is like the the other side that that access is provided. I put my hand back up, did it respond? Yeah, go ahead. I agree that's the the the mechanism for blocking isn't interpretered I agree with that. But equally so the mechanism for.
193 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:56:15.030 --> 00:56:35.030 The content is always available in certain use cases is also not in charter. And and so I always raise my hand because we started talking about use cases where, you know, content must always be available and you can never have a preference that says this is not available to the AI systems. And that is the injection I was raising that it's We don't, we're putting mechanisms.
194 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:56:35.030 --> 00:57:00.860 The place to express preferences, we're not setting policy or enforcing policy, but what can or cannot be accessed. This is I think we agree this is not access control either way, it is about the use of assets that you express preferences about. Yeah, and I think like Lila was saying it on the chat too, right? Like so we don't do the enforcement. So I, I think it's like clearly in the charter, but it's kind of, for whenever we have like.
195 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:57:00.860 --> 00:57:16.350 Newer participants we kind of need to enforce, like, you know, say it again, but I think like really you're not working on the enforcement, like it's outside our scope. We just express the preferences, but I think your point Glenn right is to be able to express his preference, like in case like there's some legal thing like.
196 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:57:16.350 --> 00:57:36.350 Making you express that preference, right? Exactly, so so I guess I guess the the takeaway here that I'm trying to say is, and maybe a thorough way is we are neither expressing a pre a a policy expression that says it must be accessible or nor are we trying to say it must not be accessible, we're not touching that at all. We're just expressing preferences of what is possible.
197 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:57:36.350 --> 00:57:44.287 Perfect. Sounds good. Thank you. Max?
198 "Max Gendler" (935176704) 00:57:44.287 --> 00:57:50.468 Just Nice.
199 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:57:50.468 --> 00:57:59.100 That's it. My bad. I instinctually clicked the unmute button on my laptop, that's my bad guys.
200 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:57:59.100 --> 00:58:19.100 Get you all awake in the morning. The point I wanted to make or sorry, the question I wanted to ask is, we're talking now that this isn't about access, but, and I know that we're trying to focus on the vocabulary draft 1st and yada yada, but the attachment draft specifically.
201 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:58:19.100 --> 00:58:40.410 Is about robots.txt. And robots.txt is specifically about access. So I understand like the point that Paul and Glenn are making, but I don't think it's entirely accurate given like the one step downstream we're planning to take as a working group. So robots.text is not an access control mechanism.
202 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:58:40.410 --> 00:58:55.710 It is also a preference mechanism, it's just focused on how you would like crawlers to behave rather than how you would like people to treat your data. I'm not sure that's how it affectuates in the real world, but.
203 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:58:55.710 --> 00:59:10.920 You're saying that all bots always follow robots.text? I'm not saying that all bots always follow robots.text. I'm saying, it has been normalized as an access control even if it's not an effective access control.
204 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:59:10.920 --> 00:59:27.240 And I think that that's a useful thing to remember, even if we're saying, so so about access. That's a good hook in the that there are norms formed around robots.text. They're not always honored and we should keep those but what we are now looking at doing is forming a new set of norms.
205 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:59:27.240 --> 00:59:42.480 Around the use of data and and part of the concerns have been raised around what, what is the collateral damage of doing so? Or what are the unintended consequences of the risk that raised? Sure, and I I think we're saying similar things.
206 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:59:42.480 --> 00:59:59.760 Yeah, I think we're. Yeah, thanks. Thank you. So regarding access versus use, are we scheduled to talk about the issue about.
207 "TRN6-29-BANFF/speaker_1" (4268955648_1) 00:59:59.760 --> 01:00:18.240 Like, like robots traveling with the data today? We've scheduled all the issues. Okay. So we will. Early on and don't discuss anything else we'll we'll hopefully get to everything. Okay, because I think that.
208 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:00:18.240 --> 01:00:34.590 To me that goes directly to this issue of kind of how we are, as a working group reimagining the kind of the total use of robots.txt of it, of kind of extending how it's.
209 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:00:34.590 --> 01:00:54.590 How it's used. Thank you. So it's 1030 now, do you think 15? Let's go ahead and take a 15 min break, let folks settle in, and then we'll come back and get into the issues list. So come back just after 1045 local time.
210 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:00:54.590 --> 01:01:13.290 Right? All right. And there is coffee if you need to stay away. So folks on Webex will be back in 15 min from now.
211 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:01:13.290 --> 01:01:33.067 Okay. Okay. Can we get an extension point sorry? Much.
212 "Lars Eggert" (2105437952) 01:01:33.067 --> 01:01:38.258 Reminder that the mics are still on and they're pretty good at picking up chatter.
213 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:01:38.258 --> 01:01:48.763 So you might want to give you more people. No worries, your help. Let's see what we can do here. We need to unmute. I'll let them know, obviously.
214 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:17:34.550 --> 01:17:54.550 No, I'm over in Malter. Yeah, so I'm not visiting. So.
215 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:18:14.550 --> 01:18:37.607 No I was. Okay. So remote folks, can you hear us?
216 "Leonard Rosenthol" (1715724288) 01:18:37.607 --> 01:18:41.249 Yes, you're good. Rick.
217 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:18:41.769 --> 01:19:01.863 Let's go ahead and get started again. So we, oh, someone just got their mic on. So their microphone on. Okay. No, they're still an echo.
218 "Martin Thomson" (2567451136) 01:19:01.863 --> 01:19:03.847 Keep talking now?
219 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:19:03.847 --> 01:19:22.700 Webex should have a better thing. That's quite a microphone thing when I Sorry? Captured me from all the way over there. Yeah. So we thought we'd start with kind of the very high.
220 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:19:22.700 --> 01:19:45.500 High level issues about the overall model and the the these three at the top here are the ones that are most obviously involved with that. These these bucketing of these issues is is at best and precise, so don't overfixate on that. It's just a device that we're using to try and organize the discussion event. It's I'm I'm very sure we're gonna bleed into other issues as we have these two.
221 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:19:45.500 --> 01:20:09.320 Discussions. And, and so this cluster, you know, we've got the focus on purpose of use rather than time of ingestion. We've got the specific proposal I believe from Krishna, which is replace the current vocabulary with a display based preference vocabulary. And then hierarchical structure is problematic and introduces unnecessary complexity.
222 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:20:09.320 --> 01:20:33.350 I I think the 1st two are, are, are somewhat and and perhaps we could just have a an open discussion of where we think we are with these. If this is an approach that we think is, is promising. You know, as a reminder right now the vocabulary we have is, is simply the training signal and then the search exception. And so.
223 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:20:33.350 --> 01:20:51.630 This could affect how we, formulate the search exception and it could also affect how we formulate other potential terms. Krishna, have you made any progress with this or do you have any, have you, have your thoughts developed or or is this still your current proposal?
224 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:20:51.630 --> 01:21:11.630 So in terms of the display based preferences, we still believe that it's the it's an appropriate way to address observability of how the preferences are expressed. We've had a lot of conversations with the community in the meanwhile, and the conversations have only solidified my belief that it is actually.
225 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:21:11.630 --> 01:21:32.090 A very viable route to expressing preferences. Maybe not all the exact terms that we are proposing, maybe there needs to be some minor changes there, but in general, I'm fairly convinced that some form of display based preference would actually help quite a bit with getting to a resolution.
226 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:21:32.090 --> 01:21:49.710 And I've not heard anything drastically negative that would undermine my confidence in this observations. So my recollection of of I think we 1st started talking about that in Zuric if I remember correctly. Some of the concerns expressed were how.
227 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:21:49.710 --> 01:22:05.400 General this framework was in that it makes sense in the context of, e.g., what we call search and acknowledging that that is a very contentiously fuzzy term. But when applied to other use cases for AI, keeping in mind that AI is a very broad kind of.
228 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:22:05.400 --> 01:22:25.400 Computer science technique, if anything. They become much less applicable or much fuzzier in their application. So have you, have you thought about that anymore? Yeah, so, even though we call it display based preference, we had a foundation model training category to address training scenarios that is not necessarily displayed the.
229 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:22:25.400 --> 01:22:48.680 Of that is still under debate in this community. But in terms of applicability of this across the scenarios that most people seem to care about, there are of course always scenarios that we've not considered. Even the use cases that we kind of discussed this morning, we could potentially map them onto the use cases and still.
230 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:22:48.680 --> 01:23:04.650 We've actually tried doing this, right? So internally we've when when when we were talking to a few people, we actually tried to take some of the use cases and say, ok, how would we apply these preferences onto those use cases? And we were able to come up with pretty.
231 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:23:04.650 --> 01:23:24.650 Good, ways of doing this without overspecifying something or underspecifying something. That doesn't mean that we've solved all the problems there but nothing so far seems to suggest that we should completely give up on that much so or that it's a really bad idea, right? Like.
232 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:23:24.650 --> 01:23:47.420 We've been really probing a lot on this, so, and and do you see considering the current state of things because it's not the state of things that was we had when when you made the proposal, is it something that's additive to what we have or is it replacing the current terms or or how would you see it being incorporated? I think the current draft has less to do about the use cases that.
233 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:23:47.420 --> 01:24:08.820 And about the foundation model training. I think in that sense it's additive, but there are some adjustments that we will have to make, particularly the one repeated feedback that we are hearing is around how we would define search, right? But setting that issue aside it would be additive given the current state of the draft.
234 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:24:08.820 --> 01:24:26.910 It's not a full hand based robot, like was there a lingering hand or did you want a system? There's a lingering hand. Sorry. Martin? So, after the discussion in Zurich, I sat down and tried to make the display based preferences work.
235 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:24:26.910 --> 01:24:43.410 And I wasn't unable to exactly the reasons that Mark was talking about. I wasn't able to find a coherent description of what it meant that was, that wasn't tied specifically to an application and that I think is the the primary challenge.
236 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:24:43.410 --> 01:24:59.550 Here, it's it's very easy to sort of conceptualize these things in the context of something like the search engines that people experience today and and think about the the sort of constraints that you had in that context. But in, in terms of.
237 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:24:59.550 --> 01:25:15.810 Thinking about this more generically, I was completely unable to do anything with, with what you had. And I, I raised some of those issues at the time and I don't think we've seen any progress on any of that. So how do you propose that we proceed?
238 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:25:15.810 --> 01:25:33.060 Yeah, we've not submitted a new draft as a, as a response to the feedback we've received. Obviously we can go back and do that. We are happy to do that. But to also represent the conversations that we've had, most.
239 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:25:33.060 --> 01:25:48.690 People that we've had a conversation with seem to be able to apply it and come away with a clear expression of what they want. We'd love to understand more as to where it's failing and maybe that's a good discussion to have.
240 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:25:48.690 --> 01:26:08.690 It's been a good long time at the meeting following the one that and that's all recorded. There's, there's a presentation deck I put together. Yeah, we'll go back and take a look for sure. But the people that we've been talking to, at least I mean, we may propose some changes in the form of the language of how we express it to make it clearer, we may.
241 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:26:08.690 --> 01:26:28.740 Remove one add one that that's fine. But i'm happy to go back and revisit the conversation from Zurich and try to really adjust the draft, but we've not offered a draft because still a lot of people kept saying, hey, we wanna have more directions around it. So we we didn't change the draft so that we have some stability around what people are reading.
242 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:26:28.740 --> 01:26:44.670 I'm happy to go back and revise the draft and and make any changes, provide additions, clarify language, maybe add use cases, I don't know, but we can we can we can definitely take a look. I'm not opposed to any of those discussions at all. I don't have any dog now about it.
243 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:26:44.670 --> 01:27:04.670 I, I mean, we will be talking about the search terms later on and maybe if we make some progress on search, then it might be good to revisit and say you know I can see a possibility where if we have a good search term that we've agreed to then, then you can start offering refinements to that term or or specializations of that term. I think.
244 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:27:04.670 --> 01:27:34.470 And what I'm I I heard before and I think what Martin Martin confirming is that it's just when it's they're, they're considered outside of the scope of search, it gets a little confusing for people. Yeah, I think that the definition of search will definitely have an impact on the actual terms that would end up in the final draft of something like this, right? And, also if search is really clearly defined, some of these terms may not be necessary.
245 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:27:34.470 --> 01:27:54.470 Which which would help us reduce the burden on preferences as well. So there's, there's much discussion to be had. Like I said, in general, the feedback has been, ok, this gets us most of the way there, but it's all also we we've talked to a lot about with the publisher community, we've talked a lot with smaller businesses, small.
246 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:27:54.470 --> 01:28:26.306 Scale of folks and it seems to get people there most of the way but like Martin is saying, there are scenarios that we probably are blind to that we should probably be addressing and we should go back. I think like Martin's point was like, you know, on a non search context it's not obvious, right? Like, you know, if, if search is the applicable, but if it's not, it's not obvious, I think is the debatable but sure. Chris Chris.
247 "Chris Needham" (1250734080) 01:28:26.306 --> 01:28:37.721 Oh, thanks just struggling the with the mute button there. So yeah, I mean my my read of this was very similar to what Martin's just said. I found it very difficult to understand like how I would apply the.
248 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:28:37.721 --> 01:28:39.898 Yes. So.
249 "Chris Needham" (1250734080) 01:28:39.898 --> 01:28:56.620 Rather than Krishna, as you were saying, like tell us where we're going wrong. I I'd be very interested to hear from the people who are telling you that it works for them because I'd I'd be very interested to, to sort of hear that perspective.
250 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:28:56.620 --> 01:29:14.627 I don't second that'd be great to hear from those people. Vars. Yeah.
251 "Farzdusa" (1320906240) 01:29:14.627 --> 01:29:32.040 Sorry, Webex is very slow. So basically I was one of the ones and Alisa Cooper and I we mentioned that we need to, we need to revisit.
252 "Farzdusa" (1320906240) 01:29:32.040 --> 01:29:52.040 Krishna's a proposal and that it has not been discussed in detail and I was in meeting and I didn't feel that we discussed it. And as we are talking, I keep hearing that on it's not implementable, it's not like we face difficulties but I don't hear details. Which part.
253 "Farzdusa" (1320906240) 01:29:52.040 --> 01:30:07.260 What is not implementable? What, what, what are the problems? Because Krishna's draft has some granular preferences that might actually address some of the concerns I'm blocking.
254 "Farzdusa" (1320906240) 01:30:07.260 --> 01:30:27.260 We have, I think that we need to, we need to set up like an hour and sit down, go through Krishna's draft. But before doing that, we actually have to consider what we are working.
255 "Farzdusa" (1320906240) 01:30:27.260 --> 01:30:44.220 Working on at the moment and then and then see how we can incorporate Krishna's internet draft and how Krishna internet draft can help us. But at this stage, all I'm hearing is that NO, we don't like this.
256 "Farzdusa" (1320906240) 01:30:44.220 --> 01:31:00.291 It was not, we worked on it, we don't like it. And I think that it's krishna's draft actually has a lot of merits and it could address some of the problems that we face in our conversations.
257 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:31:00.291 --> 01:31:19.370 Thanks. Paul. So I think, I think there's at least one problem that has been clearly articulated here and that it seems to be very narrowed towards a specific sort of scenario and it is hard to imagine how this works mainly the search.
258 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:31:19.370 --> 01:31:21.450 The internet search scenario.
259 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:31:21.450 --> 01:31:41.450 And it is hard to see how this provides us with meaningful use cases in a lot of other contexts. Like I think that is sort of at least one sort of substantive point of criticism that has been raised by a number of people here and I think we should acknowledge that. The other thing if I understood you correctly, I heard Christian.
260 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:31:41.450 --> 01:31:49.350 Say, towards the end of your thing that you actually said if we were able to develop like a search category.
261 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:31:49.350 --> 01:32:09.350 Based on the other approaches that have been forward that would mitigate the need for some of the more granular things that you have proposed. So like I actually thought that was a way forward for us to 1st look at the other things and then maybe come back to thing and says, are there still parts of that? If that was your suggestion.
262 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:32:09.350 --> 01:32:28.919 Understood that right? Like I think that is maybe a way forward for now, but I'm not sure if I fully understood you correctly well I mean it's if we have clarity on one aspect of the preference that is getting expressed. Let's say that we completely nailed down the definition of what search means and how we can express things there.
263 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:32:28.919 --> 01:32:46.769 Obviously any other proposal that have been put in place or have been proposed actually have to be reconsidered in the context of the clarity that we've just had, of course. So, in that sense, yeah, I mean, if we completely nail down the definition of search and we are able to offer something with clarity, then.
264 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:32:46.769 --> 01:33:01.799 Yeah, we'll have to go back and relook at what other proposals are there, not just this one, every other proposal needs to be reconsidered in the context of the clarity that we have. So I just just to follow on to that I, you know, from from listening to this, my my current thinking, which.
265 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:33:01.799 --> 01:33:21.799 Very happy to change is that, you know, the, the most productive thing might be to to get to the search on our agenda, have that discussion, see if we can make some progress, have that discussion with your proposal in the back of our minds. Sure. And then if we can make progress there, come back to it and say, ok, let's see how this could fit in and whether we're still interested in so.
266 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:33:21.799 --> 01:33:41.799 And so on. The approach. The caveat there is, is if you want it to apply to something greater than search, that brings in, I think the issues that that have been referred to and we have to think through what what does it mean to have a limit on the the extra length in, you know, other like a summarization use case or in a, you know.
267 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:33:41.799 --> 01:33:59.249 Accessibility use case or whatever. So I I will say this, right? If we say that this has to have applicability in context outside of such, then it's incumbent on that argument to provide all the context where it should be applied, where it fails.
268 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:33:59.249 --> 01:34:19.249 And, it would be at least at least a handful of really key use cases that would fall outside that it doesn't meet needs to be provided and we need to work through them, right? Like it's just at that point it's just how you work through the technical details of what gets proposed. Right. We can.
269 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:34:19.249 --> 01:34:35.189 Or just say hey it applies to other scenarios and if other scenarios and you're not able to articulate it, that's not a good argument. But but yeah, we, we could specify it as just these are parameters on the search term or we could but we'll get to that it sounds like. Okay, thanks Nate.
270 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:34:35.189 --> 01:34:55.189 Yeah, so on that point, I think there are a couple things. One, any I thought a lot about like output or display based expressions or ways of doing this. And the issue is, is that it doesn't capture like those of us who just don't want to contribute to this AI enterprise, maybe because we hold the idealological position that we just.
271 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:34:55.189 --> 01:34:58.799 Don't want our work used for something that we think is harmful for the world.
272 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:34:58.799 --> 01:35:18.799 And so if you're only focusing on display like you, you miss that sort of ability to express a preference for everything that leads up to the actual display. To put a concrete example on it, imagine a sketch artist. I I have a lot of artists, artists friends who care a lot about this. Imagine a sketch artist who has a very particular style.
273 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:35:18.799 --> 01:35:29.159 They may not want their work to be used at all by any sort of generative models, independent of what is displayed because they simply do not want to contribute.
274 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:35:29.159 --> 01:35:47.429 To that and they might have very valid reasons for doing that and and also with display in that context, it gets really hard like at what point is that style being pulled into the output anyway, even if you were going with a display display sort of approach, like because of the way that these models work, like everything that is included into the input affects the output.
275 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:35:47.429 --> 01:36:04.529 And so I think that anything that's focused on display or output is inherently limiting and doesn't capture those of us who wish to express the preference, we just don't want to participate and say I endeavor. So we'll stop. So I just have a follow up on that. Doesn't the foundation model training part capture that? No.
276 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:36:04.529 --> 01:36:23.129 Because what I just talked about is a, is like a grounding example. That's the use. Yes, yeah. Yeah. We'll we'll get to that discussion as well. Yeah yep what's your name? Sorry. Martin? Yeah, so I'm I'm still unclear about whether this is scoped with it.
277 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:36:23.129 --> 01:36:40.919 Within the context of that one specific application or whether this is intended to be general. And I think this is a point of confusion in, in all of this. I didn't hear you very clearly Krishna, when you, when you spoke just now. I didn't eat your position on this.
278 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:36:40.919 --> 01:36:59.579 You said a lot of words, but, I'm really answered the question for me. Are these display based things limited to search applications and those adjacent to them or are they intended to be more general? Because I think a few people have interpreted this as being more general.
279 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:36:59.579 --> 01:37:17.309 And when we try to interpret them as being more general, we're. So ok I'm gonna say a lot of words which won't make much sense to you because that's all I'm capable of with all the that I have right now. So we'll give it a try. Yeah, I'm gonna try and see if I fail at it. Okay.
280 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:37:17.309 --> 01:37:32.549 I think that display based preferences gets to 90 % of the use cases that people really, really care about. And any preference or anything that we write as specifications are coming out of this body, any other body.
281 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:37:32.549 --> 01:37:52.549 It's impossible to cover a hundred percent of the cases and make everybody happy. If we can come up with something that can just literally make every single stakeholder happy, that's great. But in the absence of that, we have to satisfy 90 to 95 % of the things that people want and are observing as trouble in the ecosystem.
282 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:37:52.549 --> 01:38:11.009 So the proposal is geared towards that. So the display based preferences definitely addresses that part of the problem, and it does go further and addresses some of the other use cases because we went and looked at it for code generation, e.g., right? And it does help there as well because of various.
283 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:38:11.009 --> 01:38:28.079 You know, options that we have there. Now, we can argue whether, you know, it satisfies one very specific use case you have in mind or not, but I am pretty sure that it satisfies 90 to 95 % of the use cases that people have. So.
284 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:38:28.079 --> 01:38:48.079 If I could interject, I think it it's not the concerns that I've heard if if I'm understanding them correctly are not so much about what use cases it covers, it's that they're difficult to interpret on their own. So, you know, we, we've defined a training term and you can visualize that as an application training a model and.
285 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:38:48.079 --> 01:38:52.169 That produces an output, and then the parameters that the the.
286 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:38:52.169 --> 01:39:12.169 Display based terms parameterize that output or how it can be used. Same with search. But if, if they're just floating out there on their own and you apply them to some application that somebody's using AI for, their application is perhaps less clear. And I, and I think that might be the nexus of the concern. Yeah I mean so we are getting into an attachment mechanism now, right? Like how you attach the.
287 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:39:12.169 --> 01:39:29.099 References also has an impact on how people will be able to use it in different snaps. No, I think Mark's point is more more that and we've talked about this in the context of search. Search is not a single thing. It's, it's kind of a spectrum of of uses of this technology better include.
288 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:39:29.099 --> 01:39:45.899 Finding stuff and and go the way to chatbox. The point here is that we might be able to look at a particular point in that space and say that the display based preferences apply in a particular way to this specific application.
289 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:39:45.899 --> 01:40:02.159 But we can't generically apply that to arbitrary uses of the technology. What would the arbitrary users be? Well, that's exactly the point. But what would they be? A few examples would help.
290 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:40:02.159 --> 01:40:19.139 People use AI for a few things. Pick one, self driving cars stock prediction. Medical imagery, you know, dealing with.
291 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:40:19.139 --> 01:40:35.639 Diagnosis in, in medical applications. Where, where do the display based preferences apply in the context of detecting tumors in a, in a radiology scan? I think we are conflating some of the model training aspects.
292 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:40:35.639 --> 01:40:51.299 And the grounding aspects with the display aspects. So, in the scenarios that you have identified under the display based preferences options, we do have things that allow you to not ground on something. So.
293 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:40:51.299 --> 01:41:06.899 I think that if you were to extend that logic, it is possible to express that certain images, image types cannot be used, certain inference types cannot be used. But if there are scenarios like that, there are, they represent gaps.
294 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:41:06.899 --> 01:41:25.109 Right? They are not, that doesn't say that all the other scenarios that we've described, which cover 95 % 90 % of the cases are wrong. It just means that we have to cover the 10 % more. Yeah Kevin go.
295 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:41:25.109 --> 01:41:44.309 So when I've thought about display based preferences and approaches like them, which I have a little bit, I like them in the sense that they focus on the observable outcome, which has been one of the ways we've built consensus in the past. I thought about them more in the way that Mark described them as parameters on a category so that in effect you can say NO.
296 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:41:44.309 --> 01:42:04.019 Or yes but with conditions, right? I think that the use cases we were trying to separate with those conditions, let's say searches as an example, where things like the laterhosen style traditional search template linksy thing and a full synthesis AI overview that doesn't provide attribution, right?
297 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:42:04.019 --> 01:42:20.489 That was the thought behind, behind providing those parameters, right? Our colleagues from Google made the point accurately that there are all kinds of AI and machine learning techniques used to deliver.
298 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:42:20.489 --> 01:42:39.089 The the later hosting style traditional experience, and there isn't anything wrong with that. I think what we heard from, from our colleagues on the publishing side was that what they were actually upset about was.
299 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:42:39.089 --> 01:42:58.949 Was that the, the outcomer directed traffic to their websites in fact, to sort of what led into to Brad's proposal around substitution abuse, right? Which is part of it.
300 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:42:58.949 --> 01:43:14.789 And then the things leading up to that display text also, the things that might not be displayed. That could be another you know what Nate was saying in terms of, it might not display the same style in this context, but can you speak up?
301 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:43:14.789 --> 01:43:34.789 Yeah, I was just kind of saying if beyond the display, there's also kind of the lead up to that, the use of content data in generating information that may not be displayed, but it could be used for the purposes, right? So I was trying to relate that to what Nate was saying before about the style being extracted from a certain scratch of artist, it may not be displayed in that.
302 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:43:34.789 --> 01:43:39.959 Context directly but it may derive that style and use it in another way.
303 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:43:39.959 --> 01:43:56.729 So that's kind of another concern that takes us back to the conversation we were having coffee during the break, but I totally hear Chris's concerned I think it's a very valid one. On the other end of that particular pool though, if we are expected to, you know.
304 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:43:56.729 --> 01:44:16.729 To indexing search results, e.g., we need to be able to work with the content that we're scoring and indexing and retrieve it. It seems like we should be able to find a way to structure that that should be right, so that's why we were talking about AI use versus display and.
305 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:44:16.729 --> 01:44:21.929 Like how do you, what uses fall into that category versus outside of the fixed?
306 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:44:21.929 --> 01:44:41.929 So I usually ask people to expand acronyms. What does the acronym Laterhosen mean? So we were in Zuric. I was there. We got a little goofy, we were all locked in the room for a couple days. You'll all get to experience that if you didn't a day or two.
307 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:44:41.929 --> 01:44:46.379 And people kept saying traditional search like it was in the old days.
308 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:44:46.379 --> 01:45:06.379 And we noticed that every time somebody said search, different parties in the room like had a different association with it, went back to their conception of what search was and how it was built, what effects it has and stuff. And so I think Martin was Martin. No it was. Yeah.
309 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:45:06.379 --> 01:45:12.089 I think the attention was just to stick a word on it that was the thing we were talking about and didn't have any preconceptions attached to it.
310 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:45:12.089 --> 01:45:32.089 And we picked later how? Because their tradition. You just get the history, you don't give them mean. So it's the ten blue links classical search of like 15 years ago, right? It's kind of the closest thing I can give us a definition for that. I think Aaron's pro.
311 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:45:32.089 --> 01:45:48.329 Proposal get up from a couple of days ago which calls it non generative search results is actually like a maybe more suitable technical term or what we call data host back, even though things like the snippets on it are probably.
312 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:45:48.329 --> 01:46:03.359 Generated. Well, we we'll get to this I think I'm thinking I think it tries to point to the same spot, like to the same concept and I I like that better. Okay.
313 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:46:03.359 --> 01:46:20.249 So I think I'm I'm glad to see if we already have a later housing reference just barely 2 h in. That's good for the working group culture. I think from what I'm hearing it makes sense to maybe put a pin in this, keep it in the mind as we have our other discussions and and come back to it.
314 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:46:20.249 --> 01:46:39.389 And see if we can make some progress there. Does that make sense to you krishna? That's fine. Okay, great. So then we have in the same bucket, this this focus on purpose of use rather than time of ingestion.
315 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:46:39.389 --> 01:46:59.389 What eventually loads. This was from Echo. I'm not sure this is terribly actionable at this point, in that in in in some ways we seem to have gone down this this route. If, if you consider creating a foundation model a a purpo.
316 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:46:59.389 --> 01:47:09.989 And if you consider hosting a search service for some definition of search, which we're still trying to nail down also a purpose. Those are the two terms we have.
317 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:47:10.000 --> 01:47:19.039 I think this could be used to inform any future terms that we come up with. Is there anything else to talk about on, on this particular issue right now?
318 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:47:19.039 --> 01:47:31.510 This this does seem to keep on coming up and and I haven't heard pushback against it conceptually, but in terms of what it actually means for action, we're still, we need to get down to the to the details.
319 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:47:31.510 --> 01:47:51.510 I mean I tend to be of the opinion that the existing search definition, at least the current draft is more of like a display output based thing. I think that purpose is what matters. Like I think it's what matters to asset owners, and I also think that the person who is accessing the asset.
320 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:47:51.510 --> 01:48:13.390 Knows what their purpose is. So I think that it is a good approach for doing this because it is expressible, it is knowable, and so it meets both of those sort of things. I think that the foundation model training category is really good and I think we should just continue that forward logically with the same sort of structure as opposed to trying to shift to some sort of like a display based thing.
321 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:48:13.390 --> 01:48:32.470 I I I guess I'm not sure I agree that the current search definition is purpose based, but in general I think a purpose based approach makes more sense. Just just at a general philosophical level, does anybody wanna push back on this notion of of trying to be more purpose based in in the vocabulary trucks we define?
322 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:48:32.470 --> 01:48:49.450 So if we go down that path, are we getting into the same realm we have today with a lot of privacy frameworks where if I'm a data collector.
323 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:48:49.450 --> 01:49:08.440 And I I've declared my primary purpose for use of that PII, my original collection purpose, much like if we declare the original use of the access in this framework. Under the privacy rules, if I want to use that stuff for a secondary purpose.
324 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:49:08.440 --> 01:49:25.120 I can't. I need to go back to the original party and say, I now want to use your data for this new purpose that wasn't originally declared. Do we end up in that same situation if we take this approach where there's an original purpose and original intent for the use.
325 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:49:25.120 --> 01:49:45.120 But if I wanted to use it again for a secondary way or an evolved way, I need to go back and re access it or reassess it. Well, I we're not creating a negotiation protocol. I understand that, I understand that, but but in our in our mindset, when we say we are going to have this framework that focuses on a purpose.
326 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:49:45.120 --> 01:50:05.190 What happens when the, the purpose changes or there's a new purpose that occurs? What is our expectation in our work that we're doing here of what that consequence of that new purpose is I guess where I'm asking. That that's definitely something we need to keep in mind and I think that's one of the reasons why.
327 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:50:05.190 --> 01:50:35.860 We've pared down the vocabulary so much. To contextualize this issue, this was Echo raised this at a time when our our our vocabulary was much more tilted towards the technology used at a particular time rather than the purpose it was used for. And I think that's what he's trying to highlight here. I and I'm and honestly I'm not against Eker's observation. I'm not trying to argue against it. I'm trying to explore the consequences of going Yes, this is a better way of doing it. And I think that one of those consequences is this secondary purpose.
328 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:50:35.860 --> 01:50:53.020 Question then becomes, what do we do with that? Like, what is our intent to do with that? I, I'm not even trying to bias it of even say what the answer should be. I'm just raising I think that creates that question that we have to address that. Well, I think we need to keep in mind also where we are.
329 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:50:53.020 --> 01:51:09.190 Expressing preferences, so it's the question of if I want to honor your preferences and you've told me your preference is only X and I want to do Y The natural implication is is that I need to go talk to you or double check or. Sure, number of different pathways or I need to decide not to ignore your preference.
330 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:51:09.190 --> 01:51:28.840 So I think one of the things is that the temporal thing of this has not been solved, right? Like so the idea was like, you know, when you collect the data for like training or whatever you kind of remember what you got in the robuster text and that's what you have. That's the preference. So if you, if you're like, in addition to Mark's fine, whether we are so if you decide to.
331 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:51:28.840 --> 01:51:48.840 Like honor the preference, right? Like the idea would be like, you know, it's at the time of collection or the preference that has expressed. So if you do another training run or something like that, you can go collect the stuff again and do it. But other other than that, I don't think there's like, we are putting in any requirements for like going back and check like Mark said, this is not a negotiation protocol, right? It's at the time of collection and in cases like tom's stuff, right?
332 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:51:48.840 --> 01:52:12.219 The idea is that this preference is gonna be transitively communicated like you know when somebody collects it and passes it on for use at some other time. I think that's the best we can do. Anything more than that is probably gonna be so much higher. Okay, and again, I'm trying to be very conscious of not biasing the welcome. I will say if that is the way we wanna go, then it would be very good for us to document that is our expectation.
333 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:52:12.219 --> 01:52:27.759 And that's what we've decided, just so that people really oh, ok, that's what they meant in this context when it comes to secondary purpose. Sounds good. Thank you. Would it be possible for you to file an issue plan on this? I can try. Okay, thank you.
334 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:52:27.759 --> 01:52:47.759 Yeah, I, I think some of this is happening in the chat now. I see you folks that are saying preferences that might travel with the the the assets so some of this is going to be addressed in terms of the attachment mechanism and I think it just also points to the limitations of what a preference can or can't do. Secondary uses, you know, that's why we have to.
335 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:52:47.759 --> 01:53:17.409 Of service, that's why we have things that are more detailed in places at the point of collection which can address what uses are permitted or not permitted in terms of the direct engagement between the collector and, and the provider. And I think that's, you know, another reason why we need to bear in mind that these preferences can. Right, can you speak up like people on the other side? Yeah sorry that these preference, sorry. These preferences are.
336 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:53:17.409 --> 01:53:37.409 If these preferences have have have their own limitations, so terms and conditions that's why that scores but they're so important that that they continue to apply because these that will cover the secondary downstream potentially the secondary downstream users that where where where intentional purposes could change, what end users might be able to have access.
337 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:53:37.409 --> 01:53:58.779 To and yeah, that's I want us to to we're definitely gonna get to attachment at, at some point, but I want us to make sure we stay on on topic here in terms of, you know, I think this is really about purpose based, a vocabulary versus technology based vocabulary.
338 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:53:58.779 --> 01:54:18.779 Yeah, so to to the to the purpose question mark. I I think bearing it in mind and having purpose kind of infuse how we just how we how we frame the vocabulary is really important because ultimately that's what gives it meaning. But I I also think we're going to have to balance ways in which the vocabulary is framed.
339 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:54:18.779 --> 01:54:40.329 So that it is well understood and not a matter of pure intent because that makes it rather sort of, you know, you know, amorphous when it comes to, do I have that intent? Is there an intent? Is there a uniform, you know, way to, to, you know.
340 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:54:40.329 --> 01:54:57.309 You know, to show that and I don't I don't know I don't know if there is. Warren? Yeah, I mean we've had very similar conversations past and accurately describing the sort of purpose I think it's gonna get very hard.
341 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:54:57.309 --> 01:55:15.039 So when you stop publishing something, you say this can be used for search as an example. Then people decide it would be really useful to be able to use some content for doing self driving cars. What you would like to do is, can I search through these images for stop signs?
342 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:55:15.039 --> 01:55:32.289 So I can train my car what a stop site looks like. Is that search? Is that not? Or, it would be useful, you know, using an early example to be able to train things based on images or radiology. Can I use stuff which I.
343 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:55:32.289 --> 01:55:47.319 Initially got for training things to be like this is a broken bone or a broken arm. I don't know. It was originally searched. It wasn't search radiological or similar, having a way to.
344 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:55:47.319 --> 01:56:03.519 Predict, in the future how AI might evolve seems incredibly difficult, and either you say NO you cannot use it for something so you opt out or we're going to end up in a case where you can't really evolve from this platform.
345 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:56:03.519 --> 01:56:20.229 Well, I think we're overcomplicating things like we have at the moment two purpose like like two purpose based definitions, right? And the your point, this is not about.
346 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:56:20.229 --> 01:56:37.929 I cannot, cannot the answer or the, the thing that we're trying to clarify is can I use an asset that have access for, for a specific purpose? Our current vocabulary draft contains two purposes a somewhat ill defined search purpose.
347 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:56:37.929 --> 01:56:55.929 And they somewhat better defined foundational model training purpose, so like you go there and you see the preference associated to that. Your AI model that your stop sharing example sounded to me pretty close to like the I'm training a new foundational model for sort of course.
348 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:56:55.929 --> 01:57:15.929 If it says NO, then you can't. If you want to, if you, if it says yes for search, then you can use it for that purpose. So like it is, it is like we are at a purpose based thing at the moment and in that sense like I think Echo's sort of like issue is actually resolved for the time being. We might need to reopen it if we come.
349 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:57:15.929 --> 01:57:37.599 With other non purpose based vocabulary terms. But as long as we have purpose based vocabulary terms, like I think we've solved that. And I Understand that I think this is informed by our experience over the last year plus of we we tried the technology based approach and it came to instrumentable issues in that. I think that we're in violent agreement Gary. Okay perfect.
350 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:57:37.599 --> 01:57:53.889 Thank you. Thanks. Yeah, I I think we should be kind of flagging these things and then setting them aside for the attachment discussion because I agree, like if we have purpose based categories and, and somebody wants to.
351 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:57:53.889 --> 01:58:09.579 Use an asset for multiple purposes, then each of those is covered by the relevant category and and like how to retain that over time or write I think those are attachment questions and it's good to raise them, but then perhaps document them in the wiki and then.
352 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:58:09.579 --> 01:58:29.579 We'll move on. Thanks. Yep. Okay, I I'm in cool. I need to go ahead and close this one. Does anybody want to push back on that? Can I just ask? Yeah, Nick, what, what is the purpose of the current search? I'm just kind of surprised that people will characterized the current search definition as purpose based, so I'm just want to kind of align on like what is.
353 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:58:29.579 --> 01:58:50.919 People view as the purpose, the current definition, like how are the purpose based? I I think like Paul did mention it's kind of illdefined, right? Like so I think that's something we need to work on. We acknowledge we need to work on it definitely. Yeah, so one very important thing to keep in mind is that this specific search preference has a certain origin story to it, which is.
354 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:58:50.919 --> 01:59:10.059 Mostly as a secondary preference towards the NO training preference, so this comes from, I think a use case that was identified relatively early in the thing that people wanted to be able to ensure that when they say NO to training or foundational model training in our current things.
355 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:59:10.059 --> 01:59:27.129 That they could clearly express the fact that they do explicitly not want that to result in an exclusion from traditional search results. That was the, it's a bit of an odd one thing like we we think that many people who will actually want to.
356 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:59:27.129 --> 01:59:47.129 Signal negative in the sense do not do this preferences might want to couple them with a positive to a search in that sense. Okay, I mean that I understand that. I I think I just conceptualized it as more of like a display or alpha based thing because it's really based on these conditions that are display based.
357 "TRN6-29-BANFF/speaker_1" (4268955648_1) 01:59:47.129 --> 02:00:04.719 So if it is purpo I think it's important that we that that's that's a clear articulation of a purpose, but whatever purpose it is. I think the abstract purpose is really to be able like we had that a long time train NO search yes, to kind of like decouple these two.
358 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:00:04.719 --> 02:00:20.199 Very abstract, not fully defined concepts from each other to give people the ability to do that. Okay. I'm gonna go ahead and close this one, I think. We can always reopen it.
359 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:00:20.199 --> 02:00:36.369 Closing it does not mean there's consensus, it's just this is a note taking for for Suresh and I basically to make sure we understand the state of the group. So that takes us to number one 70. Hierarchical structure is problematic, which I think is Krishna's issue as well.
360 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:00:36.369 --> 02:00:51.789 And I think the state we're in here is we don't actually have hierarchy except that there are some, some lingering bits of text in the draft about hierarchy. Oh five doesn't have hierarchy code.
361 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:00:51.789 --> 02:01:11.789 The terms are not well unless you consider there to be an overlap between search and foundation like that issue clearly speaks about draft zero three in sort of like NO to all and then a couple of these things and I understood krishna's comment.
362 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:01:11.789 --> 02:01:31.629 Would be about that that was considered to be problematic. I think in the current state of the draft this might be a less of an issue, Mark, because I think it it's addressing an issue that was there but not. I I think perhaps the best thing to say now is we'll come back this if necessary.
363 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:01:31.629 --> 02:01:50.499 I saw your notes there. I think that's, that's good. Okay. All right, let's leave that alone for now. That takes us to our old friend, oh max. I just I I presume we'll come back to it as a matter of course, but I did want to note that.
364 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:01:50.499 --> 02:02:05.739 The arguments about the like automated processing category and the TDM level conversation, we'll probably deal with this one. I think that's what you meant by will probably come back to it at some point, but I just wanted to.
365 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:02:05.739 --> 02:02:24.909 Vocabulary terms we define. Right, and I know that that's, you know, that's one of the things that changed in draft five here was was the removal of that level of category and I know that that's not really a consensus position and we've had some argument about it, so. I just wanted to call it out, that's right.
366 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:02:24.909 --> 02:02:44.909 We're we're at the point now where we we we didn't have consensus to have that in the draft, so we're looking for proposals or for a term that can get consensus to re add. Sure, sure. I, I think there's some, if I remember correctly, there's some issues that address that later on. I don't think there is currently proposals to re add a catch all category like which is.
367 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:02:44.909 --> 02:03:14.739 Frankly given the history of that conversation somewhat surprising but like looking at the current documentation that we have is there's nobody has proposed to re add it. We've removed it from all four to all five. Okay, so if you want to make a proposal, Sorry I might have been conflating explicit GitHub issues versus conversation on the the mailing list, which I think was still relatively strong disagreement to.
368 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:03:14.739 --> 02:03:34.739 Anyway, I I hear what you're saying. Perhaps we, we, we just put a pin in that. So max, there's like one or maybe it might help, maybe it won't, but I think the idea was like, you know, we're not gonna add stuff in the draft without consensus, right? So the reason this is not there is because we couldn't get consensus on it. Not that we had consensus to remove it, right?
369 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:03:34.739 --> 02:04:08.459 Could be that anything that gets added to the draft needs to go through a consensus process. So unless like somebody makes a proposal saying that, hey, and gets consensus on it, it's not going in the draft. So think of it as, as that way, right? Like, you know, it's not that we are like putting in things that So the idea is like getting the minimal things in that have consensus in there, right? Like, and spectfully, we don't have consensus on the search the definition at the moment, the search definition is still there and the draft won't ship and the draft won't ship. I I think like people are much further apart on the the.
370 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:04:08.459 --> 02:04:42.839 Anything that I don't think at least at least as shares we didn't see a way of getting consensus on it, but if you do have a proposal that we can try to ask for consensus, I'm sure, right? But I think it's much harder at least from the last time around, right? Like, and let's we can keep it open, but I think the point still remains that for stuff to get into the draft and to get into work group last call, we need consensus on the text, whatever that is. Even for search, if we cannot get consensus on search, somebody at some point we need to decide if just shipping the foundational model stuff is worth it or not, right? Maybe it's.
371 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:04:42.839 --> 02:04:53.889 Not, right? But I think the point is like everything that's in the draft at google glass call time hopefully should have consensus. Otherwise it's not gonna go there. In that case I think it just needs to stay open.
372 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:04:53.889 --> 02:05:13.889 Yeah, it is. Yeah, understood. Okay, it's just that right now it's not just I got it. Okay, so let's move on to training. 183I think we made some progress on in the side meeting or not sorry the side meeting the the interim that we had, the online interim that we had.
373 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:05:13.889 --> 02:05:30.369 And I think, yeah, we were talking about just removing the word large or language around the word large to come up so so I made a straw man proposal on the list, sorry on the on the on the issues list.
374 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:05:30.369 --> 02:05:45.669 Does this address the concern? We have to ask Alyssa who has had three weeks to respond and hasn't. Indeed. I think it does.
375 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:05:45.669 --> 02:06:00.759 Understanding we have one other issue open. Well, we have some definition issues around AI, so if we can put those to the side just definition. And the narrow confines of this issue that Alyssa raised, do we think that this is an improvement that that is sufficient to address the issue?
376 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:06:00.759 --> 02:06:12.604 We're gonna complain about the word wide.
377 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:06:14.680 --> 02:06:39.909 How far are we away from the world where foundation models are not the only models that are going to be how far are we away from the world where foundation models are not the primary or only models where we should not be powering AI services small language models.
378 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:06:39.909 --> 02:06:57.429 You know, like where things are things are are rapidly evolving. I I've I've been a bit puzzled about the change that was made to, focus AI training only in relation to foundation models.
379 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:06:57.429 --> 02:07:14.799 I realized that they may not be a sort of clear way to articulate what sits outside of that, but I also don't know if there's a really strong reason to draw a circle around that now if they are training is something that we think.
380 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:07:14.799 --> 02:07:31.839 The preference should exist for, you know, an outside of of the specific object of it, you know, of what's being trained so.
381 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:07:31.839 --> 02:07:47.619 Yeah. I would maybe just argue the take that a little bit further, like I think this sort of like by focusing on one specific type of model, you create.
382 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:07:47.619 --> 02:08:04.389 A very strong incentive for entities willing to use intending to use assets to do something that resembles AI training to claim that the models they are producing are not foundationals. So this.
383 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:08:04.389 --> 02:08:20.949 This this feels like asking for, for people trying to circumvent this particular thing versus the previous AI model training is probably provides more clarity in those edge cases?
384 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:08:20.949 --> 02:08:36.519 That is my main sort of conceptual doubt about what we have right now. So, so are the concerns mostly around this second sentence? I think it, it it is.
385 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:08:36.519 --> 02:08:54.219 The concern is anchoring the, the training use case on a specific type of model. So, assume that that I don't know foundation models is poutine or something, you know, it's just a word.
386 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:08:54.219 --> 02:09:09.639 What other parts of this definition if you consider the self contained definition or problematic in that reading? I I think the general implication that there's other models that are not validation models and that by.
387 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:09:09.639 --> 02:09:25.119 By expressing a preference against this category, you are also implicitly not expressing that against everything else. So perhaps just AI models then. Yes, yeah. What's the like I think that's your question, what's the like.
388 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:09:25.119 --> 02:09:40.149 What's actually the, what's the delta between foundation models and all other, like what are all the other ones? You're not expressing preference against that. Is that do we have sufficient clarity about that?
389 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:09:40.149 --> 02:09:58.779 Are they all benigned? Yeah. Well, would you mind updating the issue with like your concern? Because I think like otherwise like you know the we don't have visibility into what you're thinking. I think this is a different issue to be honest, you know, the.
390 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:09:58.779 --> 02:10:28.425 Alyssa's, yeah, issue is quite narrow. Right, like, but, but the thing is like it's somehow connected to what Paul is raising, right? Because people might use this as a subconvention tool for like whatever they want to do, right? Like if you kind of do I'm just trying to close issues ok Chris. I think I think I wouldn't necessarily even call it an issue like it's a question I have, like if everybody else is. So, so let's let's see responses to that because I'm interested too, yeah. Chris?
391 "Chris Needham" (1250734080) 02:10:28.425 --> 02:10:47.439 Yeah, so so to Alicia's Alicia's very specific issue about quantified thresholds, I think, I think she's right and I tend to agree that perhaps we shouldn't attempt to quantify, you know, what is a large model versus another size of model. But I I agree with.
392 "Chris Needham" (1250734080) 02:10:47.439 --> 02:11:06.429 Paul's comments earlier that, you know, anchoring this on foundation models, you know, creates the possibility that there are other kind of models that kind of fall outside of that scope and, you know, as technology evolves and definitions change and so on. So I think.
393 "Chris Needham" (1250734080) 02:11:06.429 --> 02:11:23.199 Having something that's perhaps a bit more general than foundation models would would would be helpful. And I'd also like to kind of respond to the, the point that Martin made earlier about the word wide because we're, we're adding here like some element of.
394 "Chris Needham" (1250734080) 02:11:23.199 --> 02:11:39.309 Sort of subjectivity into the definition, you know, what, what do we mean by a wide range of use cases? You know, a foundation model that's only put to a single purpose kind of falls outside of this definition. So I think that that concern around wide.
395 "Chris Needham" (1250734080) 02:11:39.309 --> 02:11:59.309 Foundational are perhaps separate to this particular issue, but I think I think we do need to sort of revise the definition but but to but to Alyssa's original point about quantification, yeah, I, I agree.
396 "Chris Needham" (1250734080) 02:11:59.309 --> 02:12:01.439 With that.
397 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:12:01.439 --> 02:12:18.969 Thanks, thanks for this. Timmy Robot, did you want to say something you had your hand up before. My name is Timmy Robot. I would love to see, also see it be made more generic. I think Mark, you asked besides the.
398 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:12:18.969 --> 02:12:40.089 Besides the very large number of assets, I would also say, the 1st sentence, large models, I think that like that's, I don't, I don't see large as having a lot of helpful meaning. That's actually Alyssa's issue is to get rid of large.
399 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:12:40.089 --> 02:12:58.179 Okay, and also, yeah, I I think just AI models would be sufficient. So the text I have hopefully on screen, I removed that second sentence and changed foundation models to AI models.
400 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:12:58.179 --> 02:13:13.989 Just get rid of our models that on the 1st sentence because in the circular definition, yeah. So that's a wrong, that's objectively wrong.
401 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:13:13.989 --> 02:13:31.689 AI models don't always predict possess generative capabilities. And so you What is wrong? What is that? 2nd set of the AI models typically possess generative capabilities is what it says, and that's not true. Okay. That was my next question.
402 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:13:31.689 --> 02:13:48.879 Yeah. Is this what we're getting to? This is the problem we face, right? Right. Because a lot of attention that was here.
403 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:13:48.879 --> 02:14:07.749 Derived from the fact that people were concerned that the definition we had encompassed things like, you know, ordinary least squares regression that you perform, which is effectively developing a model with maybe one parameter or two parameters, and that still.
404 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:14:07.749 --> 02:14:27.749 Fits the definition. And so we went with foundation model because it has, sort of location that it has these broad applicability things. And I think there's, there's a few things that need to be inclu included in that. One is the.
405 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:14:27.749 --> 02:14:44.649 Potential for use, not the actual use because the paul's concern and the other is the the generative aspect of of the whole thing and we the the sidebar on bitter lesson I think is is relevant here.
406 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:14:44.649 --> 02:15:02.409 A lot of the applications that people are looking to use, are going to be more amenable to to use by these generic models because they are that much more capable. So do you have a suggestion? I'm actually kind of ok with where it was.
407 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:15:02.409 --> 02:15:19.629 Post Alyssa's at it. Post the the large tweak. Yep. I think those 2nd sentence I would tweak but that's probably not relevant for this discussion. Well, if that tweak makes people more comfortable, it is relevant.
408 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:15:19.629 --> 02:15:39.629 No, I was just waiting. The I think we have a huge queue to go through. Kevin? I was gonna throw into the mix, is it possible that the thing we're interested in here is.
409 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:15:39.629 --> 02:15:59.439 The idea that the model could be used for arbitrary purposes potentially unrelated to whatever purposes is making the expressing preferences express the preferences they're expressing rather than having to model this or anything else. It's, it's the fact that we use is.
410 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:15:59.439 --> 02:16:16.419 Arbitrary. I'm gonna interject and and I'm very conscious that I don't hear anyone pushing back on getting rid of large, but I do hear people picking apart other parts of the text here.
411 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:16:16.419 --> 02:16:35.409 So, if if is anybody in queue to push back on getting rid of large? I think the biggest concern I have there is just that it seems like it sweeps in a whole bunch of other stuff. It extends the scope of the conversation quite a bit. If we if we go in this direction and get rid of large.
412 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:16:35.409 --> 02:16:54.639 Yeah. What would be the last so unless like you're specifically commenting on it, Nick? Yeah, I think the subjective terms are especially large, are problematic, but.
413 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:16:54.639 --> 02:17:12.549 I mean I think this idea of focusing on foundation models might have made sense with the way the products were presented two years ago. Today most of the AI systems are integrated with multiple models, multiple foundation models, rag systems.
414 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:17:12.549 --> 02:17:29.289 Data retrieval that's not based on deep learning. It's maybe based on hierarchical memory. So I I I wonder if this is focusing on, by focusing on foundation models that we're focusing on one atomic component of.
415 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:17:29.289 --> 02:17:45.189 What are sort of larger product systems? To be clear, we, we've had many discussions about runtime use of data and other aspects. So this th this is in some ways intentionally departmentalized, but, but that's good to keep in mind.
416 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:17:45.189 --> 02:18:04.509 And specifically we did have a separate vocabulary term. Yeah, NO, I I'm aware of this, but like foundation models themselves are becoming less foundational to these products and more of.
417 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:18:04.509 --> 02:18:19.839 One component in a in a larger system and I I think focusing strictly on them would potentially reduce the applicability. So would would just a simple change of terminology help here or is something deeper?
418 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:18:19.839 --> 02:18:39.839 I guess that the foundation models themselves, what would that technically be? Would that just be a.
419 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:18:39.839 --> 02:18:59.499 That weights? Is that what we're talking about here? Whether it ultimately by talking about producing a set of. If you just say that you end up with the, well, I don't have a great suggestion. I, I'm.
420 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:18:59.499 --> 02:19:15.579 This has the risk of being too broad and also too narrow. Simultaneously simultaneously. But I I agree with removing the large service. We've discovered that, that.
421 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:19:15.579 --> 02:19:34.839 Yeah Alexy of new issues, we might be able to close one, so Mike? So speaking not as the AD here, we, we said that we've kind of moved in a direction of expressing purpose and I think that's true for search.
422 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:19:34.839 --> 02:19:51.579 But I think the reason that we have a foundation model category, like, we said we were looking at purpose based because we wanted to abstract away the question of how did you build the thing you built? If you use AI internally.
423 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:19:51.579 --> 02:20:08.829 With some exceptions, we had previously said maybe we don't care how you arrive at it. We care what you do with it. And so the foundation model really means the thing that you are building.
424 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:20:08.829 --> 02:20:26.079 You will deliver to someone else and you don't know what they'll do with it. So it, maybe this is just wildcard. Like if you, if you don't have a restriction on how your content is used, it can be used for one of these models.
425 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:20:26.079 --> 02:20:42.279 Otherwise, you are expressing some kind of restriction and that restriction then would have to flow with the data or with the model that is mike. Sebastian, go ahead. I saw you dropped off, so.
426 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:20:42.279 --> 02:21:02.279 Well, like in 2nd draft we had the distinction between models that have the capacity to generate like synthetic content or, models that are specialized for the purpose of generating images and so on. So we had both and now.
427 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:21:02.279 --> 02:21:25.269 I was wondering what the reason was to drop like that so that would maybe be being addressing the the issue. I think this specific one is about like quantifying the large thing, right? So that's this specific issue, but not about the.
428 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:21:25.269 --> 02:21:45.269 Okay, we're also having a secondary discussion about the term in general. We haven't logged an issue about that yet. We're still getting what the scope of that issue is. Paul. So I'm also in favor of sort of dropping the large and maybe closing this. The question I think we need to separately on.
429 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:21:45.269 --> 02:22:07.979 Answer is if we had a discussion this morning quite extensively about a use case, which was I don't want my assets to be used to train AI models, why we are coming up with a vocabulary which defines a use case that is not train AI models but train foundation models. Maybe there's good reasons for this. But to me this.
430 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:22:07.979 --> 02:22:34.359 This implies at least that this is a subcategory of the thing that we had, I don't think anybody really pushed back against that particular use case, but how does this particular definition help us successfully address the use case that we identified this morning? That's the question we can separately answer. On on this issue without large, it's a better definition, so let's change that and maybe close the issue.
431 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:22:34.359 --> 02:22:53.619 So Paul, if I dig a little bit into it, are you concerned that there's like somebody cannot express a preference on AI models that are not foundational models? Would that be our underlying concern or I'm trying to dig a little bit deeper on that. So my my concern is we had this morning like there's four things which basically says don't use this to train an AI model.
432 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:22:53.619 --> 02:23:13.619 Right, and now we're, we're that with the use cases, right? Like this is the thing that we want to facilitate with the vocabulary. And now we're giving people the ability to say don't use this to train a foundation model. Either a foundation model and AI model are exactly one and the same thing, then we've successfully addressed.
433 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:23:13.619 --> 02:23:34.539 The use case, if AI, if foundation model is bigger than AI model, we've also successfully addressed the use case. If foundation model is smaller than AI model, we haven't addressed the complete use case. That is my point. That implies we needed another category at least or we need to re or we would need to rejigger the the category or the use case.
434 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:23:34.539 --> 02:23:50.349 So perhaps, I I think part of the issue is in the use case, you know, I don't want somebody to train an AI model with my data is relatively imprecise and what we need to do is define vocabulary terms that are precise enough to be implemented and whether or not.
435 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:23:50.349 --> 02:24:06.069 That's reflecting the intent is I think the question we need to answer. Thank you. Brad. That doesn't seem you precise to me, it seems broad, but it doesn't seem imprecise.
436 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:24:06.069 --> 02:24:23.529 This is not I'd say I'm just saying the the the general use case and preference of I don't want my content to be used to train an AI model period. And that includes foundational models or whatever form a model might take.
437 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:24:23.529 --> 02:24:40.239 Small or big generative, non generative, some models might be trained to perform a function of a profession that someone in that profession doesn't want to use their content to allow that model to, to train on that might replace them one day.
438 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:24:40.239 --> 02:24:56.199 I mean, it maybe inevitable, but there maybe a a valid reason for them to express that preference then I think we're bucking up against.
439 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:24:56.199 --> 02:25:13.839 The problem that we are trying to define some narrow categories like Mark's expresses, but we have the problem that these terms of art are still evolving. I don't know if the term foundation model will still even be in use two years from now.
440 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:25:13.839 --> 02:25:32.050 And, and while we you could say, well, let's put a stick in the ground I think one of the issues being raised by people is they said, well, does that mean this? Does that mean this other thing that may come in two years that replaces foundation models because it gets replaced because some researchers have decided there's a new way of doing it.
441 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:25:32.050 --> 02:25:49.750 I like Mike's observation that it's less about defining the hard term foundation model and more about describing the the meaning behind that purpose or that terminology in a more generic way than trying to reuse a current state of.
442 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:25:49.750 --> 02:26:05.770 State of the art term that may not stick or evolve really well. And so maybe I'm suggesting I don't care if it's big or large, I don't think that question's I don't think it's important, but I do think nailing down a, a more broad term.
443 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:26:05.770 --> 02:26:25.770 That is gonna have a longer life than a current curve of art is maybe a better approach yeah I I think it's a good intent, right? But like we kind of didn't make progress. I think like Brad, you wrote the substitutive use cases thing, right? But like, but it still doesn't cover native case.
444 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:26:25.770 --> 02:26:58.000 Where he says like it's not about subsidiary use I just don't want it to be trained at all, right? So that that does so what about because I've been following this, we've had this conversation for a while now. I think we keep stumbling up on when we start narrowing down to something where we say, oh, we think we hit this, we have voices that pop up and say, oh, but that I think will exclude or it did or my use case that I really care about. And I'm not sure those are productive.
445 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:26:58.000 --> 02:27:13.060 Maybe a different approach is have a broader thing like Mike proposes, and then if there are narrowings that we need to capture, maybe the 1st step is get it into the wiki where we log them, say here is here's a thing that that doesn't cover. Document in the wiki.
446 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:27:13.060 --> 02:27:29.080 But we either then create a distinct case if that's necessary or we expand this thing to include it or exclude it as as appropriate. But it seems like we're going around the same cycle of that's too broad, it doesn't care my special use case.
447 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:27:29.080 --> 02:27:49.080 We need a different approach should break the long jam here a bit. Just an observation. Yeah, makes sense Glandra I think like one of the thinking like you know with the initial hierarchical approach was like we would do the the more generic things now and the specific things can come later, it's like but since we don't have the hierarchy anymore, we cannot go down that path anymore, right? So I, I think that's your your point is.
448 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:27:49.080 --> 02:28:11.650 Thank you. Thanks. And I apologize, I do need to jump after we're on a call. I will be back. Take your way. Thanks. Sorry I got a little bit distracted by the last point. Let me gather my thoughts briefly. I think one of the things that, that we're talking about here with like is this too, is this.
449 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:28:11.650 --> 02:28:27.490 Broad and the like difficulties that a couple of people have now mentioned, right? Is there's a lot of different perspectives that site owners publishers, however you want to say are are gonna bring to this.
450 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:28:27.490 --> 02:28:44.050 And it's very difficult if not impossible to give them one definition of a foundation model that's going to to address their specific need. I think when we look at Alyssa's issue with the word large.
451 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:28:44.050 --> 02:28:59.320 It is entirely plausible that different publishers will want to say, this is what I mean by large, this is what someone else means by large. And it's a fundamental, I think issue for this group that we need to be very cognizant of because we can't.
452 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:28:59.320 --> 02:29:19.320 Over prescribe. I think what a publisher's decision is here. And I know that's that's attention here, but I think if I'm understanding Paul's points earlier that that's partially tied to it when we're talking about people trying to avoid a specific definition if we, if we lean into it too much or.
453 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:29:19.320 --> 02:29:40.170 Maybe they or everything sort of gets flattened in that way. And I think that's something that that's quite critical on, on this issue. Thanks Max, thank you. I think we exhausted the queue on this? Yeah, so I'm not hearing any pushback on just dropping large, acknowledging that there are other issues here. So what I think we should do is I'm gonna.
454 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:29:40.170 --> 02:30:00.730 To check with with Alyssa to make sure that she doesn't have any further feedback and then we'll we'll perhaps ping the list and then instruct the editors to incorporate a suggestion. And I don't think this is something where we need to do a full consensus call. I, I hope we're not at that point. So let's do that. And then we have.
455 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:30:00.730 --> 02:30:18.880 This one is Does this capture what we were just discussing? I don't think I'm necessarily suggesting I'm happy to drop this if that specific definition of AI models, but like I think the.
456 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:30:18.880 --> 02:30:38.880 That that 1st sentence that that does that capture your issue now? Okay, so we, we, we do need some further discussion about this and and and some proposals especially if people maybe want to noodle over those during lunch, we can come back to this because I would love to leave this meeting with the with the training term, whatever it is called.
457 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:30:38.880 --> 02:30:58.120 Nailed down a little bit more. So if if folks have further thoughts about this, please do inspire over lunch and break some proposals to us and we'll give you time. And that is for reference issue number 1908. Yay! That's not very many issues.
458 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:30:58.120 --> 02:31:14.560 Coming to the HP working group, we're up to three thousands. We'll get that Shut out. And the other training issue we had was use of fine tune in the foundation AI model production category.
459 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:31:14.560 --> 02:31:33.400 This maybe obligated by, that discussion. I don't have that in there at the moment. Yeah. And this, is this already overcome by events? I think so. Now, I don't know that the the fundamental issue has been resolved. I think that there's still an open question about whether.
460 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:31:33.400 --> 02:31:50.170 The use of, I don't know, whatever systems people use to produce, you know, refined weights or or what have you for specific purposes. That has not really been very well addressed in what we have.
461 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:31:50.170 --> 02:32:05.800 I think because we aren't very specific, it probably is included in the foundation model training yeah so it does say fine tune, so that is.
462 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:32:05.800 --> 02:32:23.830 But I think we're, we're we're, I think we need to work on this one in the same way that we need to work on false problem. Yeah. Ultimately, what we're looking for here is the is sort of two things. One is the production of a, of, of weights.
463 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:32:23.830 --> 02:32:51.306 And then what does how we we structure the the the name of that thing because more than just producing weights, it's gotta be, there's gotta be some more conditions on it than other than producing a set of weights because the set of weights covers ordinary lease squares and other other techniques like that that what people intuitively understand as AI.
464 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:32:52.340 --> 02:33:17.850 Right, just a question for for those who can explain to me is fine tuning ever not part of what is broadly considered to be training? Yes, a number of people have, more expensive understandings of what fine tuning is.
465 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:33:17.850 --> 02:33:39.540 So some people consider fine tuning to include things like sort of system prompt engineering that happens where you adjust the the prompt that system. It also, for some people. Yeah, is making that face maybe not because I disagree that some people might think that yeah yeah it's a.
466 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:33:39.540 --> 02:34:09.570 Little changing the context without changing the weights. Yeah, changing context, those sorts of things, there's also techniques like Laura and whatnot, but that involve the production of, of weights that are based on a system of things that that don't ultimately end up changing the weights in the, in the core model itself. So there's a there's a lot to it. Given the what I see is that the the.
467 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:34:09.570 --> 02:34:28.870 Is there a distinction that we need to make tuning for the purposes of expressing it? I don't think we need to. I think I think that there's there's probably a version of this that, that simply focuses on the production of model weights.
468 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:34:28.870 --> 02:34:47.710 As long as we get the conditions on that, correct. And that would include the type of fine tuning and that would that would naturally include all the sorts of things that involve what you understand to be fine tuning. I'm leaving aside the, the, the tweaking of the context stuff.
469 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:34:47.710 --> 02:35:09.930 So Krishna, this was your issue, any, any thoughts along those lines? Yeah, I think the point that I was trying to make is fine tune is like a very specific thing. You can still produce the model without fine tuning it at all. This sounds like it's going the right direction to you. Yeah I think what what I used words that essentially makes sense to or we agree.
470 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:35:09.930 --> 02:35:36.960 Yeah, I think we're I think we're alright on this one. Okay, just need to do some, someone needs to come up with a proposal. That again. Okay. All right. So those two both need that that new issue as well as this one need proposals for us to make progress. We need to discuss them a lunch I think. Yeah, I think that's a nice lunchtime discussion Hopefully we'll I'm very happy to give folks who have.
471 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:35:36.960 --> 02:35:56.800 Proposals especially if they have involved multiple people and perspectives, we can give them time to bring those proposals to the group. If it's a single person with a great idea, please go talk to other people 1st. Yeah. So next up is search, but it's hands UPS. Oh, we have heads up. Yeah, and Aaron.
472 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:35:56.800 --> 02:36:16.800 Nope, I I can talk to the folks who are gonna work on this. Okay, who, so show of hands who who's interested in kind of working on on those, that little cluster of issues. 12345I think I'm gonna follow the names so like it shows up, can you keep your hands up, so.
473 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:36:16.800 --> 02:36:44.380 Getting the transcript, Kevin, Martin, Paul, Brad, Aaron, the other Kevin, the original Kevin, Kayleb, Kathy and then Krishna. Lovely. Thank you all. Thank you for your service. Okay, we've got a lunch together. So, so, and there was someone else, there is Max.
474 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:36:44.380 --> 02:37:01.540 I have what I think is a technical question, but if the organizing group wants to say, Sorry, if the organ, if the group that's self organized there wants to say we'll get back to you after the lunch break, I think that's fine. But I guess reading through the original proposal.
475 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:37:01.540 --> 02:37:20.260 What I'm not entirely clear on is yes, there are some technical differences I think in in the weeds on how fine tuning looks in these different types of fine tuning, but practically, from the perspective of an outside body and the outcome of that fine tuning.
476 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:37:20.260 --> 02:37:40.260 It's unclear to technical differences, and so if there isn't really a output level difference, I don't see what the the reason for being against fine tuning is here. Maybe I'm not understanding something in the weeds here, but I think it's it's it's unclear to me. I don't think that the position is, and correct me if I'm wrong.
477 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:37:40.260 --> 02:38:05.050 It's not that it's it's against fine tuning it's that the terminology isn't precise and not sufficient to cover all the different possible inputs or more cases. Let me let me try to rephrase. I don't think from a technical outputs perspective that the output of these different fine tuning methods is substantially different in order to say.
478 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:38:05.050 --> 02:38:20.470 These fine tunings are not groupable in this or these, this, this, these elements are not groupable. I think from a technical perspective, that's how it seems to me. So I'd like to hear more about that from the technical perspective.
479 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:38:20.470 --> 02:38:40.470 Does that make sense? So I think the kind of direction like, you know, that this is going in is like kind of to say like, hey, the foundational models and anything refined from them, what's the direction this was going in? So that kind of goes in the same direction as what you're proposing, right? Like not put a fine point on it, but just say like anything that's some kind of derivative of that.
480 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:38:40.470 --> 02:38:52.348 Broadly is covered would be the direction it goes and I actually would like to speak to that. I I think that that kind of reopens the foundation model category.
481 "Tori Noble" (765022208) 02:38:52.348 --> 02:38:54.739 Into an alternate account.
482 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:38:54.739 --> 02:39:21.730 Into what category into an all training, all model training because the ambiguity that fine tuning introduces is whether you are doing the fine tuning on something that at the end of that process is still a foundation model within the foundation model category or whether it is any subsequent training of a foundation model which could then specialize it into something that is not a foundation model.
483 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:39:21.730 --> 02:39:41.730 And and I think that that is an important distinction that we should not be like that introduces pretty significant differences, and I think the point of having a foundation model category is, is to be able to target something that does have those multiple applications and it is fairly general.
484 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:39:41.730 --> 02:40:01.270 And and fine tuning is not a necessary component of that because if we define kind of training as just adjusting model weights or something else more generic, then it doesn't really matter like pre post, you know, but that ambiguity about when something stops being a foundation model is important.
485 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:40:01.270 --> 02:40:20.740 Max, would you like to sit in on this, like join this like self organized group and kind of put in your concerns when they talk about it or do you want to wait till they come back with something? I'm happy to wait until they come back. I think, I think there's enough cooks in that kitchen and then just I'm happy to get the summarized output and and try to go from there.
486 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:40:20.740 --> 02:40:40.740 I'm frankly, Aaron's follow up didn't clarify a lot for me. That might be for a number of reasons. I'm not entirely sure, so i'm happy to wait until after lunch and and and since, since it's clear we'll come back to it. I don't think we need to hash it out now. Okay, Sounds good Chris? I'm glad that.
487 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:40:40.740 --> 02:40:42.884 Password?
488 "Chris Needham" (1250734080) 02:40:42.884 --> 02:40:59.560 Yes, so if, if, if the process of fine tuning like as Aaron perhaps described it, takes a foundation model and turns it into something that we categorize as not a foundation model, like from a.
489 "Chris Needham" (1250734080) 02:40:59.560 --> 02:41:15.490 From a creator or a publisher's perspective, I'm not sure that's a distinction that that is terribly meaningful to us and I think we would want to be able to express a preference that that encompasses both of those sort of activities.
490 "Chris Needham" (1250734080) 02:41:15.490 --> 02:41:31.639 Yeah, that was just on that point and then the other thing I wanted to mention is, is being remote, I'm happy to help with definitions if there's any sort of technical way to to involve remote participants, but I'll understand if that's not practical for you.
491 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:41:31.639 --> 02:41:50.560 Thanks for that. Yeah. Okay, so it's quarter after twelve more or less. Our next big topic is search. So I'm inclined to say let's call that lunch.
492 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:41:50.560 --> 02:42:06.520 Let's give folks, Apple time both to get lunch and have some meaningful conversations. Both that conversation we just talked about as well as any other kind of folks who wanna get together, talk about proposals or or come to a better understanding of what we're we're doing here.
493 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:42:06.520 --> 02:42:21.850 So let's come back here at what do you think? 13130? Yeah. Come back here at 01:30 local time and we'll continue, get an update from that group to see if they've got anything to bring to us and then move on to search. Okay.
494 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:42:21.850 --> 02:42:39.370 The lunch plan is you are in the city of Toronto. Nothing has been provided by our gracious hosts. They've, they've provided us with a space, so I I believe there are plenty of options nearby.
495 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:42:39.370 --> 02:42:59.370 There's a large food court on the main level downstairs that has lots of options there right down at the bottom of the elevator. So just just go straight down straight down at the bottom the elevator to the front of the building, make a left, you'll see an escalator going straight up. There's more options in there than you can find than you. Sounds good. And and is is it safe to leave our computers in this room? Yes, we have cameras and all that neat stuff there, if you.
496 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:42:59.370 --> 02:43:19.370 You place to eat, you can eat down there or if you wanna bring it up, use the cafeteria up here, you're welcome to do that as well. If you want to eat in this room, you absolutely can. Thank you. Great. Okay. Yeah, and we'll stay on the same Webex so Webex doesn't have the 5 h limitations and we'll stay on the same Webex. Thank you. And, yeah, please come back before 01:30 we'd like.
497 "TRN6-29-BANFF/speaker_1" (4268955648_1) 02:43:19.370 --> 02:43:42.623 Actually start at 01:30. So keep that in mind and see if ours hopefully you can join again tomorrow. I know it's.
498 "Glenn Deen" (2079563776) 02:57:37.219 --> 02:57:50.884 I was always.
499 "TRN6-29-BANFF/speaker_1" (4268955648_1) 03:59:25.819 --> 03:59:42.379 Welcome back, we're about to start. Did you on mute? Yeah. Again sorry.
500 "TRN6-29-BANFF/speaker_1" (4268955648_1) 03:59:42.379 --> 04:00:02.379 It's the same link. Where do you want? Where do you want? It's on the mail it's on the mailing list as well. If you put on the mailing good email Yeah.
501 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:00:02.379 --> 04:00:12.319 The session too, it's the session or like share the link so.
502 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:01:16.119 --> 04:01:36.129 Okay. All right. Yeah.
503 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:01:36.129 --> 04:01:53.089 Okay, so, I hope everybody had a good lunch. We are recording again too. Yep, ok great. We had a group of people talking about what we closed before lunchtime. Do they have anything to report?
504 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:01:53.089 --> 04:02:09.649 And who have they say to bring up the issue? I can do that. Let me foundation models been too specific, 1908.
505 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:02:09.649 --> 04:02:29.269 Sorry, you said this is a PR. This is a PR issue one 90 issues I think one of the things we talked about was exactly what.
506 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:02:29.269 --> 04:02:45.559 Filer commented in that, in that issue about which is that the potential here for us to focus too heavily on foundation model creates a, a I guess a loophole, a gap.
507 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:02:45.559 --> 04:03:00.859 In in the preferences for those people who don't want the production of a model, the fact that it's focusing exclusively on the foundation models and not the specialized models was, was problematic. And so we came up with this one which is.
508 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:03:00.859 --> 04:03:16.099 To start with the production of refinement of of weights of of models, and then we have this split into two parts. A general purpose model that can generate content at one or more modalities, text, image, audio, whatever.
509 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:03:16.099 --> 04:03:32.719 Or a specialized model that is designed to generate content in one of those modalities one or more modalities. So that basically says that if you're generating a general purpose model, we're focusing on the capabilities of the model.
510 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:03:32.719 --> 04:03:52.719 Whether or not it's actually used for that purpose doesn't matter. If it's capable, then it applies. Or if you're producing a specialized model that has the specific purpose of generating content in one on one modalities that it's covered. And this is very close to the.
511 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:03:52.719 --> 04:04:11.059 Definition that we had of generative AI, I think from from before and I think that's Paul's insight from the discussion there helped us get past a a hurdle where the focus on capabilities was, was a challenge.
512 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:04:11.059 --> 04:04:31.059 So, that's the 1st part. The other one is, we spent a lot of time thinking about how this definition is potentially problematic in the sense that it could sweep up more than what people intend in terms of things like simple classifiers or the, the examples that we were given before.
513 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:04:31.059 --> 04:04:55.069 Work regression analysis of all all those sorts of things that just produce very simple models and or produce very simple outputs. And so the intent is to exclude things that produce the simple metrics like yes, NO answers or simple labels or probabilities across different different dimensions, that sort of thing.
514 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:04:55.069 --> 04:05:14.347 So, that's what we have right now. I think we can probably iterate on a couple of aspects of this, but it'd be interesting to hear what people think about where we where we landed. So Echo, do you want to ask that to the room? Chat.
515 "Eric Rescorla" (696034816) 04:05:14.347 --> 04:05:33.680 Yeah, I I don't really understand the difference between being capable of generating content and being designed to generate content. The way these models are trained does not seem to me to actually meaningfully distinguish the two.
516 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:05:33.680 --> 04:05:38.907 Yes, ok, that's a, that's a comment. I'm not sure what to do about that comment though.
517 "Eric Rescorla" (696034816) 04:05:38.907 --> 04:05:58.339 I mean I I guess I just understand why you're distinguishing, why don't you just why don't you just why don't you like erase half this text and write can of a model of a model that can generate content in one of the more idalities I just understand what the what the special external purpose split is doing.
518 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:05:58.339 --> 04:06:15.679 Do you want to talk to that, Aaron? Yeah, I think the concern is that this would cover models that are used for a particular use case but have capabilities that are not being used.
519 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:06:15.679 --> 04:06:38.266 But I am kind of thinking about how it works more in the context of a, like, a system that only has a training control rather than like a, you know, other used categories besides search, so I I'm actually sort of, I think, persuadable to what you said.
520 "Eric Rescorla" (696034816) 04:06:38.266 --> 04:07:03.920 Let me let me take a quick example, right? You know, these models were not designed to find vulnerabilities in software. If they do so. Now we know they do so. If I if I fine tune it, but if it's a general model or why i fine tune it, now is it specialized general?
521 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:07:03.920 --> 04:07:24.709 That's a good question. Yeah. Which is why I was originally suggesting just that what what Eka did. So, so significantly simpler. Oh yeah, sorry. I I remember why we did this. So the idea is that we want to capture foundation models.
522 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:07:24.709 --> 04:07:40.669 Which are defined based on their capabilities. Foundation models can do many many things, and we want to capture generative models that is used in the generation of content and not capture models that are neither.
523 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:07:40.669 --> 04:07:56.479 And the reason to leave out models that are neither is there are some use cases we think that this work should not disturb or accidentally sweep in, like, you know, classification.
524 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:07:56.479 --> 04:08:23.789 Or or other kind of like that the focus of this should be really generative and if you say that any model with the capacity to generate content is in scope, then you have essentially swept in a lot of models unintentionally that are not used for generation because generative capability can be a byproduct of training a model to do other things. Does that help?
525 "Eric Rescorla" (696034816) 04:08:23.789 --> 04:08:38.260 I surely agree with that claim, but I need wasn't this the whole reason why we started trying to talk about like how the models were used as opposed to what the capabilities of the models were, but to to to to dodge this precise problem?
526 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:08:38.260 --> 04:08:54.229 Yes, but we, I think are collapsing a lot of things now into the training category that we're not envisioned to be in it before, that is this will not just be a foundation model category.
527 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:08:54.229 --> 04:09:24.761 It will be a foundation and generative model category. And in order to do that, you, you sort of do need the two part definition because we can't think of a better way to define foundation model for that reference to capability, but, I think we're definitely open to that again. If you have a quick, a quick introduction, Andrew will help clarify this, go for it, otherwise, wait for Chris. Oh, I mean somebody else can go ahead. Okay Chris, go ahead.
528 "Chris Needham" (1250734080) 04:09:24.761 --> 04:09:48.066 Yeah, so, so I I so I I agree with the intent here like like to to sort of orient this on, models that gen that generate content. It's, it's really not clear to me like this distinction between the general purpose model and a specialized model. I'm wondering if the definition.
529 "Chris Needham" (1250734080) 04:09:49.101 --> 04:10:11.000 The intent is to exclude classifiers like should we not include classification as part of the definition of like this is that this is excluded as part of this definition so kind of make the, you know, the, the classifier part, kind of, part of the definition itself.
530 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:10:11.000 --> 04:10:20.621 Yeah, we can add words to that effect, but we didn't get round to drafting them because it took us long enough to get to this point.
531 "Chris Needham" (1250734080) 04:10:20.621 --> 04:10:22.919 Okay.
532 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:10:22.919 --> 04:10:42.609 Thanks, Andrew? It seems to me that you're trying to maintain a distinction that is just never gonna work. That that fundamentally, I, I mean, you know, compare this to other kinds of cases, right? So, you know, automobiles are not intended to be used as weapons, but they are sometimes. And we don't, you know, when.
533 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:10:42.609 --> 04:11:10.399 Their users, we don't exclude them from, you know, the classification of murder. We say, well, in that case you did a murder. So I think the same thing's gonna be true here. And, you know, you know, the problem then is that you've got things that are gonna move categories from time to time and that sucks but too bad. Did you want to say something? I guess.
534 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:11:10.399 --> 04:11:26.269 If you could Alright robot. Sorry, can we have one conversation, please? I'm wondering about whether this could just be re If you could talk more about why this isn't just like.
535 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:11:26.269 --> 04:11:43.879 Anything that's capable of generating content in one or more modalities, like whether it's a specialized or general purpose, right? Like it like it seems like this could.
536 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:11:43.879 --> 04:12:03.879 Be simpler and it would still catch both use cases and so I don't understand why, why it it's this complex. The definition is this complex. I'm looking to defend that, I'm just representing. Okay. The idea was that a lot of the technologies involved.
537 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:12:03.879 --> 04:12:10.789 Increase models that are n nominally capable of generation, even if that is not their intent to reuse.
538 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:12:10.789 --> 04:12:30.789 Okay, so it's really designed to is the the part that that we're, we're caring about here is designed to generate content versus capable of generating content. Paul, did you want to clarify something there? Yeah, the the example that was used when we 1st came.
539 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:12:30.789 --> 04:12:58.760 Up with this thing is the classifier like the horse classifier by virtue of being trained to recognize horses, it almost certainly also has the capacity to generate images of horses. That's the that's the thing you're trying to edgate or that you're trying to exclude. And I guess my my follow up question is, I like I think if, if I had content and I didn't want things to be generated.
540 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:12:58.760 --> 04:13:15.020 Based on that content, I wouldn't care what the intent of the initial people were, right? Like if a non generating model is created and put on HuggingFace.
541 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:13:15.020 --> 04:13:35.020 And then it's used to generate content, like I don't really like I don't really care that it, that people change the purpose of that model. I care that it, you know, my content is part of that of those weights now. Right. I, I see discussion in chat about changing from.
542 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:13:35.020 --> 04:13:56.690 What it's designed to, to what it's used to do. So the I guess the the problem is, is that the decision has to be made before use. And so there's NO way that I know of to guarantee how a model is gonna be used.
543 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:13:56.690 --> 04:14:12.380 But there is a way potentially a way for me to say, I don't want to be involved at all in like generative stuff. If I could interject, is that really true? Because if I'm creating an email classifier, I can, you know.
544 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:14:12.380 --> 04:14:32.380 Assured that that is only used for email classification if I can control the if you can control the distribution, if you control the distribution, yeah. Well, yeah, which that reason is the reason why back when we 1st introduced when this was called generated AI training, it was coupled with.
545 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:14:32.380 --> 04:14:53.920 An AI training category where you also had the opportunity, so this was attempt to nudge people of giving some extra space but there was the backfill category of AI training pure and simple, like you use this to, to produce weights, you can say NO against that and NO discussion about nothing. So.
546 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:14:53.920 --> 04:15:14.330 So that that is a thing which arises from the fact that it is sort of like a s sort of proposed as a standalone thing here. You you can capture that by adding an additional category that is even broader, but I don't think that is where this group is currently going, so that's a I I think.
547 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:15:14.330 --> 04:15:33.080 A valid point that you're making. Thank you. Mike. So I feel like there are two potential paths here, at least, and we are currently dancing over the line between them and that doesn't scale well as the paths get further apart.
548 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:15:33.080 --> 04:15:53.080 Either, we say that the thing you are outputting is a model and we don't know or once the model leaves its creator's hands, you don't know what it can be put to, it can be put to any use and therefore we have to talk about capabilities if it's capable of generation.
549 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:15:53.080 --> 04:16:10.070 But then that's what it does. Or you can say that along with the model goes the preferences of the data that it was trained on, and it is up to the person who gets the model.
550 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:16:10.070 --> 04:16:29.987 To honor those preferences, and these are still only preferences you not enforcement, so if downstream somebody chooses to ignore your preferences, they could have done that anyway. And I think ultimately we have to pick. Picker.
551 "Eric Rescorla" (1519360768) 04:16:29.987 --> 04:16:46.840 I mean, yeah, I was gonna say roughly what Mike just said, it seems to me that the structure of the document is currently written is if they don't use my own thing for training at all. But once you've but but then once you've a lot of time you use for training, then the preferences need to get carried along with the model.
552 "Eric Rescorla" (1519360768) 04:16:46.840 --> 04:17:17.300 And and it's responsibility downstream of people to either comply with or not comply with its preferences, but that trying to draw find because of the nature of these models, trying to find distinctions about like what kind of training you know what kind of models can we trained with the work very well. If like, I thought that was sort of where we came to last time. If there's not where people think about the way that like things be structured, then probably we should discuss that philosophical question as Mike suggested rather than discuss a text here.
553 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:17:17.300 --> 04:17:40.910 Thank you. Warren? Yeah, I mean seeing as somebody mentioned email a minute ago, for, I think we want to be able to build classifiers that can say things like is the spam yes, NO, right? And you don't want people to go to opt out from abuse type classifiers. The problem is for any classifier you build, you can also turn it into a generated use case.
554 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:17:40.910 --> 04:17:59.600 Does this look like spam? Yes, NO. I preturb it slightly, does it look like spam now? Yes, NO. So I think we still run into the problem of, as long as we are accepting the fact that we want to go to train classifiers on stuff, you also are generating generators at the same time.
555 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:17:59.600 --> 04:18:18.290 Which sort of gets back into the training versus use type discussion because you don't want people to opt out from their content being used for various types of classification, right? Not just SPA, but also does this look like CSAP?
556 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:18:18.290 --> 04:18:38.290 Oh, Similar sorts of things, so I don't know where we end up with that. Thanks, Andrew? No, I think it's been overtake. Okay, cool. Nate, I want to move on to other things and I don't want to derail this, but I do want to point out, I don't actually agree with the idea that classification.
557 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:18:38.290 --> 04:19:02.110 Is always a good, good use case or something that we should assume people might not have a valid preference against. Classification can be used for sensorship, particularly in the context of emails. I mean, there have been cases in the United States with major political candidates who have had felt that their content was being classified in a way that was sensorship towards them. Classification can also be used for discrimination and when we're dealing with inputs.
558 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:19:02.110 --> 04:19:17.600 That includes photos of people and things like this, like what is spam becomes a decision that that is is really important so people might not want to opt into that, so I think it argues for either a broader definition here or a top level definition of some sort.
559 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:19:17.600 --> 04:19:33.170 Yes. Well, for four months we go into circle trying to define and when we put our fits in the in the fits of rights owner.
560 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:19:33.170 --> 04:19:49.460 If we cannot define it, they can't define it. And if they can't define it, when when we think about what they want or don't want, they don't want their content to be to become embeddings or waits. And so maybe we should stop.
561 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:19:49.460 --> 04:20:05.030 It's talking about and defined an AI model, which was the NO NO AI training before, keep only this row level and its preference after all, if if there is.
562 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:20:05.030 --> 04:20:24.050 A law that goes against the the the preference or a specific contract where the preference will not be solved, but maybe we should stop trying to define what?
563 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:20:24.050 --> 04:20:40.190 Yeah, I think I might have missed something. It sounded like there was a assertion that one should be able to say, you can't use my content, like e.g. spam or rebusive content to train a classifier.
564 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:20:40.190 --> 04:20:56.240 Which just means that anybody who's generating spam or abuse of content will be like, you can't trade on me, that feels like we're then gonna lose the abuse. You're assuming the question, but you're assuming that you have the right of a s of knowing what spam is.
565 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:20:56.240 --> 04:21:16.240 And I'd love to share a brief personal story here. So I was among a number of large websites that were classified as spam by a large search engine, and more than half of my industry was classified as spam by a large search engine. We were later invited to that search engine's office and apologized too in front of their entire search team and told.
566 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:21:16.240 --> 04:21:34.280 Nope, that was a misclassification. So I do think that there are many people that might say that like they don't want to contribute to that. And I and I feel like, you know, these, these things are inherently wrong, like all, all AI AI models I use that as a marketing term, have make mistakes.
567 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:21:34.280 --> 04:21:52.790 And so, if you, if, and furthermore, like the idea of spam, there's warms. Like you you're, you're coming at it with your assumption of what spam is, one man in spam might be another man's sensorship, and so I like I, I realized that this makes it difficult to kind of like plot a path forward.
568 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:21:52.790 --> 04:22:12.790 But I'm really concerned about the centralization of control and if we're just gonna say like, you know, I think that there are valid reasons why people might not want to contribute to that actually. Thanks Kevin? Yeah, I just wanted to note that people are bringing up things like spam filtering and CSAM and whatnot.
569 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:22:12.790 --> 04:22:28.340 If we all agree or really necessary. I think that highlights the fact that, you know, we want, we are focused on wanting to create preferences. I think there are some folks in the room, including myself who are worried that whatever we set as a preference may become a requirement.
570 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:22:28.340 --> 04:22:48.340 So it may actually be a feature rather than a bug if the preferences we set up have to be ignored in some cases such as in the case of security. I I think there's NO enforcement mechanism that we have, so that what you said is necessarily true, Sometimes they'll be ignored, right? That's kind of why they're called preferences. We start out.
571 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:22:48.340 --> 04:23:03.980 But what I'm saying, what I'm saying is I'm saying is folks are concerned that what we set as preferences maybe adopted by someone else as a requirement say legally, and so having a set of preferences that we all agree actually have to be ignored in some cases.
572 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:23:03.980 --> 04:23:19.550 May actually be a feature rather than if you're saying lean into that harder rather than trying to avoid. It's a thought It's just a thought. No yep Max. I wanted to to go back to what Warren was saying for a second because I think.
573 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:23:19.550 --> 04:23:35.480 I, I think for, I think you and Nate are kind of talking past each other a little bit and I I hope I can bring a clarifying thought here, which is saying NO to using content for training.
574 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:23:35.480 --> 04:23:52.310 Might make your model or your classifier less effective, but it doesn't stop you from them using that classifier in your products. Like, and I I think your concern of like people that create spam will just say opt out.
575 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:23:52.310 --> 04:24:07.850 Well, it's not like, sorry, forgive me, you you said you were at Google so is it ok if I use Gmail as an example? Okay, like Gmail will still have the spam filter as part of its product offering.
576 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:24:07.850 --> 04:24:25.880 Even if there are people that say externally to that, don't use my content for training because anything that comes through Gmail gets checked, right? You're still gonna use the classifier to check, but we're talking about the training of that classifier, not about the use of that classifier.
577 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:24:25.880 --> 04:24:43.130 But but to build the classifier, I need to train it on stuff. I guess csam's might be an easier example. To train the classifier to say this is not CSAM. I need to show it CSAM and non CSAM material. So.
578 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:24:43.130 --> 04:24:59.420 A either have existed for a while, frankly or B, have other methods of like acquiring that content for training, right? Like, I don't see how you come next. I think the concern here is that.
579 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:24:59.420 --> 04:25:16.820 New classes of spam in email. Yep. If they're all labeled with an opt out, then become invisible to the classifier. And so classifiers built into a service that gets that content like ported through. All of them.
580 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:25:16.820 --> 04:25:36.820 I think that's the concern. But hold on hold on to to I I hear what you're saying. I think the answer to that is anything that comes through Gmail servers has some sort of terms of use associated with it that Google is going to use, what comes through its servers in order to improve the Gmail product. So you're not gonna respect the purpose.
581 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:25:36.820 --> 04:25:57.710 Ignored the We ignore these I think so, yeah. I think that's gmail's thing. This goes back to I think the point which hold on, which I think is when we talk about like what is the agreements for using a Gmail as similar to what we talked about, what does it mean to go to a website publisher and respect their terms of service? I think these are all connected.
582 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:25:57.710 --> 04:26:15.350 I don't think I'm saying anything contradictory to itself. But can I jump in here? The person using Gmail is not necessarily the sender of the email. Yeah, and so the sender of the email has not consented to Gmail doing anything. I think what you're.
583 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:26:15.350 --> 04:26:30.560 You're either saying that the attachment mechanism here will simply not apply to email or you're saying that Gmail will have to ignore the preference in embedded in the email. And.
584 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:26:30.560 --> 04:26:50.560 I think one of those is, is an assumption I'm not sure we're entitled to make yet because we have not had the attachment discussion. And, and the other one, I mean, I I take Kevin's point about building in things that are, should clearly be ignored because they would be harmful, but I would much rather design a system that doesn't.
585 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:26:50.560 --> 04:27:11.570 Doesn't you have to have those things work, because, you know, like a user who receives a spam email that is not currently detected can flag it as spam. That teaches the classifier that this is more likely to be spammed next time. But that is true across other domains from email too.
586 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:27:11.570 --> 04:27:27.740 And there will be websites with content on them that people need to police for various kinds of abuse of material. Or even just kind of classified to be more useful, and I I think we should.
587 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:27:27.740 --> 04:27:45.380 Think about how these preferences are going to account for that, whether they are going to be out of scope as this definition proposes or if the consensus is that any kind of use of an AI model on content should be in scope of this work.
588 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:27:45.380 --> 04:28:00.380 I'm Timmer Robot. I think, I think that defining defining high level exclusions would make this definition easier, that instead of trying to.
589 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:28:00.380 --> 04:28:17.240 Create a training definition that that protects kind of convolutes itself to protect certain things that we just define those exclusions at a, a top level so that the training definition category doesn't have to do that work.
590 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:28:17.240 --> 04:28:37.240 So, like, I think we did try, right? Like so I think like there's a bunch of texts that used to be there like about like potential exceptions, right? Like I don't know whether there's version three of the draft or not, right? Like we talked about like a whole bunch of things, at least there was a proposal from Lealah and Yang, right?
591 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:28:37.240 --> 04:28:46.280 So I think like that's kind of part of the thing, but it cannot be exhausted, right? Like I think that is the issue.
592 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:28:46.280 --> 04:29:05.720 Yeah, well, can I make a procedural suggestion and I think that would be to take like to put a pin into this, like, I think most of the feedback that we've heard is that this is not perfect at the moment, that there's sort of like.
593 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:29:05.720 --> 04:29:22.670 Things around the edges that people have various issues, but like it doesn't seem to be terribly skewed into one or the other direction and see if we can resolve some of the other issues and come back to this, like specifically going towards search.
594 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:29:22.670 --> 04:29:41.570 Like we had some part of the discussion that we had in the small lunch group was also like, ok, can we then deal this bit of overlap between sort of like the search stuff and, and, and this potentially others potentially like take a look again at either specific proposal, which I think was a.
595 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:29:41.570 --> 04:30:01.570 Sort of like the way I read it was a 45 version of section 3.2 which is still in the draft. Go through there, like I think a lot of the things sort of like that that get raised again and again also arise from us looking narrowly at one specific bit of text, like I don't think any of these.
596 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:30:01.570 --> 04:30:26.410 Categories can be read without at the same time always having 3.2 in mind. So do a bit of that before because like we've been very good at proposing something, things and then destroying them by looking at them for too long and so moving our gaze and then coming back might, might end up being more productive. I I think that's reasonable. I I have.
597 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:30:26.410 --> 04:30:43.370 Heard other proposals or referenced other proposals for different ways to spell the resolution to this issue. I'd encourage folks to comment in the issue if you have a different way you'd like to spell it. But I do, I do I hear you, I do agree that we're gonna get it to diminish and return to this conversation.
598 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:30:43.370 --> 04:30:59.690 Yeah, Aaron last word. Okay, you have your hand back. Oh, sorry, that was an old hand. Could I'd also, I would love to see whether this definition is better than the previous one.
599 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:30:59.690 --> 04:31:28.586 That's interesting question. That's something that we that we haven't been talking about. We just been talking about is this perfect? Is it better? Is it an interesting question? I mean just to go back to what's again. The the we don't have a polling in this, do we? I can't I can start a poll. Okay, let's go.
600 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:31:40.500 --> 04:32:07.780 Yeah, I know. Oh yeah, it does. Like a bad. We apparently are oh yeah Mark asked me to like I was not for this. You opted out?
601 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:32:07.780 --> 04:32:39.440 I can ask a question too. You can ask a question. You can also. Why do you need this? There is a survey in the Webex, if you can find it. If you if you cannot see it, like you had to go to apps and see Slido.
602 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:32:39.440 --> 04:32:56.060 I mean there are others. Well, and to clarify, is the old one what is in the draft or the one that.
603 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:32:56.060 --> 04:33:12.680 Just posted it. But like that's suboptimal. I think very large numbers is just has was disappeared. Oh yeah, the the very large is gone. I think we've agreed to that, soon that that's gone. How would we vote it if we're neutral?
604 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:33:12.680 --> 04:33:27.770 Yeah. Mark, I was hoping for palm. Double chat.
605 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:33:27.770 --> 04:33:47.770 What is it possible Possible just to compare them side by side might be make it a little easier. How would one do that?
606 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:33:47.770 --> 04:34:07.870 Okay, so, nobody's saying it's worse. That's all I can. Yeah, you need to share the results I think. Oh.
607 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:34:07.870 --> 04:34:23.780 Well. So it's it's it's 5050 ish between yes and unsure and NO nose. So I'll take that as a win.
608 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:34:23.780 --> 04:34:43.780 Some form of progress, yeah. So somebody just said NO, can we get more clarity on the just because I said that we Sorry? I can hear you, are we trying to get more clarity on the split, whether the split is helpful, yeah. That was my, my question is folks wanted to add another.
609 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:34:43.780 --> 04:35:03.350 The Option to the comments, yeah. Well I mean yeah it seems like there are a few people arguing for basically just look at the capability and whether that would be an improvement and get the reasons for the split, but.
610 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:35:03.350 --> 04:35:24.610 The way the question is worded, you could think they're equal and it wouldn't you wouldn't know that the answer. And sure if they're equal, you would say NO then wouldn't you help? Okay, this is not going anywhere. Where's held? I think yeah.
611 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:35:24.610 --> 04:35:44.240 Oh, so does, do you want to run a poll for that one? Sure. So noticed that you made that decision I didn't.
612 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:35:44.240 --> 04:36:06.100 So Martin, like split the definition or not is the question? Give me a minute and I'll type up something and put it in the, in the issue And asks if if the term would still be foundation or mod or some other term and I think.
613 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:36:06.100 --> 04:36:15.770 That's a new term, that's a good question. I think Leila, the the answer that Martin just said, if you didn't hear was that we'd probably come up with a new term.
614 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:36:15.770 --> 04:36:33.950 There did seem to be some discomfort with the foundation model. You just scroll down. It's a very short Okay so maybe the same poll then is, is Martin's proposed text better than the existing text.
615 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:36:33.950 --> 04:36:51.860 Yes, I'm concerned the note that we have, the insane generate content, the intent is blah blah, blah. It feels like that's doing a huge amount of heavy lifting here and when we try and fold that into the definition, we're gonna run back into the same set of problems, right?
616 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:36:51.860 --> 04:37:07.400 Sorry, you're talking about the the one up here? No, in both cases. In both cases, in both cases, because can generate content, any classifier can generate content. That's basically any model. Yes. Yeah. Yeah. Yes.
617 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:37:07.400 --> 04:37:27.400 So this is the one we're, we're asking about now. So Warren Warren I think made the made the argument that a classifier is a generator. A classifier is a generator. Is this a cat? Yes, NO. Right, and and the idea that's been put out there, as I heard it was that maybe we allow people to express a preference and for certain.
618 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:37:27.400 --> 04:37:49.450 Use cases that preference is ignored and that feels worse than just a s like standard slope. Like, I will choose to ignore all of the preferences because I might want to use stuff as a classifier and so now I'll just ignore them very quickly. Yeah.
619 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:37:49.450 --> 04:38:09.290 Our charter explicitly puts enforcement out of scope. Yes. What would I as Google do? I will ignore the preferences because I need to use email for training a classifier. So if I'm ignoring it, I should ignore it for other things too.
620 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:38:09.290 --> 04:38:26.060 Right, that feels like this is, I think one of the interesting things to think about here is is that there's complexity and nuance here and the question is where does that, where does the application of that nuance fall? So who's required to do the work? I think what that.
621 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:38:26.060 --> 04:38:41.330 There is anything that is a model is anything that's a classifier is also something that can generate content. So any exp preference that is expressed, I know it all just feels.
622 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:38:41.330 --> 04:38:57.080 So I think what you as Google would do is you would want the consent of the mailbox holder, not the consent of the sender. Well, the problem is so then comes in.
623 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:38:57.080 --> 04:39:15.890 Can I use a random email comes in? Can I use that for training or do I have to create a model for every end user mailbox? If the terms on your email service are such that.
624 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:39:15.890 --> 04:39:32.420 Any email sent to the service, any user of that service has to consent to the use of of training of their emails for spam classification. You send email to me, you send email to Warren@mari.com. My email happens to be hosted on Google. Sure, so.
625 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:39:32.420 --> 04:39:47.900 You as the email sender, have NO idea that you've sent it to Google. You're the content creator. Right. You have sent it to me, so I as the user, I'm like, oh, I will, I will set Martin's preferences. No, but.
626 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:39:47.900 --> 04:40:05.180 You sent it to you, you do what you do with it what you're doing, you're doing with that, what you will. Yeah, I think Mike's Mike's right. As the operator of the service, you receive an email, and an email may not be for any one of your subscribers. It maybe which is often the case, which is often the case, right? Span.
627 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:40:05.180 --> 04:40:25.180 And you, you say as a general rule that the mail that I received is gonna be used to. And so then but that also means that for everything that one creates, you have to have a separate model trade. No? No, NO. I don't know how you reach that. I, I I think we're rattling here and.
628 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:40:25.180 --> 04:40:45.700 I think. I think like it's still useful to answer the question if this one is better than the previous clear that sort of like from that scenario, maybe it's not better from what Nate expressed, it's probably better to see how like the general, like I think the positions are staked out, like I think people understand.
629 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:40:45.700 --> 04:41:09.820 The consequences of having like a thing that is focused only on, on generative capacity, let's see like how we feel about it. So like let's ask this question for this one too. Same question. Okay. Relative to the existing one or Relative to the text in the draft minus the very large. So I'm looking for linear progress. Yeah. We we made linear progress on.
630 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:41:09.820 --> 04:41:35.650 Yeah the 1st proposal. So you want to do it relative to the 1st one whether it's better than the 1st one. Okay, sure, sure. Okay, so between the two and the screen then well not almost on the screen. Yeah well in the issue, I can try to collapse the 1st one. How about this? I I'm gonna throw the two text as an option and ask people to pick one of them or neither because like I think like they're kind of losing the thread because people are not able to.
631 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:41:35.650 --> 04:41:54.410 Relative to each other. We still right? Yeah. Okay. Is is this one better than the last one we discussed? No about that. Okay, what is this? The two on the screen. So it's the one from Martin here, more this guy here. Oh, thank you Martin. Call this B.
632 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:41:54.410 --> 04:42:10.280 And call this what I wanted to I just wanted to like question yeah ask it to be better than a. It's stuff on the screen.
633 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:42:10.280 --> 04:42:25.730 You should not give me control of things that are on the screen, but then people don't ask what doesn't.
634 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:42:25.730 --> 04:42:45.730 So yes.
635 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:42:45.730 --> 04:43:15.640 Okay here comes the poll. Apparently. Right.
636 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:43:15.640 --> 04:43:35.640 It it's all his fault, Suresh, all his take on NO preference. I'm unsure. It's not that I have NO preference, I think this process. Does that make sense? I'll take note both, that's fine.
637 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:43:35.640 --> 04:43:57.140 If that makes sense mike. There's an AI assistant, thank god. Is that a foundation? That's the 1st one say that.
638 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:43:57.140 --> 04:44:13.760 You not be able to see the app or like you're not on the there's a button. She must be on Redo. Oh, this one shows.
639 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:44:13.760 --> 04:44:29.780 What if you think one is slightly less bad than the other but keeping them? It's just relative. It's just relative.
640 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:44:29.780 --> 04:44:47.420 You prefer or on the other one, the less bad one. And obviously, if you don't like them, the responsibility is on you that sounds suspicious. So we are are are very very evenly split here. Yeah. Yep.
641 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:44:47.420 --> 04:45:05.450 Between the two? It's vibrating between text A and text B having predominance right now text B is slightly ahead, but certainly not consensus. Not clear win out, I would say Yes yes yes yes yes yes. Standard IETF disclaimers apply.
642 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:45:05.450 --> 04:45:27.310 Okay, close that. Alright, so this needs more discussion, but it does feel like we're getting to diminishing returns on the discussion in the room right now. So one of these will probably win over what we presently have, but then I do take Timod I think it was Tim at Robott's point that perhaps.
643 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:45:27.310 --> 04:45:54.920 Some other discussions might bring out or or or or people comfort on the related issues and then it have a more productive discussion down the road, so let's put a pin in this one for now and now I need to find my way back to the agenda So I think that takes us.
644 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:45:54.920 --> 04:46:10.820 To search. Search. So we have issue 1703, which is kind of the overarching issue here around the definition of issues around search.
645 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:46:10.820 --> 04:46:31.670 And then we had, 1801 which is I think largely a restatement of some of the issues there as is 1807. And then we recently had 1906, which is a proposal around search. And I think there was also a proposal made on.
646 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:46:31.670 --> 04:46:47.240 73 recently. Yes. So, let's maybe talk through these two proposals and see if they get us anywhere. Aaron, do you want to start with yours with 1906?
647 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:46:47.240 --> 04:47:04.310 It's great. Should we just go straight to the proposal? Yeah, there, there is more explanation in the GitHub issue if you want it, but basically I was trying to kind of re.
648 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:47:04.310 --> 04:47:24.310 Articulate the search category that was in Paul's draft, to be a little bit more flexible in the way that the kind of traditional blue links are understood, because a limitation to purely just verbatim quotations and.
649 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:47:24.310 --> 04:47:55.480 The link is not the way I think any search engine has really worked even, you know, 20 years ago. So what this does is tries to draw the line at a particular amount of generation that is more than like a handful of words, to capture the kind of long form summaries or overviews or other kind of like texts that people might be concerned is substitutive.
650 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:47:55.480 --> 04:48:18.910 Without using those really loaded keywords, and the way that I did that was you defined that as substantial generation, and then give a definition of substantial that I am sure we will debate at length, which, you know, can, can be the hook for, for deciding where that de minimus line is. And I also baked into.
651 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:48:18.910 --> 04:48:38.540 To the definition that changing the modality of the content from like speech to text or text to speech, or translating from one language to another because those are kind of accessibility aspects that those would not count as generative.
652 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:48:38.540 --> 04:48:55.010 And baked in that looking at an asset and deciding not to use it does not count as a use even though the asset was considered by the application. And then kind of persisted the same concept of training.
653 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:48:55.010 --> 04:49:10.430 As we had talked about before, and I think Paul made the good note that I omitted the word necessary, but that was just an oversight and we should read it back in. You're saying in in roughly in here. Necessary.
654 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:49:10.430 --> 04:49:30.430 Yeah, only the use necessary to do. Yep. I like this discussion reactions, thoughts? So I do as as I think folks know from the mailingers I had a lot of issues with the prior definition.
655 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:49:30.430 --> 04:49:49.430 I do think in some ways this moves us forward, but I would say we need to strike the word new from new content because that's loaded with respect to the copyright debates as to whether or not these summaries are in fact new content or not. Like that's one of the core issues in that litigation. So we shouldn't, we shouldn't assume that. I think that should be.
656 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:49:49.430 --> 04:50:07.340 Find some other way to phrase it. The other issue is I'm just not sure why, and this kind of goes a bit to my point earlier, but like I'm not sure why there should be any sort of exception for producing the models that are used in the search application, like to assess and rank an asset by asset.
657 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:50:07.340 --> 04:50:24.140 You just need a model. You don't need my asset to train that model. And so I just don't understand why you could use that model, but like I don't understand why like the foundational train is that last bullet I just think should be struck and I just think that the search.
658 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:50:24.140 --> 04:50:41.690 It should just be focused only on search. Just can I clarify, so would it give you comfort if it was very specific that that model if it were, you know, if it were used to train that model would were only used for the search application?
659 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:50:41.690 --> 04:51:01.690 No, because the concern is is about the competitive effects on the marketplace. We already have a situation, we have one search engine that has access to three times the content as everyone else we know from government trials in the US that this leads to a major competitive advantage in the sense that they have a model that is like substantially better.
660 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:51:01.690 --> 04:51:18.140 And because of the way that that search engine doesn't split their crawlers, there's NO way for websites to opt out of that. So we're right back in this same sort of leveraging position. Like really what I'm concerned about is, is that this preference can't be used to manufacture consent.
661 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:51:18.140 --> 04:51:38.140 For what is currently a very contested issue as to whether or not like people say search and AI are blending, but really I think what's happening is that some search engines are leveraging search into the AI marketplace and where that line is matters a lot and like I just think that any like we have to we I just I don't see how.
662 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:51:38.140 --> 04:51:58.810 So and I also just don't see why it's necessary, like I I I don't haven't seen, I looked through all the history of this group even before I joined. I don't see any reason why a search engine needs my content in order to be able to rank my website. You don't produce a new search model every day and yet you're able to rank new websites, you're able to rank new pages. So I, I don't see this seems like a.
663 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:51:58.810 --> 04:52:16.790 Giveaway, it doesn't give any any like why does the person expressing this preference care about the ability to train a search? I mean maybe someone can lighten me and why that is necessary, but I just, I don't see it. Yeah, I can respond to that unless somebody else wants to. Is that ok?
664 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:52:16.790 --> 04:52:34.040 Yeah. Yeah, so I mean, in order to be kind of like used in the application, the application has to be able to use your content. And so opting out of training the models that are.
665 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:52:34.040 --> 04:52:50.420 You know, that operate the service really means not allowing the service to consider how best to understand or rank or surface your content. And.
666 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:52:50.420 --> 04:53:06.230 So I think it is necessary that one go with the other, and I also think it's important to avoid unintended consequences for users of these preferences because I think many providers of the service.
667 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:53:06.230 --> 04:53:26.230 Will say there is a bucket of uses that have to go together for us. And if you opt out of one of them, you're opting out of the whole bucket. So if training is not permitted, if training of any kind is not permitted, then I think what will happen is that a lot of people who believe that they are opting into being.
668 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:53:26.230 --> 04:53:51.910 Used in these systems will actually not be able to because they have opted out of the training and that is not like a use case that would be beneficial. But why? I don't I don't understand why because you can rank a new web, you can rank new web webpages all the time and they are not included in the training. They might be. If I write something right now, I publish it, it gets indexed and and and you're telling me that.
669 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:53:51.910 --> 04:54:08.900 The span of 30 min, the whole new model is being trained. Like we're talking about foundational model training. We're talking about model training for the search application to produce the models. Or the search. Okay.
670 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:54:08.900 --> 04:54:28.900 But like, so we're talking about the base models or we're talking about like the grounding sort of context window of stuff like because produce the models to me sounds like it's like a base model production, not just a search application for anything else. That's why I asked the question I did is if the use of the model that's trained here in this case is constrained to just.
671 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:54:28.900 --> 04:54:53.120 Serving search and not for for reuse in any other application. Do you still have the same problem? Yes, because of because of the anti competitive effects of this in that we're in a situation where right now you have a dominant player in many markets and in many of those markets has been ruled to have an illegal monopoly and like you don't have a way to to like that leads to like we're basically just we're basically.
672 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:54:53.120 --> 04:55:13.120 Giving a bit near of legitimacy to something that is hot be disputed. So I'll I'll perhaps maybe we can have an offline discussion about the any competitive effects cause I don't understand how that affects other markets if the use is constrained. But it's probably more helpful here to not talk about any competitive effects, but to talk about central.
673 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:55:13.120 --> 04:55:40.190 Which is something deeply. Yeah. Yeah, I mean, I think that the concern is, is that you're gonna just whoever has the biggest index will then have, like, will then have the best product with this and this will naturally lead to like a situation where whoever has the biggest index is gonna have and has the most power to basically, I I guess whoever has the most power you know inc like get material into their indexes get is gonna have.
674 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:55:40.190 --> 04:55:59.660 The better product and it's it's naturally going to lead to a winner take all dynamic. And I and but more importantly that, why I don't want there still hasn't been an answer as to why it is necessary to train a base model on an individual website to rank that individual website. I understand why a base, like something needs to be.
675 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:55:59.660 --> 04:56:15.500 Used in order to create a search model, but I don't understand why my content needs to be used because again I could write a new post right now and and these models would still be able to rank it like they are able, the whole point I thought of this AI thing is that it's able to generalize.
676 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:56:15.500 --> 04:56:35.500 It's able to go beyond its inputs. So why must I contribute to the inputs in order to to be like in order to for it to be usable on my content? Thanks, Nate Sebastian. The question that I would have is, is it supposed to be included as an option to say yes or NO? Like non gen.
677 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:56:35.500 --> 04:57:03.610 Or would the non generative be kind of excluded from the, as a general kind of description? So then could say NO if if that I I think this is like the search definition, right? Let's call it non search probably, right? It's not gonna be a separate one from this, right? This category is roughly equivalent to the laterhosen category.
678 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:57:03.610 --> 04:57:25.280 Or the search category or the ten blue links category, right? The things that are not the generative features. And I think we've, we've just been struggling with how to define that category so that we kind of capture people's expectations but that category is and has been for many years, maybe, you know, decades operated by AI AI models.
679 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:57:25.280 --> 04:57:43.910 And so that's why the training is an important aspect of this. Now it's possible like it's true that if only a single piece of content outside of training and everybody else UPS into training and this category, then the search engine would probably not have problem handling that one piece about the content.
680 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:57:43.910 --> 04:57:59.720 But we can't define categories on the assumption that they won't be used. And we can't limit the number of people who can opt out before like the quote is hit and then you you NO longer have a search application after that point.
681 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:57:59.720 --> 04:58:16.340 Right, so either there is a kind of, you know, understanding that wanting to be included in the non generative search application means being used to develop and improve the non generative search engine.
682 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:58:16.340 --> 04:58:32.990 Or, it doesn't, but I am predicting that most operators of a non generative search engine or those features will have to teach their systems how to handle the content that comes into them.
683 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:58:32.990 --> 04:58:48.560 And therefore cannot make a blanket promise that they will be able to adjust rank handle content that they were not able to train on. Thank you, Aaron. Sebastian. You good?
684 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:58:48.560 --> 04:59:08.560 Yeah, Glenn? I worry that they're creating a, a specification that is both very narrow in that it's it's true right now, but in three years people are gonna go, Oh yeah, search doesn't work that way anymore, so I don't even know what that means anymore. But I think.
685 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:59:08.560 --> 04:59:37.360 If we're created specification that the users of this thing, which is gonna be the internet, are sort of expected to have a deep understanding of what search does and how it works that is gonna be really beyond 99.9 99 % of the users and we're creating something which is unusable for that reason. So I mean this whole general direction that we're going here in getting this expertise in the room seems to be.
686 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:59:37.360 --> 04:59:55.550 Rough here. If I look at the user community outside of us, it's, it's horrible. This is a bad direction. I think that's something we need to really consider. Thanks. Dominoba? I I too would like clarity on whether.
687 "TRN6-29-BANFF/speaker_1" (4268955648_1) 04:59:55.550 --> 05:00:10.640 For this last bullet, basically whether this is something that the training models is something that's say google search because I think that's what we're all talking about, like Google search like needs this to function.
688 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:00:10.640 --> 05:00:28.550 Or wants this to function. Because I I I hear both, I I hear I've heard both in, in my time, in this working group that, that Google can't, can't currently.
689 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:00:28.550 --> 05:00:46.190 Provide search results without training happening, right? Like that's been a statement that's been made. And then I've also heard that it it can, but basically doesn't want to or it could in the future, but it's not the way it currently works.
690 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:00:46.190 --> 05:01:01.910 And I think that getting clarity is important because, I think that they're, you know, kind of go going to what Glenn is saying, if, if it's this, if it's currently the, the reality.
691 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:01:01.910 --> 05:01:21.830 That Google could separate their pipelines in a way that training isn't necessary in the future, but that's not the way it currently works that and and we we publish this in a way that is basically incompatible with Google indexing.
692 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:01:21.830 --> 05:01:39.530 Right, that we're giving, we're kind of we're giving people a tool that will potentially blow up in their face or that will have consequences that they don't understand, right? That they, that they're, they think they're opting out of just.
693 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:01:39.530 --> 05:01:59.530 A a a feature of Google search when in fact they are in reality opting out of google search completely. And I think that we have to understand kind of what the impact of this category will be. Yeah, I think like one of the things like I think Aaron mentioned was that in the use case.
694 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:01:59.530 --> 05:02:22.000 Stuff, right? Like, you know, the use cases like is currently defining like what the the expressors of the preference like talk about, right? So Aaron promised to write up like how the consumers of this preference understand those things, right? Like, so I think at that point like you could probably say something and Krishna could say something else, right? Or like maybe you both would say the same thing saying like if.
695 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:02:22.000 --> 05:02:48.590 If this non generative search with this training thing is NO, then you won't show up in the results is something they can put in or not. It's kind of outside the scope of this, but I think the way that like we are talking about how people express the preference, how they understand it and how the people consuming the preference understand it, I think that could give more clarity to what you said if you kind of get there. I think it's generally true, it's generally true of search.
696 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:02:48.590 --> 05:03:08.590 That if we cannot actually work with a certain piece of content, then any guarantees about it showing up in the search resource page, however, traditional blue links or ten blue links or five blue links or whatever we call it, we cannot guarantee that because think about a simple scenario.
697 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:03:08.590 --> 05:03:30.710 Somebody launching a new page and that page is super popular. Would you want that to show up in the search or not? If we cannot touch that content, if we cannot actually determine what happened with the content in terms of people actually looking at it and using it as some part of model to evaluate whether that content should surface in that page or not, would you want that happening?
698 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:03:30.710 --> 05:03:47.900 So this is a key question. I mean, we can talk a lot about, hey, we don't want our content to be showing up over here. The flip side of this is the number of people who complain about things not showing up in results is also very, very high. Thank you. Elaine?
699 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:03:47.900 --> 05:04:03.440 I have a similar company My understanding is that the reason we're talking about a search category is that the site owners wanted and they want to be able to say, omit my don't crawl or don't use the content training.
700 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:04:03.440 --> 05:04:20.540 So the search category is a way to make sure that they show up in search rebuilds. But site owners, please correct my understanding that's I know you were one of them. Yeah. There is, yeah. Paul.
701 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:04:20.540 --> 05:04:42.310 I, I just sort of like I think this is hard to get a joint understanding even of the situation on it. Like I'm a little bit confused by your use of the language of guarantee because like I don't think you guarantee anybody to show up in short results but well the the thing I will say is if you don't want us to use the content at all in any search.
702 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:04:42.310 --> 05:04:59.510 The application, there is a guarantee you won't show up. Yeah ok but like the other way around like I think you're not because that's that's easy to get. I I just and I get the con like I get the concerns and I flagged that earlier in the response. I just wonder if this is sort of like maybe something where we can tweak the language a little bit to make this.
703 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:04:59.510 --> 05:05:15.110 Make a little bit more comfortable, e.g., instead of produce, say something like improvement, there is some it it I don't know if that makes any difference, if this or if this is really fundamentally an in out question.
704 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:05:15.110 --> 05:05:35.110 Which I'm afraid we can't really resolve because we have sort of two divergent understandings of what's actually going on in this space or two the the question again like the the the purpose like the use, the use case why we brought this where we always brought this, this this this case.
705 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:05:35.110 --> 05:05:56.990 What we need to be in is in order to be able to say train AI know or whatever train foundation model know like the training NO, but that shouldn't mean that you can exclude me from search based on that training note search. It's a backstop thing like and I think we need to give it some.
706 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:05:56.990 --> 05:06:13.340 That changed. Some flexibility because it's a backstone, right? One? So, maybe some of an example might make something more understandable. So I've got a fairly asynteric set of interests.
707 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:06:13.340 --> 05:06:33.340 One of them is a specific type of cesium clock manufactured by a specific manufacturer. There is only about ten people in the world who are interested in this topic and only a small number of people would page aside. We would like them, presumably to be ranked in a way that it is useful for other people.
708 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:06:33.340 --> 05:06:49.849 Some way. As I say, it's a very unique concept, and so if myself and nine other people want our content as, you can't use this for training, then there's NO reasonable way for a model to say this content is better than that content.
709 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:06:49.849 --> 05:07:06.829 And I think that that's some of the whole of this information needs to be forwarded into the model and why, how does the model get updated? Sure for something like what's better at cats or dogs, there's an almost infinite amount of stuff on the internet for it to learn from.
710 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:07:06.829 --> 05:07:22.849 But for more specific things, you know, things that I happen to think are interesting in about if there's not enough content, you need people to be providing some set of input for the model to learn on. Otherwise, there's NO useful ranking signal other than like.
711 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:07:22.849 --> 05:07:42.849 Somebody linked to the page, which is, as we've seen, not the great signal. Nate and then Chris F and then I think we're gonna go and look at the other issue and have a bigger discussion. There's a lot of things to respond to. 1st, if you want people to use, if you want to use somebody's.
712 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:07:42.849 --> 05:08:16.839 Content because you you need it for those purposes, great, you should prove to them that it's worthwhile for them to contribute to that endeavor, and that's the point of preferences. What we really have here are two separate things and one that is being tied to the other one. The main purpose if there's like this idea idea of let's allow asset owners to express an exception to the general sort of train NO thing and to still be evaluated for search, and I definitely think like this I'm not saying we should strike the evaluation bullet, I'm saying we should strike the produce the model the last.
713 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:08:16.839 --> 05:08:37.479 Bullet. And that last bullet about producing the model is really a separate thing, it should be its own separate vocabulary point if we're gonna do that because really what you're saying is is that there's a need for a, you know, that it would be nice for this one company to be able to have or whatever company to be able to have a access to is.
714 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:08:37.479 --> 05:08:57.479 Much content as possible to train this a search specific model. The problem is, is that the lines between AI and search are blurring. We've heard this like a hundred times here, right? And so where this naturally is that whoever has the biggest search index will therefore have the best and most competitive AI model. And so it's.
715 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:08:57.479 --> 05:09:22.699 It's this is this cannot, these two things should not be tied together. And by the way, I like I don't really, I'm not even sure we need any search category at all. If if all we're gonna do is train, if we have some sort of a grounding or rag or output category, absolutely get it in that context. But again, it's all we've established here that you do not need my content in order to build a model that is capable of generalizing and evaluating that content.
716 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:09:22.699 --> 05:09:40.819 And I think that that's all that that this definition really needs to do and if there we need a separate vocabulary definition because some companies would really like to be able to train on, on, on a broader set of data and would like to get, you know, would like to get a separate exception for that, then that should be a separate thing.
717 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:09:40.819 --> 05:10:00.819 So Nate, I just want to respond to one of you things. So it's not about like companies training. So the reason we have search is there's people who want to opt into like training for the search application, but not for generic use. So like yeah, so I think that is why we have a search thing separated out, not because like to give a blanket permission for somebody to train on it. The people say like, hey, I want to be able to ok.
718 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:10:00.819 --> 05:10:23.139 Able to use my content for training as long as it's only for the search application, but not for general gemini, e.g., right? I think that's why the search thing got pulled out. Not because like this is like some weird people into giving permission. I think No, NO NO, but that's the genesis of like why we have search as separate thing. Because like, you know, there's like quite a bunch of people here who express saying like.
719 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:10:23.139 --> 05:10:55.899 I want to be included in search, whatever it entails, right? But not training a general purpose model. It's like something somebody wants to express. So you may not want to express that, you can say NO to everything, but there's other people who are saying like, hey, I want to be able to opt into search but not find anything else. What does that mean being opting into search? It means having your content show up and search, right? But the evaluation covers that. I'm not arguing against the evaluation though like the foundational model training, the producing the model, it is we have established it's not necessary to use my content, so that we have a.
720 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:10:55.899 --> 05:11:23.799 Established this last bullet is actually not necessary for this use case. This has been agreed to. So, so Let's talking about foundation model training here. Hold on. Let's produce the model. Let's drain the queue and then I think maybe do a poll on this last bullet specifically talk about 1703 and then talk about the overall. Yeah, sounds good. Chris, yeah. So I think, I mean I think there's a lot of good things about this proposal, I like search way better than the other. Yeah, I think there's a lot of confusion on that last bullet.
721 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:11:23.799 --> 05:11:47.269 Good point I think Lenn makes good point too about users being able to understand this if they want to use it. And they ask good points too about just the kind of like the confusion about the vocabulary about models and training what's foundational, what's in the search application itself, so maybe there's a way to either simplify it or make it more specific so that you're not.
722 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:11:47.269 --> 05:12:04.489 The search application using the content in order to rank it, I don't know. There's another way to word it maybe that might make it easier for everyone to understand. I don't know what that looks like, but seems like a good direction to go that everyone can agree on maybe.
723 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:12:04.489 --> 05:12:19.969 Hey, I think I think there's a, it it is difficult to separate using the content and a search that a search engine accesses.
724 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:12:19.969 --> 05:12:37.759 To alter the weights of the model that helps decide what sites to show in the future because in part, site, you know, these engines rely on reinforcement learning, e.g.. So did this page help you or not? If the answer is NO, I need to know what's on the page in order to know it wasn't helpful.
725 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:12:37.759 --> 05:12:57.759 And I think the the concern here we're talking a little bit past each other. I don't believe this is a diabellical plot to try and include more stuff into models used for other purposes. This is designed to capture the fact that when you say yes, I want to be in search, you mean I want to be in search on an equal basis with all the other people whose stuff is.
726 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:12:57.759 --> 05:13:22.489 Being used in that way to help help refine the algorithm. I think it is harder and harder and has been, I mean, you say that the lines are blurring, but they're really, you know, they're all it's it's been this way for a long time. And so it it doesn't strike, it doesn't seem like, it seems like this would be a require a huge amount of engineering to, build a product where.
727 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:13:22.489 --> 05:13:41.021 One could guarantee that the same ranking would happen whether or not you could use your model in reinforcement learning with your pages in reinforcement learning or not. And to promise otherwise would be very dangerous. And so that's the central, like I think that might be the hang up here.
728 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:13:43.160 --> 05:13:59.329 I think Aaron, you were next. Yeah, I was gonna make a very similar point, which is that I think it's important not to conflate the idea of training the model used in the application with training a model that is used for other things.
729 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:13:59.329 --> 05:14:17.689 And Nate I I kept hearing you say the base model or the foundation model, and I think that if your objection is to the reuse of the model outside of the application, that is a different discussion than an objection to training of any kind in any way.
730 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:14:17.689 --> 05:14:33.199 And I would rather we be clear about which one we're talking about because one has a drafting solution and the other is kind of a philosophical difference. Go ahead Samon. Yeah, because I think the, like.
731 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:14:33.199 --> 05:14:52.309 A search application or or kind of any application needs to be able to, learn how to handle particular pieces of content, cannot guarantee that that will happen if enough of that kind of content has opted out of the training of the model that does that thing.
732 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:14:52.309 --> 05:15:12.309 And I think Caleb was also bringing up the point that even like user feedback would have to be prohibited if a like if a training prohibition prevented the model from accepting the feedback and then training a model to understand.
733 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:15:12.309 --> 05:15:18.409 Not that like this was a good result for this query and this one wasn't.
734 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:15:18.409 --> 05:15:38.409 Okay, Nate and then and then after that I think we're gonna put a bit in this discussion. So I I mean look like one possible way forward would be to say would be to specify that this model cannot be used for generative search, that it can only be used for non generative search, it cannot be used in any way.
735 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:15:38.409 --> 05:16:08.679 To generate something. If it can be used to generate something, then, you know, then that's that's the problem. I, I also would just say as a, as a strict matter, like, it's not necessary, like it's just not necessary to use my content because you can still evaluate it. The whole point is that these models just generalized and in the copyright debates, the model developers have taken the opposite position that this is content, that that's why i object to the new content thing because the idea is, is that there isn't actually like the inputs and the outputs are separate, right? Like that's, I thought was.
736 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:16:08.679 --> 05:16:24.259 With the legal position on the copyright issue, so it's interesting that now we're saying that there's actually a mirror on both sides. So I just think that it's unnecessary to the definition if we're going to do it we should specify that those models must only be used for non generative search. Can I.
737 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:16:24.259 --> 05:16:42.169 Please. It is technically necessary. And if you want to separate those things, what you really want to do is free ride on the search engine. You want to get the benefit of being included and ranked and have people find your content as a result without in any way contributing to that being possible.
738 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:16:42.169 --> 05:17:03.519 I mean, I would say that you're free writing on my content by putting photos of me in in AI summaries and literally like like like taking my travel advice and turning into things I didn't even say. This is my cloud generated content. This is non generative. I are blending though and like this is what I'm saying is that this is the unless we specific.
739 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:17:03.519 --> 05:17:23.419 I mean if we'll agree to say this only applies to non generative search, the model, the model can only be used for non generative search. So Nate, my interpretation was that was assumed in the definition. Is that correct? It doesn't say that is. We can refine the way that it is written. But that was your intent by adding.
740 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:17:23.419 --> 05:17:43.419 Are used in the non generative search application? Would that that's what you're asking for. Any model that is trained for this purpose should only be used for non generated by search. Sure yeah that that's take that as a method. Okay, ok. Do you want to take a stab at the new text before we do it.
741 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:17:43.419 --> 05:18:07.639 I have a standard any text below if you want. Yeah. Ignore this. This was a GitHub problem. Yeah the tool. I, I do blame the tool actually. Yeah. Proposal in light of any amendments that might be made, nothing like there kind of coming at the smoke.
742 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:18:07.639 --> 05:18:26.239 So I'm sorry can you, it might be a good idea in considering how to amend this, but also looking atumstreams. Yes. Yeah, we're we're getting there. Yeah. And then we'll do this side by side. So I'm I'm hearing that as a friendly amendment and some wordsmithing necessary.
743 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:18:26.239 --> 05:18:44.359 But it sounds like the intent is, is that the it it's necessary to train those models and those models have restricted use. Yeah, so that's right, that's what the words there say, including the asset in the training of models used by the non generated search application. Right.
744 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:18:44.359 --> 05:19:02.884 Okay, so I I don't think we need that poll now actually. Let's, let's leave that for the time being. Put a pin in this. Let's move on to in 1703, we had a recent proposal.
745 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:19:04.700 --> 05:19:09.987 Chris Natam, do you want to walk us through this?
746 "Chris Needham" (1250734080) 05:19:09.987 --> 05:19:39.259 Okay, thank you. Yeah. So this, this is trying to solve the same problem of, of, you know, how we have a, a definition of search that that works but it it takes something of a different a different approach. And what we're trying to do here is sort of very narrowly scope the, like how the, how the assets are used. So, so in essence, what we're saying is, you know, it's it's the use of assets to create the search index.
747 "Chris Needham" (1250734080) 05:19:39.259 --> 05:19:57.289 Which has the purpose of providing search results in the form of links to to to website content and and this may include back end processing, so it's the the sort of the internal processing of the assets to do the indexing and ranking and, and.
748 "Chris Needham" (1250734080) 05:19:57.289 --> 05:20:17.289 And retrieval. So that may use AI models, but only to the extent necessary for those search functions, namely the indexing and ranking. The th 2nd part to this is is around the display and the output, and what I've done here is essentially to carry forward.
749 "Chris Needham" (1250734080) 05:20:17.289 --> 05:20:46.969 Part of the, the, the definition that's in the current draft around including direct links to the original locations and also verbatim excerts. And then the final part is to is to really explicitly state that this does not include generative outputs which are, in some sense sort of substitutive in nature, so.
750 "Chris Needham" (1250734080) 05:20:46.969 --> 05:21:05.985 Reducing the need to to sort of visit the, the original asset. So it's, yeah, as I say, it's it's trying to solve a similar problem but coming at it from a, you know, a slightly different angle by really trying to, you know, focus the.
751 "Chris Needham" (1250734080) 05:21:05.985 --> 05:21:13.819 On, on the kinds of processing that we see is, is reporting the search application.
752 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:21:13.819 --> 05:21:30.000 Oh, I I'm last I thought. But happy to respond to this. Oh, that was an old day. Okay, I see. Mike.
753 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:21:30.500 --> 05:21:43.807 When it says utilized, do you believe, do you consider that to include training the model or only mag?
754 "Chris Needham" (1250734080) 05:21:43.807 --> 05:22:05.569 So I think so bearing in mind the previous discussion, yeah, the thinking here was that it is very narrowly scoped training for those purposes, and it's, and I, and I think part of the the concern that what, you know, that we heard about the previous definition was, you know, potential for sort of wider use.
755 "Chris Needham" (1250734080) 05:22:05.569 --> 05:22:13.567 So to the attempt here is to, you know, to define as narrowly as possible the extent of, of, of the training.
756 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:22:28.799 --> 05:22:45.480 Who's up next? Aaron? Yeah, I, I have a couple of concerns with this. One is I don't think it's appropriate for the preference statement to be saying what, like, processes AI can and can't be used for.
757 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:22:45.480 --> 05:23:05.480 In the like application, I think it is better to keep it at a purpose of the use rather than kind of because you know what is necessary and what is efficient and what is optimal, I think there's just a lot of room for debate there and I don't think it's appropriate.
758 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:23:05.480 --> 05:23:27.500 Before a content or a preference declarer to be articulating the way a service must be architected. But I also think that the the last bit of this definition, I put this in the chat, it refers to outputs which materially reduce the need to access the original asset.
759 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:23:27.500 --> 05:23:57.180 And I think that that statement is going to discourage any provider from believing or or conceding that their use is in the scope of this because the like substituting for the need to consume the original is an element of the US fair use analysis and I think making definitions hinge on conceding aspects of legal defenses is just gonna discourage their use.
760 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:23:57.180 --> 05:24:15.600 Yeah. So I would agree with Aaron that necessary is like probably a pretty bad drafting place just because you're gonna do all sorts of reasonable things that aren't like strictly speaking necessary and if this comes down to litigating.
761 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:24:15.600 --> 05:24:35.600 Whether actual deviation or discussion, whether something is strictly necessary, I think that's not gonna be productive, so you'd have to talk about something that's like reasonable or efficient or something like that. 2nd part I think is the opt out in the end, the search category does not include uses that generate materially reduced need to access the.
762 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:24:35.600 --> 05:25:00.060 Original sweeps out this huge range of just normal factual content that search returns that would now be not possible. So if you google, like, what time does the Alexandria City public library close, right? You get that data, but that's not something that I think is a reasonable restriction. Like that isn't generative. It's just returning this clearly fact, and so I think.
763 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:25:00.060 --> 05:25:18.090 Both parts of that make this, make search much, much less useful for end users if it were not in. Thanks. Brad. Thanks. I I think there's gotta be a kind of meeting in the middle of here. I don't know if.
764 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:25:18.090 --> 05:25:33.120 A preference necessarily needs to unconditionally dictate the architecting of a product or service, but now this should the way that a product or service is architected dictate what a preference can be. I I feel like.
765 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:25:33.120 --> 05:25:49.050 That's, that sort of wraps me up the wrong way a little bit. So I I hope that we can see things a little differently if look at these things from each other's perspectives. I think that.
766 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:25:49.050 --> 05:26:04.529 If you have suggestions for the kinds of functions under the back end processing category that we might wanna add so that it feels more foresome and, and covers, kind of the things that.
767 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:26:04.529 --> 05:26:22.469 Might be used in relation to improving, refining, implementing AI related aspects of a search, of a search or of a search engine that includes a model. I think everybody wants their content to be used in a model that works well.
768 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:26:22.469 --> 05:26:40.139 For the provider it obviously works well for how they might expect their content to be serviced. So there's obviously a mutual interest in that regard. It's not a zero sum a zero sum goal. And I I I I do think the last sentence.
769 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:26:40.139 --> 05:26:56.579 In spirits kind of, you know, conveys the sort of general concern about so whether we address this as part of another category or address it as part of search, I don't really.
770 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:26:56.579 --> 05:27:12.179 Have a strong opinion but I do think it needs to be there one way or another. And I think for the purposes of just figuring out what that means to the room, we may as well address it in the context of this definition because we should know where that line is. And wherever we might say the words.
771 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:27:12.179 --> 05:27:27.209 Possibly is a question of like, you know, what category we, we decide to include or not. Yeah, thanks, another.
772 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:27:27.209 --> 05:27:48.619 Just piece to pile on the last sense. Sorry Chris I think this is very helpful and, and useful to think about it. Sorry speak up, please. Oh, gosh, I've never been told to speak up. You're welcome. Yes, exactly. Thank you, Warren. Is that it's hard to tell Priori whether an output is going to materially reduce the need to access the.
773 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:27:48.619 --> 05:28:08.866 The Original asset and so that's just a difficult, like I don't know whether I'm following these preferences or not until I see whether my users click on the thing, which is, it sounds like, yeah, maybe I know it when I see it, but, you know, that's, that's just a hard thing to have to apply.
774 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:28:13.279 --> 05:28:26.609 Nick. So I just wanted to second what Brad said, and then secondly, I wanted to add, in response to the concern that maybe some search applications would.
775 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:28:26.609 --> 05:28:46.609 You know, would see their product deteriorated or they would want access to more content than, that maybe would be provided by by the web in total. There's nothing here in fact, there's nothing here to preclude other more granular consent mechanisms. In fact, that was specifically contemplated.
776 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:28:46.609 --> 05:29:12.199 In at the beginning of the discussion today. So if something else is needed from any other asset owner, you can just go out and ask them and offer them a different way, a different way of control. So these are not the only controls that would be available and if there is something more specific and you're a model provider and you need, you know, you need, you need something else, you know, you, you could offer more there's nothing to prevent you from offering more granular controls and if we had.
777 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:29:12.199 --> 05:29:33.839 That maybe we wouldn't have as much trouble here. I don't know if this was addressed in my comment, but if so, my prompt point was that we wouldn't know until after whether the use was ok. And so that makes it hard to even know whether, like, did it materially reduce the need to access the original asset? Like I I don't know until I did it, but I've already done it once I find out and then I can't go back and ask permission.
778 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:29:33.839 --> 05:29:51.089 Yeah, you know? Yeah agree. Go ahead. Just to add to the pile on to the last sentence, sorry, but.
779 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:29:51.089 --> 05:30:11.089 This would sweep in much of search now. I mean regular search results often materially reduce the need for me to access the original assets so I just don't see how, how this would work. I'd love to zero in on exactly what the debate here is. It doesn't seem like there's a big difference in how the two definitions would handle.
780 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:30:11.089 --> 05:30:35.429 It'll display and output, so it seems to be, do we want to specify what individual back end processes are allowed or do we want to use a more general, anything that is not generative search? Any anything that is not generative that is search is allowed, the ladder seems simpler and clearer.
781 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:30:35.429 --> 05:30:53.249 But I'd love to hear from Chris or anyone else, what are you worried about them doing with it that is like uncomfortably between those two definitions? Like, can you articulate what, what the concern is?
782 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:30:53.249 --> 05:31:10.083 Why isn't the pledge to not use it for anything but non generative search insufficient?
783 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:31:13.559 --> 05:31:34.709 Okay, maybe coming back later so Justin, go ahead. Just to speak with, I I don't know that I don't know that their reasoning is important if they don't want it to use for that purpose, I think that's somewhat their choice, but to I I see I see I hear three different things emerging that we're kind of talking about too. To to Nate's point, he's saying I want to have.
784 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:31:34.709 --> 05:31:51.419 My stuff being indexed if I hear what you're saying, I want nothing to do with whatever you want to use generatively. I hear search has a generative function and we can use that to perform a better search, and and I hear Nate saying that's fine. I I don't want that, I only want indexing.
785 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:31:51.419 --> 05:32:07.319 And that's and that's why I think there's three ways of it. It's I'm fine with it. I I'm ok with search with AI, but I'm or I'm good with all aim. So I I I hear three, not two preferences in the room. Thank you.
786 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:32:07.319 --> 05:32:25.979 Yeah, I do also like I think Kevin here said articulated most of the questions I have at this point very well. I'm really, really uncomfortable with the sort of last part of the last sentence, like this is sort of like who is going, like.
787 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:32:25.979 --> 05:32:42.929 What does need even mean in this case? And can you assume that sort of like different users have the same needs or a so like like there's so much language in here which is just almost patternalistic towards users that sort of like.
788 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:32:42.929 --> 05:33:05.705 Yeah, that sort of like I don't see much of a difference in the intent of both of these things, but, but this for me dies with the last couple of words in there Launch the poll. Okay. I think Chris is back, in the Chris, yeah, you wanna, you want to follow up?
789 "Chris Needham" (1250734080) 05:33:05.705 --> 05:33:26.239 I, I think, I think the key to this is is like the the two bullet points like the the last sentence I, I'm I I can see the concerns. I I think don't be distracted by what the, the last sentence is doing. I think, I think that was inspired.
790 "Chris Needham" (1250734080) 05:33:26.239 --> 05:33:43.829 By like previous attempts that we've had with the rank category where, you know, sort of Brad had put forward the, the substitutive use, right? Which is, well, how do you know that something is substitutive until it happens and then you can determine whether it happened or not. And, and so that.
791 "Chris Needham" (1250734080) 05:33:43.829 --> 05:34:03.829 You know, we didn't end up sort of adopting that into the in in the group. So I, I think this this final part of this, you know, definition is, was kind of coming from similar, a similar line of thinking there, but I I think, you know, I would say don't throw the baby out with the bath water.
792 "Chris Needham" (1250734080) 05:34:03.829 --> 05:34:17.219 Like that it's the it's the, the proceeding part which I think is, is more, more important in, in this definition overall.
793 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:34:17.219 --> 05:34:44.449 Okay, Nate, so I I I would just point out something which is I think that this full search definition would become a lot easier if we had a grounding AI output sort of other category because you know if you if you ever seen the like things where where there's like a artistic rendering and it's like a line and there's two faces and you can look at it two ways. There's two shapes there. Every time you draw a shape, you don't just draw the.
794 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:34:44.449 --> 05:35:07.219 Shape, you draw the relief of it and so by I think that we need to define both what is like generative search and non generative search or AI output and and search, like I think that I just think that they go together. It's, it both are neither. So to your point, we did try like to go for an AI use thing and it didn't go very well like so I I.
795 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:35:07.219 --> 05:35:22.685 I think we had problems with that than without it. So, so not just to say that we did try like you know that that it would help, but it didn't. Okay so.
796 "Gisele" (3787759872) 05:35:22.685 --> 05:35:44.599 Hi, I was, I just wanted to say, I feel like I would like us not to throw the whole thing away like Chris said, just because perhaps there's some sentences that might not be quite as, I guess clear and future proofed as they might be, but I feel about.
797 "Gisele" (3787759872) 05:35:44.599 --> 05:36:01.199 As websites being able to just establish a preference in terms of how this content that is gonna be, I guess even ingested for a second is gonna be displayed.
798 "Gisele" (3787759872) 05:36:01.199 --> 05:36:22.406 It's it's kind of similar to what Aaron was saying where, you know, the web is free writing on the search engine. I guess it's the flip side of that, which is us being able to ensure that the search, the search, the crawlers and and all of the different bots that might be relying.
799 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:36:22.406 --> 05:36:25.087 On the website to be able to function.
800 "Gisele" (3787759872) 05:36:25.087 --> 05:36:56.509 Not just being free writing on this information without regardless of whether for a user it's a lot more convenient to get an answer to the question. If you're gonna say at what time something closes or opens and that comes from crawling a website, that website should be displayed or that listing should be displayed to give the user the ability to actually, actually it's better for the user to be able to make sure that that information is correct. If they wanna send a message, they have a place where to.
801 "Gisele" (3787759872) 05:36:56.509 --> 05:37:15.302 Do it without having this intermediary consistently capturing that, that, that traffic on on those clicks just because it's more convenient or because it's the only way that they can actually function. Whereas websites can only function if, if people visit them. So that's that's all I.
802 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:37:15.302 --> 05:37:31.859 Yeah, thanks yourself I I think the point is like both Aaron's proposal and Chris's proposal are required modification like where do we prefer to start from? So we launched a poll for that after Martin goes next. I just wanted to ask Chris.
803 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:37:31.859 --> 05:37:48.629 What he saw is the difference between the two proposals because from my perspective, I'm reading both of these as refinements to what we have in the current draft. Yep. And that's good. I think both of them refine it in different ways.
804 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:37:48.629 --> 05:38:15.329 So points from earlier and about the addendum on proof of one, I think that's, that's a distraction and if we struck that, then, then that would be a a strict improvement. But curious as to what you see is the difference between the two of them. Chris, do you want to respond to that or should we just launch the poll and go after it?
805 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:38:16.500 --> 05:38:20.745 Thank.
806 "Chris Needham" (1250734080) 05:38:20.745 --> 05:38:26.909 Sorry, I'm here yeah just just finding the mute button. So, so I haven't.
807 "Chris Needham" (1250734080) 05:38:26.909 --> 05:38:46.909 Sort of come into this with a detailed sort of comparison in mind. So I'm kind of off the top of my head, right? So the 1st obvious thing here is that the, like, Aaron's definition includes a specific mention of transcription and translation, which, which this does not.
808 "Chris Needham" (1250734080) 05:38:46.909 --> 05:39:04.259 I think the definition that that I've proposed as a, has a preferable narrower definition of the, the, the AI related usage that that is that is within scope.
809 "Chris Needham" (1250734080) 05:39:04.259 --> 05:39:20.489 I think much of the other parts are the same like this the the other difference that I'm seeing is around verbatim. And so we've included verbatim.
810 "Chris Needham" (1250734080) 05:39:20.489 --> 05:39:43.561 In this definition, Aaron's has, you know, includes what I would describe as, yeah, non non non substantial sort of, you know, rewording of text perhaps the yeah so so I think the the sort of the verbatim sort of part of the.
811 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:39:43.561 --> 05:39:46.629 The definition is is one aspect, that, that we all.
812 "Chris Needham" (1250734080) 05:39:46.629 --> 05:40:11.525 What to look at. You know, the the search results being linked to associated content is by and large the same. So, so yeah, I mean, I think they're both, yeah, the good thing is I think they're both pointing in a, in a similar direction and I think perhaps an amalgamation of the two, you know, could, could get us to something that is, you know, perhaps, you know, better overall. Yeah.
813 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:40:11.525 --> 05:40:27.929 I would like us to, once we get past the poll, to concentrate a little bit more on this question of the verbatim aspect, and the, those other differences because I think that might help us decide.
814 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:40:27.929 --> 05:40:47.929 Well, and it's just editorial, right? I am very conscious that we we should not fall into the trap of trying to vote for specific words right because whatever happens here, the editor's gonna go off and work on this and we as a group are going to work on this and it is going to change over time. And if we pick one of these.
815 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:40:47.929 --> 05:41:07.929 Starting point going forward, we will dealt with consider the other one and incorporate bits of it and and evolve it over time. So I don't want people to get stuck on, you know, a particular version of this as a do or die kind of, of thing, but it, I think it would be useful to kind of get a sense of where people are thinking it.
816 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:41:07.929 --> 05:41:34.459 Most, beneficial for us to start. Keeping in mind that we have to come to consensus. So, you know, either way, what, what is the the you know easiest path for us as a group? You know, the the working group meetings will continue until consensus is achieved. So let's let's go ahead and do the poll, see where that gets us. So just for clarity, are we assuming that with Aaron's definition we are adding.
817 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:41:34.459 --> 05:41:48.349 That non generative yes. This one we are cutting the last sentence. I don't think we we came to a conclusion about cutting the last sentence, but I did hear a lot of doubt and and Chris, you seem to, to express that would need some heavy.
818 "Chris Needham" (1250734080) 05:41:48.349 --> 05:41:52.708 I would say, yeah, take this but cut cut the.
819 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:41:52.708 --> 05:42:10.859 Final sentence. Okay. Thank you Chris, that makes this easier. Do you want me to edit it? I think it would be useful to actually see the text and I know that that's annoying. That is annoying. Yep, but like otherwise. Is there anybody from Microsoft here on stuff? I can figure it.
820 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:42:10.859 --> 05:42:27.089 173173. So we have, well we have errands, which is in a different issue. Erins is 1906, right? I'm not gonna I'm not gonna modify it too much, but if you Can you do strike through and mark that's if you refresh. Oh, look at that.
821 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:42:27.089 --> 05:42:43.949 It's just structure. Okay, so that's that one. And then circuit in briefly. And then so there we go. 1906. Yeah.
822 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:42:43.949 --> 05:43:06.829 Yeah, ask one one question. So because the my before we strike the last sentence I think it's important to remember that that's just a kind of a an expression of something that's happening in the world today. There's probably a better way to say it, but to maybe consider what that, what that's all.
823 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:43:06.829 --> 05:43:31.549 Probably a better way to work it. So I like the idea of the too, but just to try to forget about it. It's a starting point, Chris. The discussions will continue, yeah. But I think we should start with the deletion. Let's start with the imperfect version of that and I think Chris said Let's start with the stuff stricken off. That's why.
824 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:43:31.549 --> 05:43:51.889 The this strike not happened. Yeah, Bastian, I I thought about whether I got received an answer on my previous question, so maybe I'm jet lacked or I don't understand it, but so the non generative search, is that considered as an example.
825 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:43:51.889 --> 05:44:14.959 Section or is that considered as a, as a choice as a preference choice where you would say yes or NO? Because if I read that Yes, it is a separate case that you can address like all other use cases by either saying yes or NO to. So you would say NO to a non generative search. You can't like it's relatively unlike.
826 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:44:14.959 --> 05:44:30.899 Scenario to happen but like it is also relatively unlikely to happen well so my proposal would be to I mean we'd only have the two categories if I see that correctly, like the model training in general, but I always.
827 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:44:30.899 --> 05:44:47.279 Saw that kind of as an as an exception for something that we may or may not have or like which is the use. And the way to express that is to say NO to training and yes to this. Thereby you specify the exception.
828 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:44:47.279 --> 05:45:05.189 Like that is like Let's not let's not complicate things by, by, by adding another, by adding another type of category which is an exception to another one. We need to like the the entire vocabulary is constructed that there is categories of users.
829 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:45:05.189 --> 05:45:25.189 That you can either express, you can express NO preference about, you can express a yes or NO. And that is how we're treating both of these proposals at the moment. Okay, so the poll isn't conclusive, it is NO youth. Please do, yeah.
830 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:45:25.189 --> 05:45:40.619 I was listening to the recent arguments. Sorry. That's why you have ears and fingers. Well, I have to use the ears 1st and then the fingers.
831 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:45:40.619 --> 05:46:00.619 Sorry, are we not using the the poll inside the Webex? Yes, we are. It's like it's yeah next to the participants list on the.
832 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:46:00.619 --> 05:46:26.009 I understand that it has. Okay, so we we've got a fair representation of the room. It is slightly leaning towards aaron's, but but slightly. I think, you know, there are a couple of potential paths forward. I, I am wary of saying, well, the two groups should go off and develop their proposals cause then people get invested in their babies.
833 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:46:26.009 --> 05:46:42.029 And that's not what we're trying to do here. So what I would suggest is if the editors are willing, perhaps they can take both proposals and, and try and come up with something that they can bring back to the group. So I would like greater clarity on the points of difference that Chris identified.
834 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:46:42.029 --> 05:47:01.049 And that includes the thing that was struck from, from Sure, I think this question of transcription translation, speech to text, text to speech, and new content is a major sticking point that I don't think the editors can resolve without the input from the room.
835 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:47:01.049 --> 05:47:16.679 Well, the rest of it I think we can manage. What I'm looking for is a piece of text that then we can raise issues on. If we're up to me at this point, I would take Erin's apart from that question, but we can raise issues on it. We can also raise issues to add things to it.
836 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:47:16.679 --> 05:47:36.629 Chris's proposal or elsewhere. Yep. Okay, I think that I think that question of verbatim use as opposed to non substantial modifications like translation or what have you, is a, is a.
837 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:47:36.629 --> 05:47:52.199 Real meaningful difference between the two of them and if we don't resolve that here, I mean, why aren't we even bothering coming to a meeting? So, if we can get that on the agenda at some point better. All right. I think this is good on the agenda.
838 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:47:52.199 --> 05:48:09.089 So it's 03:20. Do we want to take a quick 10 min break and then come back and continue these discussions? The back at 03:30 short? Yep.
839 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:48:09.089 --> 05:48:25.499 And we'll go to we can get a 04:30. It's just.
840 "TRN6-29-BANFF/speaker_1" (4268955648_1) 05:48:25.499 --> 05:48:46.205 Yeah, which is Webex.
841 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:00:31.739 --> 06:00:51.680 What are the guardrails? Hey, people are clicking in. Mark is in the bathroom, so I'm being.
842 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:00:51.680 --> 06:01:21.560 So he's delegated like being to you guys Yeah that's ok. It's clear that I think. I don't know.
843 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:01:21.560 --> 06:01:49.100 Oh hang on for it. Yeah. I'm sure I'm sure that we're off my now.
844 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:01:49.100 --> 06:02:09.450 The old story about finding contests.
845 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:02:29.450 --> 06:02:49.450 It's not just your phone, yeah.
846 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:02:49.450 --> 06:03:11.370 Let us recommence. It's it's a difficulty so I think the editors are gonna come back with with some text that we can noodle on a little bit there, but in the meantime, let's go back to our agenda.
847 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:03:11.370 --> 06:03:28.080 So I think that we're trying to address all of these issues as a, as, as a set. Let's take a look down while Mark works on that.
848 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:03:28.080 --> 06:03:43.680 Perhaps try and pick off some of the terminology issues here. 01:51, the definition of AI, we remind ourselves, we talked about this in the March interim.
849 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:03:43.680 --> 06:04:01.230 So there's a question of, yeah, alignment with other definitions, e.g. the nist definition, the OEC definition.
850 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:04:01.230 --> 06:04:16.230 And Martin talked about, that being a good definition, but I think we we ended up making some modifications to it so that it made sense in our context. Yeah, I don't know why GitHub does that, but.
851 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:04:16.230 --> 06:04:36.230 If it is single comment, but it's the same So do we want to talk more about this? I mean the original issue is pointed out, it's extremely problematic and would affectively sweep in any statistical technique, and this touches on a bit of the.
852 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:04:36.230 --> 06:04:57.740 We've had before. Our definition of AI is an engineered system of sufficiently complexity that for given set of human defined objectives learns from data to generate outputs such as content predictions recommendations or decisions. I think perhaps the distinction here is that we have further refinements of the application of our in the individual terms. So it's not necessary.
853 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:04:57.740 --> 06:05:20.180 Necessarily that our terms apply to everything encompassed by this definition. So how do we currently feel about this issue? So Anchor is stating the chat to focus on the behavior of the system rather than.
854 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:05:20.180 --> 06:05:37.260 Defining the whole thing, but if that's something somebody finds valuable, please follow up on that. Right, and I think even more important for that, why do we need even to define AI? I guess the question is, can we get rid of this entirely and just, let the rest of the text stand?
855 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:05:37.260 --> 06:05:51.425 I think we're down here by by trying to define the AI part of the term AI model, right? We are defining AI itself.
856 "Eric Rescorla" (1353541632) 06:05:51.425 --> 06:05:55.053 Why we can do that.
857 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:05:55.053 --> 06:05:57.304 Yeah go for it echo.
858 "Eric Rescorla" (1353541632) 06:05:57.304 --> 06:06:17.630 I, I mean so like I think this gets a little bit to what I was saying earlier, which is like the sort of structure of the current vocabulary is that we have this very specific search application thing and then the, and then the and then and then this AI category basically serves to say you can't really do anything with this text at all.
859 "Eric Rescorla" (1353541632) 06:06:17.630 --> 06:06:41.400 With this content at all, as a cat especially like a catch all for some very, very bright category of things. And, and so I just wonder where like, is there some other way to frame that intuition that doesn't like try to define like what AI is, right? You know, I mean maybe to give you know a.
860 "Eric Rescorla" (1353541632) 06:06:41.400 --> 06:06:56.880 Perhaps less kind of virtual example, you know, you know, before there were AI reviews, there were like on search engines there was like, we just turned this stuff out of Wikipedia and put it in a box. And, and, and like.
861 "Eric Rescorla" (1353541632) 06:06:56.880 --> 06:07:19.540 You know, whether that like used AI or not, like, like I I think people were not very happy about it for for the same reason they weren't having search overviews, NO matter how it's constructed. So I don't I don't like have like an answer to this question, but I'm just like I just like trying to absorb the function of AI is sort of this catch all in this and that's and and maybe there's some way to get capture that doesn't require having like a definition.
862 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:07:19.540 --> 06:07:29.547 Well, our our current definition of foundation model production doesn't use AI at all. It uses models which maybe in 1st AI.
863 "Eric Rescorla" (1353541632) 06:07:29.547 --> 06:07:40.879 I mean, certainly, sure. Well sir. Is there anywhere, is there anything that's text now that depends on the AI? No. So, so so maybe we can start by striking that definition.
864 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:07:40.879 --> 06:08:07.620 I would also in the same way note that nowhere in our discussion so far today, did anybody say that depends on how we define AI, which I think is an indication that we may just not meet this definition given the current status of the text. For me it is like, are there terms that we need to define as probably a late stage in a toil pass over this thing and not something that we can.
865 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:08:07.620 --> 06:08:24.030 Make a very well informed decision at this point. Sure. At this point, I don't think we need it because our our our foundation model definition currently relies more on deep learning and machine learning than it does on, on that definition.
866 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:08:24.030 --> 06:08:41.460 Just point out it is the term AI is in the charter so it doesn't doesn't mean we didn't define that in the. Oh it's weave.
867 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:08:41.460 --> 06:08:56.760 It means when you submit something you have to be able to justify that it concerns AI. All right. Well, well, perhaps the editors can dispose that text knowing that we can resurrect it if necessary.
868 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:08:56.760 --> 06:09:15.119 Is that ok? Oh I'm very happy to remember. Which text? Any text? The the definition, all of the text can go. I think at least this. Can all of the text go? Getting to that time of the day aren't.
869 "Eric Rescorla" (1353541632) 06:09:15.119 --> 06:09:17.662 Going to. You'll need to renew AI training too.
870 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:09:17.662 --> 06:09:33.000 Why you're in there. That's, that's what I was wondering. I think that's that's a good observation. Let's pare it down and then if we need it, we can always add it back and have the discussion then rather than spending too much time on it now. Doesn't it need to say AI for SEO?
871 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:09:33.000 --> 06:09:49.530 We have the keywords for that. Are we discussing adding new terms to this? Because.
872 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:09:49.530 --> 06:10:10.730 Learning. These are things that we refer to in the draft now. Well, I think when we get more confidence in in our vocabulary, I think that that will be a step that we'll need to take. Similarly, 01:52 use of machine learning.
873 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:10:10.730 --> 06:10:14.304 Was this another record? Oh, so this is, you already got us there.
874 "Eric Rescorla" (1353541632) 06:10:14.304 --> 06:10:32.430 Yeah, so I mean I think, so I mean it as you suggested, 4.1 uses deep learning or the machine learning techniques, which is kind of a train wreck text as it is. But you can but my you wouldn't just be able to like remove.
875 "Eric Rescorla" (1353541632) 06:10:32.430 --> 06:10:47.879 That's the definition of machine learning or that changing 4.1. So I don't know, like, I mean, can someone like who was like really an A expert, which is not me, maybe try to find that like that that quote I just put in the chat in some way that like makes more sense.
876 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:10:47.879 --> 06:10:54.586 I'll tell you i'm wrong how it makes sense. I mean, we just say machine learning, would that work?
877 "Eric Rescorla" (1353541632) 06:10:54.586 --> 06:10:59.840 It seems like someone tell me, I have.
878 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:10:59.840 --> 06:11:07.142 And it's a very broad term, so I think I think we need something very broad here anyway. Yeah.
879 "Eric Rescorla" (1353541632) 06:11:07.142 --> 06:11:25.860 I mean that's what I would have probably written it seems to be the remainder, it seems to me that like the, the, the, the th the the part that is like, I mean if you just read, if you just read this text, this, this paragraph in context, this four point paragraph graph in context, right? There's.
880 "Eric Rescorla" (1353541632) 06:11:25.860 --> 06:11:55.919 What is the, what is the, what is the restrictive text? It's a model that is producing these machine learning techniques, it's trained in a larger assets and that typically possess the gender capabilities. That's that's not restrictive, right? So they'll so so any model, I mean, is there any model that people would not want to have people want to have trained, the people wouldn't categorize the foundation model that was used machine learning techniques that had a larger assets. I think the answer is probably NO, but maybe someone who knows more of me can answer that question.
881 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:11:55.919 --> 06:12:14.724 So I think we're going back to the definition of the foundational model training or whatever we're gonna call the training term. Yeah, I think for the disposition of this issue, probably the best thing to do is to remove the definitions until we have a vocabulary term that we're happy with and then create definitions as necessary.
882 "Eric Rescorla" (1353541632) 06:12:14.724 --> 06:12:34.677 Okay. Well, so I think I I've been arguing you just need machine learning previously, but I think I mean I don't see how you can have it. I mean somewhat just remember you're gonna need it for 4.1 because it's not like an inherently, it's not it's like not it's not a term that is clear unless you have a definition. So either if it Doesn't this stays in 4.1 then we'll have we'll need a definition later.
883 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:12:34.677 --> 06:12:55.205 Is it not like something that we could rely on people having an understanding of without defining it? Like machine learning is that that's what I'm ambient at this point. I mean it's kind of cop out, but I don't know that there's a whole lot of debate about that.
884 "Eric Rescorla" (1353541632) 06:12:55.205 --> 06:12:58.248 I guess if you remove deep learning, you'd be, you might be ok.
885 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:12:58.248 --> 06:13:18.740 Right. Yeah, yeah. Okay. I think learning I can strike easily. Did. No, it did save. See, this is.
886 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:13:18.740 --> 06:13:33.810 That's what happens to me. Replacing digital assets with digital content. I think we discussed this before. This turned out to be somewhat contentious.
887 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:13:33.810 --> 06:13:51.870 Yeah then we seem to to come to some agreement in in the March interim about changing the term.
888 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:13:51.870 --> 06:14:03.847 I will say that in an IT of context, asset is an unusual term, it implies of ownership, which I.
889 "Leonard Rosenthol" (1805905664) 06:14:03.847 --> 06:14:09.419 I don't think it's a property percent.
890 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:14:09.419 --> 06:14:10.201 Really?
891 "Leonard Rosenthol" (1805905664) 06:14:10.201 --> 06:14:14.340 Yeah, why would you believe that Ownership.
892 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:14:14.340 --> 06:14:25.727 Because I've never heard it in a context where it didn't have that conversation? It means thing in media workflow. Immediate workflows is just a thing.
893 "Leonard Rosenthol" (1805905664) 06:14:25.727 --> 06:14:29.219 It's a thing, Right, I agree.
894 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:14:29.219 --> 06:14:31.683 I think the whole.
895 "Leonard Rosenthol" (1805905664) 06:14:31.683 --> 06:14:41.024 It's the whole thing whereas content is just the visible stuff and doesn't include things like metadata where.
896 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:14:41.024 --> 06:14:44.645 Oh, you come from them. You come from a very different world. Yeah.
897 "Leonard Rosenthol" (1805905664) 06:14:44.645 --> 06:14:48.447 Yeah, I come from the media world oddly enough, right?
898 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:14:48.447 --> 06:15:00.150 Right, right. So, I think what we need to do, whatever happens here is define a term and make sure that it speaks to all the audiences for the document.
899 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:15:00.150 --> 06:15:19.710 I don't know that we need to spend a lot of working group time because this is a, a fairly editorial issue that that does feel like a bit of a bike shed. But it is a somewhat bexing editorial issue because like I think it very much depends where you come from. Sure. And.
900 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:15:19.710 --> 06:15:39.710 There there's sort of like I wonder what, what, like Leonard or others were throwning to I consider myself like I also think it's the more natural term to use in this context from where I come from, but I I can also not necessarily quantify what we're losing if we're using a different term. And I think Leonard, if you could specif.
901 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:15:39.710 --> 06:15:50.365 Your concerns related that that would arise from the use of another term, then sort of that will help us in editing this.
902 "Leonard Rosenthol" (1805905664) 06:15:50.365 --> 06:16:08.802 I'm not saying that if we can come up with another term, that we're all ok with, I'm fine with it as well. It's content, which is the only other term that's been proposed is not a viable option, but I'm happy to give some thought to other terms that are in common use that might be better suited.
903 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:16:08.802 --> 06:16:14.343 And can you make it crisp why content is not by.
904 "Leonard Rosenthol" (1805905664) 06:16:14.343 --> 06:16:34.680 Because in the media world content refers only to the visible content, like the pixels of an image or the frames of a video or the pages of a PDF, but doesn't refer to the metadata or other pieces that make up the entire asset.
905 "Leonard Rosenthol" (1805905664) 06:16:34.680 --> 06:16:54.680 From a binding perspective, like, you know, I I wouldn't use the word, but think about it as the difference between container and stuff in the container when somebody, somebody in chat suggested file, the reason why fi file is better is closer to asset, but for.
906 "Leonard Rosenthol" (1805905664) 06:16:54.680 --> 06:17:03.239 File implies a storage model, which we certainly don't want to imply. Right, part of it.
907 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:17:03.239 --> 06:17:07.868 Alright.
908 "Leonard Rosenthol" (1805905664) 06:17:07.868 --> 06:17:14.083 Yeah, but I'm happy to look for other terms. We can certainly move leave this one open and move on to other.
909 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:17:14.083 --> 06:17:34.820 Okay, where are we? Terminology. So i'm conscious that it's three.
910 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:17:34.820 --> 06:17:51.720 50. Martin, did you have that PR ready? There's a PR up. It's not very good, but, oh waiting as a starting point. I think that these contacts issues I'd rather face when we're fresh tomorrow morning, to be honest.
911 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:17:51.720 --> 06:18:09.150 So what's the PR? Oh, do you want me to give you a number? 1909. What's there? It wants 200. Should we closed the workgroup at 200?
912 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:18:09.150 --> 06:18:25.380 So what do you walk us through this. Why don't you find the text and we can walk through the text. There we go.
913 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:18:25.380 --> 06:18:41.460 Oh, this is terrible. Yeah, I could make it maybe side by side. I don't know sorry there's like 15 things on my screen right now. Yeah, I know. And that's gonna be too small. Oh, let's see here.
914 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:18:41.460 --> 06:18:57.060 So I tried to avoid the new content thing, but I'm not sure that that's been successful there. You that's NO better at all. Oh NO that's not so bad. That's not so bad.
915 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:18:57.060 --> 06:19:17.010 Yeah, so the the green stuff. Yeah, the green stuff. So basically the the one phrase tweak that I made there was the talent of that, not really sure about output, but maybe that avoids the new content.
916 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:19:17.010 --> 06:19:33.480 Sort of which I think is just a word choice problem, and then it basically took the points that, that Aaron had plus some text from what what Chris had starting with the substantial quantities about that's helpful.
917 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:19:33.480 --> 06:19:54.830 Screen went by I think I've forgotten S search results of the links and associated content that's straight from Aaron's. Verbatim cop excerpts and snippets was directly from Christmas one, but then modified that with the adaptation.
918 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:19:54.830 --> 06:20:13.260 Stuff the translations, the transcriptions, the text to speech, speech to text modifications, which covers the accessibility piece that Aaron's one had, and then tweaks to the last two points on Aaron's one. So this uses text from Chris's one.
919 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:20:13.260 --> 06:20:29.520 Internal processing of assets to perform indexing ranking and retrieval can use AI models. And the final one is internal processing includes use of the asset in training models, but only models exclusively used by the search application under this definition.
920 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:20:29.520 --> 06:20:45.450 So basically, you can use models for all this stuff, but, if you're training it, it can only be used for these things and not other things. And yes, it's far more complicated than the previous one.
921 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:20:45.450 --> 06:21:01.530 But I think that covers the points and allows for disagreement on the, on the points that we didn't agree on.
922 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:21:01.530 --> 06:21:16.740 Yeah, well, you know, Asset we can we can use things if you prefer. Routine. So, discuss.
923 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:21:16.740 --> 06:21:33.660 Yeah, I, obviously like parsed this the derived from my definition. But I I think it's a good blend. I am a little concerned about 2nd to last bullet.
924 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:21:33.660 --> 06:21:50.520 For the same reasons as I was concerned that in Chris's original write up, here it is not phrased as exclusively, which I think is, is an improvement, but that introduces an ambiguity.
925 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:21:50.520 --> 06:22:10.520 Right? Because it doesn't say whether the internal processing is limited to those functions and I don't think it should be. So I question the utility of a statement that just kind of acknowledges that those are possible and other things are also.
926 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:22:10.520 --> 06:22:30.210 Possible and I wouldn't want to clarify it in the direction of saying only those things are possible. Yeah, I, I that that would be unbiased. I I agree, but this is really just a statement of facts more than anything else. It doesn't really read on the, on the definition. It's just as a reminder that this this definition doesn't.
927 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:22:30.210 --> 06:22:49.290 Say anything one way or the other or the other. Maybe that's the right way to frame that then. Yeah perhaps we could add that as a clarification just to avoid the ambiguity of like saying after the alternative as text after the list as a note. Yeah. Well, would it be as simple as, as, you know, to perform tests such as.
928 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:22:49.290 --> 06:23:05.987 Yep. Yep. All right. Yep. I'll.
929 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:23:23.500 --> 06:23:30.990 You we're hearings in the chat some about pushback on this sentence level term.
930 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:23:30.990 --> 06:23:50.990 It's a little unclearing here. Yeah, I just don't know what it means. I'm not pushing back. I just don't know what it means It also reminds me of some objections that Brad had during our conversation bunch. So Leah, do you, do you want to talk about.
931 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:23:50.990 --> 06:24:11.520 Your update on this? Sure, yeah, I spent the break also trying to split this complicated baby. Can you speak up a little bit? Yeah, sorry. I spent the break also trying to split this baby and I think the way I tried to structure it was to 1st say what is included, which is.
932 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:24:11.520 --> 06:24:31.520 Showing the asset and search output with the direct reference or link and then allowing for excerpts or snippets removing the verbatim requirement and just keeping it to excerts or snippets. It does include the the indexing ranking and retrieval language that Aaron has.
933 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:24:31.520 --> 06:24:48.780 Challenged in Chris's language, but I tried to sort of say that It's and to enable the search application without that like strictly necessary language, and that separately I wrote that the category doesn't include.
934 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:24:48.780 --> 06:25:06.960 Essentially generative AI outputs like AI summaries but without that title, a little bit more granular definition explaining that the asset can be used, summarized aggregated, you know, reproduced to generated summary or new content, which I think gets rid of the substantial quantities language.
935 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:25:06.960 --> 06:25:25.950 And I did note that the category does not include evaluating evaluating an asset for potentials. So it kind of takes pieces from both to kind of break out the definition to what is included and I just still on the screen.
936 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:25:25.950 --> 06:25:49.625 Where? Sorry, in the Webex chat, there's a link to the comment. There we go, 1703, yeah. Okay.
937 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:26:21.040 --> 06:26:31.210 What's this one? Nope.
938 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:26:34.860 --> 06:26:52.830 So what, what do we think the differences are between these two approaches? So again, the non substantive modifications, the transcriptions and.
939 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:26:52.830 --> 06:27:12.390 Translations is a major point of difference. Obviously the way it's spelled. Classification category. I think this has a lot less flexibility for those deminimus kind of uses. I also wonder what incorporated and aggregated are doing?
940 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:27:12.390 --> 06:27:30.300 We can, we can keep it to just be used, that's easier. I was just trying to like take language to just say used is fine. You Can't hear anything. Oh, sorry. I said to just say used to sign if that's easier to get rid of like. Just delete that.
941 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:27:30.300 --> 06:27:46.466 So Drop the text that's highlighted. I'm sorry. The AI models do tend to be verbose.
942 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:27:54.679 --> 06:28:23.190 This is better, much better than the last 111 other note about like what I think the difference is that maybe we need to play out is generate new content is a pretty broad brush. Not to say that there's not a lot correct there, but e.g., internal labels that NO anyone would ever see is probably new contently constructed.
943 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:28:23.190 --> 06:28:40.187 And I think that's this, that's what everyone was trying to solve for with the sentence level, and maybe sentence levels the wrong bar. I'm not I'm not sure, but I think as broader rush as new content could create the wrong incentives for the sorts of uses that would be valuable.
944 "Eric Rescorla" (1353541632) 06:28:40.187 --> 06:28:49.659 I mean doesn't the training process if you put if you train the model involve generating new content beyond snippets?
945 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:28:49.659 --> 06:28:59.426 I just responded to that point the chat maybe the clarification required is like display and output of the search results of that.
946 "Eric Rescorla" (1353541632) 06:28:59.426 --> 06:29:03.402 I think that'd be fine.
947 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:29:03.402 --> 06:29:21.300 In my just as a note in my original draft, each of the bullets had like a little title like display and output blah blah blah but then they became a little bit too imprecise so I took them out because I was worried that they would kind of If it's helpful we can put back in like display now.
948 "Eric Rescorla" (1353541632) 06:29:21.300 --> 06:29:35.199 I, I also don't understand what generates summaries or new content beyond exceptional snippets like exceptional snippets by definition are not generated.
949 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:29:35.199 --> 06:29:38.585 Right, right. It's, it's saying beyond what's referenced above this.
950 "Eric Rescorla" (1353541632) 06:29:38.585 --> 06:29:58.945 But I I guess I don't I don't understand. Like my point is like you can't because you you you don't generate any snippets. So like this so so like what is the beyond that, what is that what is beyond doing in essence?
951 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:30:00.459 --> 06:30:04.344 You write an entire summary that's beyond a snippet.
952 "Eric Rescorla" (1353541632) 06:30:04.344 --> 06:30:11.907 Yeah, I understand but then any generation is the problem here because snippets of verbatim, not necessarily.
953 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:30:11.907 --> 06:30:16.041 Okay that probably needs to be clarified. We took out verbatim.
954 "Eric Rescorla" (1353541632) 06:30:16.041 --> 06:30:27.379 Sorry. It doesn't say verbatim, but I mean but a snippet, but what what is a snippet? Snippet is a, is a subset. It's not gen is it generated or not?
955 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:30:27.379 --> 06:30:42.630 I think like we're trying to get to the point of, Meredith I think saying, e.g., when does the public library open? It opens at 09:00 A.M. like that might be something that's generated from the source, but it is.
956 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:30:42.630 --> 06:30:59.907 Like a small snippet of information, right? If it were to be like this library that was built in such and touch here by such and such, you know, and it's blah blah blah becomes like a long kind of substituted piece of information that's different than. So.
957 "Eric Rescorla" (1353541632) 06:30:59.907 --> 06:31:11.042 The snippet is I understand extra for snippets to be to be subsetting the existing text by striking out words. And, well you just try to not that. I mean you just try to say it differently.
958 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:31:11.740 --> 06:31:25.348 I really are like the what you were saying is the answering thing that Nato's bringing up, right? Like right. And and I think you're trying to find a different way to to deal with what Aaron was trying to deal with the sentence.
959 "Eric Rescorla" (1353541632) 06:31:25.348 --> 06:31:27.266 Level of question before.
960 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:31:27.266 --> 06:31:58.530 Well Leah, how would you feel if if the highlighted word displayed were changed to generated what I'm trying to avoid is like the question that Eric asked, right, of like, what about generation that might happen on the back end here as part of this process? Like I I think not trying to cut that off and so I'm talking specifically, it's I'm trying to get a little bit at the substitute of this without.
961 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:31:58.530 --> 06:32:16.500 Using substitute use of what's displayed to the user and will or want to incentivize them to go to the site versus just leaving. So I don't have a str I'm not at all married to any of this text again thrown together in 10 min, but if there's a better way to get at it.
962 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:32:16.500 --> 06:32:36.500 Happy to do that. So I think it's moving in the right direction. What I like about it is that 1st off, I think it's helpful to structure.
963 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:32:36.500 --> 06:33:03.990 To this with like one sentence, the top sentence makes clear that's like then you have the specific exclusions and exclusions, and I think that helps with any kind of drafting. At that 1st bullet in that does not include is really important because it's important that any you know many of these applications will, will put a hundred different assets in a context window but only generate.
964 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:33:03.990 --> 06:33:23.990 Links to three and the other 97 are still contributing to the output, even if they're not included. And so that bullet is important to sort of draw that line between non generative search and whatever else. So I think certainly there's probably some ironing out that needs to be done, but I think it's generally this is the the best definition I've seen in term.
965 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:33:23.990 --> 06:33:29.460 Is it moving in the right direction?
966 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:33:29.460 --> 06:33:49.179 Sorry to return to this, Tim at robot. In 2nd bullet point, or I'm sorry, the 3rd bullet point, if we strike excerpts or snippets as reference to above, do we lose anything?
967 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:33:50.939 --> 06:34:16.560 I don't know well like it's broader. It is is the intention that the the excepts or snippets have to be, you say you took out verbatim, but this can be read as requiring it to be verbatim or not. It's a little ambiguous I think. It's, and like I don't work.
968 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:34:16.560 --> 06:34:31.920 I'm either a publisher nor a search engine company person, so I have a knowledge here, but it sounds like there are ways to provide non substitutive snippets that are not verbatim.
969 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:34:31.920 --> 06:34:51.920 That requires some sort of processing. And then it sounds like there's a way to use the asset to generate what we would call a substitutive summary that says enough information the user never visits the site. Right. What number what bullet .3 is trying to get at can be reworded to get at and I think.
970 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:34:51.920 --> 06:35:17.970 It's something that would be like a net new summary that is not like a short piece of information that requires you to go website to get more. I want to avoid the language that's like materially replaced or you know the stuff that we were saying beforehand. So maybe summary is a better point. Well, we do say summary, so maybe to say that asset is not being used to generate.
971 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:35:17.970 --> 06:35:36.210 Summaries or other new content. I I think the reason for keeping beyond excepts or snippets is just so it's not contradicting the excepts or snippets section, but if there's a more graceful way to do that, it's absolutely happen to do that. I just don't want it to read in a way that it's countered like.
972 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:35:36.210 --> 06:35:55.710 But as a line drawing exercise, you know, it sounds like we're coming to a place where on on the one side for the small excerpts or snippets or whatever we call this, you know, the sentence level, that's something we want to allow it to be generated because we acknowledge that, you know, you don't necessarily want to verbatim.
973 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:35:55.710 --> 06:36:15.710 Except in in the search results, that's often awkward. But you don't want to create something from whole cloth that is substitutive or however we define that. I'd be interested to hear if anyone agree di disagrees with that, that they want to draw the line in a different place and not acknowledging that we need to come up.
974 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:36:15.710 --> 06:36:36.360 Up with some words here, to, to figure out where that line exactly is. So I I think Justin, Yeah, I think we we don't need the text after the word generates summaries in bullet three. That that's been addressed above, right? So the ambiguity that you're, you know, is very clear that.
975 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:36:36.360 --> 06:36:53.910 One of them, you create the ambiguity when you add it to bullet three it might be so yeah I I think what you're saying is if a result or expert maybe displayed from the source asset, that is a you're saying verbatim in that statement in my opinion, and in the bottom you're saying, but you can't generate new stuff.
976 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:36:53.910 --> 06:37:10.470 It's pretty clear if we just remove that. So as we have this conversation, I'm acutely aware that we need to work through the nuances to make sure that we're on the same page, but at the same time, we do not want to engage in group editing because that is where working groups go to die.
977 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:37:10.470 --> 06:37:30.470 Sebastian regarding the 4th bullet point. So it's in the category, the category does not include and then we have the 4th bullet point that says does not constitute. Wouldn't it be easier to strip the last part and move the 4th point to the 1st category.
978 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:37:30.470 --> 06:37:52.830 And say this category includes evaluation of an asset for potential use for stuff. Is that the intention? That that was from Aaron's draft, right? Kind of clearing the way for understanding that an asset might be used sorry might be evaluated what will be used, but.
979 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:37:52.830 --> 06:38:12.830 I think Sebastian's proposal is if you just say evaluating asset for potential use as an included thing, would it solve your problem is what he's asking, if I understood correctly? Yeah, move it to the to the negation. It's two negations. We are working very often with NO two negation.
980 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:38:12.830 --> 06:38:29.490 And that causes problems can be rewarded so that it's not updated part of the call of what you're saying yes to, I think it's more of a clarification of what you're not saying. Okay, thank you.
981 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:38:29.490 --> 06:38:46.350 Thanks, Meredith. So my I just would like to reiterate something you said earlier, which is that I think looking at whether or not people have to click through is like a very bad path here is what people ultimately care about. Like, do people click through to their websites?
982 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:38:46.350 --> 06:39:01.470 But at the same time, it's a, it's a really bad test, and so I think instead of trying to like come back over and over again, we have to figure out a different evaluation of which types of wanting people to click through.
983 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:39:01.470 --> 06:39:21.470 Are we actually trying to affect here, right? Because you might want someone to click through a really basic factual data because they would see some unrelated thing on your website. That's been very, very hard to reduce between meaningful tests here but where in the context does that discussion about what.
984 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:39:21.470 --> 06:39:48.500 That means is a snippet is stuff that doesn't replace clicking through. It's not written down because it's hard to write down, but we keep talking about it because it is fundamentally the behavior people care about to go for it. I think like Andrew's point is like on the chat is kind of the thing like it's.
985 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:39:48.500 --> 06:40:08.220 And it's kind of hot to measure I think but yeah so Mark, you said earlier you talked about like verbatim only allowing verbatim results in like awkward composition and that there's.
986 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:40:08.220 --> 06:40:24.060 An idea that we want to allow kind of almost verbatim is is what I heard you say? I think there are many ways one can spell that. Okay. And we, we need to get there somehow, but what I was interested in is does anyone disagree with.
987 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:40:24.060 --> 06:40:41.970 Allowing something like that. Okay. But disallowing hope loft, you know. Right, because right now the way I read this is that only verbatim is allowed. Right. And so I brought that up earlier. Yeah, if we want to allow kind of almost verbatim, then we have to figure out how to.
988 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:40:41.970 --> 06:40:58.860 State that, right. And that that's why I I'd love to leave the room today with some sense that we all agree on, you know, roughly where the line is with knowing that we still need to describe it with words, but that it's not all the way shoved at one end or the other.
989 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:40:58.860 --> 06:41:13.950 Right. And I'm not hearing any pushback on that. That's what I'm that that's the signal I'm taking. So I would I would love a specific question especially to people representing like media companies and content producing.
990 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:41:13.950 --> 06:41:30.180 Organizations or whatnot, like that specific question because like I, I've I feel like I've heard only verbatim, but not clearly. And so if we could.
991 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:41:30.180 --> 06:41:45.270 Yeah, I didn't hear that well. Okay. I just want to jump in on a quick point. Just want to jump in on a quick point that I made in the chat. I had in an original in an earlier version of this, I had included.
992 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:41:45.270 --> 06:42:05.270 Like quote non substitutive functional transformations like translation transcription or accessibility adaptations is something that is permitted, but, yeah, I was, I have been chatting a little bit with Brad and with Chris and there was a concern that that starts to like US centric fair use carve out, which is why.
993 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:42:05.270 --> 06:42:24.360 I took it out but if it's helpful to include some sort of language to that extent to help make that distinction between what is an AI overview or AI summary versus what is a transformation of the text that it's not substituted, then we can also work on that, but.
994 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:42:24.360 --> 06:42:43.110 Just want to make that note, but I agree with Tim and for a lot that we should hear from publishers around that versus transformative versus. Chris F I was just gonna bring up the so I mean I agree with those directions this is going in.
995 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:42:43.110 --> 06:43:03.110 There's one point about the snippets of the exerts and I agree about the verbatim on your verbatim allowable, but I mean there has to be a limit on that too, so of course you know if you the excert is you know five paragraphs long that could be substantive as well. So I think Aaron was trying to get the one sentence thing I don't know if that's the right solution.
996 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:43:03.110 --> 06:43:21.990 Question either, but there has to be some kind of limit introduced. So I just wanted to bring that up. So I agree with Chris what Chris just said, and mostly that's what I was intending to say. I think.
997 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:43:21.990 --> 06:43:41.990 For the context of the group, this is a somewhat new, not, not entirely new, but it's taken on new new changes in some major search applications. And I, at least for myself, and I think there, I think that there's a split internally for me and I also think among other publishers I've heard about kind of.
998 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:43:41.990 --> 06:44:03.290 On the one hand, it does involve rewriting, you know, if there it's like if you're rewriting a title very like very tightly to reference something that is not included in that publisher's title but is on that page, that might assist the user in actually navigating that page and using it because that title now includes something.
999 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:44:03.290 --> 06:44:33.300 That's like deeper down in the page that's relevant to their specific query, so I I can understand why you might want that, but again to Chris's point you, it's gotta be really tight, really close and near verbatim or down the slippery slope that we've done for all the other definitions. I put a pull request another pull request for this one. Okay, it's a tweak to it, but what's the number Martin? Scrolls down and.
1000 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:44:33.300 --> 06:44:49.500 We've got 200? I got 200 as well. I get all all the numbers. Yeah, fantastic again. There we go, it's not that big a document, so you get to.
1001 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:44:49.500 --> 06:45:10.170 You need to scroll up just a little bit, otherwise you're gonna miss the 1st sentence, which is already off the top. Oh nice. Thank you. Because apparently where's text is almost identical to the text that we had for the 1st sentence before.
1002 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:45:10.170 --> 06:45:35.030 Yeah we feel that these two proposals are converging more than they were? Yeah, I think so. I think we still have that translation transcription question, right? But I'd I'd rather spend our time improving one of them and addressing the issues.
1003 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:45:35.030 --> 06:45:42.525 And making sure that we incorporated everything we're talking about rather than going back and forth between the two.
1004 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:45:44.799 --> 06:46:04.370 We still have a queue? Yeah, right? Andrew? Yeah, so I I I wanna come back to the question of why we are doing this because there's an awful lot here that sounds very much like I'm giving you instructions of what you.
1005 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:46:04.370 --> 06:46:21.630 You may or may not do. Elsewhere in the document, it's just like, you know, and you can throw this away whenever you want. So it's really weird to give people instructions about what they must do when they just ignore it. I, I don't understand how those fit together.
1006 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:46:21.630 --> 06:46:41.630 To be frank, andrew, right? The idea is to communicate whatever you want to communicate as a preference clearly that it's understood by both sides, right? Then whatever they want to do is like outside the scope of this working group, but at least like make sure that like the, what you want to communicate is communicated clearly, and that's why we are like noodling on the stacks because it's at least clear what you want to communicate because it.
1007 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:46:41.630 --> 06:47:02.030 It was not. Okay, good. I'm glad you said that because this is this actually gives me the the opening that I wanted. Ok, I I think the problem here is that we're attempting to draw very, very clearly a bunch of text for what it amounts to. I have some preferences and I want you to.
1008 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:47:02.030 --> 06:47:29.640 Understand them and we're gonna like we're gonna agree to do those things. And the, the, because the the recipient of these of of these preferences can ignore them at any time. Well, that case is the case where like this person just doesn't care what I want. So we can assume that those are cases where you just got emergence anyway. So the only cases where you're is where you're lined. In that case, we don't need to be really precise on this.
1009 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:47:29.640 --> 06:47:44.970 But it needs to be as, you know, close enough, it's good enough because, you know, we understand, ok, well, you know, we're trying to get to the right thing and so whether it's exactly verbatim or whether it's like kind of verbatim but not precisely, that doesn't matter that much because the idea is that you're convergent.
1010 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:47:44.970 --> 06:48:04.970 On the on the goals of this of this effort. I think it's like at a, at a higher bit, right? One of them is I'm saying like when I say, because like we have vocabulary, right? Like it's like a short description of the thing. We're not talking about all these things in the actual expression of the vocabulary itself, right? And the idea of like making this.
1011 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:48:04.970 --> 06:48:22.680 Precises that like the when somebody expressed a preference like the person consuming it as a good faith actor also understands exactly the same thing. And I think what I'm saying is we do this all the time in English or in any other language, in another natural language, people get it wrong and when they do.
1012 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:48:22.680 --> 06:48:39.390 It's a good faith effort to converge, but when you've got people who are not doing things in good faith convergence doesn't happen and there's a there's a flavor to some of the discussion around this is that we have to fix those cases where we got the bad faith actors. Can't fix that problem with this.
1013 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:48:39.390 --> 06:48:57.690 So I think we should just stop worrying about that part. Thank you. Give me to? Again, i wanna advocate for potentially returning to this after we talk about the the top level exclusions because I think we're.
1014 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:48:57.690 --> 06:49:14.585 Trying to put too much into the categorical definition so we could solve at a higher level. Thank you. Specifically regarding like the translation.
1015 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:49:16.580 --> 06:49:19.901 Okay.
1016 "Gisele" (3787759872) 06:49:19.901 --> 06:49:50.010 Hi, I'm gonna have to leave in a, in, in a few minutes but I just wanted to share my view as a, as a publisher or a creator as it was asked in terms of the verbatim or almost verbating or or why even make a distinction. And I think on from where I see it, there's a few different overlapping reasons why one would want closer as close to verbatim as possible. One of them being that.
1017 "Gisele" (3787759872) 06:49:50.010 --> 06:50:10.010 Obviously whatever it is that is put in front of a searcher is a representation of our website, our knowledge, our brand, if that is not exactly what we said, and it's not, there is a chance that there it's gonna be the generation might not be accurate.
1018 "Gisele" (3787759872) 06:50:10.010 --> 06:50:28.260 Which, you know, it's it's what happens with these models eh at least now, who knows what's gonna happen in the future, so that perception could be incorrect that we are presenting of the brand, and it's something that is happening right now. I know publishers in the in the food.
1019 "Gisele" (3787759872) 06:50:28.260 --> 06:50:48.260 A space who have been getting recipes associated to them in the in this generation, in these summaries that are incorrect, that if people were to follow those steps, they would think that website is terrible. They could go and open a trust pilot and say it's a terrible website which could then end their business.
1020 "Gisele" (3787759872) 06:50:48.260 --> 06:51:21.350 Over time. So I think there's a point there towards having content being as close as possible to to what we have on our website that is is actually important and there's a level of accuracy that there's gonna be much higher when it comes directly from the source. And then on the other end eh, I think also because of how, how this works where lots of websites are visited and I guess there's a, there's a, a decision being made around what's best and then that might be.
1021 "Gisele" (3787759872) 06:51:21.350 --> 06:51:46.940 Attributed to one or two websites even though the information might be coming from different places creates this Frankenstein I guess possibility of, of just a summary that not only doesn't represent one website, it doesn't represent free websites and it still mentions them or perhaps links to them that might make people not wanna click either. And then the.
1022 "Gisele" (3787759872) 06:51:46.940 --> 06:52:04.500 Last point is somebody was asking what's the difference between or how, how do we separate between like an excert snippet and and a summary? And from where I see it is the difference between having something that might encourage somebody to read more?
1023 "Gisele" (3787759872) 06:52:04.500 --> 06:52:24.500 That might present a little bit of information that they it might answer their question as we've had it with Google say with snippets, which is something that has been happening for a very long time versus getting an an answer that is competing directly. It's a generation that is competing directly, which I guess is.
1024 "Gisele" (3787759872) 06:52:24.500 --> 06:52:38.685 Where this point of new content comes from competing directly with with the with the websites. So just, just that just to share my view of of the verbatim or or not verating. To me the closer to verbatim, the better.
1025 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:52:40.399 --> 06:53:08.430 Thank you. Max. I wanna go back just a few comments to the sort of little back and forth that Suresh and and Andrew had. Suresh, you said the phrase, what are we trying to express? And I think it's, it's useful to talk about who's gonna be making these expressions, which is publishers, site owners, content or however you want to phrase that, right? There is.
1026 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:53:08.430 --> 06:53:27.569 You know, a clear class of people that will be creating the expression, right? And I think from that perspective, and this is a bit of a a point that attaches to context and I know we'll get to that much more tomorrow, but like fundamentally when we think about these definitions to be thinking about.
1027 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:53:27.569 --> 06:53:43.349 Publish your choice, and I think that's really critical and I think that that's partially to what Andrew was saying earlier as well. I think like unless we take that as the, where these definitions start from, it'll be very difficult for us to actually.
1028 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:53:43.349 --> 06:53:59.699 Make this work useful. So I I wanted to just clarify that like from my perspective, the we in a, in what are we expressing should start from the publisher site owner perspective. I I think Andrew's point was not to like.
1029 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:53:59.699 --> 06:54:15.419 Spend too much time perfecting this because I understand that right like I think we've drained the queue. And we've trained the queue. Okay, so I I think Aaron asked in chat if we can.
1030 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:54:15.419 --> 06:54:35.419 Go back to, what, what are the substantive differences and or or what are the outstanding issues in these? I think that's a helpful way to look at this. I, I don't want this to be a beauty contest between two different proposals. I'd much rather, you know, identify the open issues or or or the things that we need to resolve and work towards a an amount.
1031 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:54:35.419 --> 06:54:53.339 Text. I think we need to put time to just discuss that. Sure. I I I I don't think we're ready to incorporate anything into the draft yet, not considering our past experiences.
1032 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:54:53.339 --> 06:55:12.959 But I would like to raise some issues against the proposals and understand them a little more deeply and make sure we understand where they sit. So, if folks can go and and think about these and and and try and figure out what needs to be discussed on them more, I think we'll come back to them after some other discussions.
1033 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:55:12.959 --> 06:55:29.939 Tomorrow, I mean we're doing reasonably well in our in our agenda. I can find it here somewhere. Let's just go this way.
1034 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:55:29.939 --> 06:55:50.399 So, we've kind of at least touched on everything down to, here. Tomorrow morning, maybe we should go through the concept text issues we'll touch briefly on the use issues.
1035 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:55:50.399 --> 06:56:06.449 And then, tomorrow afternoon or whenever we finish that, we can go back to both this, this cluster of search issues as, as well as the training issues and see if we can make some progress on, on those two definitions. Does that make sense? Sounds good.
1036 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:56:06.449 --> 06:56:22.529 And, I think like, can we take the 201 PR as the basis for like people to start issues or like where do people start? I've got two of them now. There's the two PRs. I think we need to.
1037 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:56:22.529 --> 06:56:41.129 Pick one. I would like to pick one. I would say 201 is like the well perhaps perhaps we should see if people change their minds. Yeah, I'd like to see what the room thinks. So as a reminder, I believe this is, yeah, this is.
1038 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:56:41.129 --> 06:57:01.129 200 NO 1906, yeah. 199I think. 1909 and 201 200 is the other thing. Martin just wanted to take 200 so we just did. Oh, the 200 moves the definition of AI.
1039 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:57:01.129 --> 06:57:18.089 Oh. So this is 200 199. Sorry, four dotty dollar, thank you. Gotcha liked it all be talking about my call.
1040 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:57:18.089 --> 06:57:34.649 So suck this in. This is 1909 and that's a I'm not gonna call I'm not gonna associate that person's name with it because I hope we're past that, right? So this one's.
1041 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:57:34.649 --> 06:57:53.129 What version? Sorry, this is one version that attempts to have a single sentence definition and then expand on what the words in that mean. These are gonna have issues and and adjustments need to be made. Yeah.
1042 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:57:53.129 --> 06:58:08.579 And this is two oh one, which is a little awkward because the diff is split, but there's the 1st sentence and then here's the rest. Yeah. And I think the 1st sentence changes much.
1043 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:58:08.579 --> 06:58:24.869 Okay, so maybe you want to do a poll? Should I? 1992 oh one or don't care. And then we can go home. Sorry? Which is the one that we're looking at right now?
1044 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:58:24.869 --> 06:58:44.869 Two oh 1199 or two oh one or don't care. The one we're looking at right now is two oh one, sorry. Sorry, I I might have missed this, but this is, this is slightly different than the original submission because it aims to show the difference.
1045 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:58:44.869 --> 06:59:05.959 Between submission and my submission. This is this is slightly different that wasn't the comment. Yes. Yep. I think I just missed. I was just trying to track the conversation I got some feedback from people and you were like yes. And so that's where we ended up now. Both of them have evolved.
1046 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:59:05.959 --> 06:59:35.489 Oh live now because I'm still seeing the old one. I'm not into I haven't done I don't know what to type into the question into the poll. 1992 oh one or two oh one. This is the we don't care. I don't care. And your three options are 1992 oh one don't care. And starting point. This is not we are adopting text. This is.
1047 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:59:35.489 --> 06:59:50.939 What is the basis of our continuing discussions? And we're not throwing the other one away, we'll still use it as a reference. Launch that poll. I mean.
1048 "TRN6-29-BANFF/speaker_1" (4268955648_1) 06:59:50.939 --> 07:00:07.709 This is the Vegas question ever, but everybody seems to be happy answering it so number three, do you wanna go home?
1049 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:00:07.709 --> 07:00:23.519 Then we get to pick. Whatever you'd like. Records. Hey, make your preferences heard.
1050 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:00:23.519 --> 07:00:42.029 I'm sure NO preference is pretty Strong another 30 s. You do know there's more than 30 people in the room. You can see. I know.
1051 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:00:42.029 --> 07:01:02.519 Wow, go NO preference. Oh man. You could not have a clearer answer. No preference is to express.
1052 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:01:02.519 --> 07:01:22.519 If this poll isn't determinated, we're gonna leave it up to the editors to come with an amalgamated proposal to the group. Do we have to synthesize the two of them? You do what you feel necessary. Oh, things would be.
1053 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:01:22.519 --> 07:01:40.289 I know we're. You take his input, the two proposals in the existing text and you generate an output. Right. It's deterative yes. You assuming that I'm gonna be doing this work? Maybe doing this work. What's the outcome of the poll?
1054 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:01:40.289 --> 07:02:00.289 The outcome of the poll is like evenly split three ways. So it's like 1st then NO information. Well, what if they just ok here's here's here's a better question, Suresh. Is anyone closed? And let's do this by yeah if NO NO, it's not a poll. It's, it's a question.
1055 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:02:00.289 --> 07:02:10.319 Question. Yeah I'll stop this and yeah because so stop that one. That's just, that's an embarrassment.
1056 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:02:10.319 --> 07:02:28.379 Martin, do you wanna do that as opposed to a particular option or just opposed to giving the editor's discretion to come up with a I think I'd like to hear about what the problems with each one of these would be. Like who would think one of these is not. So that that's effectively an issue gathering exercise.
1057 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:02:28.379 --> 07:02:45.119 No, NO, I think we start with if one of these is unacceptable. Okay. Then, then that gives us, we can go to the other one. Because you just ask for preferences and preferences don't really give us enough to. Who would find it unacceptable? Yeah, beautiful.
1058 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:02:45.119 --> 07:03:03.089 And then we might ask people to say why? Do you find it unacceptable? Yeah, yeah. The last one was a beautiful testing.
1059 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:03:03.089 --> 07:03:23.089 Really the beauty contest didn't really give us enough. All the children are beautiful. Thank you massports. So I would an objection just to like move us forward with two oh one is it doesn't talk about some things like translate.
1060 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:03:23.089 --> 07:03:44.189 That are in Aaron's version that I think have valued. So I would say that's the difference between the two. So, so you would object to two oh one as a starting point? Yes perfect. Okay. But we could, we could have had that to two oh one but we had a starting point. And as they are, I'd have to object to one and approve of one starting point. Okay.
1061 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:03:44.189 --> 07:04:00.059 Since I raised my hand 1st, I get frustrating. Oh, of course. It's going exactly the same with that. Yeah. Information is information. You have enough objections.
1062 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:04:00.059 --> 07:04:20.059 Why are you doing one of the sites? Mark asked me to because like I I I think you're not gonna get an answer. Yeah. Okay, so this is not going anywhere, so. No, ok. Okay, that.
1063 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:04:20.059 --> 07:04:47.189 That poll is not useful so like Can we do a poll if that poll is useful? I think people want to go home at this point. I think you're right. So I heard because I, I think the point is Martin, right? Like so when people go home today, right, they need to start issues on something. Do you want them to start issues on both of them or like just one of them. No I think that's helpful at this point. I think we need to, we need to pick one of these and then we can start building on it.
1064 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:04:47.189 --> 07:05:04.139 To point, if the translation thing is important to you, and that's causing you to pick 1909 over two oh one, then, you know, we either pick 1909 or we pick two oh one and make sure we have an issue for the.
1065 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:05:04.139 --> 07:05:24.439 And that's the thing it's We can pick either and then create issues against it. And so we don't need to face plant on picking one. That's why I would suggest we just give the editor's discretion to come up with a proposal that looks something that that they think is defensible and then we'll start creating issues against it. Yeah. Chris, did you want to say something?
1066 "Chris Needham" (1410254336) 07:05:24.439 --> 07:05:42.379 Yeah, I, I think we need to work through this thing about translations, like that, that's probably my main objection to starting with, 1909 that and also the lack of clarity around sentence level. So this so this is why I'm so.
1067 "Chris Needham" (1410254336) 07:05:42.379 --> 07:05:56.640 You're preferring two oh one, as I I I don't think that the search category should should really be saying anything about translations or sort of modality transformation.
1068 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:05:56.640 --> 07:06:07.079 Can I suggest something like because I I think we won't go anywhere if you just kind of pick one. I would just say like if you have issues with one of these, either 199 or two oh one file an issue on it.
1069 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:06:07.079 --> 07:06:24.059 Right, and then we can kind of enumerate the issues tomorrow because then we don't know what we're referring. No but like issue and then we'll pick one of the two, right? Like it's clear that the translation issue is has strong, it needs to be discussed. Yep.
1070 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:06:24.059 --> 07:06:40.229 Yeah, yeah. I'll follow an issue with. I don't think it's helpful to to focus on the individual proposals, which needed to focus on the issues that are involved in both of them. But is that the one? I suspect others will will be at north.
1071 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:06:40.229 --> 07:07:02.429 So what is the homework? We need to The homework is to look at both proposals and to like follow issues if you have explaining what you need to see change in either or both. Yep. Is there any homework related to the foundation model or whatever model training? Where did we end up with that?
1072 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:07:02.429 --> 07:07:21.029 In text. Which one?
1073 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:07:21.029 --> 07:07:41.029 There's text in there, but we didn't pick one of the two. We also have two options on the training side, so think about which one you prefer there yep Which is the split out one and the one which.
1074 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:07:41.029 --> 07:08:02.579 Only focuses on generative capacity. Right, and they're both an issue number whatever this one is. 1908 issue issue so issue 1908 PR number 1909 and two oh one things we should be looking at. Yep. Okay, let's call it for today.
1075 "TRN6-29-BANFF/speaker_1" (4268955648_1) 07:08:13.640 --> 07:08:31.703 Okay, we're done for today. It's the same time tomorrow morning? Yep. You all tomorrow morning? Hopefully everybody's less jet lagged.
1076 "Leonard Rosenthol" (1805905664) 07:08:31.703 --> 07:08:35.141 Some folks tomorrow.