Markdown Version | Transcript | Session Recording
Session Date/Time: 07 May 2026 13:00
RASPRG
Session: RASPRG Interim Meeting
Date: May 2026 (Interim)
Chairs: Alvaro Retana, Ignacio Castro
Participants: 19 attendees (as per bluesheet)
Summary
The RASPRG interim meeting focused on reviewing current research efforts regarding the analysis of standards development organization (SDO) data, discussing methodologies for participation tracking, and exploring the development of common tools for data normalization. Key presentations covered cross-organizational comparisons between the IETF and W3C, behavioral predictors within the IESG, and technical challenges related to entity resolution and data ethics.
Key Discussion Points
1. Introduction and Administration
Ignacio Castro opened the meeting by reviewing the IRTF Note Well and the RASPRG charter. He emphasized that the group’s goal is to understand the standardization process through evidence-based, reproducible work, rather than influencing IETF operations directly. The session slides are available in the Chair slides.
2. Measuring the IETF and W3C (sodestream)
Stephen McQuistin presented research from the sodestream project: sodestream: Measuring the IETF and W3C.
- Project Goals: Streamlining social decision-making and identifying bottlenecks in standards development.
- Data Scope: The project analyzes ~130,000 participants across the IETF and W3C, using 4.3 million emails and 1,800 GitHub repositories.
- Observations: While IETF participation has leveled off (~4,000 active participants annually), the W3C has seen growth, largely driven by its wholesale shift to GitHub in the mid-2010s.
- Intersection: Topic modeling shows significant overlap between the two organizations, highlighting the need to study them as a shared community to avoid duplicated efforts.
- Discussion: Priyanka Sinha inquired about data cleaning (spam/duplicates). Stephen McQuistin confirmed that data is cleaned for consistency, noting that IETF archives are generally cleaner than others.
3. Regional Participation and LLM Tools
Marcelo Santos discussed ongoing efforts to analyze Brazilian and Latin American participation in the IETF.
- Predictive Analysis: Researching features that predict whether a draft will become an RFC versus expiring.
- Onboarding Tool: Developing a beta software interface (using LLMs) to help researchers find relevant working groups or mailing list discussions based on their expertise areas.
- Focus: A proposal for the upcoming Vienna meeting will focus on "expired drafts"—analyzing why documents fail to progress.
4. Linking Email Archives and AI Agents
Jie Bian described research focused on bridging the gap between email discussions and document revisions.
- AI Agent: Developing a chatbot that allows users to query specific details of an Internet-Draft or RFC; the agent searches the mailing list archives to provide context for technical decisions.
- Call for Feedback: A prototype will be shared on the mailing list to test for accuracy and potential "hallucination" in AI-generated technical replies.
5. IESG Behavioral Study
Sue Hares presented a longitudinal study on the "back end" of the standards process (post-WG Last Call).
- Methodology: A behavioral study of IESG minutes and deliberations from 1990–2016, currently being extended to 2026.
- Predictors: Research identifies how group cohesion and leadership (ADs "going the extra mile") predict the time taken for a draft to move from approval to RFC publication.
- Key Finding: The IESG process is effective but highly dependent on the "character" of the participants and the IETF Chair.
- Discussion: Michael Richardson and Sue Hares discussed the application of "psychohistory" (qualitative social science) to organizational behavior.
6. Data Tracker and API Improvements
Jennifer Richards provided updates from the tools team regarding the IETF Data Tracker Statistics.
- New Statistics: Eric Vyncke has contributed new plots for country and affiliation statistics.
- API Evolution: A new API version is planned to move away from raw data access (V1) toward a more stable, user-friendly interface based on Open API schemas. This aims to insulate researcher tools from internal Data Tracker database changes.
7. Analysing SDO Data (Draft Discussion)
Colin Perkins presented Analysing Internet Standards Development Organisation Data, corresponding to draft-perkins-rasprg-analysis-data.
- Sociotechnical Context: Analysis must account for unwritten rules, informal hallway discussions, and the influence of reputation, which are not captured in technical artifacts.
- Data Processing Challenges:
- Entity Resolution: Identifying individuals across decades and different email addresses.
- Affiliation Mapping: Managing 280+ variations of single company names (e.g., Huawei).
- Mailing List Parsing: Handling 30-40 years of malformed email headers.
- Ethics: Researchers must be careful with public data to avoid disrupting the standards process or livelihoods of participants.
- Discussion: Sue Hares suggested adding "economic environment" (e.g., data center growth) as an exogenous factor in the sociotechnical loop. Priyanka Sinha suggested incorporating external data like patent filings or market share.
Decisions and Action Items
- Data Normalization: There is a clear need for common methodologies or tools to handle affiliation strings and entity resolution.
- Draft Contribution: Sue Hares will provide references to organizational culture frameworks and human-subject data standards for inclusion in draft-perkins-rasprg-analysis-data.
Next Steps
- Hackathon Planning: Colin Perkins and Ignacio Castro discussed organizing a hackathon for the Vienna meeting (IETF 120) to focus on entity resolution and data cleaning "test cases."
- Mailing List: Participants are encouraged to discuss which specific research tools should be hosted on the RASPRG GitHub.
- Upcoming Meetings: The chairs will monitor mailing list demand for a second interim in June; otherwise, the group will reconvene in Vienna.