Session Date/Time: 06 Nov 2025 18:00

IETF Session: Evolution of Internet Performance Metrics to Quality of Outcome

Summary

This session discussed the limitations of traditional network speed tests for non-technical users in modern, multi-gigabit, multi-device home environments. The presenter detailed an evolution of measurement approaches, from gateway speed tests and device-specific performance estimates to a new metric called "Quality of Outcome" (QO). QO aims to provide users with a more meaningful, application-centric understanding of their internet experience by assessing the probability that a specific application session will be flawless, based on statistical analysis of latency distributions. Early trials of the QO metric showed positive user reception. The discussion highlighted the need to move beyond simple throughput measurements towards metrics that reflect actual user experience and application needs, and explored challenges in scaling and presenting these complex metrics to end-users.

Key Discussion Points

Limitations of Traditional Speed Tests:
- Users often misunderstand speed test results (e.g., "gauge full at 100 Mbps" leading to satisfaction regardless of actual tier).
- A single speed test doesn't reflect the complexity of multi-gig, multi-device home networks, Wi-Fi extenders, and varying device capabilities (e.g., an iPhone 4 on Wi-Fi 2.4 GHz will not achieve gigabit speeds).
- Discrepancy between network-delivered speed (at the gateway) and user-perceived speed (at a device via Wi-Fi) causes confusion and dissatisfaction.
- Simple throughput doesn't correlate directly to application performance (e.g., 4K streaming often needs far less than maximum available speed).
Evolution of Measurement Approaches:
- Gateway Speed Test: Measures speed delivered to the network gateway to differentiate ISP network performance from in-home issues.
- Device Fingerprinting: Identifies the user's testing device to provide expected speed ranges for that specific device, helping set realistic expectations.
- Wi-Fi Blaster: A pseudo-speed test using UDP packets to estimate Wi-Fi speeds to every connected device in the home, including IoT devices, to identify poor internal connections.
- Whole Home Test: Combines an initial outage check, gateway speed test, and individual device tests, adjusting expected performance based on device type (e.g., Nest thermostat vs. 4K TV) and classifying connections as "good," "weak," or "fair."
Need for a New Metric – Quality of Outcome (QO):
- Current methods still only estimate connection quality and don't reflect what users are doing or experiencing on their devices.
- Users consume applications, not raw megabits or milliseconds of latency; they want to know if Netflix or a video call is working.
- Network conditions are highly variable (e.g., microwave use, physical obstructions impacting Wi-Fi).
- Speed alone is insufficient; latency significantly impacts application performance (e.g., web page load times).
Introduction of Quality of Outcome (QO):
- Defined as the likelihood that a specific application is used "without error," bridging the gap between Quality of Experience (QoE, subjective user interpretation) and Quality of Service (QoS, raw network measurements).
- Technical Basis: Utilizes a cumulative distribution function (CDF) of packet latencies for an application over a period. Application-specific requirements define thresholds for flawless performance (e.g., X% of packets must arrive within Y milliseconds).
- QO Score: Calculated based on the "worst intersection point" where the application's latency CDF meets the defined thresholds, providing a probability of a flawless session (e.g., "80% probability of a flawless gaming session").
- Composability: The QO metric can be broken down to identify the root cause of performance issues (e.g., Wi-Fi congestion, access network variability, server distance), guiding targeted troubleshooting advice.
Early Trial Results and User Feedback:
- A small initial trial with 25 users, given a router providing QO scores, showed:
  - 86% felt it reflected their experience.
  - 100% preferred it over gateway speed tests.
  - 86% would recommend the QO tool.
- Participants emphasized that throughput doesn't matter as much as it used to (analogy of camera megapixels vs. other features like low-light sensitivity). Users primarily care if they are "getting what they pay for," even if that throughput isn't the cause of a poor experience.
Challenges and Future Considerations:
- User Psychology and UX: How to present probabilistic scores (e.g., "80% likely to be flawless") to non-technical users effectively (e.g., "good," "fair," "weak," or a scaled number). Avoid causing stress if scores rarely reach 100%.
- Validation: Ensuring the QO number genuinely correlates with user-reported quality of experience at scale.
- Application Collaboration: Potential for applications to provide feedback on their performance (e.g., video buffer emptying) and collaborate with networks to optimize steering or troubleshooting. This suggests a need for open standards in this area.
- Independent Measurement: A participant noted the high desire for independent measurement tools, similar to regulatory-mandated speed tests, to provide trustworthy and unbiased assessments of QO.
- Impact of Encryption: A participant raised concern that protocols like QUIC, with strong encryption, might degrade the ability to monitor network performance, impacting user experience tools.
- Customer Education and Placement: The importance of educating customers on proper router placement and other user-addressable issues was reiterated, as these frequently contribute to perceived performance problems.

Decisions and Action Items

No formal IETF working group decisions were made during this presentation.
Presenter: Expressed interest in continued engagement with the IPPM working group, including returning to present further findings and learning.
IPPM Working Group Chair (Marcus Eiler): Invited the presenter to share further learnings and findings with the IPPM working group in the future.

Next Steps

Scaling and UX Refinement: The presenter's organization plans to figure out how to scale the QO metric, refine its user interface, and determine the most effective way to present the probabilistic scores to end-users (e.g., through user research and testing).
Large-Scale Data Collection: Collect data and user feedback at scale to further fine-tune the QO thresholds and presentation.
Collaboration with Application Providers: Explore potential mechanisms for applications and network providers to share information and work together to improve end-user experience (e.g., application-informed network steering).
Consideration for Public/Independent Tools: Continue to consider the implications and potential for making such quality-of-outcome tools independently or publicly available, given the strong user desire for independent validation.