Session Date/Time: 21 Jan 2025 10:00

NMRG

Summary

This interim online meeting of the Network Management Research Group (NMRG) focused on the critical role of data in network management, particularly in the context of machine learning. The session featured three technical presentations exploring challenges and solutions related to data descriptors, data quality assessment for intrusion detection systems, and the use of synthetic radio data for environment sensing. A subsequent open discussion aimed to consolidate past efforts and define future research directions for the group concerning data quality, usability, and collection. Key outcomes included a strong interest in developing a draft outlining requirements for data descriptors and exploring common platforms for evaluating data-centric algorithms.

Key Discussion Points

1. Data Descriptors and Topologies in Decentralized Learning (Arash)

Problem Statement: Current AI model training is largely centralized, leading to challenges with increasing costs (model/data volume, retraining), privacy concerns (especially in Europe), and communication overhead. Decentralized learning (Federated, Gossip, Split Learning) offers a promising alternative.
Decentralized Learning Challenges: A key challenge is efficiently identifying relevant data nodes for model training. Existing parameters (data size, type, compute/storage capabilities) are insufficient as they neglect data characteristics and model architecture.
Proposed Solution: Data Descriptors: Introduction of quantitative and qualitative "data descriptors" to characterize data at each node. These descriptors must capture:
- Data Characteristics: Quality (how to define from a model training perspective?), age, dynamics (evolution, rate of change), variance, and relevance (to the model).
- Model Architecture: Model size, capacity, structure, and parameters are crucial for interpreting data measures. The coupling between data and model is non-trivial and often revealed only during training.
- Training Methods: Data dynamics can influence optimal training methods (e.g., faster data variation rate for Federated learning, slower for sequential learning).
Objective: To enable mindful selection of training data from a global data corpus, leading to better model performance and reduced data transfer.
Impact of Descriptors: Such descriptors would facilitate the development of data comparison metrics and the creation of "knowledge network data topologies." These topologies, which are model-specific rather than global, could then be used for tasks like node selection, path computation, and addressing catastrophic forgetting.
Discussion: There was agreement that data descriptors cannot be standalone but must be considered in conjunction with the model that will use the data. The concept extends beyond machine learning to other network functions and automation.

2. Evaluating IDS Data Sets within a Systematic Evaluation Framework (Benoit)

Problem Statement: The effectiveness of ML-based Network Intrusion Detection Systems (NIDS) critically depends on data quality. Many existing NIDS datasets are derived from simulations, potentially unrepresentative of real-world scenarios, and have documented errors (e.g., mislabeling, correlated features). This questions the validity of models trained on such data.
Approach: Proposing a systematic evaluation framework for NIDS datasets based on Measurement Theory. This involves translating abstract quality characteristics (like diversity, realism) into concrete, measurable constructs, followed by assessing their reliability and validity.
Inspiration: The work builds on insights from "Bad Design Smells" in NIDS datasets, which identified suspicious patterns.
Focus on Diversity: The presentation highlighted diversity as a key data quality characteristic, crucial for model generalization and preventing overfitting.
V-Disc Metric: The framework proposes using the "V-Disc" score, a generic measure of diversity inspired by ecology, which quantifies the effective number of unique elements in a dataset. It is flexible, allowing different similarity functions (e.g., cosine, Euclidean) and incorporating sensitivity to rare or frequent elements.
Application Example: Demonstrated the V-Disc score on a subset of the CICIDS2018 dataset, showing how different feature categories and similarity functions impact the diversity score. The diversity profile (V-Disc vs. sensitivity parameter) can be used to compare and improve datasets.
Discussion: Clarified that V-Disc provides a quantitative measure of diversity, unlike "repetitive data point" metrics that quantify the absence of diversity. The framework aims to systematically link metrics to defined quality constructs and validate whether they truly measure the intended quality. Computational expense of V-Disc on large datasets was acknowledged. Future work includes correlating diversity measures with model performance for validation.

3. Classification with Synthetic Radio Data for Real-Life Environment Sensing (Marine)

Motivation: Next-generation mobile networks (5G/6G) leverage radio data for sensing capabilities (e.g., real-life environment sensing). However, collecting and storing large volumes of quality real-world data for ML model training is prohibitively costly and constrained.
Proposed Solution: Train ML models using synthetic data generated by generative models. The idea is to learn a generative model from real data at a source node, then transfer only the small generative model (or its descriptor) over the network to a remote node, where synthetic data is generated on demand for training. This avoids transferring raw, massive data.
Use Case: Indoor/Outdoor Detection (IOD): Focused on detecting a user's environment (indoor/outdoor) using unsupervised clustering models based on standard 3GPP radio signals (RSRP, RSRQ).
Framework: An automatic framework was proposed to learn, select (based on synthetic data quality), and transfer the best generative model. The IOD ML model is then trained exclusively with this generated synthetic data.
Generative Models Tested: GAN, VAE, V-GAN, and GMM.
Quality Metrics for Synthetic Data: Jensen-Shannon Divergence and V-Dist for similarity, and PR-gO (Precision-Recall-generated Original) for variability and variance.
Results: GMM consistently demonstrated superior performance in generating high-quality synthetic data. Importantly, it was shown that training IOD models with synthetic data generated by well-selected generative models can maintain detection performance (F1 score) comparable to training with original raw data, even when the generative model itself was trained on a relatively small subset of the original data. This confirms the potential for significant resource savings.
Discussion: The primary goal is to maintain (not necessarily improve) performance while achieving substantial resource gains in data collection, transfer, and storage. The approach also offers privacy benefits by avoiding direct transfer of sensitive raw data. The generalizability of this approach to other datasets and scenarios was confirmed, and interest was expressed in integrating this with Arash's concept of data descriptors for node selection.

4. Open Discussion: Next Steps for Data Topic in NMRG

Past Work Review: The chairs summarized past NMRG activities related to data, including workshops on flow-based measurements, discussions on AI challenges for network management, and recent technical talks on data quality. The existing draft on AI challenges already highlights issues like what data to collect, data representation, usability, quality, and compliance.
Future Research Avenues: Identified potential directions for the group:
- Methods and metrics for assessing data quality for different network management applications.
- Strategies for collecting data with expected quality (quality by design).
- Reviewing the quality of datasets used in research.
- Description of datasets (formats, models, semantics) to match them with AI-based solutions.
- Frameworks and data enablers for network automation.
Collective Work: The chairs invited proposals for how the group could work together (research papers, white papers, drafts, workshops, interim meetings).

Decisions and Action Items

Decision: There was general agreement and support for initiating a collective work on defining requirements and objectives for "data descriptors" in network management.
- Action Item: Arash to propose an initial outline for a draft on data descriptors, outlining what such descriptors should enable (objective-oriented requirements rather than specific formats). This draft should be socialized on the mailing list and further discussed at IETF 118 in Bangkok.
Decision (Exploratory): The idea of a common platform or testbed for evaluating data-related algorithms was discussed as a potential future direction.
- Action Item: Arash to provide more details about the platform and vision developed in his lab at IETF 118, which could inform the discussion on a common platform.

Next Steps

Mailing List Engagement: Participants are encouraged to continue discussions on the NMRG mailing list regarding data descriptors, data quality, and potential collaborative activities.
IETF 118 (Bangkok): Further discussions on these data-related topics, including updates on the proposed draft and Arash's platform vision, are anticipated.
Exploration of Platform: Investigate possibilities for a common platform/testbed to evaluate data-centric algorithms, potentially leveraging existing open platforms, shared datasets, and models.
Supervis Project Learnings: Consider contributing learnings from projects like "Supervis" on general data quality characterization to future NMRG discussions.