Markdown Version | Recording 1 | Recording 2
Session Date/Time: 04 Nov 2025 14:30
NMRG
Summary
The NMRG session focused on Intent-Based Networking (IBN), with a strong emphasis on the application of Artificial Intelligence, particularly Large Language Models (LLMs) and Generative AI, for network and service management. Discussions covered updates on existing IBN drafts, innovative neurosymbolic approaches, and test-driven development methodologies for network configuration using LLMs, highlighting challenges like natural language ambiguity and LLM hallucinations, alongside proposed solutions for verification and reliability.
Key Discussion Points
-
Opening Remarks:
- The session began with reminders about IRTF's adherence to IETF IPR rules, meeting recording policies, privacy, and the code of conduct (referencing RFCs 7154, 7776, and 9775).
- NMRG, as an IRTF research group, focuses on longer-term research issues related to the Internet, distinct from IETF's short-term engineering and standards-making. Research groups publish informational or experimental RFCs.
- The morning session was dedicated to Intent-Based Networking, featuring five presentations: an update on the IBN Use Cases draft, three technical presentations from a recent CNSM conference, and a draft on Generative AI for IBN.
-
"Use Cases and Practices for Intent-Based Networking" (Paul Jung):
- Presented updates from version 00 to 02 of the draft, noting that approximately 70% of co-chair Jerome's comments from the adoption call have been addressed.
- New sections include "IBN for green service management" (contributed by Luis) and enhanced discussions on AI agents for IBN, as well as security considerations (insider and outsider attacks).
- Key updates covered:
- Adding references to technologies mentioned in Section 2 (IBN system).
- Partially linking methodologies in Section 2 to use cases.
- Polishing self-contained exploration for each use case for better readability.
- Clarifying relationships with RFC 9315 concepts and definitions, including a new subsection on mapping between IBN systems and the intent lifecycle.
- Planned addition of intent classification taxonomy (RFC 9316, Section 3) with a table for solution intent, user type, and network scope.
- Enhancing Section 4 (practice & learning) with more use cases and numerical/colored results.
- Clarifying the difference between policy verification and validation in Section 2.2.
- Expanding intent translator options to include non-graphic user interface tools like LLMs (e.g., Plan T5, GPT-3) for prompt learning, and domain-specific languages (DSLs) like NDL and NEMO.
- Enhancing the "on-path telemetry" use case with IBN.
-
"Extending Test-Driven Development to Intent-Based Networking with LLMs" (Davide):
- Proposed applying Test-Driven Development (TDD) principles from software engineering to IBN, leveraging LLMs for network configuration.
- Highlighted LLMs' ability to generate code but also their susceptibility to "hallucinations" (producing incorrect or non-functional outputs), especially with complex tasks (e.g., IPv6 regex parsing).
- Introduced a test-feedback loop system where tests are designed before configuration generation.
- Architecture: User provides intent -> LLM generates configuration (e.g., for an ONOS SDN controller) -> a policy verifier (Technicium) tests the configuration against predefined policies -> if tests fail, feedback is provided to the AI for re-generation. A human remains in the loop for initial requirements and final configuration acceptance.
- Evaluated with three test cases: blocking HTTP traffic, allowing all traffic, and enforcing waypoints (traffic from A to C must pass through B).
- Results showed high consistency (over 90%) in LLM output, with a 60-50% success rate for LLM interaction with ONOS, and about 50% of the generated configurations passing the tests.
- Discussion covered the need to formalize roles in prompt/test development and the use of a minimal formal language for policy expression in Technicium (e.g., reachability, two-way points).
- The TDD approach helps prevent hallucinations by providing immediate validation; policy conflicts are identified as verification failures.
-
"Large Language Models in Network Management" (Anwar Lakimi):
- Addressed the challenge of manual, complex SDN configuration, proposing natural language as a simplified control layer through an LLM agent interacting with SDN controllers via REST APIs.
- Acknowledged LLM capabilities (understanding complex patterns, generating human text) but also challenges: natural language ambiguity and unreliable generation (hallucinations).
- Presented a pipeline to tackle these:
- Context Injection: Automatically adds static, required configuration parameters (e.g., SDN controller type, OpenFlow version, data path ID) to the user's natural language prompt.
- Intent Regulation (Chat with LLM): Allows users to interactively refine the LLM-generated YAML configuration until ambiguities are resolved and the user is satisfied.
- Retrieval Augmented Generation (RAG): Retrieves relevant API endpoints and JSON fields from a vector database to enrich the LLM's context.
- JSON Payload Generation & Execution: The LLM generates a JSON payload for an API request to the SDN controller (e.g., Ryu).
- Flow Table Verification: An LLM checks the SDN controller's flow table to confirm successful installation.
- Error Recovery: If an API call fails, a dedicated LLM attempts to correct the JSON payload and retries the configuration up to three times.
- Evaluation used Mininet and Ryu controller, with a complexity score based on match/action fields. Experiments with GPT-4 Omni and Google Gemini Flash showed up to 96% accuracy for simple flow rules, with accuracy decreasing as complexity increased.
- An ablation study indicated the pipeline's resilience, with individual components contributing to overall performance.
- Future work includes multi-domain (e.g., optical) and multi-controller setups, exploring more LLM models (especially open-source), and addressing the challenge of generating large, high-quality datasets for fine-tuning LLMs.
-
"Neurosymbolic Approach for Intent-Based Service Management" (Lorenzo Colombo):
- Proposed a neurosymbolic approach for Zero Touch Network and Service Management (ZSM), combining the flexibility of LLMs with the explainability and determinism of symbolic AI.
- Acknowledged LLMs' strengths in text understanding but also their non-deterministic nature and proneness to hallucinations, contrasting with symbolic AI's reliability but limited flexibility.
- Architecture:
- Natural Language to JSON: An LLM translates natural language intent into a structured JSON file.
- Syntax Checker: Validates the JSON for correct syntax, required keys, appropriate data types, and legal values. If invalid, the LLM is re-prompted with the syntax error (up to three "shots").
- JSON to ASP Facts: Valid JSON is parsed into Answer Set Programming (ASP) facts, a declarative programming language for search and optimization.
- ASP Solver (Clingo): Computes all "stable models" (answer sets), representing possible service deployment solutions, using facts from the intent and current cluster metrics (e.g., from Kubernetes).
- Experiments:
- LLM Validation: A small dataset of 100 simple intents was created (due to lack of public datasets). Few-shot prompting (re-prompting with syntax errors) improved accuracy. Gemma 2B (2.7 billion parameters) achieved near-perfect accuracy after two shots, with a mean inference time of less than 3 seconds.
- ASP Solver Performance: Clingo's inference time was measured, showing acceptable performance (under 5 seconds) even with 30 clusters, particularly with multi-threaded execution.
- The current LLM validation focused solely on syntax correctness, and the ambiguity feedback loop is still theoretical.
- Future plans include expanding reasoning for more complex deployment rules, incorporating optimization functions (e.g., multi-objective optimization, reinforcement learning), testing the full architecture, and expanding the dataset.
-
"Generative AI for Intent-Based Networking" (Giuseppe Fioccola):
- Provided an update on the draft, a collaborative effort with the University of Cagliari, CNR, and Serious Technologies.
- Scope: To describe how to specialize AI models, particularly using transfer learning techniques, to create generative models for IBN.
- Highlighted Low-Rank Adaptation (LoRA) as an example for scalable transfer learning, allowing fine-tuning with significantly fewer parameters, thus reducing storage and computational load.
- Introduced the concept of an "Adapter Hub" for storing, indexing, and sharing different LoRA adapters, promoting modularity and reusability.
- Described "Adapter Flow" for combining multiple adapters to form composite models for complex intents.
- Outlined a life cycle for these models: generation (fine-tuning), evaluation (accuracy, latency, resources), and deployment, including feedback loops for continuous adaptation using network telemetry.
- Logical architecture: Intent reception -> model fusion and composition (querying the Adapter Hub) -> composite model deployment -> telemetry feedback -> possible re-specialization.
- The latest version includes a new section on "network digital twin AI-enabled IBN architecture," where digital twins can abstract service requirements, integrate top-down/bottom-up feedback, and continuously learn to validate intents before deployment.
- The draft is not limited to LoRA but uses it as a current implementation example.
- Future updates will include detailed technical formulations, drawing from a paper currently submitted to Infocom. Consideration of IETF/3GPP data models (XML, YAML, JSON) and knowledge graphs for computational policy are future considerations.
Decisions and Action Items
- Paul Jung: Will continue to address the remaining comments on the "Use Cases and Practices for Intent-Based Networking" draft (draft-jung-nmrg-ibn-use-cases-practices), aiming for an IRTF last call by the July Vienna meeting.
- Giuseppe Fioccola: Plans to provide further technical details and updates to the "Generative AI for Intent-Based Networking" draft (draft-fioccola-nmrg-genai-ibn) after the associated paper is published in Infocom.
Next Steps
- NMRG Chairs: Convene the second NMRG session in the afternoon.
- Paul Jung: Incorporate outstanding comments into the IBN use cases draft and prepare for the IRTF last call.
- Davide: Continue work on formalizing the proposed neurosymbolic approach, particularly clarifying roles in test and prompt development, and exploring LLM-based test generation.
- Anwar Lakimi: Focus on extending the LLM-based network management pipeline to multi-domain (e.g., optical) and multi-controller setups, and investigate the use of various LLM models, including open-source options. Explore strategies for generating synthetic datasets to support fine-tuning efforts.
- Lorenzo Colombo: Expand the system's reasoning capabilities to manage more complex deployment rules, implement optimization functionalities (e.g., multi-objective algorithms, reinforcement learning), and integrate the theoretical feedback loop for ambiguity resolution. The overall architecture will be tested, and the dataset expanded.
- Giuseppe Fioccola: Await publication of the Infocom paper to integrate its technical formulations into the Generative AI for IBN draft, and continue welcoming comments.
Session Date/Time: 04 Nov 2025 22:00
NMRG
Summary
The second session of NMRG featured a rich set of presentations and discussions centered on Network Digital Twins (NDTs), Agentic AI, Large Language Models (LLMs) in network management, and authorization policy sharing. Presentations covered a visualization tool for NDTs, an NDT-based architecture for AI-driven operations, a data and agent-aware network framework for AI training, problem statements for agentic AI in network management, semantic routing for LLMs, an LLM-assisted human-in-the-loop management framework, and a model for distributed authorization policy sharing. The session concluded with a discussion on efficient data indexing for Yang push to message brokers and its relevance to NDT scalability. Recurring themes included the challenges of scalability, security (especially agent reliability and data privacy), and the need for standardized, semantic interoperability across various layers of an AI-driven network.
Key Discussion Points
-
Network Digital Twin Visualization (Felix):
- Presented DixieWIS, a real-time, multi-vendor, fine-grained visualization tool for NDTs, utilizing GNMI data.
- Key features include link colorization, bandwidth overlay, and a "time machine" for recalling historical events with granular detail.
- Noted the use of traffic shaping to mitigate bandwidth constraints inherent in NDT virtualization, proportionally scaling traffic to preserve network characteristics.
- Benefits: Enhanced operator cognition for application traffic engineering, "what-if" analysis, stability testing, and policy validation.
- Scalability Concerns: Python deque for historical data (120 snapshots/60 seconds) does not scale well; proposed Apache Kafka and time-series databases for long-term storage and improved performance. Discussed integrating compression, deduplication, and flow data (e.g., sFlow), along with tagging/thresholds for event highlighting.
- Discussion:
- A participant raised concerns about the overall usability vs. complexity of NDTs, questioning if the advantages justify the data collection, processing, and resource overhead. Felix clarified that NDTs are often used to abstract parts or characteristics of a real network rather than a complete, full-fidelity replica.
- The traffic shaping approach was clarified as dynamic, scaling traffic to a suitable fraction of the NDT's maximum capable throughput (e.g., 7 Mbps on an 80 Mbps capable virtual device) while preserving network characteristics.
-
NDT-based Architecture for AI-driven Network Operations (Sheng):
- Introduced an architecture integrating NDTs with agentic AI to enable AI-driven network operations, highlighting a shift towards a proactive operational lifecycle.
- The core autonomous domain comprises Network Digital Twin, Network Agent (defined as LLM + plan + memory + tooling + action), and a Knowledge Base, forming a closed-loop system.
- Updates to the draft include sections on security, technology, clarification of knowledge and tooling relationships, AI agent characteristics, and collaboration between small and large AI models.
- Next Steps: Further exploration of agent use cases, interaction logic between agents, knowledge base, and tools, and enhancing agent security mechanisms.
-
Data and Agent-Aware Training Network (DITN) (Hisham):
- Proposed a framework for an "intelligent, multi-plane network" specifically designed to meet the unique demands of AI systems (training, inference, agentic interaction), as traditional networks are not suited for distributed data handling.
- Key Components:
- MTRCE (Model Training and Root Compute Engine): An intelligent entity that takes a model, identifies suitable distributed data using "data descriptors," plans training, and manages model distribution.
- DRRT (Data and Reachability Topology): Maps network information to data needs, enabling the selection of appropriate datasets.
- Presented experiments in continual learning using neural architecture search-inspired (NWT) and Fisher algorithms, utilizing mini-batches as initial data descriptors, showing improved performance over random data selection.
- Research Questions: Explored challenges in defining data descriptors (beyond revealing samples), generalizing MTRCE algorithms across different learning frameworks (e.g., federated learning, knowledge distillation), and building effective DRRTs.
- Discussion:
- A participant questioned the framework's integration with existing transport layer issues and how it addresses agent-related security concerns like hallucinations. Hisham clarified DITN as an overlay but requiring underlay integration for agent and data discovery. He also suggested that DITN's intelligent components could themselves be implemented as agents, leveraging agentic protocols.
- It was emphasized that data descriptors and data topology are priority areas, particularly for enabling data governance without revealing sensitive information, potentially attracting private datasets for training.
-
Motivations and Problem Statement of Agent AI for Network Management (Yongshan):
- Presented an early-stage draft aiming to identify if agentic AI is suitable for network management and what problems it addresses, rather than proposing solutions.
- Highlighted six problems with existing technologies: architectural bottlenecks due to centralization, absence of agent-to-agent (A2A) semantic interoperability, lack of dynamic trust, real-time data validity issues, regularity of IBN intent translator engines, and oversimplification of Automatic Service Agents (ASA).
- Defined five objectives for agentic AI in network management: autonomous operation, intelligent resource orchestration, predictive security, novel network service models, and autonomous fidelity/action awareness.
- Discussion:
- Semantic interoperability in A2A was clarified as going beyond syntactic compatibility to ensure shared understanding.
- Concerns about "who watches the watchman" regarding agent security (hallucinations, disobedience) were raised, with the presenter acknowledging it as a complex problem for security experts to address collaboratively.
- It was noted that many of the proposed objectives were general to automation and autonomous networking, suggesting that Agent AI might be viewed more as a solution approach to existing challenges rather than generating entirely new objectives.
-
VLM Semantic Router for LLM Network Access (Huang Min):
- Presented a two-part proposal for a "Semantic Router" to facilitate LLM access, addressing vendor lock-in, lack of standardization for LLM API metadata, security awareness, and performance/cost trade-offs.
- Part 1 (Provider Side): An analyzer/semantic router classifies LLM prompt content (e.g., category, security sensitivity, complexity) using an LLM itself. This classification is then encoded into standardized HTTP headers. Downstream providers or routing intermediaries use these headers for intelligent routing and model selection, optimizing for security, performance, and cost.
- Part 2 (Application Side): Focuses on negotiation between application providers and the gateway. Proposed "auto parameters" (e.g.,
autofor model, tools, reasoning mode) and application-facing extension headers. This aims for vendor neutrality and seamless upgrades by allowing the gateway to dynamically select the best LLM model and capabilities. - Discussion:
- Significant security concerns were raised regarding the trust required for the semantic router (especially if it handles sensitive data like PII) and the susceptibility of HTTP headers to various types of injection attacks (both traditional and AI-specific prompt injections). The presenter acknowledged these concerns and indicated a willingness to revise the draft to incorporate more robust security considerations.
- It was suggested that locating the router more locally (e.g., at the enterprise edge as an egress gateway) could mitigate some injection risks and facilitate local policy decisions regarding data exposure.
-
LLM-assisted Network Management with Human-in-the-Loop (LLM-HiL) (Ming Che):
- Introduced a framework for LLM agents to assist network management while explicitly keeping human operators in the loop, recognizing that human participation will remain crucial even in highly autonomous networks.
- The framework comprises an enhanced telemetry model (injecting semantics), an LLM Agent Decision Model (for task execution and configuration generation with access control), and the Human Operator.
- Updates: Detailed discussions on task agent communication (Model-Connected Protocol for tool invocation via IPCP, and Agent-to-Agent protocol, treating humans as special agents) and a task agent lifecycle (creation, updates, deletion).
- An implementation in a simple network environment with four basic agents (intent understanding, policy/config generation, resource evaluation) using the LangGraph framework was mentioned.
- Discussion: Security issues related to A2A protocols were noted as an area for further discussion and inclusion in the draft.
-
Model for Distributed Authorization Policy Sharing (Lucia):
- Addressed the need for dynamic, context-aware authorization policies in distributed, automating systems due to the fragmentation of existing policy languages and tools.
- Proposed a unifying model using YAML as a canonical representation for policies, treating them "as code" to allow for fine granularity (even at the data level) and dynamic adaptation to context (topology, risk, request info).
- The framework aims to cover the entire policy lifecycle.
- Discussed requirements for distributed policy management: granularity, context awareness, token alignment, lifecycle control, and interoperability.
- Explored the transformation of operator "intents" into enforceable policy code, requiring a shared semantic layer (actor, action, context) and mechanisms for trustability in this translation.
- Discussion: Clarified the role of a Policy Administration Point (PAP) in validating YAML policy schemas, extracting policy code, and distributing it to Policy Decision Points (PDPs), which then make authorization decisions.
-
Message Keying for Yang Push to Message Broker (Thomas):
- Presented a third document (complementing existing NEMOP drafts on architecture and schema) focused on efficient data taxonomy and indexing for Yang push to message brokers, allowing SQL-like querying without direct network access.
- Detailed how Yang push subscription IDs (local) are mapped to network-significant schema IDs, enabling efficient data organization within message broker topics (using schema name + subscription type) and message keys for indexing.
- Benefits: Allows consumption of specific data subsets (e.g., IETF interfaces for certain nodes), topic compaction, and streamlined discovery via a stream catalog.
- Questions for NMRG:
- Does this solution address the "large-scale challenge" of data acquisition and storage complexity in the NDT Architecture draft?
- Should the NDT Architecture's data collection section be revised to include data mesh integration, with Yang push to message broker as a proposed solution for organizing and maintaining data?
- Discussion:
- A participant from the NDT architecture group indicated existing individual drafts on data collection methods and data generation/optimization using AI/LLM for complexity.
- It was acknowledged that while these address collection, the integration, organization, and maintenance of data within a big data context for NDTs still represents a "long way to go."
- The general sense was that the proposed solution offers a possible and scalable approach for data collection and indexing. It was suggested that the NDT Architecture document might benefit from focusing more on requirements for data handling, allowing such solutions to be evaluated against those needs. The NDT Architecture document is expected to have another last call, inviting further input.
Decisions and Action Items
- NDT Architecture Draft: All interested parties are encouraged to provide input during the upcoming last call for the NDT Architecture document, especially concerning the requirements for data integration, organization, and maintenance.
Next Steps
- DixieWIS (Felix):
- Replace Python deque with Kafka and a time-series database for improved scalability, performance, and longer historical data recall.
- Integrate compression and deduplication algorithms for efficient data storage.
- Augment interface data with flow data (e.g., sFlow) for deeper application traffic insights.
- Implement a tagging feature with thresholds/triggers for automatic highlighting of events of interest.
- NDT-based AI Operations (Sheng):
- Continue developing use cases for network agents and their interaction logic with knowledge bases and tools.
- Enhance agent security mechanisms within the proposed architecture.
- DITN (Hisham):
- Seek feedback and collaboration from the community on defining more advanced "data descriptors" and developing MTRCE algorithms, with priority on DRRT (Data and Reachability Topology) and data descriptors.
- Agent AI for Network Management (Yongshan):
- Bring more implementation examples and practical use cases to future presentations.
- Refine the stated objectives to clearly distinguish between general automation goals and specific objectives derived from Agent AI.
- VLM Semantic Router (Huang Min):
- Revise the draft to incorporate comprehensive security considerations, particularly regarding trust in the router and mitigating HTTP header/prompt injection attacks.
- Collaborate on addressing redundancy and failover mechanisms for gateway deployments.
- LLM-HiL (Ming Che):
- Add more detailed discussion on security issues related to Agent-to-Agent (A2A) protocols within the draft.
- Release the code for the LangGraph-based multi-agent implementation.
- Distributed Authorization Policy Sharing (Lucia):
- Solicit feedback on the requirements and scope of the proposed model.
- Explore integration with existing intent-based networking (IBN) efforts, including alignment with intent translation drafts and mechanisms for policy exchange.
- Provide references for the Policy Administration Point (PAP) concept in the draft.
- Message Keying for Yang Push (Thomas):
- Review existing NMRG drafts on NDT data collection, generation, and optimization to identify potential alignment or complementary aspects.
- Advocate for the NDT Architecture document to focus more on abstract data handling requirements, against which various solutions (including Yang push to message broker) can be evaluated.