**Session Date/Time:** 06 Dec 2022 20:00 # [TOOLS](../wg/tools.html) ## Summary The TOOLS Working Group interim meeting discussed strategies for consolidating and improving access to IETF artifacts, including RFCs, Internet-Drafts, and meeting proceedings. Key themes included creating a unified online presence (e.g., `docs.ietf.org`), managing archival URLs, enhancing search engine optimization (SEO), ensuring clear status signaling for documents, and supporting both human and programmatic consumption of content. There was a general sense of agreement on the benefits of consolidation and clear status labeling, while acknowledging challenges with legacy links, branding, and consistency for various document types. ## Key Discussion Points * **Consolidation of IETF Artifacts**: * A proposal was made to consolidate all IETF work products (RFCs, Internet-Drafts, meeting proceedings, slides, status changes, etc.) into a single, unified online location (e.g., `docs.ietf.org`). * **Pros**: Improved discoverability for external users, better SEO, more consistent user experience, and simpler programmatic linking for external entities. * **Concerns**: * **Archival URLs**: Strong desire to avoid changing existing archival RFC URLs. A proposed solution was to use redirects from existing RFC Editor URLs to the new consolidated domain, ensuring mime type consistency for tools that follow redirects. * **Branding**: Potential issues with branding different streams (IETF, IRTF, Independent) if all reside under a single `ietf.org` domain, possibly blurring distinctions (e.g., between research and standards). * **Document Status**: Risk of confusing users if approved RFCs and provisional Internet-Drafts are hosted on the same domain, particularly if the domain name implies approval (e.g., `rfceditor.org`). A general sense of those present was that `docs.ietf.org` would be a neutral name. * **Content-Forward Landing Pages vs. Raw Content**: * General agreement that default document views should be content-forward, human-readable landing pages, clearly labeled with document metadata (status, stream, relationships). * A need for separate, canonical URLs for "unadulterated archive raw content" was emphasized for machine-based tools and archival integrity (e.g., `docs-raw.ietf.org`). * The current RFC Editor practice of not displaying content directly on landing pages was noted as a factor in poor SEO. * **"Current" Version Pronouns and Document Relationships**: * Discussion on providing mechanisms to link to the "most recent" version of a document (e.g., an Internet-Draft). * **Concerns**: Distinguishing between guiding users to a newer, obsoleting RFC versus an unrelated, ongoing Internet-Draft. Avoiding automatic redirects that could link an RFC to an unapproved draft. * **Proposed Solution**: Utilize clear metadata in a top bar on the document page to indicate relationships (e.g., obsoleted by, updated by) and point to newer versions, rather than automatic redirects for RFCs. For Internet-Drafts, a "latest" pronoun is useful, but the transition from individual to working group drafts can be confusing due to numbering. * **Document Formats and Presentation Evolution**: * Agreement that the default presentation should be the "best human-readable format available" (e.g., HTML from XML), with links to other specific raw formats. * Acknowledged that the "best" format for a document at a given URI might change over time (e.g., from `htmlized` text to XML-generated HTML), which humans can adapt to, but machines might struggle with. * **References in Documents**: A proposal was made to dynamically alter references within documents when displayed on `docs.ietf.org` to point to the `docs.ietf.org` version of referenced RFCs, while keeping the original archival RFC Editor links in the underlying archival form. This approach was met with reservations due to the potential for "not true" URLs and confusion. * **SEO Optimization vs. Community Workflows**: * The importance of SEO was recognized, with `rel=canonical` HTTP headers suggested as a mechanism to direct search engines to the intended archival location (e.g., RFC Editor for RFCs). This would help prevent alternative versions (e.g., `tools.ietf.org` copies) from outranking the authoritative source. * A general sense that SEO should be a high priority without unduly constraining internal community workflows. * **Signaling Internet-Draft Status**: * The current situation where `www.ietf.org` hosts individual contributions before they are official working documents was noted as a source of confusion. * Placing Internet-Drafts on `docs.ietf.org` could exacerbate this by making them appear more official. * **Proposed Solution**: Clear status labeling within a visible top bar on the document page (e.g., "This is an Internet-Draft, not an IETF standard"). The idea of a separate `internet-drafts.org` domain was briefly floated but not pursued, favoring consolidation with clear internal labeling. * **Programmatic Document Retrieval and APIs**: * A distinction was made between human-consumable web pages and machine-consumable data. The human-readable `docs.ietf.org` pages can prioritize human usability, allowing changes in presentation. * Programmatic access should be served by a consistent API (e.g., a JSON API, potentially leveraging `RFCxxxx.json` files for metadata) that provides the necessary information for retrievals, rather than relying on strict, unchanging URL naming conventions for human-facing pages. * Challenges with legacy programmatic links embedded in documents over decades were acknowledged as a constraint on future changes. * **Archival Persistence of URLs**: * Three classes of URLs were identified: 1. **Archival URIs**: Supposed to be immutable (e.g., published RFCs). 2. **"Please try not to change" URIs**: Expected to be stable but might evolve (e.g., persistent draft versions). 3. **"Totally up for grabs" URIs**: Expected to change (e.g., latest version of a draft). * Commitment to being clear about the status of new URLs and their expected longevity. * **Bulk Document Retrieval and Mirroring**: * Consensus to continue providing services for individuals and organizations to mirror all IETF documents (e.g., via HTTP, rsync). * **Explored mechanisms**: Checking documents into a Git repository (e.g., GitHub) was suggested as a potential backend for managing document evolution (especially for smaller artifacts like charters) and as a source for building rsync mirrors. Elasticsearch was mentioned as an alternative for search-focused access. The primary concern for Git was managing static artifacts, not necessarily full document revision control. * **Document Signing and Timestamping**: * Reaffirmed the importance of continuing to sign and timestamp Internet-Drafts and RFCs. * Timestamping is critical for establishing prior art in legal contexts. Signing is the most cost-effective way to achieve this. ## Decisions and Action Items **Decisions Made:** * A general sense of those present indicates agreement on consolidating IETF artifacts into a single access point (e.g., `docs.ietf.org`), with appropriate redirects for existing archival RFC URLs. * Landing pages will be content-forward, human-readable, and clearly labeled with metadata, aiming to improve SEO. * Separate, canonical URLs for raw, unadulterated document content will be provided for machine consumption. * Mechanisms for providing the "latest" version of a document will be implemented, using metadata on the document page rather than aggressive redirects for published RFCs. * Services for bulk document retrieval and mirroring will continue to be provided. * The practice of signing and timestamping Internet-Drafts and RFCs will continue. **Action Items:** * The chair will compile and circulate these meeting minutes. * Participants are encouraged to send any further thoughts or detailed opinions to the tools mailing list. ## Next Steps * Further discussions are needed to address the details of "findability" (search and navigation aspects) within the consolidated platform. * Refine the technical architecture for the consolidated platform, considering the balance between human usability, programmatic access, and archival integrity, especially regarding the management of URL consistency and document format evolution.