Markdown Version | Session Recording
Session Date/Time: 26 Jul 2024 16:30
mlcodec
Summary
The mlcodec working group met to discuss updates on several drafts, including the OPEs extension mechanisms, de-redundancy, and speech coding enhancements. Tim presented a proposal for a new repeat extensions extension (RTE) to reduce overhead. Jean-Marc discussed progress on the DRED (de-redundancy) draft, focusing on specification methods and the inclusion of neural network weights. Yan provided an update on speech coding enhancements, presenting test results and addressing concerns about performance with out-of-domain signals.
Key Discussion Points
-
Opus Extension Draft (Tim):
- Tim presented a new "repeat these extensions" extension (RTE) to reduce overhead, especially for multi-frame packets.
- Concerns were raised about implementation complexity.
- The working group agreed that a spec text draft should be provided.
- Discussion of the extension ID numbering proposal and different ranges for short, long, and structural extensions.
-
Opus Dred Draft (Jean-Marc):
- Discussion on how to specify the DRED specification, including a normative feature decoder.
- Discussion of binary weight format, fixed-point vs. floating-point representation and where this file will be published.
- Proposal to exclude normative vocoder. Options for handling vocoder were presented including defining minimal objective metrics.
- Concerns were raised regarding reproducibility and confidence in weights.
- The working group agreed on creating a reproduction method for model data.
-
Speech Coding Enhancements (Jan):
- Update on Opus 1.5 and its speech enhancement methods.
- Testing and requirements for speech coding enhancements, and testing done on music and double talk.
- Discussion of potential new datasets to use for testing
Decisions and Action Items
- Action Item (Tim): Write up spec text for the "repeat these extensions" extension and post it to the mailing list.
- Action Item (Jean-Marc): Draft the DRED specification based on his recommendations, including sparse decoders. Put the draft on the mailing list.
- Action Item (Jean-Marc): Provide information to reproduce the DRED weights, either on the mailing list or in a Git repository.
- Action Item (Chairs): Consult with the AD about how to reference a binary blob of weights in an RFC.
- Action Item (Jan): Further testing with music data on music using a standardized music sample set called "squam".
Next Steps
- Tim will provide the spec text for the RTE extension.
- Jean-Marc will provide information to reproduce training and weights for the DRED, and draft DRED specification.
- The Working Group will evaluate the proposals on the mailing list.
- The chairs will consult with the AD about how to handle the normative reference to the weights.
- Continue discussion on the mailing list.