Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opus audio decoding support #5802

Open
vadimkantorov opened this issue Feb 3, 2025 · 4 comments
Open

Opus audio decoding support #5802

vadimkantorov opened this issue Feb 3, 2025 · 4 comments
Assignees
Labels
question Further information is requested

Comments

@vadimkantorov
Copy link

vadimkantorov commented Feb 3, 2025

Is opus codec (in ogg container - typically with .opus file extension) supported by DALI in audio decoding?

https://docs.nvidia.com/deeplearning/dali/user-guide/docs/examples/audio_processing/audio_decoder.html says that "procedure can be used for most of the well-known digital audio coding formats as well", and opus is very popular as it's produced by webrtc streams.

https://docs.nvidia.com/deeplearning/dali/user-guide/docs/operations/nvidia.dali.fn.decoders.audio.html says that "It supports the following audio formats: wav, flac and ogg". Does it mean ogg-container/format or ogg-codec? The wording is a bit ambiguous.

Thank you!


My original usecase: using https://github.com/triton-inference-server/dali_backend/ and DALI for performant audio decoding in Triton

@jantonguirao
Copy link
Contributor

Yes, DALI should support both OGG Vorbis and OGG Opus for audio decoding. I agree that the wording in the documentation is unclear. Thank you for bringing this to our attention; we will update the documentation to clarify this.

Please let us know if you find any issues with it.

@jantonguirao
Copy link
Contributor

jantonguirao commented Feb 3, 2025

Here is the PR for the documentation changes

@jantonguirao jantonguirao added the question Further information is requested label Feb 3, 2025
@vadimkantorov
Copy link
Author

Also maybe worth dropping vague "most well-known audio codecs" wording

E.g. are mp3/aac/m4a/mka supported? Even if not supported - better list all supported codecs/containers (and maybe even provide a battery of example supported audio files) - and if a popular codec is not supported, better also list it explicitly, as people will have questions about codec support matrix anyways...

@JanuszL JanuszL added this to the Release_1.47.0 milestone Feb 3, 2025
@vadimkantorov
Copy link
Author

For that example speech commands example docs - maybe best put there a hyperlink to the codec support matrix? (As if you evolve it, this place would need to be modified as well - and it's easy to forget such stuff)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants