SoundStream: An End-to-End Neural Audio Codec

This repository is an implementation of the article with same name.

The RVQ (stands for Residual Vector Quantizer) relies on lucidrains' repository.

I built this implementation to serve my needs and some features are missing from the original article.

Missing pieces

Denoising: this implementation is not built to denoise, so there is no conditioning signal nor Feature-wise Linear Modulation blocks.
Bitrate scalability: for now, quantizer dropout has not been implemented.

Citations

@misc{zeghidour2021soundstream,
    title   = {SoundStream: An End-to-End Neural Audio Codec},
    author  = {Neil Zeghidour and Alejandro Luebs and Ahmed Omran and Jan Skoglund and Marco Tagliasacchi},
    year    = {2021},
    eprint  = {2107.03312},
    archivePrefix = {arXiv},
    primaryClass = {cs.SD}
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
images		images
.gitignore		.gitignore
README.md		README.md
dataset.py		dataset.py
main.py		main.py
net.py		net.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SoundStream: An End-to-End Neural Audio Codec

Missing pieces

Citations

About

Releases

Packages

Languages

wesbz/SoundStream

Folders and files

Latest commit

History

Repository files navigation

SoundStream: An End-to-End Neural Audio Codec

Missing pieces

Citations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages