SuperCodec: A Neural Speech Codec with Selective Back-Projection Network

Updates

Code release. (Jul. 27, 2024)
Online demo at Github See here. (Aug. 13, 2023)
Supports 16-48 kHz at variable bitrates. (Jul. 27, 2024)

In this paper, we present SuperCodec, a neural speech codec that replaces the standard feedforward up- and downsampling layers with Selective Up-sampling Back Projection (SUBP) and Selective Down-sampling Back Projection (SDBP) modules. Our proposed method efficiently preserves the information, on the one hand, and attains rich features from lower to higher layers of the network, on the other. Additionally, we propose a selective feature fusion block in the SUBP and SDBP to consolidate the input feature maps

Supercodec

Pre-requisites

Clone this repo: git clone https://github.com/exercise-book-yq/Supercodec.git
CD into this repo: cd Supercodec
Install python requirements: pip install -r requirements.txt

Training Example

# train
python train.py --config config_v1.json

Inference Example

# inference
python inferece.py --checkpoint_file [generator checkpoint file path]

Additional Experiments

Objective evaluation testing on our test set from VCTK at 16 kHz sampling rate. We compare our proposed method with existing various codecs trained with the same configuration.

Model	Bitrate	ViSQOL	STOI(%)	WARP-Q(↓)
Supercodec	1 kbps	3.118	84.80	2.219
TiCodec	1 kbps	2.490	80.21	2.578
HiFiCodec	1 kbps	2.060	75.19	2.840
EnCodec	1 kbps	2.202	76.53	2.687

Objective evaluation testing on our test set from VCTK at 24 kHz sampling rate. We compare our proposed method with existing various codecs trained with the same configuration.

Model	Bitrate	ViSQOL	STOI(%)	WARP-Q(↓)
Supercodec	1.5 kbps	3.322	85.61	2.147
TiCodec	1.5 kbps	2.639	79.03	2.539
HiFiCodec	1.5 kbps	2.026	76.80	2.761
EnCodec	1.5 kbps	2.202	79.81	2.569

All models are non-causal and trained on LibriTTS.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
resources		resources
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
audio_to_mel.py		audio_to_mel.py
config_v1.json		config_v1.json
conv.py		conv.py
data.py		data.py
hubert_kmeans.py		hubert_kmeans.py
inference.py		inference.py
losses.py		losses.py
lstm.py		lstm.py
msstftd.py		msstftd.py
norm.py		norm.py
optimizer.py		optimizer.py
requirements.txt		requirements.txt
residual_vq.py		residual_vq.py
scheduler.py		scheduler.py
supercodec.py		supercodec.py
train.py		train.py
utils.py		utils.py
version.py		version.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SuperCodec: A Neural Speech Codec with Selective Back-Projection Network

Updates

Pre-requisites

Training Example

Inference Example

Additional Experiments

References

About

Releases

Packages

Languages

License

exercise-book-yq/Supercodec

Folders and files

Latest commit

History

Repository files navigation

SuperCodec: A Neural Speech Codec with Selective Back-Projection Network

Updates

Pre-requisites

Training Example

Inference Example

Additional Experiments

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages