Universal Vocoder

This is a restructured and rewritten version of bshall/UniversalVocoding. The main difference here is that the model is turned into a TorchScript module during training and can be loaded for inferencing anywhere without Python dependencies.

Generate waveforms using pretrained models

Since the pretrained models were turned to TorchScript, you can load a trained model anywhere. Also you can generate multiple waveforms parallelly, e.g.

import torch

vocoder = torch.jit.load("vocoder.pt")

mels = [
    torch.randn(100, 80),
    torch.randn(200, 80),
    torch.randn(300, 80),
] # (length, mel_dim)

with torch.no_grad():
    wavs = vocoder.generate(mels)

Emperically, if you're using the default architecture, you can generate 30 samples at the same time on an GTX 1080 Ti.

Train from scratch

Multiple directories containing audio files can be processed at the same time, e.g.

python preprocess.py \
    VCTK-Corpus \
    LibriTTS/train-clean-100 \
    preprocessed # the output directory of preprocessed data

And train the model with the preprocessed data, e.g.

python train.py preprocessed

With the default settings, it would take around 12 hr to train to 100K steps on an RTX 2080 Ti.

References

Towards achieving robust universal neural vocoding

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
configs		configs
data		data
models		models
README.md		README.md
preprocess.py		preprocess.py
reconstruct.py		reconstruct.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Universal Vocoder

Generate waveforms using pretrained models

Train from scratch

References

About

Releases 1

Packages

Languages

yistLin/universal-vocoder

Folders and files

Latest commit

History

Repository files navigation

Universal Vocoder

Generate waveforms using pretrained models

Train from scratch

References

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages