MelSpecVAE

Author: Moisés Horta Valenzuela, 2021

Español:

English:

MelSpecVAE is a Variational Autoencoder that can synthesize Mel-Spectrograms which can be inverted into raw audio waveform. Currently you can train it with any dataset of .wav audio at 44.1khz Sample Rate and 16bit bitdepth.

Listen to audio examples here: https://soundcloud.com/h-e-x-o-r-c-i-s-m-o-s/sets/melspecvae-variational

Features:

Interpolate through 2 different points in the latent space and synthesize the 'in between' sounds.
Generate short one-shot audio
Synthesize arbitrarily long audio samples by generating seeds and sample from the latent space. Noise types for generating Z-vectors are uniform, Perlin and fractal.

Credits:

VAE neural network architecture coded following 'The Sound of AI' Youtube tutorial series by Valerio Velardo
Some utility functions from Marco Passini's MelGAN-VC Jupyter Notebook.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
LICENSE		LICENSE
MelSpecVAE_v1.ipynb		MelSpecVAE_v1.ipynb
MelSpecVAE_v1_Esp.ipynb		MelSpecVAE_v1_Esp.ipynb
README.md		README.md
download-from-youtube.sh		download-from-youtube.sh
generate.py		generate.py
melspecVAE.png		melspecVAE.png
requirements.txt		requirements.txt
train.py		train.py
utility.py		utility.py
vae.py		vae.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MelSpecVAE

About

Releases

Packages

Languages

License

moiseshorta/MelSpecVAE

Folders and files

Latest commit

History

Repository files navigation

MelSpecVAE

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages