Skip to content

Implementation of WaveNet: A Generative Model for Raw Audio

Notifications You must be signed in to change notification settings

huyouare/WaveNet-Theano

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WaveNet

Implementation of WaveNet: A Generative Model for Raw Audio

figure

Dependencies

Theano
Lasagne
Keras

#####Implementation includes:

  • Causal convolutional layers (implemented by masking)
  • Dilated (à trous) convolutional layer blocks
  • 256-class softmax
  • Sample generation
  • Downsampling from 48kHz to 24kHz
  • Conversion of bitrate from 16 bit to 8 bit via μ-law algorithm
  • Gated convolution (tanh * sigmoid) [TODO]
  • Conditional distribution (speaker, text) [TODO]
  • Context stacks [TODO]
  • Tested on VCTK (Yamagishi, 2012) data set [In-Progress]
  • Testing on music datasets [TODO]

#####DeepMind Blog Post
https://deepmind.com/blog/wavenet-generative-model-raw-audio/

#####Paper
https://arxiv.org/pdf/1609.03499.pdf

#####Parts adapted from
https://github.com/igul222/pixel_rnn and
https://github.com/kundan2510/pixelCNN

About

Implementation of WaveNet: A Generative Model for Raw Audio

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages