Skip to content

Releases: keonlee9420/Comprehensive-Transformer-TTS

v0.2.1

06 Mar 04:02
Compare
Choose a tag to compare

Fix and update codebase & pre-trained models with demo samples

  1. Fix variance adaptor to make it work with all combinations of building block and variance type/level
  2. Update pre-trained models with demo samples of LJSpeech and VCTK under "transformer_fs2" building block and "cwt" pitch conditioning
  3. Share the result of ablation studies of comparing "transformer" vs. "transformer_fs2" paired among three types of pitch conditioning ("frame", "ph", and "cwt")

v0.2.0

18 Feb 10:49
Compare
Choose a tag to compare

A lot of improvements with new features!

  1. Prepare two different types of data pipeline in preprocessor to maximize unsupervised/supervised duration modelings

  2. Adopt wavelet for pitch modeling & loss

  3. Add fine-trained duration loss

  4. Apply var_start_steps for better model convergence, especially under unsupervised duration modeling

  5. Remove dependency of energy modeling on pitch variance

  6. Add "transformer_fs2" building block, which is more close to the original FastSpeech2 paper

  7. Add two types of prosody modeling methods

  8. Loss camparison on validation set:

    • LJSpeech - blue: v0.1.1 / green: v0.2.0

    • VCTK - skyblue: v0.1.1 / orange: v0.2.0

v0.1.1

27 Sep 13:49
Compare
Choose a tag to compare
multi-speaker aligner

v0.1.0

24 Sep 17:20
Compare
Choose a tag to compare
v0.1.0 Pre-release
Pre-release
Initial commit