Adding OverFlow #2183

shivammehta25 · 2022-12-04T20:02:39Z

This is the model from the paper: https://arxiv.org/abs/2211.06892
Audio samples: https://shivammehta25.github.io/OverFlow/

CLAassistant · 2022-12-04T20:02:44Z

All committers have signed the CLA.

TTS/tts/models/overflow.py

TTS/tts/utils/helpers.py

TTS/tts/layers/losses.py

recipes/ljspeech/tacotron2-DDC/run-November-24-2022_01+19PM-c50d89fc/train_tacotron_ddc.py

recipes/ljspeech/tacotron2-DDC/run-November-24-2022_01+41PM-c50d89fc/train_tacotron_ddc.py

recipes/ljspeech/tacotron2-DDC/run-November-24-2022_02+10PM-c50d89fc/train_tacotron_ddc.py

recipes/ljspeech/tacotron2-DDC/run-November-24-2022_12+57PM-c50d89fc/train_tacotron_ddc.py

recipes/ljspeech/tacotron2-DDC/train_tacotron_ddc.py

king-dahmanus · 2022-12-05T19:29:40Z

Good idea, from the samples I've hird the thing is quite good. What about adding neural hmm, or is it the same thing but just upgraded?

shivammehta25 · 2022-12-05T20:03:54Z

It shares neural hmm as its core instead of attention. The benefits of neural HMM TTS are that it's almost half the number of parameters and it works very well even in a low resource setting i.e when we don't have enough data to train on. Once we merge this it would require very little change to add the neural HMM TTS into the system, which I plan to do as well.

king-dahmanus · 2022-12-05T21:52:04Z

That's nice. How fast is it on the cpu? I previously suggested improving its speed for screen readers but I didn't realize how foolish that was untill recently. So how fast, or faster/slower is it compared to tacotron2 with hifigan or vits?

erogol · 2022-12-09T22:52:42Z

Cool the PR is ready. I'll first try the LJSpeech recipe and let you know how it goes.

TTS/tts/layers/overflow/neural_hmm.py

erogol · 2022-12-10T01:49:13Z

TTS/tts/layers/overflow/neural_hmm.py

+            # Process Autoregression
+            h_memory, c_memory = self._process_ar_timestep(t, ar_inputs, h_memory, c_memory)
+            # Get mean, std and transition vector from decoder for this timestep
+            # Note: Gradient checkpointing currently doesn't works with multiple gpus inside a loop


If this is a blocker to use multi-gpu we should explain this in the model docstring and docs too

Is this a model specific issue or rooted from torch

This is a torch issue gradient checkpointing in a loop is currently not supported in DDP. It works fine for Multi-GPU if we turn off the flag by use_grad_checkpointing=False but will significantly increase memory usage while training. This is because to compute the actual data likelihood (not an approximation using MAS/Viterbi) we must use all the states at the previous time step during the forward pass to decide the probability mass at the current step.

I have added some information please take a look and see if it needs any more explanation.

TTS/tts/datasets/dataset.py

Edresson · 2022-12-10T16:12:50Z

Great PR @shivammehta25 Thanks for the contribution :).

…del instead

shivammehta25 · 2022-12-10T17:00:41Z

Oh my bad! I clicked one too many time the request for review button. Sorry for the spam xP.
And @Edresson Thank you very much :D Glad you liked it!

… and can be dumped as json

erogol · 2022-12-12T07:34:43Z

@shivammehta25 how do you compute lj_parameters.pt?

shivammehta25 · 2022-12-12T07:51:58Z

Inside TTS/tts/layers/overflow/comon_layers.py. There exists OverflowUtils.get_data_parameters_for_flat_start which is called in on_init_start of the model. It loads the training data and computes the means and std (a single scalar) and for the whole training set to standardise the data during training (the model has mean and std as registered_buffers). This is used during the initialization in OutputNet, where the weights of the last layer are set to zeros and bias is set to 0 and 1 (a cool hack from fixup initialization). It also computes an average transition probability to start with an "ideal diagonal alignment". I sort of "cache" these lj_parameters.pt them near the cached phonemes.

erogol · 2022-12-12T11:43:43Z

Ok thanks. I missed it for some reason. I also trained an LJSpeech model. It works great.

But I think we also need to train a vocoder that we can do separately.

Mergin it now 👍

shivammehta25 · 2022-12-12T12:07:28Z

I tried synthesising waveforms with the universal hifigan vocoder it works pretty well!

shivammehta25 added 11 commits November 26, 2022 22:09

Adding encoder

405bffe

currently modifying hmm

d607993

Adding hmm

a324920

Adding overflow

8628648

Adding overflow setting up flat start

6ec83c4

Removing runs

783a982

adding normalization parameters

10f15e0

Fixing models on same device

aff8b1f

Training overflow and plotting evaluations

62941d6

Adding inference

f448ea4

At the end of epoch the test sentences are coming on cpu instead of gpu

ff33837

shivammehta25 commented Dec 4, 2022

View reviewed changes

TTS/tts/models/overflow.py Outdated Show resolved Hide resolved

Adding figures from model during training to monitor

3edb0d2

erogol requested changes Dec 5, 2022

View reviewed changes

shivammehta25 added 9 commits December 5, 2022 11:13

reverting tacotron2 training recipe

5fc800c

fixing inference on gpu for test sentences on config

427dfe5

moving helpers and texts within overflows source code

ecc12c6

renaming to overflow

b86f3f8

moving loss to the model file

995ee93

Fixing the rename

5b0fe46

Model training but not plotting the test config sentences's audios

5377f87

Formatting logs

bd5be6c

Changing model name to camelcase

755aa6f

Fixing test log

1350a4b

shivammehta25 added 2 commits December 6, 2022 08:38

Fixing plotting bug

3c986fd

Adding some tests

4a5b1a0

shivammehta25 added 2 commits December 9, 2022 09:45

Adding all tests for overflow

c3d0167

making changes to camel case in config

ddefe34

shivammehta25 changed the title ~~[WIP] Adding OverFlow~~ Adding OverFlow Dec 9, 2022

shivammehta25 marked this pull request as ready for review December 9, 2022 09:52

shivammehta25 requested a review from erogol December 9, 2022 10:11

erogol reviewed Dec 10, 2022

View reviewed changes

TTS/tts/layers/overflow/neural_hmm.py Show resolved Hide resolved

erogol reviewed Dec 10, 2022

View reviewed changes

Adding information about parameters and docstring

c2df9f3

shivammehta25 requested a review from erogol December 10, 2022 09:44

Edresson reviewed Dec 10, 2022

View reviewed changes

TTS/tts/datasets/dataset.py Outdated Show resolved Hide resolved

removing compute_mel_statistics moved statistic computation to the mo…

9927434

…del instead

shivammehta25 requested review from Edresson and erogol and removed request for erogol and Edresson December 10, 2022 16:58

Edresson approved these changes Dec 10, 2022

View reviewed changes

Added overflow in readme

340cd0b

Edresson approved these changes Dec 10, 2022

View reviewed changes

Adding more test cases, now it doesn't saves transition_p like tensor…

aca3fe1

… and can be dumped as json

erogol approved these changes Dec 12, 2022

View reviewed changes

erogol merged commit 3b8b105 into coqui-ai:dev Dec 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding OverFlow #2183

Adding OverFlow #2183

shivammehta25 commented Dec 4, 2022 •

edited

Loading

CLAassistant commented Dec 4, 2022 •

edited

Loading

king-dahmanus commented Dec 5, 2022

shivammehta25 commented Dec 5, 2022 •

edited

Loading

king-dahmanus commented Dec 5, 2022

erogol commented Dec 9, 2022

erogol Dec 10, 2022

erogol Dec 10, 2022

shivammehta25 Dec 10, 2022 •

edited

Loading

shivammehta25 Dec 10, 2022

Edresson commented Dec 10, 2022

shivammehta25 commented Dec 10, 2022 •

edited

Loading

erogol commented Dec 12, 2022

shivammehta25 commented Dec 12, 2022

erogol commented Dec 12, 2022

shivammehta25 commented Dec 12, 2022

Adding OverFlow #2183

Adding OverFlow #2183

Conversation

shivammehta25 commented Dec 4, 2022 • edited Loading

CLAassistant commented Dec 4, 2022 • edited Loading

king-dahmanus commented Dec 5, 2022

shivammehta25 commented Dec 5, 2022 • edited Loading

king-dahmanus commented Dec 5, 2022

erogol commented Dec 9, 2022

erogol Dec 10, 2022

Choose a reason for hiding this comment

erogol Dec 10, 2022

Choose a reason for hiding this comment

shivammehta25 Dec 10, 2022 • edited Loading

Choose a reason for hiding this comment

shivammehta25 Dec 10, 2022

Choose a reason for hiding this comment

Edresson commented Dec 10, 2022

shivammehta25 commented Dec 10, 2022 • edited Loading

erogol commented Dec 12, 2022

shivammehta25 commented Dec 12, 2022

erogol commented Dec 12, 2022

shivammehta25 commented Dec 12, 2022

shivammehta25 commented Dec 4, 2022 •

edited

Loading

CLAassistant commented Dec 4, 2022 •

edited

Loading

shivammehta25 commented Dec 5, 2022 •

edited

Loading

shivammehta25 Dec 10, 2022 •

edited

Loading

shivammehta25 commented Dec 10, 2022 •

edited

Loading