Add alphas to control language and speaker balancer #1216

Edresson · 2022-02-08T17:09:46Z

This PR fixes the issue/suggestion reported in #1185 .

It permits the combination of weights for speaker and language balance. I normalize the language and speaker weights and added an independent alpha for each one.I performed the sum of the weights multiplied by the respective alphas of language and speaker balancer. In this way, we can control the influence of speaker and language in the batch balancer and easily add a new balancer if necessary in the future.

vince62s · 2022-02-08T17:26:11Z

Hi @Edresson . I am not sure to fully understand the logic.
say language_alpha = 10 and speaker_alpha = 1
hos does it behave exactly from a language balancing standpoint, from a speaker balancing standpoint and from a language/speaker standpoint.
thanks

Edresson · 2022-02-08T19:02:16Z

Hi @Edresson . I am not sure to fully understand the logic. say language_alpha = 10 and speaker_alpha = 1 hos does it behave exactly from a language balancing standpoint, from a speaker balancing standpoint and from a language/speaker standpoint. thanks

In this way, we can control the language and speaker influence. In this case ( language_alpha = 10 and speaker_alpha = 1), language will generate the highest weights and speaker small ones. During the sample selection in a language X, a speaker with few samples will have more probably to be chosen than a speaker with more samples, however, we will keep the language balancer intact (because it has a high alpha). In theory, the alphas can be used to create levels of balancing. Balancers with the highest alphas will have priority (in this case language).

TTS/tts/models/base_tts.py

TTS/tts/utils/speakers.py

tests/data_tests/test_samplers.py

erogol · 2022-02-11T11:36:41Z

I like the idea of computing weights and then sampling 👍

erogol · 2022-02-11T14:48:31Z

I am about to make formatters return List[Dict], I guess your changes are affected by that too. So maybe we wait until I push those changes and rebase this PR.

TTS/tts/models/base_tts.py

TTS/utils/samplers.py

CLAassistant · 2022-02-23T13:51:32Z

All committers have signed the CLA.

TTS/config/shared_configs.py

vince62s · 2022-03-11T11:51:13Z

@Edresson sorry to revive this discussion but the 0.6.2 is about to be merged including this.

as is, let say we have a very abnormal distribution of speakers, eg: a few speakers with very little number of samples. Those will be balanced with a probability which is the same as another speaker with a huge amount of data. Am I correct ?
I think there cold be some edges where some datasets include noisy speakers. I may be wrong though.

Edresson · 2022-03-11T13:26:32Z

@Edresson sorry to revive this discussion but the 0.6.2 is about to be merged including this.

as is, let say we have a very abnormal distribution of speakers, eg: a few speakers with very little number of samples. Those will be balanced with a probability which is the same as another speaker with a huge amount of data. Am I correct ? I think there cold be some edges where some datasets include noisy speakers. I may be wrong though.

Yeah, the objective of the speaker balancer is that all speakers appear in a similar frequency in the batch. If the user has noisy speakers he needs to remove these speakers from the dataset. Unfortunately, we can't control it :(.

vince62s · 2022-03-11T13:29:45Z

Well somehow it is a change in behavior I think vs previous version, so might be good to be specific in the doc.

Edresson · 2022-03-11T13:36:26Z

Well somehow it is a change in behavior I think vs previous version, so might be good to be specific in the doc.

By default use_speaker_weighted_sampler and use_language_weighted_sampler are disable. So if the user does not enable it, no change exists between the versions :). The only thing that this PR, changes is that currently, we can enable the language weighted sampler and the Speaker weighted sampler together (and use multiples GPUS with these samplers).

Edresson force-pushed the dev branch from 4727cf6 to 02a4244 Compare February 10, 2022 11:20

Edresson requested a review from erogol February 10, 2022 13:36

Edresson force-pushed the dev branch 2 times, most recently from c7b6b5f to 525eb93 Compare February 10, 2022 17:33

erogol requested changes Feb 11, 2022

View reviewed changes

TTS/tts/models/base_tts.py Outdated Show resolved Hide resolved

TTS/tts/utils/speakers.py Outdated Show resolved Hide resolved

tests/data_tests/test_samplers.py Show resolved Hide resolved

Edresson force-pushed the dev branch from 7456116 to 726ccca Compare February 11, 2022 12:32

Edresson force-pushed the dev branch from d5f23fe to d31168b Compare February 21, 2022 14:29

Edresson requested a review from erogol February 21, 2022 16:01

erogol requested changes Feb 22, 2022

View reviewed changes

TTS/tts/models/base_tts.py Show resolved Hide resolved

TTS/tts/models/base_tts.py Show resolved Hide resolved

Edresson force-pushed the dev branch from c3bacde to 05f8e4c Compare February 22, 2022 21:17

Edresson requested a review from erogol February 22, 2022 21:31

erogol requested changes Feb 23, 2022

View reviewed changes

TTS/tts/models/base_tts.py Show resolved Hide resolved

TTS/utils/samplers.py Outdated Show resolved Hide resolved

Edresson force-pushed the dev branch from 449f81e to 2416bef Compare February 23, 2022 13:57

Edresson requested a review from erogol March 1, 2022 20:06

erogol approved these changes Mar 2, 2022

View reviewed changes

Edresson added 8 commits March 7, 2022 15:52

Add alphas to control language and speaker balancer

f5ecbf3

Add docs for speaker and language samplers

7245444

Change the Samplers weights to float for save memory

4c3c198

Change the test_samplers to unittest format

ebfa2b9

Add get_sampler method in BaseTTS

9af8073

Fix rebase issues

77f527a

Add language and speaker samplers support for DDP training

c34f65f

Rename distributed sampler wrapper

11aa2fc

Edresson force-pushed the dev branch from 238fab6 to 11aa2fc Compare March 7, 2022 18:57

Remove the DistributedSamplerWrapper and use the one from Trainer

d212cd9

Edresson force-pushed the dev branch 2 times, most recently from 24c0511 to ee61689 Compare March 7, 2022 19:35

Bugfix after rebase

0898f6e

Edresson force-pushed the dev branch from ee61689 to 0898f6e Compare March 7, 2022 19:44

Edresson requested a review from erogol March 7, 2022 19:59

erogol requested changes Mar 9, 2022

View reviewed changes

TTS/config/shared_configs.py Outdated Show resolved Hide resolved

Move the samplers config to tts config

60e51b5

Edresson force-pushed the dev branch from 8319f03 to 60e51b5 Compare March 10, 2022 13:51

erogol approved these changes Mar 10, 2022

View reviewed changes

erogol merged commit 917f417 into coqui-ai:dev Mar 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add alphas to control language and speaker balancer #1216

Add alphas to control language and speaker balancer #1216

Edresson commented Feb 8, 2022

vince62s commented Feb 8, 2022

Edresson commented Feb 8, 2022 •

edited

Loading

erogol commented Feb 11, 2022

erogol commented Feb 11, 2022 •

edited

Loading

CLAassistant commented Feb 23, 2022 •

edited

Loading

vince62s commented Mar 11, 2022

Edresson commented Mar 11, 2022 •

edited

Loading

vince62s commented Mar 11, 2022

Edresson commented Mar 11, 2022 •

edited

Loading

Add alphas to control language and speaker balancer #1216

Add alphas to control language and speaker balancer #1216

Conversation

Edresson commented Feb 8, 2022

vince62s commented Feb 8, 2022

Edresson commented Feb 8, 2022 • edited Loading

erogol commented Feb 11, 2022

erogol commented Feb 11, 2022 • edited Loading

CLAassistant commented Feb 23, 2022 • edited Loading

vince62s commented Mar 11, 2022

Edresson commented Mar 11, 2022 • edited Loading

vince62s commented Mar 11, 2022

Edresson commented Mar 11, 2022 • edited Loading

Edresson commented Feb 8, 2022 •

edited

Loading

erogol commented Feb 11, 2022 •

edited

Loading

CLAassistant commented Feb 23, 2022 •

edited

Loading

Edresson commented Mar 11, 2022 •

edited

Loading

Edresson commented Mar 11, 2022 •

edited

Loading