Trick to Upsampling to High sampling rates using VITS model #1456

Edresson · 2022-03-28T19:42:52Z

No description provided.

Edresson · 2022-04-21T20:13:46Z

To Fix the zoo-tests we need to merge this coqpit PR.

TTS/tts/models/vits.py

erogol · 2022-04-22T09:28:33Z

TTS/tts/models/vits.py

-        assert batch["spec"].shape[2] == batch["mel"].shape[2], f"{batch['spec'].shape[2]}, {batch['mel'].shape[2]}"
+
+        if not self.args.TTS_part_sample_rate:
+            assert batch["spec"].shape[2] == batch["mel"].shape[2], f"{batch['spec'].shape[2]}, {batch['mel'].shape[2]}"


can you also implement the corresponding assert when upsampling is used?

TTS/server/server.py

TTS/tts/models/vits.py

TTS/utils/synthesizer.py

erogol · 2022-04-22T09:39:15Z

TTS/vocoder/configs/hifigan_config.py

@@ -104,6 +104,7 @@ class HifiganConfig(BaseGANVocoderConfig):
            "resblock_type": "1",
        }
    )
+    discriminator_model_params: dict = field(default_factory=lambda: {"periods": [2, 3, 5, 7, 11]})


HifiGAN changes must be a separate PR

Yeah, my bad, I removed these commits, it is in the PR: #1526

erogol

In general looks good, asked for a bunch of changes

erogol · 2022-04-25T09:03:14Z

recipes/vctk/hifigan/train_hifigan.py

+from TTS.vocoder.datasets.preprocess import load_wav_data
+from TTS.vocoder.models.gan import GAN
+
+output_path = "/home/julian/workspace/train"


Why it is pointin Julian's workspace?

Why there is HiFiGan recipe update in VITS PR?

My bad, I removed these commits, it is in the PR: #1526

erogol · 2022-04-25T09:04:26Z

tests/tts_tests/test_vits_speaker_emb_train_upsampling_interpolation_approach.py

@@ -0,0 +1,90 @@
+import glob


We don't need these costly tests.

Just write unittests in test_vits.py

Indeed, good catch, Done :)

erogol · 2022-04-25T09:05:09Z

tests/tts_tests/test_vits_speaker_emb_train_upsampling_vocoder_approach.py

@@ -0,0 +1,90 @@
+import glob


We don't need this one too if you tests things in test_vits.py

erogol · 2022-04-26T09:47:24Z

Awesome PR

Edresson force-pushed the VITS-upsample branch 3 times, most recently from c583a8a to a5f5eba Compare March 28, 2022 22:05

Edresson force-pushed the VITS-upsample branch from ec7f8e7 to b8fabec Compare April 21, 2022 12:14

Edresson marked this pull request as ready for review April 21, 2022 19:58

Edresson requested a review from erogol April 21, 2022 19:58

erogol requested changes Apr 22, 2022

View reviewed changes

erogol reviewed Apr 25, 2022

View reviewed changes

Edresson added 14 commits April 25, 2022 08:03

Add upsample VITS support

99ecf35

Fix the bug in inference

faec639

Fix lint checks

18d110e

Add RMS based norm in save_wav method

17b6486

Style fix

9252b3c

Add the period for VITS multi-period discriminator in model_args

adcc2f8

Bug fix in speaker encoder load in inference time

c32082a

Add unit tests

984e2d6

Remove useless detach_z_vocoder parameter

1e75942

Add docs for VITS upsampling

d495e45

Fix the docs

3f3efe8

Rename TTS_part_sample_rate to encoder_sample_rate

b3e2c58

Add upsampling_init and upsampling_z methods

ce7138d

Add asserts for encoder_sample_rate part

f4e5329

Edresson force-pushed the VITS-upsample branch from 1c883f0 to f4e5329 Compare April 25, 2022 11:04

Move upsampling tests to test_vits.py

af98ec8

Edresson requested a review from erogol April 25, 2022 16:07

erogol self-assigned this Apr 26, 2022

erogol approved these changes Apr 26, 2022

View reviewed changes

erogol merged commit 8d228ab into dev Apr 26, 2022

erogol deleted the VITS-upsample branch April 26, 2022 09:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trick to Upsampling to High sampling rates using VITS model #1456

Trick to Upsampling to High sampling rates using VITS model #1456

Edresson commented Mar 28, 2022

Edresson commented Apr 21, 2022 •

edited

Loading

erogol Apr 22, 2022

Edresson Apr 22, 2022

erogol Apr 22, 2022

Edresson Apr 25, 2022 •

edited

Loading

erogol left a comment

erogol Apr 25, 2022

Edresson Apr 25, 2022

erogol Apr 25, 2022

Edresson Apr 25, 2022

erogol Apr 25, 2022

Edresson Apr 25, 2022

erogol commented Apr 26, 2022

Trick to Upsampling to High sampling rates using VITS model #1456

Trick to Upsampling to High sampling rates using VITS model #1456

Conversation

Edresson commented Mar 28, 2022

Edresson commented Apr 21, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Edresson Apr 25, 2022 • edited Loading

Choose a reason for hiding this comment

erogol left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

erogol commented Apr 26, 2022

Edresson commented Apr 21, 2022 •

edited

Loading

Edresson Apr 25, 2022 •

edited

Loading