Add duration predictor training #10

ductho9799 · 2023-08-17T10:36:26Z

Hello p0p4k! Your repository is very awesome. I trained VITS2 with your code on my private data. I have implemented duration predictor training code. You can test it.

p0p4k · 2023-08-17T11:03:03Z

I have implemented duration predictor training code. You can test it.

Hi, I will check and review the code ASAP.

I trained VITS2 with your code on my private data.

How are the results? Can you share some samples? No need to share the weights, just wav samples if possible, to see the output quality. Thanks!

ductho9799 · 2023-08-17T13:40:29Z

I haven't had time to experiment with LJSpeech yet. I just tested with my private Vietnamese dataset. The result of VITS2 after training duration predictor is better than VITS with my dataset.
Here are some samples created by VITS, VITS2 and Ground Truth:
VITS: vits
VITS2: vits2
Human: gt

p0p4k · 2023-08-17T14:20:52Z

Thanks for the samples. They do sound good. Can I ask if you transferred VITS-1 weights to VITS-2 or trained VITS-2 from scratch?

ductho9799 · 2023-08-17T14:41:36Z

I trained VITS-2 from-scratch. Here is my configs: vits-2-configs.json. I trained it on 4 RTX 3090 24 GVRAM

p0p4k · 2023-08-17T17:17:18Z

Interesting! Can I add your samples on README of this repo? I still would advice to add discriminator and train the model.
Also, would be great if you can turn on the other flags and check any improvement in the output? Thanks!

ductho9799 · 2023-08-18T01:49:03Z

Thanks for your suggestions. I'm planning to train VITS-2 with the LJSpeech dataset next week. I will send you the checkpoint of LJSpeech and generated samples.

p0p4k · 2023-08-21T15:19:47Z

Hi, I updated the code with 2 discriminators; please check it if you are interested.

ductho9799 · 2023-08-21T16:17:39Z

Thank you so much for updating the new discriminators. I will test and train with new discriminators. I'll share the result with you as soon as possible.

egorsmkv · 2023-08-21T20:34:03Z

@ductho9799 hello. What was improved in speech? I'm curious just pronunciation or other characteristics of the voice?

ductho9799 · 2023-08-22T02:06:04Z

@p0p4k @egorsmkv Hello, I trained a version of VITS-2 with the LJSpeech dataset. I share the weights, config, and audio samples of VITS-2 in VITS-2. Can you help me evaluate the quality of VITS-2 on LJSpeech dataset?

I trained VITS-2 with 390 epochs and the trained duration predictor with 200 epochs.

p0p4k · 2023-08-22T02:14:12Z

@ductho9799 change access of your drive file. Thanks.

ductho9799 · 2023-08-22T02:29:07Z

Yes, try again it, please.

p0p4k · 2023-08-22T02:31:28Z

Thanks for sharing the checkpoints! Samples sound not bad! Can you train the latest code with duration discriminator and HIFIGan Discriminator (multiperiod disc) with nosdp?

p0p4k · 2023-08-22T02:34:49Z

I am booting a cloud GPU right now to train as well. I want to check if the duration discriminator is working or not. (no nan, inf values, etc)

ductho9799 · 2023-08-22T02:44:40Z

Thanks for sharing the checkpoints! Samples sound not bad! Can you train the latest code with duration discriminator and HIFIGan Discriminator (multiperiod disc) with nosdp?
I can train this config at the weekend.

p0p4k · 2023-08-22T02:49:16Z

If the training works well, I will share the checkpoints so you can continue to train on that; else will try to fix the code before weekend.

p0p4k · 2023-08-23T05:10:53Z

@ductho9799 checkpoints are on main page readme. Good luck!

kingkong135 · 2023-09-19T02:45:57Z

I trained VITS-2 from-scratch. Here is my configs: vits-2-configs.json. I trained it on 4 RTX 3090 24 GVRAM

@ductho9799 Can you share file symbols.py, i trained in infore dataset but result not good. I used config like you. :((. All config, model and train.log in drive. Can you give me some advice? Thank you very much.

ngocson1804 · 2024-02-16T09:04:26Z

@ductho9799 have you tried with an external embedding extractor?

HuuHuy227 · 2024-02-17T05:10:47Z

@ductho9799 have you tried with an external embedding extractor?

Did you mean bert-vit2s?

TalapMukhamejan · 2024-07-13T18:10:40Z

@ductho9799 Can you please share your symbols? Cause while trying to inference it, I am getting this error
RuntimeError: Error(s) in loading state_dict for SynthesizerTrn:
size mismatch for enc_p.emb.weight: copying a param with shape torch.Size([184, 192]) from checkpoint, the shape in current model is torch.Size([178, 192]).

ml-maddi · 2024-12-14T15:23:41Z

Hi @p0p4k , thanks for this great repository. I have trained the vits2 model for Bangla language for 100k steps. I have got some promising results. You can view the audio samples in this link Bangla Audio Samples

Add duration predictor training

7eeb7ca

p0p4k closed this Aug 23, 2023

Subarasheese mentioned this pull request Aug 29, 2023

Error when training model: AttributeError: '_MultiProcessingDataLoaderIter' object has no attribute '_shutdown' #33

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add duration predictor training #10

Add duration predictor training #10

ductho9799 commented Aug 17, 2023

p0p4k commented Aug 17, 2023

ductho9799 commented Aug 17, 2023 •

edited

Loading

p0p4k commented Aug 17, 2023

ductho9799 commented Aug 17, 2023 •

edited

Loading

p0p4k commented Aug 17, 2023 •

edited

Loading

ductho9799 commented Aug 18, 2023

p0p4k commented Aug 21, 2023

ductho9799 commented Aug 21, 2023

egorsmkv commented Aug 21, 2023

ductho9799 commented Aug 22, 2023 •

edited

Loading

p0p4k commented Aug 22, 2023

ductho9799 commented Aug 22, 2023

p0p4k commented Aug 22, 2023

p0p4k commented Aug 22, 2023

ductho9799 commented Aug 22, 2023

p0p4k commented Aug 22, 2023

p0p4k commented Aug 23, 2023

kingkong135 commented Sep 19, 2023

ngocson1804 commented Feb 16, 2024

HuuHuy227 commented Feb 17, 2024

TalapMukhamejan commented Jul 13, 2024

ml-maddi commented Dec 14, 2024

Add duration predictor training #10

Add duration predictor training #10

Conversation

ductho9799 commented Aug 17, 2023

p0p4k commented Aug 17, 2023

ductho9799 commented Aug 17, 2023 • edited Loading

p0p4k commented Aug 17, 2023

ductho9799 commented Aug 17, 2023 • edited Loading

p0p4k commented Aug 17, 2023 • edited Loading

ductho9799 commented Aug 18, 2023

p0p4k commented Aug 21, 2023

ductho9799 commented Aug 21, 2023

egorsmkv commented Aug 21, 2023

ductho9799 commented Aug 22, 2023 • edited Loading

p0p4k commented Aug 22, 2023

ductho9799 commented Aug 22, 2023

p0p4k commented Aug 22, 2023

p0p4k commented Aug 22, 2023

ductho9799 commented Aug 22, 2023

p0p4k commented Aug 22, 2023

p0p4k commented Aug 23, 2023

kingkong135 commented Sep 19, 2023

ngocson1804 commented Feb 16, 2024

HuuHuy227 commented Feb 17, 2024

TalapMukhamejan commented Jul 13, 2024

ml-maddi commented Dec 14, 2024

ductho9799 commented Aug 17, 2023 •

edited

Loading

ductho9799 commented Aug 17, 2023 •

edited

Loading

p0p4k commented Aug 17, 2023 •

edited

Loading

ductho9799 commented Aug 22, 2023 •

edited

Loading