-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add duration predictor training #10
Conversation
Hi, I will check and review the code ASAP.
How are the results? Can you share some samples? No need to share the weights, just wav samples if possible, to see the output quality. Thanks! |
Thanks for the samples. They do sound good. Can I ask if you transferred VITS-1 weights to VITS-2 or trained VITS-2 from scratch? |
I trained VITS-2 from-scratch. Here is my configs: vits-2-configs.json. I trained it on 4 RTX 3090 24 GVRAM |
Interesting! Can I add your samples on README of this repo? I still would advice to add discriminator and train the model. |
Thanks for your suggestions. I'm planning to train VITS-2 with the LJSpeech dataset next week. I will send you the checkpoint of LJSpeech and generated samples. |
Hi, I updated the code with 2 discriminators; please check it if you are interested. |
Thank you so much for updating the new discriminators. I will test and train with new discriminators. I'll share the result with you as soon as possible. |
@ductho9799 hello. What was improved in speech? I'm curious just pronunciation or other characteristics of the voice? |
@p0p4k @egorsmkv Hello, I trained a version of VITS-2 with the LJSpeech dataset. I share the weights, config, and audio samples of VITS-2 in VITS-2. Can you help me evaluate the quality of VITS-2 on LJSpeech dataset? I trained VITS-2 with 390 epochs and the trained duration predictor with 200 epochs. |
@ductho9799 change access of your drive file. Thanks. |
Yes, try again it, please. |
Thanks for sharing the checkpoints! Samples sound not bad! Can you train the latest code with duration discriminator and HIFIGan Discriminator (multiperiod disc) with nosdp? |
I am booting a cloud GPU right now to train as well. I want to check if the duration discriminator is working or not. (no nan, inf values, etc) |
|
If the training works well, I will share the checkpoints so you can continue to train on that; else will try to fix the code before weekend. |
@ductho9799 checkpoints are on main page readme. Good luck! |
@ductho9799 Can you share file symbols.py, i trained in infore dataset but result not good. I used config like you. :((. All config, model and train.log in drive. Can you give me some advice? Thank you very much. |
@ductho9799 have you tried with an external embedding extractor? |
Did you mean bert-vit2s? |
@ductho9799 Can you please share your symbols? Cause while trying to inference it, I am getting this error |
Hi @p0p4k , thanks for this great repository. I have trained the vits2 model for Bangla language for 100k steps. I have got some promising results. You can view the audio samples in this link Bangla Audio Samples |
Hello p0p4k! Your repository is very awesome. I trained VITS2 with your code on my private data. I have implemented duration predictor training code. You can test it.