-
-
Notifications
You must be signed in to change notification settings - Fork 343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible BUG may make your code not achieve the best performance. (background noise, etc.) #131
Comments
Thank you for your suggestions! |
BTW, how was the final quality? Is it improved? or just related to convergence speed? |
I think that make sense, discriminator should use newest y_ for training, that make discriminator penalize generator better. U can ref two implementation on github (https://github.com/seungwonpark/melgan/blob/master/utils/train.py/#L85) and official code (https://github.com/descriptinc/melgan-neurips/blob/master/scripts/train.py/#L156). Note that official code training discriminator before generator but it's still re-compute D-fake for training generator. The quality on my language (vietnamese) is improve significantly. I'm training Ljspeech now on melgan only, i have 1M4 val audio here. I believe that the same improvement can achieve with PWG since the manner correctly. |
Thank you for sharing samples! |
BTW, i'm creating TF framework for Speech, now focus TTS :)), just like ur ESPNet :D. TF now training faster than pytorch with exact parameter, batch_size, batch_max_step, the improvement on inference speed you knew before: D. |
That sounds nice :) |
haha BTW, what is the license of this repo and ur ESPNET :)), i cann't retrain anything, there are many model and dataset, i plan to provide some training script, pretrained on my TF framework and the rest is convert from ur pytorch pretrained then support for inference only :D. Can i do that ? |
That's very cool. |
ok that nice, i will make TF models and pytorch models synchronized and use the same preprocesing here so atleast user can use pytorch for training and TF for inference :D. Again, thank for ur hard work :)). |
Hi @kan-bayashi , there is a bug i think that make ur code cann't achieve the best performance for both melgan and PWG. That is after training 1 step generator, you should re-compute y_ then use this y_ for discriminator, but seem ur code not do that. In my experiment, re-compute y_ is crucial for obtain best quality. I have my tensorflow code for melgan, i can get the same performance as ur code but just around 2M steps from scratch (don't need PWG auxiliary loss to help convergence speed), but if i don't recompute y_, my tf code at 2M steps not good as 2M steps when recompute y_
The text was updated successfully, but these errors were encountered: