Possible BUG may make your code not achieve the best performance. (background noise, etc.) #131

dathudeptrai · 2020-04-24T02:03:02Z

Hi @kan-bayashi , there is a bug i think that make ur code cann't achieve the best performance for both melgan and PWG. That is after training 1 step generator, you should re-compute y_ then use this y_ for discriminator, but seem ur code not do that. In my experiment, re-compute y_ is crucial for obtain best quality. I have my tensorflow code for melgan, i can get the same performance as ur code but just around 2M steps from scratch (don't need PWG auxiliary loss to help convergence speed), but if i don't recompute y_, my tf code at 2M steps not good as 2M steps when recompute y_

kan-bayashi · 2020-04-24T03:59:59Z

Thank you for your suggestions!
I'm not sure which is the standard for GAN training, but it is better to change the update manner since your experiments shows better performance.
I will make PR.

kan-bayashi · 2020-04-24T04:01:47Z

BTW, how was the final quality? Is it improved? or just related to convergence speed?

dathudeptrai · 2020-04-24T04:09:14Z

I think that make sense, discriminator should use newest y_ for training, that make discriminator penalize generator better. U can ref two implementation on github (https://github.com/seungwonpark/melgan/blob/master/utils/train.py/#L85) and official code (https://github.com/descriptinc/melgan-neurips/blob/master/scripts/train.py/#L156). Note that official code training discriminator before generator but it's still re-compute D-fake for training generator. The quality on my language (vietnamese) is improve significantly. I'm training Ljspeech now on melgan only, i have 1M4 val audio here. I believe that the same improvement can achieve with PWG since the manner correctly.
1460000steps.zip

kan-bayashi · 2020-04-24T04:18:57Z

Thank you for sharing samples!
It seems very good :) I will follow your suggested manner!

dathudeptrai · 2020-04-24T04:22:21Z

BTW, i'm creating TF framework for Speech, now focus TTS :)), just like ur ESPNet :D. TF now training faster than pytorch with exact parameter, batch_size, batch_max_step, the improvement on inference speed you knew before: D.

kan-bayashi · 2020-04-24T04:34:29Z

That sounds nice :)
But I love pytorch so I want facebook to make it faster 🚀

dathudeptrai · 2020-04-24T04:37:16Z

haha BTW, what is the license of this repo and ur ESPNET :)), i cann't retrain anything, there are many model and dataset, i plan to provide some training script, pretrained on my TF framework and the rest is convert from ur pytorch pretrained then support for inference only :D. Can i do that ?

kan-bayashi · 2020-04-24T04:39:48Z

That's very cool.
This repository is MIT license and ESPnet is Apache 2.0, so you can do it!

dathudeptrai · 2020-04-24T04:44:45Z

ok that nice, i will make TF models and pytorch models synchronized and use the same preprocesing here so atleast user can use pytorch for training and TF for inference :D. Again, thank for ur hard work :)).

kan-bayashi added the bug Something isn't working label Apr 24, 2020

kan-bayashi added a commit that referenced this issue Apr 24, 2020

changed to re-compute y_ (#131)

aabd569

kan-bayashi mentioned this issue Apr 24, 2020

Change to re-compute fake y for discriminator update #132

Merged

kan-bayashi closed this as completed in #132 Apr 24, 2020

dathudeptrai mentioned this issue May 11, 2020

LJSPECCH synthesis quality has some issue espnet/espnet#1878

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible BUG may make your code not achieve the best performance. (background noise, etc.) #131

Possible BUG may make your code not achieve the best performance. (background noise, etc.) #131

dathudeptrai commented Apr 24, 2020 •

edited

Loading

kan-bayashi commented Apr 24, 2020

kan-bayashi commented Apr 24, 2020

dathudeptrai commented Apr 24, 2020 •

edited

Loading

kan-bayashi commented Apr 24, 2020

dathudeptrai commented Apr 24, 2020 •

edited

Loading

kan-bayashi commented Apr 24, 2020

dathudeptrai commented Apr 24, 2020 •

edited

Loading

kan-bayashi commented Apr 24, 2020

dathudeptrai commented Apr 24, 2020

Possible BUG may make your code not achieve the best performance. (background noise, etc.) #131

Possible BUG may make your code not achieve the best performance. (background noise, etc.) #131

Comments

dathudeptrai commented Apr 24, 2020 • edited Loading

kan-bayashi commented Apr 24, 2020

kan-bayashi commented Apr 24, 2020

dathudeptrai commented Apr 24, 2020 • edited Loading

kan-bayashi commented Apr 24, 2020

dathudeptrai commented Apr 24, 2020 • edited Loading

kan-bayashi commented Apr 24, 2020

dathudeptrai commented Apr 24, 2020 • edited Loading

kan-bayashi commented Apr 24, 2020

dathudeptrai commented Apr 24, 2020

dathudeptrai commented Apr 24, 2020 •

edited

Loading

dathudeptrai commented Apr 24, 2020 •

edited

Loading

dathudeptrai commented Apr 24, 2020 •

edited

Loading

dathudeptrai commented Apr 24, 2020 •

edited

Loading