Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

During inference, the output of noisy gate is nan. #180

Open
zqhang opened this issue Dec 3, 2023 · 5 comments
Open

During inference, the output of noisy gate is nan. #180

zqhang opened this issue Dec 3, 2023 · 5 comments

Comments

@zqhang
Copy link

zqhang commented Dec 3, 2023

The training process proceeds smoothly; however, an issue arises during inference as the noise_stddev becomes zero when self.training is False, leading to an error when computing the load. Should we refrain from adding noise in the NoisyGate during inference?

@laekov
Copy link
Owner

laekov commented Dec 4, 2023

@Sengxian Can you please shed some light on why we are multiplying the noise with self.training here?

@laekov
Copy link
Owner

laekov commented Dec 4, 2023

I suppose it should be raw_noise * training + eps instead of (raw_noise + eps) * training

@zqhang
Copy link
Author

zqhang commented Dec 4, 2023

Do I accurately comprehend your statement: noise_stddev = self.softplus(raw_noise_stddev) * self.training + self.noise_epsilon ?

@laekov
Copy link
Owner

laekov commented Dec 4, 2023

Do I accurately comprehend your statement: noise_stddev = self.softplus(raw_noise_stddev) * self.training + self.noise_epsilon ?

Yes, I think that can help fixing your nan issue. But as I am not an algiorithm person, I am not sure if this is what the nosiy gate is expected to behave for inference.

@zqhang
Copy link
Author

zqhang commented Dec 4, 2023

Thank you for your help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants