Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to use balance loss? #176

Open
Heihaierr opened this issue Oct 26, 2023 · 1 comment
Open

how to use balance loss? #176

Heihaierr opened this issue Oct 26, 2023 · 1 comment

Comments

@Heihaierr
Copy link
Contributor

how to apply balance loss? can u add it to the example 'transformer-xl'?

@laekov
Copy link
Owner

laekov commented Nov 6, 2023

Sorry for the late reply.

The BaseGate module has methods including set_loss, get_loss and has_loss. In a customized gate (or gates in FastMoE with balance losses), they use self.set_loss to put the loss value in the module, which can be further added to the final loss using get_loss function of the gate modules. (e.g. adding them to get_loss function in Megatron-LM)

We will add this to our document.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants