You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The BaseGate module has methods including set_loss, get_loss and has_loss. In a customized gate (or gates in FastMoE with balance losses), they use self.set_loss to put the loss value in the module, which can be further added to the final loss using get_loss function of the gate modules. (e.g. adding them to get_loss function in Megatron-LM)
how to apply balance loss? can u add it to the example 'transformer-xl'?
The text was updated successfully, but these errors were encountered: