You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Changing cross-entropy loss to sigmoid would allow us to calculate contrastive loss independently on each device (assuming we are using multi-gpu) without all-gather operations (which cost mem + compute).
The text was updated successfully, but these errors were encountered:
What is chunked sigmoid loss? See this twitter thread (it even contains code for implementation! (albeit in JAX)): https://twitter.com/borisdayma/status/1663269506289135616
Changing cross-entropy loss to sigmoid would allow us to calculate contrastive loss independently on each device (assuming we are using multi-gpu) without all-gather operations (which cost mem + compute).
The text was updated successfully, but these errors were encountered: