Does VICREG loss work correctly in multi-gpus setting? #535

NoTody · 2022-10-08T04:20:05Z

Hi, Kevin

I notice that the current implementation of VICREG doesn't have the gather layer for the input embeddings. Does it still work as expected to get embeddings from mini-batches in multi-gpus in this case? See https://github.com/facebookresearch/vicreg/blob/a73f567660ae507b0667c68f685945ae6e2f62c3/main_vicreg.py#L200 for original implementation of VICREG.

Regards,
Notody

KevinMusgrave · 2022-10-08T18:38:43Z

Yeah I don't think it does. For me, the ideal solution would be to make it compatible with DistributedLossWrapper. That means either

making it have the same input format as BaseMetricLossFunction. I don't know if there's a way to do this that makes sense.
Adding an if-statement in DistributedLossWrapper to catch the VICReg loss type

KevinMusgrave · 2023-01-29T17:30:35Z

In v2.0 this will work:

loss_fn = DistributedLossWrapper(loss=VICRegLoss())
loss = loss_fn(embeddings, ref_emb=ref_emb)

KevinMusgrave added this to the v2.0 milestone Jan 21, 2023

KevinMusgrave closed this as completed Jan 29, 2023

Provide feedback