LightningLite not updating DeviceDtypeModuleMixin correctly #10556

justusschock · 2021-11-16T10:04:16Z

🐛 Bug

When using LightningLite and transferring the _LiteModule to cpu, attributes of DeviceDtypeModule are not updated.

To Reproduce

class SomeDummy(DeviceDtypeModuleMixin):
    def __init__(self):
        super().__init__()
        self.a = torch.nn.Linear(1,1)

class MyClass(LightningLite):
    def run(self):
        model = SomeDummy()
        model, optimiser = self.setup(model, torch.optim.Adam(model.parameters()))

        #  do some stuff
        
        # now clean up gpu memory for later stages
        model.cpu()
        assert str(model.module.device) == 'cpu'

MyClass(accelerator='gpu', devices=1).run()

Expected behavior

model.module.device should be Cpu

Additional context

Could probably be solved by using DeviceDtypeModuleMixin as base class for the _LiteModule since this is an issue with the to function only calling _apply on all child tensors instead of calling .to on every child module.

cc @carmocca @justusschock @awaelchli

The text was updated successfully, but these errors were encountered:

awaelchli · 2021-11-16T11:23:07Z

That's an issue we have even without Lite, e.g., it could occur with torchmetrics too. This example shows it:

import torch

from pytorch_lightning.core.mixins import DeviceDtypeModuleMixin


class SomeDummy(DeviceDtypeModuleMixin):
    def __init__(self):
        super().__init__()
        self.a = torch.nn.Linear(1, 1)


class WrapperModule(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.module = SomeDummy()  # this could be a torchmetric


w = WrapperModule().cuda()
print(w.module.device)  # prints cpu !!! should be cuda:0
w.cpu()
print(w.module.device)  # prints cpu

As you said, the only solution is to add the DeviceDtypeModuleMixin to the base of the wrapper class because of how child modules get called.

awaelchli · 2021-11-16T11:43:41Z

I noticed, in torchmetrics they abandoned the DeviceDtypeMixin and override the _apply method:

https://github.com/PyTorchLightning/metrics/blob/93cb842f24d15804dd2e7677ca7fc6631b234773/torchmetrics/metric.py#L466-L490

justusschock · 2021-11-16T12:34:00Z

@awaelchli yes, but without Lite this is not our concern :D

From our offline chat (for completeness):
For torch metrics removing it is fine since they don't nest modules there. And overriding the _apply is exactly what the mixin does :)

justusschock added bug Something isn't working fabric lightning.fabric.Fabric labels Nov 16, 2021

awaelchli added this to the 1.5.x milestone Nov 16, 2021

awaelchli mentioned this issue Nov 16, 2021

Fix propagation of device and dtype properties in Lite modules #10559

Merged

11 tasks

awaelchli closed this as completed in #10559 Nov 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LightningLite not updating DeviceDtypeModuleMixin correctly #10556

LightningLite not updating DeviceDtypeModuleMixin correctly #10556

justusschock commented Nov 16, 2021 •

edited

Loading

awaelchli commented Nov 16, 2021 •

edited

Loading

awaelchli commented Nov 16, 2021 •

edited

Loading

justusschock commented Nov 16, 2021 •

edited

Loading

LightningLite not updating DeviceDtypeModuleMixin correctly #10556

LightningLite not updating DeviceDtypeModuleMixin correctly #10556

Comments

justusschock commented Nov 16, 2021 • edited Loading

🐛 Bug

To Reproduce

Expected behavior

Additional context

awaelchli commented Nov 16, 2021 • edited Loading

awaelchli commented Nov 16, 2021 • edited Loading

justusschock commented Nov 16, 2021 • edited Loading

justusschock commented Nov 16, 2021 •

edited

Loading

awaelchli commented Nov 16, 2021 •

edited

Loading

awaelchli commented Nov 16, 2021 •

edited

Loading

justusschock commented Nov 16, 2021 •

edited

Loading