-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto move input to proper device for inference #1412
Comments
it may be a nice feature to have ddp/tpu also for inference time... |
We should definitely automate this! |
@tcwalther pls let me bring some light to it... what are the goal, always use the best you have responsible (GPU/TPU)? thinking about the inference, the case I see could be to have it as a parameter... because in my case (using notebook) I have GPU but most of the "production" models won't fit there so it will crash |
whenever a lightningModule is used
we put x on the proper device... @tcwalther this is what you mean no? |
does it also mean that we shall estimate if the model fits available device? @tcwalther ^^ |
@tcwalther @JonathanSchmidt1 it is close to #1467, right? |
Yes I assume fixing this, would have also fixed my problem, although a recent update of pytorch lightning already fixed my problem and I could remove my code. |
Does PyTorch Lightning provide abstractions for inference? In particular, does it provide ways of automatically handling the transfer to/from GPU when I call
model(x)
, or do I need to roll my own code for that?Example Use Case
I have a use case where I train a model on slices of a sliding window of an audio spectrogram (i.e., let's say 1 second chunks). When training is finished, I'd like to see the performance of the model on an entire file. Pseudocode:
Notice that during training, the notion of a file is entirely gone, but when I plot my test file, I reintroduce that. Of course, in my real code, my training data
X, Y
is split into training, validation and test, as usual. The plotting step is an additional verification; sort of like putting the pieces together.Problem
When the model runs on the GPU, The last part of the code becomes:
This isn't the end of the world, but it's not as nice as the other code that PyTorch Lightning helped me refactor. I also can't call
x.type_as(...)
since in that loop, I have no reference type that lives on the CPU/GPU that I could refer to (or maybe I can, but I haven't figured it out).A workaround to this is to save the model and load it again, on a CPU.
While this removes the noise of the
.to(device)
and.cpu()
calls, it adds the overhead of having to save the model every time. I also still have to manually callmodel.eval()
. The use case of running my model on an entire audio file is not for metrics but for visual inspection; as such I always only sample a few audio files. Running the model on a CPU instead of a GPU for inference thus isn't a problem.Question
Is there a more elegant way to achieve the above?
The text was updated successfully, but these errors were encountered: