You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PyTorch/XLA currently has no means to clear cached memory, i.e. something similar to torch.cuda.empty_cache(). This is relevant for benchmarking a model on both PyTorch CUDA and XLA:CUDA.
For comparison, using PyTorch CUDA, we have:
# Memory usage rises to 4G>>>a=torch.rand(1024, 1024, 1024, device="cuda")
>>>dela>>>gc.collect()
# Memory goes back to ~200M>>>torch.cuda.empty_cache()
In contrast, using XLA:CUDA with PJRT_ALLOCATOR_PREALLOCATE=false, we have:
# Memory usage rises to 4G>>>a=torch.rand(1024, 1024, 1024, device=xm.xla_device())
>>>a.cpu()
>>>dela>>>gc.collect()
# Memory usage never goes back...
Not only that, but it would be nice to have an API call for deleting the pre-allocated memory.
PyTorch/XLA currently has no means to clear cached memory, i.e. something similar to
torch.cuda.empty_cache()
. This is relevant for benchmarking a model on both PyTorch CUDA and XLA:CUDA.For comparison, using PyTorch CUDA, we have:
In contrast, using XLA:CUDA with
PJRT_ALLOCATOR_PREALLOCATE=false
, we have:Not only that, but it would be nice to have an API call for deleting the pre-allocated memory.
cc @miladm @JackCaoG
The text was updated successfully, but these errors were encountered: