API for clearing XLA:CUDA cached memory. #7824

ysiraichi · 2024-08-09T16:13:27Z

PyTorch/XLA currently has no means to clear cached memory, i.e. something similar to torch.cuda.empty_cache(). This is relevant for benchmarking a model on both PyTorch CUDA and XLA:CUDA.

For comparison, using PyTorch CUDA, we have:

# Memory usage rises to 4G
>>> a = torch.rand(1024, 1024, 1024, device="cuda")
>>> del a
>>> gc.collect()
# Memory goes back to ~200M
>>> torch.cuda.empty_cache()

In contrast, using XLA:CUDA with PJRT_ALLOCATOR_PREALLOCATE=false, we have:

# Memory usage rises to 4G
>>> a = torch.rand(1024, 1024, 1024, device=xm.xla_device())
>>> a.cpu()
>>> del a
>>> gc.collect()
# Memory usage never goes back...

Not only that, but it would be nice to have an API call for deleting the pre-allocated memory.

cc @miladm @JackCaoG

The text was updated successfully, but these errors were encountered:

miladm · 2024-08-09T17:37:25Z

Thank you @ysiraichi!
do you plan to work on this issue?

ysiraichi · 2024-08-09T17:54:22Z

It's actually not something that's blocking me, so I wasn't going to. But, if you think this is worth looking into, I could do it.

ysiraichi added the xla:gpu label Aug 9, 2024

ysiraichi mentioned this issue Aug 12, 2024

Failing Torchbench Models: tracking issue #5932

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API for clearing XLA:CUDA cached memory. #7824

API for clearing XLA:CUDA cached memory. #7824

ysiraichi commented Aug 9, 2024

miladm commented Aug 9, 2024

ysiraichi commented Aug 9, 2024

API for clearing XLA:CUDA cached memory. #7824

API for clearing XLA:CUDA cached memory. #7824

Comments

ysiraichi commented Aug 9, 2024

miladm commented Aug 9, 2024

ysiraichi commented Aug 9, 2024