-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak using LSUN dataset #619
Comments
After some research I think this is due to the way LMDB manages memory: https://lmdb.readthedocs.io/en/release/#memory-usage However, this is still a problem for me. We use a resource management system (SLURM) which reserves RAM for jobs and kills them when they go above their limit. If the apparent LMDB memory usage just keeps growing I cannot use it unless I basically reserve all available memory, thereby blocking other users from submitting jobs. Can anybody with LMDB knowledge help? |
I've looked into this a long time ago, in 2015. There isn't a way (afaik) to make the kernel reclaim "clean" pages from a particular process either |
That's very unfortunate. Thanks for letting me know. |
@soumith Hi, do you have any recommended database for PyTorch? |
I believe the LSUN dataset leaks memory. I expect the memory usage of a process that is simply iterating over a dataset using a DataLoader to be constant. This is the case when using FakeData or CIFAR10. However, with LSUN the memory usage is steadily increasing. This was causing me problems during training of a GAN because the I always ran out of host memory at some point during the training.
I created the following script to reproduce the issue. Uncomment the different datasets for testing them.
Output for LSUN:
Output for CIFAR10:
Output for FakeData:
The text was updated successfully, but these errors were encountered: