You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My object detector is running out of CPU memory after a few iterations.
I traced down the issue to the fact that I use varying aspect ratios at the input.
(My dataset has varying aspect ratios and I want to train the network to handle this properly. Since my training is distributed, I use batch size 1, making this possible.)
The memory keeps increasing on the forward pass. Training with varying sizes but constant aspect ratio or putting the model in evaluation mode solves the issue (ie constant memory at 1171MB). After a few iterations, memory will oscillate but still have an upwards trend.
I'm not blocked by this but still wanted to know if this is expected behavior and if there is a quick fix for this?
Current memory: 1301.67578125
Current memory: 2059.98046875
Current memory: 2895.0234375
Current memory: 3668.140625
Current memory: 3556.203125
Current memory: 4434.28515625
Current memory: 4428.390625
Current memory: 4432.43359375
Current memory: 3522.76953125
Current memory: 3351.46484375
Environment
Output of collect.py
ollecting environment information...
PyTorch version: 1.3.1
Is debug build: No
CUDA used to build PyTorch: None
OS: Mac OSX 10.14.6
GCC version: Could not collect
CMake version: Could not collect
Python version: 3.7
Is CUDA available: No
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
Versions of relevant libraries:
[pip3] numpy==1.17.4
[pip3] torch==1.3.1
[pip3] torchvision==0.4.2
The text was updated successfully, but these errors were encountered:
Thank you for pytorch!
My object detector is running out of CPU memory after a few iterations.
I traced down the issue to the fact that I use varying aspect ratios at the input.
(My dataset has varying aspect ratios and I want to train the network to handle this properly. Since my training is distributed, I use batch size 1, making this possible.)
The memory keeps increasing on the forward pass. Training with varying sizes but constant aspect ratio or putting the model in evaluation mode solves the issue (ie constant memory at 1171MB). After a few iterations, memory will oscillate but still have an upwards trend.
I'm not blocked by this but still wanted to know if this is expected behavior and if there is a quick fix for this?
The code below illustrates the issue:
Output:
Environment
Output of collect.py
The text was updated successfully, but these errors were encountered: