Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

when runing the example meet a issue,out of memory #174

Closed
nuptwuchen opened this issue Dec 28, 2017 · 4 comments
Closed

when runing the example meet a issue,out of memory #174

nuptwuchen opened this issue Dec 28, 2017 · 4 comments

Comments

@nuptwuchen
Copy link

model [CycleGANModel] was created
create web directory ./checkpoints/maps_cyclegan/web...
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1513363039688/work/torch/lib/
Traceback (most recent call last):
File "train.py", line 27, in
model.optimize_parameters()
File "/home/wuchen/Downloads/pytorch-CycleGAN-and-pix2pix/models/cycle_gan_mod
self.backward_G()
File "/home/wuchen/Downloads/pytorch-CycleGAN-and-pix2pix/models/cycle_gan_mod
rec_B = self.netG_A(fake_A)
File "/home/wuchen/anaconda2/lib/python2.7/site-packages/torch/nn/modules/modu
result = self.forward(*input, **kwargs)
File "/home/wuchen/Downloads/pytorch-CycleGAN-and-pix2pix/models/networks.py",
return nn.parallel.data_parallel(self.model, input, self.gpu_ids)
File "/home/wuchen/anaconda2/lib/python2.7/site-packages/torch/nn/parallel/dat
return module(*inputs[0], **module_kwargs[0])
File "/home/wuchen/anaconda2/lib/python2.7/site-packages/torch/nn/modules/modu
result = self.forward(*input, **kwargs)
File "/home/wuchen/anaconda2/lib/python2.7/site-packages/torch/nn/modules/cont
input = module(input)
File "/home/wuchen/anaconda2/lib/python2.7/site-packages/torch/nn/modules/modu
result = self.forward(*input, **kwargs)
File "/home/wuchen/anaconda2/lib/python2.7/site-packages/torch/nn/modules/conv
self.padding, self.dilation, self.groups)
File "/home/wuchen/anaconda2/lib/python2.7/site-packages/torch/nn/functional.p
return f(input, weight, bias)
RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytcu:58

@ssnl
Copy link
Collaborator

ssnl commented Jan 2, 2018

Unfortunately, your GPU runs out of memory. IIRC, the cyclegan with batch size 1 and 256x256 resolution needs 3G-5G GPU memory with cuDNN. Please check that you have cuDNN installed and maybe reduce batch size and/or resolution.

Btw, which pytorch version are you using?

@junyanz
Copy link
Owner

junyanz commented Jan 14, 2018

It takes 3.8 GB on my GTX 1080. You probably need a larger GPU or train models at a lower resolution (128p).

@lxj616
Copy link

lxj616 commented Mar 5, 2018

For anyone who searched 'Out of memory' to get here

  1. At 256x256 resolution it needs 4010 MB exactly on GTX 745 (grow from 3800 MB to 4010 MB for one epoch, then stay still), if your only have 4GB gpu, try shutdown your x-server to reserve exactly 4GB gpu memory
  2. A weird thing is that if you use cpu instead of gpu, it took 20GB+ memory to run on 256x256, I don't know what's wrong

So... I shutdown my x-server then provide full gtx745 4G gpu to train on 256x256, can not go further without a better gpu : (

@mrluker
Copy link

mrluker commented Apr 10, 2018

@lxj616 when you use CPU mode, it is unable to perform as many calculations in parallel compared with a GPU and thus has to hold more in memory

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants