-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multi gpu error #23
Comments
Hi @yja1 |
I have delete that when use multi gpu |
This is my code. You may refer it and modify your code. model = torch.nn.DataParallel(model, device_ids=gpu_ids).cuda()
ignored_params = list(map(id, model.module.model.fc.parameters() )) + list(map(id, model.module.classifier.parameters() ))
base_params = filter(lambda p: id(p) not in ignored_params, model.parameters())
# Observe that all parameters are being optimized
optimizer_ft = optim.SGD([
{'params': base_params, 'lr': 0.01},
{'params': model.module.model.fc.parameters(), 'lr': 0.1},
{'params': model.module.classifier.parameters(), 'lr': 0.1}
], momentum=0.9, weight_decay=5e-4, nesterov=True) |
I viewed your code above and changed the train.py line303 and below into this:
|
Hi @xujian0 model = torch.nn.DataParallel(model, device_ids=gpu_ids).cuda() |
Thanks for your reply, and it solved my problem! |
CUDA_VISIBLE_DEVICES=6,7 python train.py --PCB --batchsize 60 --name PCB-64 --train_all
but error in forward:
RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at /opt/conda/conda-bld/pytorch_1513368888240/work/torch/lib/THC/THCTensorCopy.cu:204
The text was updated successfully, but these errors were encountered: