-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
这是GPU集群版本的caffe吗 #9
Comments
可以。 |
Thanks |
can you provide the installation instruction of open mpi? |
can you give a code snippet in python to show how to set multiple gpu devices? I find that caffe::SetDevice accepts an integer. |
Multi Gpu configuration is through command line. Python interfaces cannot On Saturday, August 13, 2016, 康洋 notifications@github.com wrote:
熊元骏 Department of Information Engineering, E-mail: bitxiong@gmail.com |
Hi @zimenglan-sysu-512 , To install OpenMPI, please see https://www.open-mpi.org/faq/?category=building#easy-build In the
for optimal performance. |
Excellent work! |
想请问下,在实现和效率上,这种数据并行的方式和caffe master(http://caffe.berkeleyvision.org/tutorial/interfaces.html --------Parallelism: the -gpu flag to the caffe tool can take a comma separated list of IDs to run on multiple GPUs. A solver and net will be instantiated for each GPU so the batch size is effectively multiplied by the number of GPUs. To reproduce single GPU training, reduce the batch size in the network definition accordingly.)有什么不一样呢? |
@sunnyxiaohu |
@yjxiong i have encountered that "unrecognized options: --enable-mpi-thread-multiple", how could solve it? |
@pkuCactus |
想请问一下,集群上是intelmpi的话,可以使用吗?我想设置成openmpi,但是不知道具体怎么做,因为cmake的时候总是自动检测intelmpi,ccmake的话我写的可能有问题 |
@zzy123abc Intelmpi is not tested. You can manually modify the cache variables (search |
谢谢,那请问您测试的是单节点多gpu还是多节点多gpu的呢?上面也有人问到,就是说,gpu02的gpu0和gpu1,加上gpu03的gpu0和gpu1怎么一起工作?solver设置里面写的0,1,2,3好像不可以,修改成0,1,0,1可以使用吗? |
Yes. Just as you said, [0, 1, 0, 1] |
Hi, @yjxiong I met the problem that MPI mode is disabled.
I have one PC with multiple GPUs. As you can see in the above log, the program runs in non-parallel mode. The problem should be that |
I fix the problem. |
can this synchronized batchnorm be used on the One-Device-Multi-GPU ? |
这个版本的caffe可以运行在GPU集群上吗?如果可以的话,对集群有要求吗?谢谢
The text was updated successfully, but these errors were encountered: