-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Advices on Multi-GPU support? #121
Comments
thanks for the effort! you will first need to dump a dataset to some tfrecord, tf slim has great support for multi-gpu training. i have been trying to do this for a long time but haven't really got into it yet. |
I see.
Can you give some suggestions on how to use tf slim to implement a multi-gpu version, based on this branch? It seems tricky because your network is defined in a class... |
I seems that |
I think py_func may be a bottleneck. But I am not sure whether it supports multi gpu |
So have anyone implemented a version that supports multi gpu? |
...so why py_func is a bottleneck? what is the matter? |
Are your GPUs the same type |
I recently wrote one with multi-gpu support. |
Wow thanks so much @ppwwyyxx! This looks amazing! closing this. |
It seems like the errors are caused by the nms() used in tf.py_func. When I changed it into py_nms, the errors are solved. However, the time complicity are increased a lot. |
Hi Ender, thanks for your work!
There have been some requests for multi-gpu support(e.g. #51). I am now trying to write a multi-gpu version based on your code.
However, after looking into the code, it seems that the current structure does not support multi-gpu well. For example. if I modify train_val.py in this way:
It can not work because the network class has only one "self.image" so an error of
will be throwed.
Can you give any advises of how to implement a multi-gpu version of this code?
many thanks.
The text was updated successfully, but these errors were encountered: