-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
running on multiple cores? #18
Comments
There are basically two important loops that should be straightforward to parallelize: https://github.com/lvdmaaten/bhtsne/blob/master/sptree.cpp#L385 The first loop is actually embarrassingly parallel, so it should be completely trivial. The second loop may be somewhat trickier because each iteration may access the same nodes of the tree, so it is somewhat less predictable what speedups you can get there. |
@lvdmaaten thanks for the infos. I'll look a bit deeper into the loops and play around with it as soon as i find some spare time for it :) |
I have an OpenMP based version here: https://github.com/maximsch2/bhtsne. I don't like the binary file interface, so I'm also modifying it to build as a shared library and expose a simple C API. |
I'm no OpenMP expert, but it looks good to me. What kind of speed-ups are you seeing compared to the non-OpenMP code? |
I haven't benchmarked it on big datasets yet, but I think I get around 1.3-1.5x on two cores on a smallish dataset (takes a couple of seconds to build). |
Nice! In an earlier version of the code, I had hardcoded the output dimension. Indeed, that was a lot faster than the version that is currently in the repo. |
@maximsch2 |
I don't have an easy access to a Windows machine, but I don't think there is anything unix-specific that I've added there. I think MinGW supports OpenMP, so you should be able to build it using gcc just like you would do on Linux. EDIT: I've just checked and apparently there is even a description of how to build it on Windows. I haven't updated the EDIT2: I've pushed a version with updated |
@lvdmaaten
This is a total time including reading a file, building a tree (currently not paralellized and takes around 25 seconds in this case) and doing 1000 iterations of embedding learning. |
Thanks, seems to be working :) |
Nice! |
I was able to get this version working on Windows, but the multicore version by maximsch2 is just returning with a "non-zero return code" pretty fast. It's not giving more information even though verbose=True. |
@baobabKoodaa If you want, I can try running your script/data here on Linux to see if this is a Windows-specific issue. |
@maximsch2 Thank you for the idea and for the kind offer. I will try running it on a Linux machine at some point, for now I can make due with the single core version. |
Since it hasn't been mentioned yet, see https://github.com/DmitryUlyanov/Multicore-TSNE |
see: lvdmaaten/bhtsne#18. These changes are inspired by https://github.com/maximsch2/bhtsne. This approach was selected as it requires minimal changes to parallelize the algorithm. In particular, these changes correspond roughly to the changes in maximsch2/bhtsne@08d8a2a
Hey,
it's more a question than an actual issue: I'm mapping a dataset with 32dims x 900000items with tsne on a multi-core machine but as tsne is single threaded i'm just using one core. Do you have any tipps or tricks how i can split the dataset to parallelize computation?
thanks in advance!
The text was updated successfully, but these errors were encountered: