-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add nmslib indexer #2417
Add nmslib indexer #2417
Conversation
Thanks @masa3141 ! Ping @searchivarius – would you have capacity to review this PR? (choice of NMSLIB APIs, default parameters, etc). Cheers. |
@piskvorky yes, will do soon. thanks letting me know! |
Ping @searchivarius |
Please, bear with me. There was recently reported what seems to be a serious bug. I need to fix it first. |
I see. Thank you! |
@searchivarius |
@masa3141 sorry for the delay, I still haven't resolved the issue with the recently reported NMSLIB bug. In fact, knnQuery might not work, but I don't know yet why, knnQueryBatch seems to be working fine. |
Thank you for your support in your busy time. Please review at your convenience. |
Ok, I have started working on this. Should be ready soon! |
Hi @masa3141 there seems to be, indeed, an issue with knnQuery. Could you slightly change your code so it uses knnQueryBatch instead? I am going to fix knnQuery very soon, but it will take time and the older NMSLIB versions will still be broken. Please, make query matrix 2-dim numpy, example is here: Thank you! |
Hi @searchivarius, thank you for the review! |
Did the test fail due to Memory Error? |
@masa3141 no it's related to the using wrong memory layout for numpy array data. I am quite surprised it wasn't caught before. |
How do I fix it? Does it happen only in python2.7? |
@masa3141 ohhh I missed that you changed to knnQueryBatch already. I will review shortly. |
Hi @masa3141 and @piskvorky I have reviewed the querying part. It does look fine to me, many thanks! It would be nice to double check how all is working though when things are merged. |
@searchivarius Thank you for the review! |
No idea. CC @mpenkov . |
I changed to install nmslib into CI only when python version is over 3.0. |
@piskvorky @mpenkov |
@masa3141 Not in the immediate future. Let us work out why Appveyor is failing. I'll get back to you once that's done. |
@mpenkov Thanks for fixing Appveyor issue! When will this feature be merged? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your patience. I finally got around to reviewing this. Please have a look at the comments.
@mpenkov |
Hmm, looks like a dependency problem:
|
Interesting. It didn't happen before. From today this happens. |
OK, leave it with me. Once I fix this, I will ping you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Picked up one more thing. While I'm messing with appveyor, please use underscores instead of camel cases in test method names.
I fixed a couple myself, leaving the rest up to you.
I found there are many camel cases in test method names. (Ex, test_nmf.py, test_sklearn_api.py) |
Yes, we should, but those files are out of scope for this PR. Let's handle the nmslib test cases only for now, and leave the rest for another PR. |
Upgrade pip
I see. I changed to use underscores. |
Opened a new ticket with support: https://help.appveyor.com/discussions/problems/24208-unable-to-use-the-latest-version-of-pip-for-my-tox-builds |
@mpenkov |
@masa3141 Finally merged. Thank you for your contribution and your patience! |
Thanks a lot! |
Hurray, thanks everybody! |
@mpenkov Hi, 3.8.0 release doesn't contain this nmslib feature? The release note doesn't mention this feature. Thanks |
@masa3141 Thank you for pointing it out. Looks like our changelog generation script missed the nmslib PR. The indexer itself is definitely there: https://github.com/RaRe-Technologies/gensim/blob/develop/gensim/similarities/nmslib.py I will update the change log. |
Hi, I added nmslib indexer.
Some research shows nmslib is better than annoy indexer.
https://erikbern.com/2018/06/17/new-approximate-nearest-neighbor-benchmarks.html
https://www.benfrederickson.com/approximate-nearest-neighbours-for-recommender-systems/
This is the first time to contribute to gensim. If I miss something, please let me know.