-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove NMSLIB dependency #473
Comments
Hi @nanthony007, replacing nmslib with another approximate nearest neighbor search library is certainly doable, but is a bit more involved than you might realize. The candidate generator ( scispacy/scispacy/candidate_generation.py Line 148 in 4f9ba09
scispacy/scispacy/candidate_generation.py Line 365 in 4f9ba09
That being said, I have recently installed nmslib successfully on Windows Subsystem for Linux with python 3.10. 3.11 likely does not work, as you say. |
I ideally wanted to include scispacy as a dependency of a package for more novice programmers to have some simple access to biomedical NER and using WSL and/or navigating dependency (python, scispacy, etc) versions seems like mental overhead I want to avoid. Is there a way this model could be re-trained using spacy's new entity linker itself? Could that accomplish the same NEL while benefiting from scispacy's models? |
I wonder if annoy could be a good fit for an alternative ANN index? |
Please see #481 |
Closing due to no clear direction forward... |
@nanthony007 I was able to build scispacy for Python 3.11 by using the latest pybind11 (2.10.4) and building nmslib from the master branch, e.g.:
( |
Thanks @phaeta ! Could you share what OS you are on? |
@dakinggg macOS Ventura |
Unfortunately I am unable to replicate this. Copying your git install command resulted in git not finding the revision. Upon removing the trailing "/" pip attempts to build the wheels and install but fails during the Clang build. @phaeta are you on M1 or Intel? Are you using conda python? The build errors I am getting appear to be around SIMD and Scalars...
|
@nanthony007 Try this: Regarding architecture, I'm using an Intel Mac. I'm using python@3.11 from Homebrew. Also the master-branch nmslib build works for me on Linux (Ubuntu 20.04 (aarch64), Python 3.11 built from source). I'll play around with this in a container and put together a Dockerfile. Also, I have access to an M1 Mac Mini; I'll try things there too. Stay tuned |
Okay thanks! That command also does not work so maybe it's something with M1? My main concern is M1 and Windows 11 support since I think most students will likely be on those platforms. |
I can confirm that this works for Windows 11 and Python 3.11. |
Unfortunately, this solution isn't working for my Intel machine. I'm running Debian 12 with Python 3.11. Has anyone tested this on Linux?
|
hey @nanthony007 and @umayerr, could you try installing with mamba as per #520 (comment)? I'm looking to see if it works for others. |
Unfortunately I'm not in the position to let one package dictate my package manager selection so mamba will just be a pass/no for me and my use case. Thanks for the follow up on this though, I've resulted to just isolating scispacy processes into completely separate VMs from the other services. |
I'm not sure if this would be possible and what alternatives may even exist, BUT, due to years of inactivity and unresponsiveness on the primary nmslib maintainer's side (not faulting him), the nmslib dependency makes scispacy very unaccessible to new users and, in fact, will remain completely inaccessible to users on new operating systems (Windows 11) or running modern versions of python (3.11).
Are there any possible alternatives for the few lines of code where this package uses nmslib?
From what I can see those are primarily two calls to
nmslib.init()
and otherwise type annotations.Please advise, if possible I would love to help here but am not comfortable writing robust production C++ code nor am I an expert on the scispacy models themselves.
The text was updated successfully, but these errors were encountered: