Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EntityLinker import hangs #520

Closed
mezarque opened this issue Jul 30, 2024 · 2 comments
Closed

EntityLinker import hangs #520

mezarque opened this issue Jul 30, 2024 · 2 comments

Comments

@mezarque
Copy link

I've been trying to import EntityLinker but running into an unusual issue where the kernel hangs for a very long time (so far I've let it run up to 93 minutes) without dying or producing an error.

I know there are some previous issues that were related to nmslib (e.g. #365, #372, #437, #446). These seemed to result in a zsh: illegal hardware instruction error, which I don't seem to be encountering.

I eventually figured out how to resolve this, but wanted to share my solution, in case anyone else runs into the same problem.

Hardware / OS

I'm using a 2021 MacBook Pro with an Apple M1 Pro chip, running macOS Ventura 13.1.

Steps

  1. Create a conda environment using conda create -n scispacy python=3.9. I'm using conda 24.7.1.

  2. conda activate scispacy

  3. pip install scispacy

  4. pip install https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.5.4/en_core_sci_sm-0.5.4.tar.gz

  5. Start an interactive Python session with python

  6. Run the following code:

    import spacy
    nlp = spacy.load("en_core_sci_sm")

    Receive warning:

    /Users/dennis/miniconda3/envs/scispacy/lib/python3.9/site-packages/spacy/language.py:2195: FutureWarning: Possible set union at position 6328
      deserializers["tokenizer"] = lambda p: self.tokenizer.from_disk(  # type: ignore[union-attr]
    
  7. Continue with

    doc = nlp("Alterations in the hypocretin receptor 2 and preprohypocretin genes produce narcolepsy in some animals.")

    No problems.

  8. Run the following code:

    import spacy
    
    from scispacy.abbreviation import AbbreviationDetector
    
    nlp = spacy.load("en_core_sci_sm")
    
    # Add the abbreviation pipe to the spacy pipeline.
    nlp.add_pipe("abbreviation_detector")
    
    doc = nlp("Spinal and bulbar muscular atrophy (SBMA) is an \
               inherited motor neuron disease caused by the expansion \
               of a polyglutamine tract within the androgen receptor (AR). \
               SBMA can be caused by this easily.")
    
    print("Abbreviation", "\t", "Definition")
    for abrv in doc._.abbreviations:
    	print(f"{abrv} \t ({abrv.start}, {abrv.end}) {abrv._.long_form}")

    No problems.

  9. Run the following code:

    import scispacy

    No problems.

  10. Run the following code:

    from scispacy.linking import EntityLinker

    Kernel hangs for a very long time without dying.

Attempts

  1. I've encountered the same behavior in an interactive Python session, as well as when running the code within a Jupyter notebook.
  2. I tried uninstalling nmslib with pip uninstall nmslib and reinstalling with each of the following strategies:
  • pip install --no-binary :all: nmslib (suggested here)
  • CFLAGS="-mavx -DWARN(a)=(a)" pip install nmslib (suggested here)

Solution

Installing nmslib using conda (I used mamba) appeared to solve the issue.

mamba install nmslib

This installed nmslib 2.1.1, which appears to be a newer version than what is specified in requirements.in and setup.py (nmslib>=1.7.3.6). Might upgrading the version there be a good idea? I'm not sure what other issues that would introduce.

@dakinggg
Copy link
Collaborator

Wow, thank you! This solution seems to work for me on both Windows and Linux with python 3.11, which hasn't previously worked. Thank you for sharing! I will respond to some other issues and see if it works for others and then update the installation instructions.

@dakinggg
Copy link
Collaborator

dakinggg commented Sep 6, 2024

I just added a support matrix based on what im able to test or glean from previous github issues, so going to go ahead and close this issue. Thanks again for the suggestion!

@dakinggg dakinggg closed this as completed Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants