Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

embeddings.search hangs indefinitely on Mac #576

Closed
tf13 opened this issue Oct 13, 2023 · 5 comments
Closed

embeddings.search hangs indefinitely on Mac #576

tf13 opened this issue Oct 13, 2023 · 5 comments

Comments

@tf13
Copy link

tf13 commented Oct 13, 2023

Running OS 12.7 (Monterrey) on a MacBook Air (M1, 2020) with Python 3.11.5

Script below — it should be effectively identical to the first example at https://neuml.hashnode.dev/tutorial-series-on-txtai

It failed in Spyder and also from the command line in Terminal (both with python3 scriptname.py and line by line after running python3 on the command line).

I then tried the same script on an Linux box I have access to, running Ubuntu 22.04.3 LTS. It worked fine as-is from the command line.

from txtai import Embeddings

# temp data
data = [
  "US tops 5 million confirmed virus cases",
  "Canada's last fully intact ice shelf has suddenly collapsed, forming a Manhattan-sized iceberg",
  "Beijing mobilises invasion craft along coast as Taiwan tensions escalate",
  "The National Park Service warns against sacrificing slower friends in a bear attack",
  "Maine man wins $1M from $25 lottery ticket",
  "Make huge profits without work, earn up to $100,000 a day"
]

# create embeddings
embeddings = Embeddings(path="sentence-transformers/nli-mpnet-base-v2")

# create index for list of texts
embeddings.index(data)

print(f"{'Query':20} Best Match")
print("-" * 50)


for query in ("feel good story", "climate change", "public health story", "war",
              "wildlife", "asia", "lucky", "dishonest junk"):
    # Extract uid of first result
    # search result format: (uid, score)
    uid = embeddings.search(query)[0][0]

    # Print text
    print(f"{query:20} {data[uid]}")
@davidmezzetti
Copy link
Member

What happens if you change this line:

embeddings = Embeddings(path="sentence-transformers/nli-mpnet-base-v2")

to

embeddings = Embeddings(path="sentence-transformers/nli-mpnet-base-v2", gpu=False)

@tf13
Copy link
Author

tf13 commented Oct 15, 2023

It froze just as before. This time I left it running to see whether it would ever resolve. More than 11 hours later, it is still hung after printing the headers.

@davidmezzetti
Copy link
Member

Ok, not exactly sure what could be going on. Outside of managing the build script, I don't run txtai in a macOS environment frequently.

Perhaps you can try something things as seen in the build script such as a different Python version or setting the OMP_NUM_THREADS=1 environment variable. There are also a couple ideas in the FAQ

@davidmezzetti
Copy link
Member

Closing this due to inactivity. Please re-open or open a new issue if there are further questions.

@tf13
Copy link
Author

tf13 commented Nov 3, 2023

Sorry, I should have reported back: OMP_NUM_THREADS =1 does seem to do the trick.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants