Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gensim installed with pip on Mac with python 3.7 not finding C extension #2802

Closed
hsimpson22 opened this issue Apr 21, 2020 · 23 comments
Closed
Assignees
Labels
bug Issue described a bug impact HIGH Show-stopper for affected users reach MEDIUM Affects a significant number of users testing Issue related with testing (code, documentation, etc)
Milestone

Comments

@hsimpson22
Copy link

Problem description

I am trying to train a w2v model on my local machine (Mac OS 10.14.5), but I am getting message about needing to install C compiler:

/Users/hsimpson/envs/py3/lib/python3.7/site-packages/gensim/models/base_any2vec.py:743: UserWarning: C extension not loaded, training will be slow. Install a C compiler and reinstall gensim for fast training.

Steps/code/corpus to reproduce

any call to gensim.models.Word2Vec

I am working in a python3 virtual env (name = py3)

I tried the following to fix:

  • pip install --upgrade gensim
    response is :

Requirement already up-to-date: gensim in /Users/hsimpson/envs/py3/lib/python3.7/site-packages (3.8.2)
and get Requirement already satisfied messages for all dependencies

  • pip uninstall gensim; pip install gensim
    this installed that same version 3.8.2

neither solved the problem .

I know installing conda might fix as it did for user guo18306671737 in #2572 but I don't use conda anymore as it has caused issues for me with paths etc. so would really like pip install to work for me -- for that user looks like the problem was attributed to being Windows specific but I am on a Mac so thought it's worth letting you know at least --- that user was also on Python 3.7

Versions

on CLI & in PyCharm (both with virtual env py3):

Darwin-18.6.0-x86_64-i386-64bit
Python 3.7.5 (default, Nov  1 2019, 02:16:32) 
[Clang 11.0.0 (clang-1100.0.33.8)]
NumPy 1.16.5
SciPy 1.3.1
gensim 3.8.1
FAST_VERSION -1

in jupyter notebook, with either Python3 or py3 kernel selected:

Darwin-18.6.0-x86_64-i386-64bit
Python 3.7.5 (default, Nov  1 2019, 02:16:32) 
[Clang 11.0.0 (clang-1100.0.33.8)]
NumPy 1.16.4
SciPy 1.4.1
gensim 3.2.0
FAST_VERSION -1

so, strangely, the gensim version pip says it installed (3.8.2), the version in CLI/PyCharm (3.8.1) and version in jupyter notebook kernel (3.2.0) all do not match .. not sure why .. in my notebook I had to prepend the path to /Users/hsimpson/envs/py3/lib/python3.7/site-packages (where pip says it's installing gensim 3.8.2) in order to get spacy to be imported, but it still says it's running gensim 3.2.0

however since both 3.8.1 and 3.2.0 versions say FAST_VERSION -1 not sure if that matters

@edwardxugang
Copy link

same problem in the same system, I solved the issue by
conda install -c anaconda gensim

@krisdigital
Copy link

Some things to try:

pip install numpy --no-cache-dir --no-binary :all:

then

pip install gensim --no-cache-dir --no-binary :all:

(Worked for me)

@piskvorky piskvorky added this to the 3.8.3 milestone Apr 25, 2020
@piskvorky
Copy link
Owner

piskvorky commented Apr 25, 2020

@hsimpson22 there are two separate issues here:

  1. Version 3.8.2 incorrectly reports itself as 3.8.1. See Inconsistency between pip version (3.8.2) and installed version (3.8.1) #2796 . Probably not a real problem.

  2. Your pip install failed to install the optimized version of word2vec.

For 2), can you post your installation log? What did pip install show you when you were installing gensim?

In particular, I'd expect OSX to install Gensim from the pre-compiled py3.7 wheel, so you don't need any C compiler.

Your pip install output should show us exactly what went wrong. You definitely don't need conda to install Gensim.

@piskvorky piskvorky added the need info Not enough information for reproduce an issue, need more info from author label Apr 25, 2020
@polyrand
Copy link

I got the same error too. In my case it is not a problem since I'm not going to train anything. But I found this issue, so here is some info:

When importing gensim I get a warning:
unable to import 'smart_open.gcs', disabling that module

I re-installed it using pip install -v, since there's a lot of output I put it in a gist to avoid cluttering the issue:

https://gist.github.com/polyrand/057d44c4a65e342630ca4f4e42e2883e

Let me know if you need anything else.

@amehtaSF
Copy link

I am getting the same error as well

@mpenkov
Copy link
Collaborator

mpenkov commented May 1, 2020

@polyrand the smart_open.gcs warning is not relevant, you may ignore it. If you're interested, it got fixed in the smart_open repo (https://github.com/RaRe-Technologies/smart_open/blob/develop/CHANGELOG.md).

@krisdigital
Copy link

krisdigital commented May 1, 2020

@piskvorky

For 2), can you post your installation log? What did pip install show you when you were installing gensim?

In my case it did not say anything special, just that it installed the wheel version 🤷‍♂️ My guess would be that it has something to do with the Mac OS version, because I run 10.14.6 on really old hardware (2009), @hsimpson22 writes 10.14.5.. is there a wheel without optimized w2v? 🧐

@piskvorky
Copy link
Owner

piskvorky commented May 1, 2020

@krisdigital @hsimpson22 Can you try installing from the .tar.gz sources? I.e. not from the precompiled wheel.

You'll need a working compiler but at least you'll see exactly how the compilation goes. It's possible there's some binary incompatibility on account of your older OS version.

@krisdigital
Copy link

@piskvorky That is what I did and it worked! #2802 (comment) The numpy step is probably not needed..

@mpenkov
Copy link
Collaborator

mpenkov commented May 1, 2020

I cannot reproduce the problem locally on my setup, identical to that of @hsimpson22:

  • MacOS Mojave 10.14.5
  • Python 3.7.6
  • gensim 3.8.2 installed via pip (from wheels)

@krisdigital
Copy link

krisdigital commented May 1, 2020

I tried again and it uses gensim-3.8.2-cp37-cp37m-macosx_10_9_x86_64.whl on my machine. looks wrong to me? Other than that no useful output..

Using the wheel I get the error again during training..

@piskvorky
Copy link
Owner

piskvorky commented May 1, 2020

Luckily, I have an old Macbook! Still on 10.11.6 El Capitan :)

Running pip install --upgrade gensim in a clean environment installed Gensim from gensim-3.8.2-cp37-cp37m-macosx_10_9_x86_64.whl without issues.

Training a word2vec model from the examples:

from gensim.models import Word2Vec
from gensim.test.utils import common_texts
model = Word2Vec(common_texts, size=100, window=5, min_count=1, workers=4)

/Volumes/work/workspace/vew/tst/lib/python3.7/site-packages/gensim/models/base_any2vec.py:743: UserWarning: C extension not loaded, training will be slow. Install a C compiler and reinstall gensim for fast training.
  "C extension not loaded, training will be slow. "

So, I can confirm the issue. After much pomp, the automated wheels testing process is not catching even critical compilation issues like this.

import numpy; print("NumPy", numpy.__version__)
NumPy 1.18.3
import scipy; print("SciPy", scipy.__version__)
SciPy 1.4.1
import gensim; print("gensim", gensim.__version__)
gensim 3.8.1
from gensim.models import word2vec;print("FAST_VERSION", word2vec.FAST_VERSION)
FAST_VERSION -1

@piskvorky piskvorky added bug Issue described a bug reach MEDIUM Affects a significant number of users impact HIGH Show-stopper for affected users testing Issue related with testing (code, documentation, etc) and removed need info Not enough information for reproduce an issue, need more info from author labels May 1, 2020
@krisdigital
Copy link

I remember El Capitan :) Good times..

So then it seems to be a problem with older hardware independent from the macOS version 🤔

@amehtaSF
Copy link

amehtaSF commented May 1, 2020

I have a 2018 macbook pro running mac os catalina 10.15.4 and I am running into the same issue where it does not find the c compiler. The installation looks normal and I tried @krisdigital 's suggestion to install with the extra arguments

@krisdigital
Copy link

@amehtaSF You say it does not work running pip uninstall gensim and then pip install gensim --no-cache-dir --no-binary :all:?

@piskvorky
Copy link
Owner

piskvorky commented May 1, 2020

I confirm installing from source with

pip install gensim --no-cache-dir --no-binary gensim

fixed this.

It's only the precompiled (OSX only? all?) wheels that are at fault.

@amehtaSF
Copy link

amehtaSF commented May 1, 2020

@krisdigital Ah I apologize! I just tried again and it is now working. I had initially only re-run the import statement for gensim and it didn't work, but this time I completely restarted PyCharm and it did work after I did that. Thank you for your help!!

@piskvorky
Copy link
Owner

piskvorky commented May 1, 2020

I also confirm gensim 3.8.1 didn't have this problem – the precompiled OSX wheel works. It must be something recent.

@mpenkov this may be blocking for 3.8.3. Please let me know when the final wheels are ready. I'd like to try them manually first, before the "public" release.

@mpenkov
Copy link
Collaborator

mpenkov commented May 2, 2020

@piskvorky The wheels output from the latest gensim-wheels are in s3://gensim-wheels. You can obtain a wheel by using e.g.

aws s3 cp s3://gensim-wheels/gensim-3.8.3-cp37-cp37m-macosx_10_9_x86_64.whl .

(Ignore wheels from other gensim versions in that S3 bucket)

@menshikh-iv
Copy link
Contributor

@piskvorky I already fix that & add sanity checking in wheel building (build fails if "bad" fast version), so, it should work right now

@piskvorky
Copy link
Owner

piskvorky commented May 2, 2020

Awesome. I confirm s3://gensim-wheels/gensim-3.8.3-cp37-cp37m-macosx_10_9_x86_64.whl works on El Capitan.

@menshikh-iv can you link to your fix of this bug here? Just for completeness & archival – I'll close this ticket.

@menshikh-iv
Copy link
Contributor

@piskvorky of course,

Also, "bad" fast version related issue #2794

@menshikh-iv
Copy link
Contributor

@piskvorky @mpenkov yesterday I also fixed issue related to nmslib in release-3.8.3 branch, so, now we ready to release (I guess?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue described a bug impact HIGH Show-stopper for affected users reach MEDIUM Affects a significant number of users testing Issue related with testing (code, documentation, etc)
Projects
None yet
Development

No branches or pull requests

8 participants