Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

numpy 1.19.2 incompatible with gensim 4.1.0 #3226

Closed
martinobertoni opened this issue Sep 1, 2021 · 60 comments
Closed

numpy 1.19.2 incompatible with gensim 4.1.0 #3226

martinobertoni opened this issue Sep 1, 2021 · 60 comments
Assignees

Comments

@martinobertoni
Copy link

martinobertoni commented Sep 1, 2021

Problem description

When importing gensim I get the following error

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mbertoni/software/miniconda3/envs/test/lib/python3.7/site-packages/gensim/__init__.py", line 11, in <module>
    from gensim import parsing, corpora, matutils, interfaces, models, similarities, utils  # noqa:F401
  File "/home/mbertoni/software/miniconda3/envs/test/lib/python3.7/site-packages/gensim/corpora/__init__.py", line 6, in <module>
    from .indexedcorpus import IndexedCorpus  # noqa:F401 must appear before the other classes
  File "/home/mbertoni/software/miniconda3/envs/test/lib/python3.7/site-packages/gensim/corpora/indexedcorpus.py", line 14, in <module>
    from gensim import interfaces, utils
  File "/home/mbertoni/software/miniconda3/envs/test/lib/python3.7/site-packages/gensim/interfaces.py", line 19, in <module>
    from gensim import utils, matutils
  File "/home/mbertoni/software/miniconda3/envs/test/lib/python3.7/site-packages/gensim/matutils.py", line 1024, in <module>
    from gensim._matutils import logsumexp, mean_absolute_difference, dirichlet_expectation
  File "gensim/_matutils.pyx", line 1, in init gensim._matutils
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject

Steps/code/corpus to reproduce

conda create --name=test python=3.7 -y
conda install -y numpy==1.19.2
pip install gensim

Versions

Linux-5.11.0-25-generic-x86_64-with-debian-bullseye-sid
Python 3.7.11 (default, Jul 27 2021, 14:32:16)
[GCC 7.5.0]
Bits 64
NumPy 1.19.2
SciPy 1.7.1

@GuillemGSubies
Copy link

Same error here, any plans to solve it soon?

@subbuvidyasekar
Copy link

subbuvidyasekar commented Sep 1, 2021

I also have the same issue.
umap package requires numpy 1.19.2.
my gensim version 3.8.3
It works for me initially. it didnt work from 29th August in importing gensim

@piskvorky
Copy link
Owner

piskvorky commented Sep 1, 2021

Thanks all for reporting. Possible duplicate of #3097.

@mpenkov I thought we fixed this in #3095 (Gensim 4.0.1). Is this a regression?

@subbuvidyasekar upgrade Gensim, the old 3.8.3 version is no longer supported.

@mpenkov
Copy link
Collaborator

mpenkov commented Sep 1, 2021

@mpenkov I thought we fixed this in #3095 (Gensim 4.0.1). Is this a regression?

Hard to say at this stage. The previous fix may have been incomplete (e.g. #3097 (comment)).

We'll need to investigate.

@piskvorky
Copy link
Owner

piskvorky commented Sep 1, 2021

Maybe something to do with conda? I see that mentioned above – it's like a separate distribution center, so god knows what they do (or don't do) with numpy. We only support standard packages (PyPI, pip).

@martinobertoni can you try replicating without conda? Using just pip. Thanks.

@martinobertoni
Copy link
Author

Hi and thanks for looking into this
Avoiding conda and using venv give the same error

python -m venv test
source test/bin/activate
pip install numpy==1.19.2
pip install gensim

@martinobertoni
Copy link
Author

found a fix, let me know if it works for you:

pip install gensim --no-binary :all:

this apparently recompile stuff using whatever numpy on the system

@piskvorky
Copy link
Owner

piskvorky commented Sep 2, 2021

Thanks @martinobertoni . According to https://github.com/scipy/oldest-supported-numpy/blob/master/setup.cfg#L46 , the binary wheel for Gensim on Python 3.7 should be compiled against numpy 1.14.5. Is that correct @mpenkov ? And since 1.14.5 < 1.19.2 (backward compatible), the wheel should work… but doesn't.

I forgot what the numpy kerfuffle was, they changed their binary compatibility somehow. I'd have to re-read #3095 and #3097.

EDIT: @mpenkov it looks like with 4.0.0, we had to do a quick bugfix release 4.0.1 because the numpy-oldest-version resolution failed somehow: #3095 (comment). And you edited the numpy version manually.

@mpenkov
Copy link
Collaborator

mpenkov commented Sep 2, 2021

Yeah, I've also purged that information from my mind since the previous release, so it'll be a fresh start for me too. I'll try to cut out some hours during the weekend (or next week) and sit down to deal with this.

@piskvorky
Copy link
Owner

piskvorky commented Sep 6, 2021

Another report: https://twitter.com/nutniti1/status/1433702323335749637

Looks like 4.1.0 affected many people.

@J535D165
Copy link

J535D165 commented Sep 8, 2021

Yes, I can confirm. 4.0.1 works fine on Python 3.8 on Ubuntu, however 4.1.0 fails. The Numpy version is 1.21.2.

@mpenkov
Copy link
Collaborator

mpenkov commented Sep 8, 2021

Thank you everyone for your patience.

I think I've tracked down the problem. Our github actions workflow (a recent contribution 959f2dd) was not using oldest-supported-numpy. I think this is what caused the problem.

Can you please confirm that the wheels built off the most recent develop head work?

http://gensim-wheels.s3-website-us-east-1.amazonaws.com/

@mpenkov mpenkov self-assigned this Sep 9, 2021
@mpenkov
Copy link
Collaborator

mpenkov commented Sep 9, 2021

Unfortunately, my efforts did not help resolve the issue. I can still reproduce the problem, even with the newly built wheels.

I'm not sure what's going on here. We're definitely using oldest-supported-numpy to build the wheels:

Build wheel
  Building libraries...
  Building libraries finished.
  Collecting oldest-supported-numpy
    Downloading oldest_supported_numpy-0.10-py3-none-any.whl (3.8 kB)
  Collecting numpy==1.14.5
    Downloading numpy-1.14.5-cp37-cp37m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (4.7 MB)
  Installing collected packages: numpy, oldest-supported-numpy
  Successfully installed numpy-1.14.5 oldest-supported-numpy-0.10
  Processing /Users/runner/work/gensim/gensim
    DEPRECATION: A future pip version will change local packages to be built in-place without first copying to a temporary directory. We recommend you use --use-feature=in-tree-build to test your packages with this new behavior before it becomes the default.
     pip 21.3 will remove support for this functionality. You can find discussion regarding this at https://github.com/pypa/pip/issues/7555.
  Building wheels for collected packages: gensim
    Building wheel for gensim (setup.py): started
    Building wheel for gensim (setup.py): finished with status 'done'
    Created wheel for gensim: filename=gensim-4.1.1.dev0-cp37-cp37m-macosx_10_9_x86_64.whl size=23965679 sha256=3ed8975ace2985c7ac614110e05bf8a2afc039fa8b192284d4893ed840edf469

(from https://github.com/RaRe-Technologies/gensim/runs/3545153699?check_suite_focus=true#step:6:109)

but it doesn't seem to be helping:

(test.env) sergeyich:~ misha$ python --version
Python 3.7.11
(test.env) sergeyich:~ misha$ pip show numpy
Name: numpy
Version: 1.19.2
Summary: NumPy is the fundamental package for array computing with Python.
Home-page: https://www.numpy.org
Author: Travis E. Oliphant et al.
Author-email:
License: BSD
Location: /Users/misha/git/gensim/test.env/lib/python3.7/site-packages
Requires:
Required-by: scipy
(test.env) sergeyich:~ misha$ pip install ~/Downloads/gensim-4.1.1.dev0+20210908135409-cp37-cp37m-macosx_10_9_x86_64.whl
Processing ./Downloads/gensim-4.1.1.dev0+20210908135409-cp37-cp37m-macosx_10_9_x86_64.whl
Requirement already satisfied: smart-open>=1.8.1 in ./git/gensim/test.env/lib/python3.7/site-packages (from gensim==4.1.1.dev0+20210908135409) (5.2.1)
Requirement already satisfied: numpy>=1.17.0 in ./git/gensim/test.env/lib/python3.7/site-packages (from gensim==4.1.1.dev0+20210908135409) (1.19.2)
Requirement already satisfied: scipy>=0.18.1 in ./git/gensim/test.env/lib/python3.7/site-packages (from gensim==4.1.1.dev0+20210908135409) (1.7.1)
Installing collected packages: gensim
Successfully installed gensim-4.1.1.dev0
WARNING: You are using pip version 21.2.3; however, version 21.2.4 is available.
You should consider upgrading via the '/Users/misha/git/gensim/test.env/bin/python -m pip install --upgrade pip' command.
(test.env) sergeyich:~ misha$ python -c 'import gensim'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/misha/git/gensim/test.env/lib/python3.7/site-packages/gensim/__init__.py", line 11, in <module>
    from gensim import parsing, corpora, matutils, interfaces, models, similarities, utils  # noqa:F401
  File "/Users/misha/git/gensim/test.env/lib/python3.7/site-packages/gensim/corpora/__init__.py", line 6, in <module>
    from .indexedcorpus import IndexedCorpus  # noqa:F401 must appear before the other classes
  File "/Users/misha/git/gensim/test.env/lib/python3.7/site-packages/gensim/corpora/indexedcorpus.py", line 14, in <module>
    from gensim import interfaces, utils
  File "/Users/misha/git/gensim/test.env/lib/python3.7/site-packages/gensim/interfaces.py", line 19, in <module>
    from gensim import utils, matutils
  File "/Users/misha/git/gensim/test.env/lib/python3.7/site-packages/gensim/matutils.py", line 1024, in <module>
    from gensim._matutils import logsumexp, mean_absolute_difference, dirichlet_expectation
  File "gensim/_matutils.pyx", line 1, in init gensim._matutils
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject

@piskvorky
Copy link
Owner

Yeah that's weird. Does it work with numpy 1.14.5 (used during wheel building here), instead of numpy 1.19.2?

@mpenkov
Copy link
Collaborator

mpenkov commented Sep 9, 2021

No, there's a different problem there:

$ python -c 'import gensim'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/misha/git/gensim/test.env/lib/python3.7/site-packages/gensim/__init__.py", line 11, in <module>
    from gensim import parsing, corpora, matutils, interfaces, models, similarities, utils  # noqa:F401
  File "/Users/misha/git/gensim/test.env/lib/python3.7/site-packages/gensim/parsing/__init__.py", line 4, in <module>
    from .preprocessing import (  # noqa:F401
  File "/Users/misha/git/gensim/test.env/lib/python3.7/site-packages/gensim/parsing/preprocessing.py", line 26, in <module>
    from gensim import utils
  File "/Users/misha/git/gensim/test.env/lib/python3.7/site-packages/gensim/utils.py", line 61, in <module>
    default_prng = np.random.default_rng()
AttributeError: module 'numpy.random' has no attribute 'default_rng'

Installing gensim explicitly requires a more recent version of numpy

https://github.com/RaRe-Technologies/gensim/blob/fb0c722a66c50a2b42ffa1b5e4e77f1eaa6d529a/setup.py#L318

Could this be the cause of the problem? It's a bit late here, I might have to revisit this in the next couple of days with a fresh head.

@piskvorky
Copy link
Owner

piskvorky commented Sep 9, 2021

I don't that's a problem; 1.19.2 >= 1.17.0.

Unless someone has an idea, we might have to open a ticket at numpy again, ask what's going on.

But seeing as we had the same problem with 4.0.0 (fixed in 4.0.1), I suspect the same fix would help 4.1.0 too. @mpenkov do you remember what you did there?

Since 4.0.1 still works by all accounts, even with numpy 1.19.2, I suspect the issue is somewhere on our side rather than numpy's.

@mpenkov
Copy link
Collaborator

mpenkov commented Sep 9, 2021

@mpenkov do you remember what you did there?

The windows wheel build was ignoring the oldest-supported-numpy dependency, so I made sure to explicitly install it prior to building the wheel.

piskvorky/gensim-wheels@ad9fbf1

@mpenkov
Copy link
Collaborator

mpenkov commented Sep 10, 2021

Since 4.0.1 still works by all accounts, even with numpy 1.19.2, I suspect the issue is somewhere on our side rather than numpy's.

There are several potential problem points, in order of rabbit-hole depth:

  • our build system
  • multibuild
  • manylinux
  • oldest-supported-numpy
  • numpy
  • pip
  • python

The most likely by far is our build system (scripts we use for building wheels). It uses multibuild, which in turn uses manylinux.

One thing to try would be to try reproduce the problem with the least number of moving parts:

  • Build a wheel using oldest-supported-numpy in your local environment (e.g. MacOS, Py3.7)
  • Try installing that wheel in an environment with a newer numpy

If there's still a problem, we may eliminate multibuild and manylinux from the above list of problems. It would also be a step in the direction of producing a reproducible example to show to other people.

If there's no problem, then something in our build system is messed up, and needs further investigation.

I can't think of anything better at the moment, so I'll try the above and report back when I have results. If anyone else is willing to play along at home, then please be welcome to do so ;)

@mpenkov
Copy link
Collaborator

mpenkov commented Sep 10, 2021

@piskvorky @gojomo I've reproduced the problem locally, without using our CI/multibuild/multilinux. I think something is wrong with our setup.py and we need to look there.

The steps are:

  1. Build wheel against oldest_supported_numpy under Py3.7 (build env)
  2. Test wheel against numpy==1.19.2 under Py3.7 (test env)
  3. boom!

Can you please try reproducing this?

build.sh

#!/usr/bin/env bash
set -euxo pipefail

numpy_str="${1:-oldest-supported-numpy}"

rm -rf wheel-builder.env
virtualenv -p /usr/local/opt/python\@3.7/bin/python3.7 wheel-builder.env
source wheel-builder.env/bin/activate

python --version
pip --version
pip install "$numpy_str"
pip freeze
pip -v wheel .

rm -rf wheel-builder.env

test.sh

#!/usr/bin/env bash
set -euxo pipefail

wheel_path="$1"
numpy_str="$2"

virtualenv -p /usr/local/opt/python\@3.7/bin/python3.7 wheel-tester.env
source wheel-tester.env/bin/activate

python --version
pip --version

cp "$wheel_path" wheel-tester.env/
pushd wheel-tester.env
pip install "$numpy_str"
pip install gensim-*.whl
pip freeze
python -c 'import gensim;print(gensim.__version__)'
popd

rm -rf wheel-tester.env

@mpenkov
Copy link
Collaborator

mpenkov commented Sep 11, 2021

I've tried building against various versions of numpy locally (MacOS, Py3.7) and then testing with 1.19.2:

  • 💥 1.14.5 (same as oldest-supported-numpy)
  • 💥 1.15.1
  • 💥 1.16.1
  • 💥 1.16.5
  • 🎉 1.17.0
  • 🎉 1.17.1

(the above is purely trial and error)

This is surprising to me. According to the official numpy recommendation, code built against oldest-supported-numpy (1.14.5 in this particular case) should work with all future versions of numpy. The results above contradict that, so it may be worth raising this with the numpy guys.

In the immediate sense, how about we build against 1.17.0 instead of oldest-supported-numpy? That would allow us to make a bugfix release quickly.

@piskvorky @menshikh-iv @gojomo What are your thoughts?

@menshikh-iv
Copy link
Contributor

menshikh-iv commented Sep 11, 2021

I also made some tests on Ubuntu with py3.7 and I'm +1 for using 1.17 as base build numpy version. We also have these one in setup.py https://github.com/RaRe-Technologies/gensim/blob/develop/setup.py#L318, so, no reasons to use smth lower than these one.

@mpenkov
Copy link
Collaborator

mpenkov commented Sep 11, 2021

Do we build against numpy 1.7.0 for all Python versions, on all platforms? Or is there a better way to go?

@menshikh-iv
Copy link
Contributor

Logic from oldest-supported-numpy

python_requires = >=3.5
install_requires = 
	
	numpy==1.16.0; python_version=='3.5' and platform_system=='AIX'
	numpy==1.16.0; python_version=='3.6' and platform_system=='AIX'
	numpy==1.16.0; python_version=='3.7' and platform_system=='AIX'
	
	numpy==1.18.5; python_version=='3.5' and platform_machine=='aarch64'
	numpy==1.19.2; python_version=='3.6' and platform_machine=='aarch64'
	numpy==1.19.2; python_version=='3.7' and platform_machine=='aarch64'
	numpy==1.19.2; python_version=='3.8' and platform_machine=='aarch64'
	
	numpy==1.21.0; python_version=='3.8' and platform_machine=='arm64' and platform_system=='Darwin'
	numpy==1.21.0; python_version=='3.9' and platform_machine=='arm64' and platform_system=='Darwin'
	
	numpy==1.13.3; python_version=='3.5' and platform_machine!='aarch64' and platform_system!='AIX'
	numpy==1.13.3; python_version=='3.6' and platform_machine!='aarch64' and platform_system!='AIX' and platform_python_implementation != 'PyPy'
	numpy==1.14.5; python_version=='3.7' and platform_machine!='aarch64' and platform_system!='AIX' and platform_python_implementation != 'PyPy'
	numpy==1.17.3; python_version=='3.8' and (platform_machine!='arm64' or platform_system!='Darwin') and platform_machine!='aarch64' and platform_python_implementation != 'PyPy'
	numpy==1.19.3; python_version=='3.9' and (platform_machine!='arm64' or platform_system!='Darwin') and platform_python_implementation != 'PyPy'
	numpy==1.21.2; python_version=='3.10' and platform_python_implementation != 'PyPy'
	
	numpy==1.19.0; python_version=='3.6' and platform_python_implementation=='PyPy'
	numpy==1.20.0; python_version=='3.7' and platform_python_implementation=='PyPy'
	
	numpy; python_version>='3.11'
	numpy; python_version>='3.8' and platform_python_implementation=='PyPy'

@mattip
Copy link

mattip commented Sep 11, 2021

If you wish to build and test against NumPy 1.14.5, you cannot use numpy.random.default_rng that was introduced in NumPy 1.17.

@mpenkov
Copy link
Collaborator

mpenkov commented Sep 17, 2021

Thank you @PrimozGodec!

https://pypi.org/project/gensim/4.1.2/

@gojomo
Copy link
Collaborator

gojomo commented Sep 17, 2021

@PrimozGodec Do you happen to know if this updated constraint in #3236 allows the build to work on the new Apple processors (M1 etc)? (Or, if this doesn't do it, any ideas on the minimal build-params update that might help that work?)

@PrimozGodec
Copy link
Contributor

@gojomo it is something new to me, I noticed that the separate wheels for new Mac processors exist during this conversation. I googled a bit and it seems that the x86_64 that most of the packages provide are not compatible with these processors. Packages that build platform-specific wheels will slowly need to start building arm64 wheels. Currently, users of computers with new processors will need to build a package themself (when calling pip install package .tar.gz will be downloaded and the package will be built if libraries exist at the computer).

@piskvorky
Copy link
Owner

@gojomo IIRC Gensim already builds aarm64 wheels – is that not enough?

@mpenkov
Copy link
Collaborator

mpenkov commented Sep 21, 2021

They are for Linux. We don't build them for MacOS yet.

@piskvorky
Copy link
Owner

piskvorky commented Sep 21, 2021

Okay. Is it a matter of just building them, or something more fundamental?

@gojomo did you try building those wheels, any errors? I see some work was done in #3026 but I'm not sure how deep this goes, no way to test.

@mpenkov
Copy link
Collaborator

mpenkov commented Sep 21, 2021

It may be as simple as just building them. The devil's always in the details, though, and once we start it's possible that previously unseen obstacles will come up.

If there's demand, we can start looking into it for the next release.

@jmichaelschmidt
Copy link

found a fix, let me know if it works for you:

pip install gensim --no-binary :all:

this apparently recompile stuff using whatever numpy on the system

This solution worked for me too...

@piskvorky
Copy link
Owner

piskvorky commented Sep 22, 2021

@mpenkov can we close this? AFAIK Gensim 4.1.2 fixed all the numpy issues.

Support for Apple's M1 is a separate ticket (assuming it is an issue at all, pending @gojomo feedback).

@mpenkov
Copy link
Collaborator

mpenkov commented Sep 23, 2021

I think so.

@mpenkov mpenkov closed this as completed Sep 23, 2021
@gojomo
Copy link
Collaborator

gojomo commented Sep 23, 2021

I've never built wheels, nor do I have an M1 processor. But people with M1 processors have asked on the project discussion list, and M1 processors (& their imminent followups coming in newer Macs) are reported to have some exciting performance gains – so it'd be a good thing to get/confirm working, as soon as a capable dev with the right system/tools can do so.

If pip install gensim --no-binary :all: generally gets it working, it'd be good to post that recommendation to the list thread at https://groups.google.com/g/gensim/c/4x0YipvR6-A/m/K-yxRSaQBAAJ.

davidshumway added a commit to davidshumway/stellargraph that referenced this issue Dec 25, 2021
[x] installs gensim==4.1.2 (stellargraph#2010 and piskvorky/gensim#3226)
[x] fixes typo (?): notebook states training on "75%" but `train_test_split` ratio set to "0.1", which also affects final accuracy (85% new / 72% old)
[x] sets `n_jobs=4` as default value in LogisticRegressionCV is resulting in non-convergence
@GuillemGSubies
Copy link

I'm having this problem again with gensim 4.1.2 and python3.10

@gojomo
Copy link
Collaborator

gojomo commented Mar 31, 2022

I'm having this problem again with gensim 4.1.2 and python3.10

Can you be more specific about the exact numpy version involved, and the exact error message you're seeing, after which steps-to-trigger?

@GuillemGSubies
Copy link

@gojomo run the following command

conda create -n replicate python=3.10 -y ; conda activate replicate ; pip install gensim==4.1.2 numpy==1.21.5 ; python -c "import gensim"

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/guillem.garcia/miniconda3/envs/borrar/lib/python3.10/site-packages/gensim/__init__.py", line 11, in <module>
    from gensim import parsing, corpora, matutils, interfaces, models, similarities, utils  # noqa:F401
  File "/home/guillem.garcia/miniconda3/envs/borrar/lib/python3.10/site-packages/gensim/corpora/__init__.py", line 6, in <module>
    from .indexedcorpus import IndexedCorpus  # noqa:F401 must appear before the other classes
  File "/home/guillem.garcia/miniconda3/envs/borrar/lib/python3.10/site-packages/gensim/corpora/indexedcorpus.py", line 14, in <module>
    from gensim import interfaces, utils
  File "/home/guillem.garcia/miniconda3/envs/borrar/lib/python3.10/site-packages/gensim/interfaces.py", line 19, in <module>
    from gensim import utils, matutils
  File "/home/guillem.garcia/miniconda3/envs/borrar/lib/python3.10/site-packages/gensim/matutils.py", line 1024, in <module>
    from gensim._matutils import logsumexp, mean_absolute_difference, dirichlet_expectation
  File "gensim/_matutils.pyx", line 1, in init gensim._matutils
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

@gojomo
Copy link
Collaborator

gojomo commented Apr 1, 2022

Thanks for the 1-liner to replicate!

It looks like this is a new combination-of-versions causing a similar binary incompatibility: python==3.10; numpy==1.21.5; gensim=4.1.2.

As of gensim==4.1.2, it doesn't look like Gensim's wheels configuration even contemplates python==3.10 – see https://github.com/RaRe-Technologies/gensim/blob/4.1.2/.github/workflows/build-wheels.yml – so perhaps an interim answer/workaround is: "Gensim isn't yet tested & officially supported in Python 3.10; try Python 3.9." Similarly, in Python 3.9, our wheel-building spec only specifies numpy==1.19.3 – so forcing a later numpy, unless absolutely necessary, may contribute to the issue.

Of course, we'd eventually want to fix this both for Python 3.10, and if at all possible more automatically handle normal Python & numpy version-increments without requiring manual Gensim config-twiddling - but achieving such additional version-obliviousness may deserve a creation/discussion in a new issue.

Another workaround possibility might be to force the local recompilation of Gensim binaries, rather than relying on a wheel that may be mismatched – via an option like the pip install gensim --no-binary :all: variant above.

Relatedly: should perhaps Gensim's default installs go back to not including pre-compiled binaries, especially on OSes (like linux) where that proceeds relatively smoothly? We might thyen only rely upon the extra specificity imposed by binary wheels in situations, such as MSWindows installs, where the process needs that extra hand-holding.

@piskvorky
Copy link
Owner

piskvorky commented Apr 1, 2022

I think precompiled wheels are good, it helps users a lot. I'd prefer to keep them for Gensim.

TBH I don't understand why numpy gives us so much trouble. I thought they created one binary incompatibility way back, which wreaked havoc downstream (incl. Gensim) at one point. But that that was exceptional.

Instead, binary numpy incompatibilities look like a common occurrence now :(

Python 3.10 is coming in the next Gensim release, including wheels. The release is imminent but I have no idea how that affects the numpy mess. I also don't know how conda is implicated (which we don't support).

@gojomo
Copy link
Collaborator

gojomo commented Apr 1, 2022

My impression was pre-wheels, installs were still pretty straightforward on Unixes (albeit slightly slower), & the biggest benefit has been sparing Windows users from confusing buildchain choices there. But wheels also seem to require foreseeing/prebuilding-for all these varied configs, to avoid these binary-format mismatches.

I don't believe conda is an essential element of this current problem report, as the recipe-to-reproduce uses pip to install both numpy and gensim – I suspect you'd see the same error with a vanilla Python3 venv environment.

@mpenkov
Copy link
Collaborator

mpenkov commented Apr 2, 2022

Yeah, I don't understand what the problem here is, either.

If numpy continues to give us pain, though, perhaps we should present users with a more helpful error message when this happens? e.g.

"Under Python {your_python_version}, gensim {gensim_version} requires numpy {minimum_numpy_version} or above, but your numpy version is {current_numpy_version}. Consider upgrading numpy"

What do you think?

@piskvorky
Copy link
Owner

piskvorky commented Apr 2, 2022

Consider upgrading numpy

Isn't it the opposite? According to the above, we build with numpy==1.19.3 but the user had a problem with numpy==1.21.5.

@mattip
Copy link

mattip commented Apr 2, 2022

@GuillemGSubies I cannot reproduce your situation: for me (on Ubuntu 20.04 with conda 4.10.3) that one liner succeeds in building gensim from source and then using it. I think you should open a new issue with the complete log of your build. If you do so, please ping me there. One theory for the failure might be that you have numpy 1.22 installed somewhere and the gensim build is picking it up, but it is hard to tell without the log (even then it may be difficult to debug). The message you see indicates that gensim is being built with numpy1.22 (the "Expected 96 from C header" part of the error), and being run with a numpy version 1.20 or 1.21 (the "got 88 from PyObject" part of the error). Here are the sizes of np.ndarray for reference:

version size
up to 1.19 inclusive 80
1.20 - 1.21 inclusive 88
1.22 96

@conduit242
Copy link

I'm still getting this a year later with Python 3.8.8 and NumPy 1.20.1 on Monterrey 12.6 with an i7 processor.

@PrimozGodec
Copy link
Contributor

@conduit242 I discovered the same a few days ago and proposed a PR which will fix it #3467.

The problem is that in version 4.3.1, they accidentally started to build wheels on the newest Numpy. Until they accept the PR, you can use Gensim 4.3.0, which should work.

cskaandorp pushed a commit to cskaandorp/asreview that referenced this issue Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.