Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with inital tests #2994

Closed
botzill opened this issue Nov 30, 2018 · 7 comments
Closed

Issue with inital tests #2994

botzill opened this issue Nov 30, 2018 · 7 comments
Labels
feat / serialize Feature: Serialization, saving and loading 🔮 thinc spaCy's machine learning library Thinc third-party Third-party packages and services

Comments

@botzill
Copy link

botzill commented Nov 30, 2018

Hi.

I'm trying to get started with spacy and I get this issue when running simple code:

import spacy

spacy.prefer_gpu()
nlp = spacy.load('en_core_web_sm')

Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python3.6/dist-packages/spacy/init.py", line 21, in load
return util.load_model(name, **overrides)
File "/usr/local/lib/python3.6/dist-packages/spacy/util.py", line 114, in load_model
return load_model_from_package(name, **overrides)
File "/usr/local/lib/python3.6/dist-packages/spacy/util.py", line 135, in load_model_from_package
return cls.load(**overrides)
File "/usr/local/lib/python3.6/dist-packages/en_core_web_sm/init.py", line 12, in load
return load_model_from_init_py(file, **overrides)
File "/usr/local/lib/python3.6/dist-packages/spacy/util.py", line 173, in load_model_from_init_py
return load_model_from_path(data_path, meta, **overrides)
File "/usr/local/lib/python3.6/dist-packages/spacy/util.py", line 156, in load_model_from_path
return nlp.from_disk(model_path)
File "/usr/local/lib/python3.6/dist-packages/spacy/language.py", line 647, in from_disk
util.from_disk(path, deserializers, exclude)
File "/usr/local/lib/python3.6/dist-packages/spacy/util.py", line 511, in from_disk
reader(path / key)
File "/usr/local/lib/python3.6/dist-packages/spacy/language.py", line 643, in
deserializers[name] = lambda p, proc=proc: proc.from_disk(p, vocab=False)
File "pipeline.pyx", line 643, in spacy.pipeline.Tagger.from_disk
File "/usr/local/lib/python3.6/dist-packages/spacy/util.py", line 511, in from_disk
reader(path / key)
File "pipeline.pyx", line 626, in spacy.pipeline.Tagger.from_disk.load_model
File "pipeline.pyx", line 627, in spacy.pipeline.Tagger.from_disk.load_model
File "/usr/local/lib/python3.6/dist-packages/thinc/neural/_classes/model.py", line 335, in from_bytes
data = msgpack.loads(bytes_data, encoding='utf8')
File "/usr/local/lib/python3.6/dist-packages/msgpack_numpy.py", line 184, in unpackb
return _unpackb(packed, **kwargs)
File "msgpack/_unpacker.pyx", line 187, in msgpack._cmsgpack.unpackb
ValueError: 1792000 exceeds max_bin_len(1048576)


My server is powerful 16GB and 6CPU, what can be the issue ?

@ajaymaity
Copy link

I am getting the same issue. It was working fine before today.

@ines ines added third-party Third-party packages and services 🔮 thinc spaCy's machine learning library Thinc feat / serialize Feature: Serialization, saving and loading labels Nov 30, 2018
@ines
Copy link
Member

ines commented Nov 30, 2018

Looks like this is related to today's update of msgpack 😞 Working on this, see my comment in #2995:

Looks like it might be related to an update of the msgpack library that was released today and is used in our library thinc, which spaCy depends on. So when you installed spaCy, that new version was pulled in and apparently it includes a change to the limit?

We'll investigate this and hopefully push an update to thinc soon. In the meantime, try downgrading msgpack:

pip install msgpack==0.5.6

@ines ines closed this as completed Nov 30, 2018
@ivyleavedtoadflax
Copy link
Contributor

I've started experiencing this today as well, whever loading the en_core_web_sm model either with:

import en_core_web_sm
nlp = en_core_web_sm.load()

or having downloaded the model with spacy download

import spacy
nlp = spacy.load('en_core_web_sm')

Interestingly when i run python -m spacy info --markdown it returns: **Models:** en_core_web_lg, despite having loaded en_core_web_sm.

Info about spaCy

  • spaCy version: 2.0.17/2.0.13
  • Platform: Linux-4.15.0-39-generic-x86_64-with-debian-buster-sid
  • Python version: 3.6.4
  • Models: en_core_web_sm

@h-oll
Copy link

h-oll commented Dec 18, 2018

Getting the same kind of issue with msgpack 0.5.6:
File "msgpack/_unpacker.pyx", line 200, in msgpack._unpacker.unpackb ValueError: 2681947787 exceeds max_bin_len(2147483647)

@aloncohen1
Copy link

aloncohen1 commented Dec 18, 2018

I am getting the same error when I am trying to run:
nlp = spacy.load('en_core_web_sm')

I am getting:
ValueError: 1792000 exceeds max_bin_len(1048576)

@kylehiroyasu
Copy link

Same issue...

@lock
Copy link

lock bot commented Jan 19, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Jan 19, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feat / serialize Feature: Serialization, saving and loading 🔮 thinc spaCy's machine learning library Thinc third-party Third-party packages and services
Projects
None yet
Development

No branches or pull requests

7 participants