Read First: FAQ / Common Issues / Troubleshooting Guide #8226
Locked
polm
started this conversation in
Help: Best practices
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
This is a guide to issues that multiple people have experienced. Some of them are basic Python environment issues, some of them are recent bugs in or out of spaCy we're dealing with, and some we're not sure about yet.
Be sure to also check out the troubleshooting section in the docs!
How to ask a good question
If your question isn't answered here, feel free to open a new discussion. Some things to keep in mind to help us help you:
spacy info
and the versions of any related packages you are usingAre you using an old spaCy version?
Be sure to upgrade to the most recent spaCy version to get fixes for older bugs. If you're using v2.x, we're still releasing updates for it, so try those too. If you can't upgrade your version for some reason, let us know why so we understand your situation better.
Installing trained pipelines
Jupyter and Google Colab
I installed spaCy in Jupyter but I get the error "no such module"
GPU Issues on Google Colab
KeyError: 'packaging'
It seems to be possible to fix this by using a clean environment and not installing cupyGPU Support
I have an AMD card...
We do not currently have a guide for this. Cupy has experimental support for AMD GPUs you can try. If you've gotten spaCy working with an AMD card, please let us know! You can open a Discussion on the topic.
Training a ML model
Check the Example Projects!
Did you know we have example projects? These are complete examples of using the NER, Text Categorizer, Entity Linking, and other models. They include all necessary training data, so you can check out the code, train a model, and use their configs and conversion scripts as reference for your own models. If you have a question about how everything fits together in practice, check here first!
Preprocessing Text
My retrained model forgot pretrained entities
Can I continue training from the latest epoch of the previous training run?
I'm having trouble with binary classification in textcat
I want to add non-textual features
As of 3.2,
nlp()
andnlp.pipe()
accept Docs as input, so what you can do is create a simple Doc and add your data as custom attributes, and then pass that Doc to another pipeline.Here are some older workarounds for this:
make_doc
to pass arbitrary extra data.What are iterations, steps, epochs....? When does training stop?
E
and#
Hyperparameters
Performance
Incorrect pre-trained model predictions
spaCy is too slow
I'm getting Out of Memory errors
doc_cleaner
component at the end of the pipeline in spaCy v3.2.1+ to automatically removedoc._.trf_data
to reduce the memory required during the training evaluate steps and the size of any saved output docs.Windows
I get the error
Microsoft Visual C++ 14.0 is required
I get the error
ImportError: DLL load failed: The specified module could not be found.
Beta Was this translation helpful? Give feedback.
All reactions