FAQ: What to do when spaCy is too slow? #8402
Locked
polm
started this conversation in
Help: Best practices
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Sometimes people find that spaCy is taking too long for their intended use and want to speed up processing. We have a set of standard suggestions for improving processing speed that you should try first before getting creative. This FAQ will show you how to make spaCy fast, especially for large amounts of data or if you're only using some components. If you find these don't help you, please feel free to open a Discussion with more details of your problem - see the bottom of this post for details on that.
Use
nlp.pipe
When you create a doc, that requires getting and then releasing a number of resources. By default these are not re-used between calls of
nlp(text)
, but there's a slightly different interface that does re-use them, and can often give you significant speedup. You can usenlp.pipe
like this:Using this is often enough to resolve speed problems, but if it doesn't cut it for you then read on.
Disable components you aren't using
If you're using spaCy just for the NER predictions, then time spent running the parser is wasted. Be sure to disable or avoid loading components you won't use.
If you aren't using a component at all, you can avoid ever running it like this.
If you want to use components sometimes, but not in a particular hot loop, you can disable them just for a single call to
nlp.pipe
like this:There are more options too, see the docs. Also see this section for examples of disabling everything except NER and other specific cases.
Use a smaller model
If the above options don't help, you might want to consider using a smaller model. In particular CPU based models will typically be faster than GPU based models.
While larger models will almost always be more accurate, sometimes the difference between models is quite small. If speed is a priority for your application it's definitely worth evaluating smaller models. Thanks to the way spaCy pipelines are constructed, trying out a smaller can be as simple as changing a single line in a training config, so testing smaller models doesn't require much developer or machine time.
Use rule-based components
If you need sentence tokenization or lemmas, you can use rule-based alternatives to the default models for a speedup. See the docs for details. Here's an example of how to turn off the parser and use the Sentencizer instead for faster, rule-based sentence tokenization:
Check out the Cython API
spaCy has a Cython API that allows you to access internal data structures with less overhead. While the API is thoroughly documented, it's designed for speed over safety, so proceed with caution when using it.
Consider your application architecture
Sometimes the best way to deal with a performance issue is to change your plans so it doesn't come up in the first place.
Maybe you're parsing data when a user requests it, but it ends up being too much data. Can you instead process the data when you first get it and save the results for later?
Alternately, maybe you're processing all your data as soon as it arrives, but that's taking forever. If users only see some of your data, can you try processing it on demand and caching spaCy's output somewhere?
There's not always an architectural answer - sometimes you just have a lot of input and want answers fast - but it's good to check to see if you missed anything, and make an affirmative statement in your docs or code about why the way it is now is the best it can be.
Using multiprocessing with
nlp.pipe
nlp.pipe
takes an argumentn_process
which can be used for multiprocessing. However, multiprocessing does not imply a free speedup. Here's some tips for using this option.If you want to know if there's a speedup from using multiprocessing, try
n_process=2
as a comparison to single-process first, rather than jumping straight ton_process=-1
. If2
is faster than1
, then try increasingn_process
.Windows and macOS have different multiprocessing behavior than Linux. Because of this, you may not see a speedup with doing
n_process>1
on those operating systems.Combining
n_process>1
with tools that also use multiprocessing (e.g. standard library's multiprocessing, concurrent.futures or something with service workers) can quickly lead to significant coordination overhead.With larger models, memory limits are a concern. Be sure you have enough memory for a copy of the model on each process.
memory_profiler
can be helpful here, using a command likemprof run --include-children --multiprocess <your_program.py>
Helpful Discussions:
If the above doesn't solve your performance issues, then feel free to open a Discussion, but be sure to include an overview of your whole usage pattern and a description of what components you are using.
Beta Was this translation helpful? Give feedback.
All reactions