FAQ: What to do when spaCy is too slow? #8402

polm · 2021-06-16T05:12:31Z

polm
Jun 16, 2021

Sometimes people find that spaCy is taking too long for their intended use and want to speed up processing. We have a set of standard suggestions for improving processing speed that you should try first before getting creative. This FAQ will show you how to make spaCy fast, especially for large amounts of data or if you're only using some components. If you find these don't help you, please feel free to open a Discussion with more details of your problem - see the bottom of this post for details on that.

Use `nlp.pipe`

When you create a doc, that requires getting and then releasing a number of resources. By default these are not re-used between calls of nlp(text), but there's a slightly different interface that does re-use them, and can often give you significant speedup. You can use nlp.pipe like this:

texts = ["this is a doc", "this is another doc", "this is a banana, cleverly disguised as text", ...]
for doc in nlp.pipe(texts):
    ... do stuff ...

Using this is often enough to resolve speed problems, but if it doesn't cut it for you then read on.

Disable components you aren't using

If you're using spaCy just for the NER predictions, then time spent running the parser is wasted. Be sure to disable or avoid loading components you won't use.

If you aren't using a component at all, you can avoid ever running it like this.

spacy.load("some-model", disable=["parser"])

If you want to use components sometimes, but not in a particular hot loop, you can disable them just for a single call to nlp.pipe like this:

for doc in nlp.pipe(texts, disable=["parser"]):
    ... do stuff ...

There are more options too, see the docs. Also see this section for examples of disabling everything except NER and other specific cases.

Use a smaller model

If the above options don't help, you might want to consider using a smaller model. In particular CPU based models will typically be faster than GPU based models.

While larger models will almost always be more accurate, sometimes the difference between models is quite small. If speed is a priority for your application it's definitely worth evaluating smaller models. Thanks to the way spaCy pipelines are constructed, trying out a smaller can be as simple as changing a single line in a training config, so testing smaller models doesn't require much developer or machine time.

Use rule-based components

If you need sentence tokenization or lemmas, you can use rule-based alternatives to the default models for a speedup. See the docs for details. Here's an example of how to turn off the parser and use the Sentencizer instead for faster, rule-based sentence tokenization:

nlp = spacy.load("en_core_web_sm")
nlp.disable_pipe("parser")
nlp.add_pipe("sentencizer")

Check out the Cython API

spaCy has a Cython API that allows you to access internal data structures with less overhead. While the API is thoroughly documented, it's designed for speed over safety, so proceed with caution when using it.

Consider your application architecture

Sometimes the best way to deal with a performance issue is to change your plans so it doesn't come up in the first place.

Maybe you're parsing data when a user requests it, but it ends up being too much data. Can you instead process the data when you first get it and save the results for later?

Alternately, maybe you're processing all your data as soon as it arrives, but that's taking forever. If users only see some of your data, can you try processing it on demand and caching spaCy's output somewhere?

There's not always an architectural answer - sometimes you just have a lot of input and want answers fast - but it's good to check to see if you missed anything, and make an affirmative statement in your docs or code about why the way it is now is the best it can be.

Using multiprocessing with `nlp.pipe`

nlp.pipe takes an argument n_process which can be used for multiprocessing. However, multiprocessing does not imply a free speedup. Here's some tips for using this option.

If you want to know if there's a speedup from using multiprocessing, try n_process=2 as a comparison to single-process first, rather than jumping straight to n_process=-1. If 2 is faster than 1, then try increasing n_process.
- At some point you will have diminishing or negative returns from increasing this value.
Windows and macOS have different multiprocessing behavior than Linux. Because of this, you may not see a speedup with doing n_process>1 on those operating systems.
Combining n_process>1 with tools that also use multiprocessing (e.g. standard library's multiprocessing, concurrent.futures or something with service workers) can quickly lead to significant coordination overhead.
With larger models, memory limits are a concern. Be sure you have enough memory for a copy of the model on each process.
- A tool like memory_profiler can be helpful here, using a command like mprof run --include-children --multiprocess <your_program.py>
Helpful Discussions:
- Running the tokeniser in parallel does not gain a lot from more cores? #10306 (comment)
- spaCy multiprocessing ('n_process') makes Docker image exit #10087 (comment)

If the above doesn't solve your performance issues, then feel free to open a Discussion, but be sure to include an overview of your whole usage pattern and a description of what components you are using.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FAQ: What to do when spaCy is too slow? #8402

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

FAQ: What to do when spaCy is too slow? #8402

polm Jun 16, 2021

Use nlp.pipe

Disable components you aren't using

Use a smaller model

Use rule-based components

Check out the Cython API

Consider your application architecture

Using multiprocessing with nlp.pipe

Replies: 0 comments

polm
Jun 16, 2021

Use `nlp.pipe`

Using multiprocessing with `nlp.pipe`