Memory Leak when using spacy FASTAPI #10496
Replies: 2 comments 2 replies
-
The processes are killed because there is no free memory anymore. The interesting part here is the memory usage of the gunicorn processes. So dumped the memory usage during operations:
As you can see, each gunicorn process uses ~2GB RAM (the PSS Column) and additionally ~500MB Swap space. After a restart of spacy fastapi:
After the restart, it still uses 2GB RAM, but nearly no swap space. |
Beta Was this translation helpful? Give feedback.
-
Hi @MariamRiaz , the increase in memory may be caused by the growing vocab as your server accepts a lot of requests, although yes it's unusual that it grows that quickly. It's hard to pinpoint what the exact cause is: it could be the gunicorn server itself and not the logic behind getting the dependency information. One thing you can try out is to force restart a worker after a certain number of requests. If you're using FastAPI and gunicorn/uvicorn, you can set --max-requests to some number, to alleviate the memory leak. |
Beta Was this translation helpful? Give feedback.
-
Hi All,
I am utilizing fastapi to serve two of spacy models as RESTAPI, ""en_core_web_lg" and "de_core_news_lg". We send our text data to this RESTAPI to get dependency information from the models. As we were testing our large dataset we experienced a memory leak on this process when the data reached to a million data points. (Keep in mind that the length of our text is around 30-60 token in average and not a lengthy text). Upon further investigation it seems that the process memory consumption is increasing gradually as the data points it processes increases. If you can help us in resolving this issue it would be very helpful.
Beta Was this translation helpful? Give feedback.
All reactions