-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Steady memory use increase when processing not so large files #437
Comments
I agree that changing |
Thanks for the example, it's nice of you to have taken the time to create it. And clearly, I'm not suggesting there is an issue in the library, I'm quite convinced it comes from my own code. The example is not using the |
You can use my code and convert it to |
Thanks for your patience, here is a test script that gives the following memory usage graph: I replaced the use of the |
You really need to run the process for a long time, eg 10 000 000 records to make sure the memory is stable. It is normal for the memory to slightly ramp-up in the mean time. |
Just to confirm, I created a similar script to yours in commit 9d62219 and run for some time. Memory is table, here are 3 memory dumps:
|
Look at bec3f12, It insert a transformer to stop the pipeline once a certain number of processed records is reached. |
Happy to learn about the root cause, closing. |
2 million lines results in 50GB memory usage... |
This is Javascript. Memory usage is the head size, around 10/20GB, 50GB is more about the allocated/reserved memory. It include the Node.js runtime and the buffer whose size is configurable. The parser in itself consume no memory at all. |
I used
I fixed it by using the readline module to process the file in batches of 10000
now the memory issue is gone maybe worth adding as a feature if the detected file size is too large? |
Detecting a large file size is out of scope with the library. |
Hello,
I'm using csv-parse to process a CSV file and csv-stringify to create an output CSV file out of that process with the following pseudo code:
sourcefile
is a dictionary entry from a zip file read withunzipper
which conveniently gives a stream to work with.Tracing the results from

process.memoryUsage()
I get the following graph:horizontal axis is the record index
If I comment out the call to

outputStringifyer.write()
, then I get the following graphThe memory increase is much smaller if at all present in that case.
The examples above are from a 10k lines file and with a 100k lines I reach the default memory limit in NodeJS 18 which then triggers this error:
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
Passing the
--max-old-space-size=4096
option allows to process the 100k lines file just fine but I'm worried that this just pushes the limit further down the line.I'm quite convinced there is a fundamental misunderstanding in my code but right now, it eludes me quite a bit.
Maybe I would need to change the async iterator approach to a stream approach but I'm not sure how this would be done, let alone if it would allow me to abort in the middle like I do in the example above.
Thanks a lot for any suggestion.
The text was updated successfully, but these errors were encountered: