Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't load CAR file into memory during upload #193

Open
makew0rld opened this issue Jun 10, 2024 · 5 comments
Open

Don't load CAR file into memory during upload #193

makew0rld opened this issue Jun 10, 2024 · 5 comments

Comments

@makew0rld
Copy link

Command: w3 up --car my_file.car.

When running this command with a large file, for example a CAR file created from a 4 GiB file of random data, I noticed my memory usage would go up by ~4 GiB, indicating w3 is loading the entire file into memory during the upload process.

I'm not sure exactly where in the code this is coming from, but it's not ideal and could cause OOM crashes/issues in some scenarios. Please let me know if a fix for this is possible.

@alanshaw
Copy link
Member

How are you measuring this please? I believe the Node.js default is to allow up to ~4GB of memory to be allocated even if the currently in use memory is less.

@makew0rld
Copy link
Author

makew0rld commented Jun 12, 2024

I am measuring this with htop and free. Here is a screenshot as an example:

image

This shows multiple threads, but if we just look at the top entry, it can be seen from the RES column that w3 is taking up over 5 GiB of memory. I could also observe this by just looking at how the overall memory usage (in htop or free) changed from before I started my test, to once w3 is running. It went up by about 4-5 GiB.

I don't think this is related to Node.js memory allocation settings. w3 doesn't do this with smaller files. To me this seems indicative of some code somewhere in the stack that is loading the entire file into memory to do the upload, instead of loading 32 KiB (for example) chunks into memory and uploading those. An uploader CLI should never need to use gigabytes of memory.

Please let me know if you are not able to reproduce this, as it seems like stable behaviour on my machine.


Edit: To be clear, the file uploads fine. It works. But there is a bug here in the form of using large amounts of memory.

@makew0rld
Copy link
Author

@alanshaw any update on this?

@SethDocherty
Copy link

SethDocherty commented Sep 20, 2024

@makew0rld have you had any luck uploading even larger files? My colleague was running into time-out issues trying to upload a 32GB CAR file yesterday. After the 8th timeout he tried another CAR file that was 2.9GB, and that was successful on the first try.


After some additional research, I did find this note that gives some details on how a CAR file larger than 4GB is split up into smaller chunks.

I'm noticing in the command you run that you're not passing in the --shard-size option. @alanshaw does the upload command default to a shard size (this for reference) if the file is greater than 4GB?

@makew0rld
Copy link
Author

I haven't tested with larger files than 4 GB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants