-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S3 streaming with s3 cp uses several GB of memory on upload #923
Comments
Interesting. Will look into it. Also on a side note, make sure you use |
It's 10000 parts max according to the documentation: http://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadUploadPart.html so that's a limit of ~50 GB at the minimum size. |
Yep that's right. The good news is that I have confirmed the bug, and it is a very easy fix. The wrong constant was being used to limit the amount of data in memory. It must have been changed when I rebased off develop to merge the original pull request. For streaming an upload file, the maximum memory usage you should expect is around 90 MB. For fast processes like running cat, you will tend to see it reach that ceiling. For slower processes, the memory usage will be noticeably less. Memory usage increases though if the size of the file is over 50 GB due to a bump up in chunksize. Thanks for the catch! I will send a pull request out soon. |
Its a closed issue, but still commenting. For some reason, this is not working on the EMR isntance I'm using. Could you please let me know what might be wrong?
|
What version of the CLI are you using? In what way is it not working? Do you have more information you can share? |
The version of the cli is here is the error: |
I just spinned up a new EMR instance and upgraded the aws cli to 1.7. This feature is working as expected. Sorry for the false alarm. thanks! |
|
In testing the streaming upload feature implemented in #903, it is reading the entire stream into memory, causing large memory usage for the tool. On an ubuntu ec2 instance running the latest master branch, uploading a 9 GB file resulted in 6.5-6.9 GB of real memory usage.
Test command:
The text was updated successfully, but these errors were encountered: