Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zipped and unzipped deployment adding upto more than 500Mb and failing deployment #881

Closed
neo01124 opened this issue May 23, 2017 · 18 comments

Comments

@neo01124
Copy link

Using "slim_handler": true, the zipped version of my deployment is 120Mb and the unzipped version is 436Mb. The combined size is more than the lambda limit for ephemeral storage which fails the deployment.

A streaming mode for unzipping the file would mitigate this.

@mcrowson
Copy link
Collaborator

So this is a little tricker than I thought. Because there are multiple files in a zip, we can't exactly stream unzip (from what I can tell reading online). Would the gzip format lend itself better?

If so we run into another challenge implementing this. The current approach for creating the application zip with the slim_handler mirrors the application zip needed by lambda. Lambda requires a zip file. So we'd have to separate functionality there assuming we go gzip with the slim_handler application.

mbeacom added a commit to mbeacom/Zappa that referenced this issue May 23, 2017
@mcrowson
Copy link
Collaborator

Putting Slack convo w/ @mbeacom here: The commit above brings the whole response into memory.

@bharling
Copy link

I think I'm hitting exactly the same issue myself - #924

mbeacom added a commit to mbeacom/Zappa that referenced this issue Jun 15, 2017
@mcrowson
Copy link
Collaborator

Could be made possible by boto/botocore#1195

@mbeacom
Copy link
Contributor

mbeacom commented Jun 15, 2017

I'm going to implement this for now. If they ultimately handle it, we can update accordingly.

@nils-braun
Copy link

Are there any news on this that I have missed (sorry for my stupid question)? I have seen your branch, @mbeacom, and I have implemented it in a similar and it seems to work (at least in my case). Do you want to do a PR or is there already one?

@mbeacom
Copy link
Contributor

mbeacom commented Jul 24, 2017

@nils-braun The branch I have out currently dumps the zip to memory (can be a problem with large deployments and is restricted to the memory allocated to the function). I haven't submitted a PR for this yet because we're looking to stream the zip file, uncompressing the contents on the fly. This will free up both memory and disk resources.

@nils-braun
Copy link

nils-braun commented Jul 28, 2017

Very good idea - maybe something like https://stackoverflow.com/a/28766502/3158566 may help. I am definitely willing to help you with this PR, if you want or if you do not have the time to work on this now :-)

@mcrowson
Copy link
Collaborator

so it looks like zip files require random access. I understand this to mean that with a .zip file it is impossible to download and unzip on the fly as the whole zip will be required to decompress each file.

Alternatives could be to make gzipped tarballs instead for large application (keeping the handler as a .zip) or just to side step compression altogether and upload all the files to s3 and then its trivial to stream/download individual files to /tmp.

@mbeacom
Copy link
Contributor

mbeacom commented Jul 31, 2017

The random access issue is what I was running into previously with this. @mcrowson If you guys aren't opposed to the gzip idea, I could have this out tonight. I have that pretty much already setup and just need to format/commit it.

@olirice
Copy link
Contributor

olirice commented Aug 2, 2017

@mbeacom streaming gzip sounds great. Could you post a quick status and/or PR timeline when you get the chance?

There is an open PR waiting on movement in this thread after you last post.

#1022 is a small update to download the regular old zip into a memory file and then unzip it to /tmp.

It isn't as nice as a streaming .tar.gz solution but changing the slim_handler file format requires sweeping changes in cli.py, core.py and testing. To my eye, those updates look like they might take more than an evening :)

FYI @dswah

@mcrowson mcrowson mentioned this issue Aug 4, 2017
@ConorMcGee
Copy link

Sorry folks but I'm confused. Is the merge attached above expected to have fixed this issue? I'm still experiencing it.

@mcrowson
Copy link
Collaborator

That certainly was the intent. Got any logs or details?

@ConorMcGee
Copy link

ConorMcGee commented Feb 22, 2018

I'll make sure I haven't done something wrong on my end tomorrow - I'm sure I have... I just wanted to double-check since the other issues referenced had been closed but not this one. Thanks!

@ConorMcGee
Copy link

So, it seems with a .tar.gz of 125M I was still running out of space. I managed to get that down to 115M by clearing .pyc files and that seems ok.

Error in cloudwatch for 125M tar:

Error
[Errno 28] No space left on device: OSError
Traceback (most recent call last):
File "/var/task/handler.py", line 509, in lambda_handler
return LambdaHandler.lambda_handler(event, context)
File "/var/task/handler.py", line 237, in lambda_handler
handler = cls()
File "/var/task/handler.py", line 102, in __init__
self.load_remote_project_archive(project_archive_path)
File "/var/task/handler.py", line 166, in load_remote_project_archive
t.extractall(project_folder)
File "/var/lang/lib/python3.6/tarfile.py", line 2007, in extractall
numeric_owner=numeric_owner)
File "/var/lang/lib/python3.6/tarfile.py", line 2049, in extract
numeric_owner=numeric_owner)
File "/var/lang/lib/python3.6/tarfile.py", line 2119, in _extract_member
self.makefile(tarinfo, targetpath)
File "/var/lang/lib/python3.6/tarfile.py", line 2168, in makefile
copyfileobj(source, target, tarinfo.size, ReadError, bufsize)
OSError: [Errno 28] No space left on device

Duration: 10683.37 ms Billed Duration: 10700 ms Memory Size: 3008 MB Max Memory Used: 601 MB

Sorry I'm kind of just dumping this here without time to really dive into the issue - just mentioning it in case it's of concern. Thanks!

@olirice
Copy link
Contributor

olirice commented Feb 23, 2018

Check the extracted size of your project. Sounds like it might be over 500.

You could try stripping unused symbols in any .so files within the deployment zip

@mcrowson
Copy link
Collaborator

The goal of the fix was to stream download the gzipped file and unzip it on the fly so we are not holding both the zipped and unzipped project in /tmp at the same time. I suspect your unzipped project might be close to 500M (you don't actually get all 500M, aws seems a little inconsistent on this limit, and the handler function itself I think takes up about 7M.

@mcrowson
Copy link
Collaborator

mcrowson commented May 2, 2018

Closing due to lack of activity

@mcrowson mcrowson closed this as completed May 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants