gzippin #1037

mcrowson · 2017-08-04T16:04:02Z

Description

Using gzipped tarballs for the slim_handler's package. This allows the project to be downloaded and unzipped into /tmp on the fly.

Would love help testing especially on windows as I attempted to address
https://github.com/Miserlou/Zappa/blob/master/zappa/core.py#L568

from PR #716 with the tarball approach.

GitHub Issues

#961
#881
#1020

coveralls · 2017-08-04T16:17:41Z

Coverage increased (+0.1%) to 74.165% when pulling 2696d9b on mcrowson:gzipped_project into a21c973 on Miserlou:master.

coveralls · 2017-08-04T16:17:41Z

Coverage decreased (-15.9%) to 58.137% when pulling 2696d9b on mcrowson:gzipped_project into a21c973 on Miserlou:master.

coveralls · 2017-08-04T16:35:26Z

Coverage increased (+0.2%) to 74.236% when pulling 6551e34 on mcrowson:gzipped_project into a21c973 on Miserlou:master.

mbeacom · 2017-08-04T22:59:34Z

Beat me to it! Thanks for getting it done, though!

dswah · 2017-08-05T00:09:27Z

so stoked!

olirice · 2017-08-05T02:00:44Z

Looks great, nice work!

I just tested deploying a bare bones project with python 2.7.12 and am getting an error from handler.py

[1501892918179] 'StreamingBody' object has no attribute 'tell': AttributeError
Traceback (most recent call last):
  File "/var/task/handler.py", line 505, in lambda_handler
  return LambdaHandler.lambda_handler(event, context)
  File "/var/task/handler.py", line 239, in lambda_handler
  handler = cls()
  File "/var/task/handler.py", line 104, in __init__
  self.load_remote_project_archive(project_archive_path)
  File "/var/task/handler.py", line 167, in load_remote_project_archive
  with tarfile.open(fileobj=archive_on_s3['Body'], mode="r:gz") as t:
  File "/usr/lib64/python2.7/tarfile.py", line 1691, in open
  return func(name, filemode, fileobj, **kwargs)
  File "/usr/lib64/python2.7/tarfile.py", line 1745, in gzopen
  t = cls.taropen(name, mode, fileobj, **kwargs)
  File "/usr/lib64/python2.7/tarfile.py", line 1721, in taropen
  return cls(name, mode, fileobj, **kwargs)
  File "/usr/lib64/python2.7/tarfile.py", line 1587, in __init__
  self.firstmember = self.next()
  File "/usr/lib64/python2.7/tarfile.py", line 2356, in next
  tarinfo = self.tarinfo.fromtarfile(self)
  File "/usr/lib64/python2.7/tarfile.py", line 1251, in fromtarfile
  buf = tarfile.fileobj.read(BLOCKSIZE)
  File "/usr/lib64/python2.7/gzip.py", line 268, in read
  self._read(readsize)
  File "/usr/lib64/python2.7/gzip.py", line 295, in _read
  pos = self.fileobj.tell() # Save current position
AttributeError: 'StreamingBody' object has no attribute 'tell'

If you put the tarfile in streaming mode 'r|gz' (instead of 'r:gz') its quiets that error.

with tarfile.open(fileobj=archive_on_s3['Body'], mode="r|gz") as t:

After that update, handler.py errors that the module has no attribute your_entrypoint.

[1501896837626] 'module' object has no attribute 'app': AttributeError
Traceback (most recent call last):
  File "/var/task/handler.py", line 513, in lambda_handler
  return LambdaHandler.lambda_handler(event, context)
  File "/var/task/handler.py", line 247, in lambda_handler
  handler = cls()
  File "/var/task/handler.py", line 134, in __init__
  wsgi_app_function = getattr(self.app_module, self.settings.APP_FUNCTION)
AttributeError: 'module' object has no attribute 'app'

If you check the .tar.gz on s3, the files are in the right places but they all have a size of 0 bytes.

I haven't had time to look into that yet. It might be related to tarfile.TarFile.addfile expecting a file object, rather than a file path.

coveralls · 2017-08-06T01:58:13Z

Coverage increased (+0.2%) to 74.272% when pulling c466a0b on mcrowson:gzipped_project into a21c973 on Miserlou:master.

mcrowson · 2017-08-06T02:05:48Z

ok, try it now. Works for me with a 200M project on 128M Ram lambda.

Test Environments were OSX py2.7 and OSX py3.6

olirice · 2017-08-06T16:23:07Z

Working now, 2.7 and 3.6 on ubuntu 16.04

Here are a couple of cold start times on different memory size instances.

For the cases I could check (when the project + zip less than 500 mb), cold start performance is at least as fast as zip on disk, usually better.

225 mb project
Memory Size | Gzip Extract Time (s) | Zip Extract Time (s)
1536mb | 1.88 | 1.86
512 mb | 4.10 | 5.99
256 mb | 9.56 | 10.01
128 mb | 16.89 | 20.74

Given the slow cold start times on small instances, the win here will be huge deployments on large instances versus big-ish deployments on tiny instances.

Notes:

Precompiled packages was turned off to make sure extract sizes matched local size.
All times are slowest of 3 attempts

mcrowson · 2017-08-06T18:32:25Z

Awesome testing! Was the timeout due to 30 second timeout on lambda?

…

Sent from my iPhone

On Aug 6, 2017, at 12:23 PM, Oliver Rice ***@***.***> wrote: Working now, 2.7 and 3.6 on ubuntu 16.04 Here are a couple of cold start times on different memory size instances. 414 mb project Memory Size | Gzip Extract time (s) 1536mb | 3.05 seconds 512 mb | 12.49 seconds 256 mb | 24.79 seconds 128 mb | Error(timeout) For the cases I could check (when the project + zip less than 500 mb), cold start performance is at least as fast as zip on disk, usually better. 225 mb project Memory Size | Gzip Extract Time (s) | Zip Extract Time (s) 1536mb | 1.88 | 1.86 512 mb | 4.10 | 5.99 256 mb | 9.56 | 10.01 128 mb | 16.89 | 20.74 Given the slow cold start times on small instances, the win here will be huge deployments on large instances versus big-ish deployments on tiny instances. Notes: Precompiled packages was turned off to make sure extract sizes matched local size. All times are slowest of 3 attempts — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

olirice · 2017-08-06T19:27:33Z

Yes, all tests went through API Gateway and timed out after 30 seconds. I'm sure the it would have completed over direct invocation if you crank the Lambda timeout up to 5 minutes.

mcrowson · 2017-08-06T19:35:42Z

Any ideas on speeding that up? It might be a separate issue from this PR but it sure would be nice to make that more doable.

…

Sent from my iPhone

On Aug 6, 2017, at 3:27 PM, Oliver Rice ***@***.***> wrote: Yes, all tests went through API Gateway and timed out after 30 seconds. I'm sure the it would have completed over drect invocation if you crank the Lambda timeout up to 5 minutes. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

olirice · 2017-08-07T00:35:15Z

Completely agree. I'm not clear where the bottleneck is yet though. I tried removing gzip compression and stream extracting an uncompressed tarball to see if CPU usage during extraction was the issue. Speeds were pretty similar to gzip so it wasn't a helpful test.

We need to get a better understanding of disk/network/CPU performance at each Lambda memory size to figure out if there's more we can do to improve cold starts.

I think disk is most likely to be the problem. I'll run some tests this week and get back to you but (as you say) it's outside the scope for this PR.

olirice · 2017-08-07T00:46:53Z

Any thoughts on a failover strategy to zip or an uncompressed tarfile when zlib isn't available?

Zlib is technically an optional component so using gzip without a backup breaks slim_handler for certain valid builds.

Too niche to worry about?

GeorgianaPetria · 2017-08-07T15:00:16Z

Hi all,

Is it already possible to use this fix?

I am getting the following error
[Errno 28] No space left on device: IOError Traceback (most recent call last): File "/var/task/handler.py", line 491, in lambda_handler return LambdaHandler.lambda_handler(event, context) File "/var/task/handler.py", line 240, in lambda_handler handler = cls() File "/var/task/handler.py", line 102, in __init__ self.load_remote_project_zip(project_zip_path) File "/var/task/handler.py", line 169, in load_remote_project_zip z.extractall(path=project_folder) File "/usr/lib64/python2.7/zipfile.py", line 1040, in extractall self.extract(zipinfo, path, pwd) File "/usr/lib64/python2.7/zipfile.py", line 1028, in extract return self._extract_member(member, path, pwd) File "/usr/lib64/python2.7/zipfile.py", line 1084, in _extract_member shutil.copyfileobj(source, target) File "/usr/lib64/python2.7/shutil.py", line 52, in copyfileobj fdst.write(buf) IOError: [Errno 28] No space left on device

mcrowson · 2017-08-07T15:02:53Z

Got any details about the project? Size of zip? RAM size on lambda? Size of project unzipped? etc.

Oh, just reading your trace. You're using current code and want this new code. Not saying that this PR is broken.

GeorgianaPetria · 2017-08-07T15:29:57Z

Sure:
Size of the zip: 107M
Project unzipped: 487M
Max memory used on Lambda: 512M

mcrowson · 2017-08-07T15:37:22Z

Check out this PR and give it a shot. I'd love to see how the 487M works. Also if you're on Windows we'd love to see it succeed there as well.

GeorgianaPetria · 2017-08-08T19:40:58Z

I'm on Ubuntu.
Sorry, how can I check out the PR? I'm using zappa version 0.43.1.

It's actually working for me now, not sure if I already have your modified version though.
I'm doing find . | grep -E "(__pycache__|\.pyc|\.pyo$)" | xargs rm -rf before deploying.

I believe the reason why it wasn't working is that the unzipped project was sometimes larger than 525M when I had all the python files. But after deleting them (project has 487M) it works.

dswah · 2017-08-15T23:25:43Z

@Miserlou @mcrowson i saw that this is merged! awesome! but has it been released?
if so, how do i pass the archive_format command to zappa to tell it to use tarball ?

dswah · 2017-08-15T23:44:39Z

pip install -U zappa did it :)

mcrowson added 2 commits August 4, 2017 12:02

gzippin

2696d9b

perms

6551e34

needed to add some tarinfo and open the gzip in streaming mode.

c466a0b

olirice mentioned this pull request Aug 7, 2017

slim_handler reduce /tmp use and enable significantly larger deployments #1022

Closed

mcrowson mentioned this pull request Aug 11, 2017

Zip32/64 Issue for Gigantic Packages #730

Open

Miserlou merged commit ec39035 into Miserlou:master Aug 14, 2017

moviedatascience mentioned this pull request Feb 3, 2020

OSError: [Errno 28] No space left on this device on slim_handler=true #1872

Open

CaseGuide mentioned this pull request Jul 29, 2020

slim_handler: true fails in Windows, possibly due to file slashes in tarball #2145

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gzippin #1037

gzippin #1037

mcrowson commented Aug 4, 2017

coveralls commented Aug 4, 2017 •

edited

Loading

coveralls commented Aug 4, 2017

coveralls commented Aug 4, 2017 •

edited

Loading

mbeacom commented Aug 4, 2017

dswah commented Aug 5, 2017

olirice commented Aug 5, 2017

coveralls commented Aug 6, 2017 •

edited

Loading

mcrowson commented Aug 6, 2017 •

edited

Loading

olirice commented Aug 6, 2017

mcrowson commented Aug 6, 2017 via email

olirice commented Aug 6, 2017 •

edited

Loading

mcrowson commented Aug 6, 2017 via email

olirice commented Aug 7, 2017

olirice commented Aug 7, 2017

GeorgianaPetria commented Aug 7, 2017

mcrowson commented Aug 7, 2017 •

edited

Loading

GeorgianaPetria commented Aug 7, 2017 •

edited

Loading

mcrowson commented Aug 7, 2017 •

edited

Loading

GeorgianaPetria commented Aug 8, 2017

dswah commented Aug 15, 2017

dswah commented Aug 15, 2017

gzippin #1037

gzippin #1037

Conversation

mcrowson commented Aug 4, 2017

Description

GitHub Issues

coveralls commented Aug 4, 2017 • edited Loading

coveralls commented Aug 4, 2017

coveralls commented Aug 4, 2017 • edited Loading

mbeacom commented Aug 4, 2017

dswah commented Aug 5, 2017

olirice commented Aug 5, 2017

coveralls commented Aug 6, 2017 • edited Loading

mcrowson commented Aug 6, 2017 • edited Loading

olirice commented Aug 6, 2017

mcrowson commented Aug 6, 2017 via email

olirice commented Aug 6, 2017 • edited Loading

mcrowson commented Aug 6, 2017 via email

olirice commented Aug 7, 2017

olirice commented Aug 7, 2017

GeorgianaPetria commented Aug 7, 2017

mcrowson commented Aug 7, 2017 • edited Loading

GeorgianaPetria commented Aug 7, 2017 • edited Loading

mcrowson commented Aug 7, 2017 • edited Loading

GeorgianaPetria commented Aug 8, 2017

dswah commented Aug 15, 2017

dswah commented Aug 15, 2017

coveralls commented Aug 4, 2017 •

edited

Loading

coveralls commented Aug 4, 2017 •

edited

Loading

coveralls commented Aug 6, 2017 •

edited

Loading

mcrowson commented Aug 6, 2017 •

edited

Loading

olirice commented Aug 6, 2017 •

edited

Loading

mcrowson commented Aug 7, 2017 •

edited

Loading

GeorgianaPetria commented Aug 7, 2017 •

edited

Loading

mcrowson commented Aug 7, 2017 •

edited

Loading