Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: switch from "compress/gzip" to more optimal library #233

Merged
merged 1 commit into from
Aug 3, 2018

Conversation

cyphar
Copy link
Member

@cyphar cyphar commented Mar 14, 2018

One of the largest resource hogs in umoci is compression-related, and it
turns out
that Go's ordinary gzip implementation is nowhere near as
fast as other modern gzip implementations. In order to help our users
get more efficient builds, switch to a "pure-Go" implementation which
has x64-specific optimisations. We cannot use zlib wrappers like vitess
because of issues with "os/user" and static compilation (so we wouldn't
be able to release binaries).

This very simplified benchmark shows the positive difference when
switching libraries (using "tensorflow/tensorflow:latest" as the image
base):

% time umoci unpack --image tensorflow:latest bundle # before
39.03user 7.58system 0:45.16elapsed
% time umoci unpack --image tensorflow:latest bundle # after
40.89user 7.99system 0:34.89elapsed

But the real benefit is when it comes to compression and repacking (in
this benchmark the changes were installing FireFox and doing a repo
refresh):

% time umoci repack --image tensorflow:latest bundle # before
78.54user 13.71system 1:26.66elapsed
% time umoci repack --image tensorflow:latest bundle # after
53.11user 4.87system 0:36.75elapsed

That's almost 3x faster, just by having a more optimised compression
library!

Closes #225
Signed-off-by: Aleksa Sarai asarai@suse.de

@cyphar cyphar added this to the 0.4.1 milestone Mar 14, 2018
@cyphar
Copy link
Member Author

cyphar commented Mar 14, 2018

Static compilation is going to be a problem with this change. This is going to
be our first cgo dependency in umoci -- which is going to come with some
potential problems. If we drop CGO_ENABLED=0 from the build then we will
still get static binaries but the compiler will complain (because go-mtree uses
os/user which then uses getgrouplist and similar methods) and produce
binaries that will try to load glibc at runtime.

@jonboulle
Copy link
Contributor

surely you can't drop CGO_ENABLED from the build because then the vitess bindings won't work at all.
How about toggling cgzip/gzip based on another build flag? (or if you don't want to create a 2x2 matrix with one unusable case, then just toggle off the cgo flag?)

@cyphar
Copy link
Member Author

cyphar commented Mar 19, 2018

By "drop CGO_ENABLED" I mean to remove CGO_ENABLED=0 which will enable cgo. The main problem is that the statically linked binary you get won't like it, because then parts of github.com/vbatts/go-mtree pull in os/user. We could probably patch that to fix it...

@cyphar
Copy link
Member Author

cyphar commented Jun 7, 2018

Given the above cgo problems, I'm going to have to close this for now and move on with a release.

@cyphar cyphar closed this Jun 7, 2018
@cyphar
Copy link
Member Author

cyphar commented Jul 5, 2018

This was effectively blocked by golang/go#23265. So in Go 1.11 we can actually build static binaries.

@cyphar cyphar reopened this Jul 5, 2018
@tych0
Copy link
Member

tych0 commented Jul 30, 2018

What if instead we use pgzip? https://github.com/klauspost/pgzip

No static linking, and it is generally much faster because it works in parallel (at least for compressing).

@flx42
Copy link
Contributor

flx42 commented Jul 30, 2018

IIRC it provided no benefit to decompression, but you can prove me wrong :)

@tych0
Copy link
Member

tych0 commented Jul 30, 2018 via email

@flx42
Copy link
Contributor

flx42 commented Jul 30, 2018

I would argue that half the process (compression) might be far from half the use cases. I would guess that that more people use umoci for decompression than compression.

@flx42
Copy link
Contributor

flx42 commented Jul 30, 2018

But yeah, if compression is part of your use cases, then it's better than status quo :)

@cyphar
Copy link
Member Author

cyphar commented Aug 1, 2018

Regarding decompression (which is the purpose for which a lot of people would be using umoci -- since by design images are consumed more often than they are built), it looks like they claim there is a performance improvement for decompression as well -- but this improvement is mainly because you can "do other work" (I imagine this is referring to readahead).

I'm not sure what zlib does to make their decompression faster (since gzip decompression is not parallelisable). Is it just that it's written in C -- in which case the x86 speedups in pgzip might tap into that same speed improvement?

I think going with pgzip now is a better approach than vitess for the moment (at least, until Go 1.11 is something that we are using everywhere). We can always change it later.

@flx42
Copy link
Contributor

flx42 commented Aug 1, 2018

Fine for me.

@flx42
Copy link
Contributor

flx42 commented Aug 2, 2018

Somehow, the default compress/gzip seems as fast as pgzip with Go 1.11 beta 2. It's a single data point for now, I hope I will have time to test more.
Edit: for decompression.

@cyphar cyphar force-pushed the cgzip-migration branch 2 times, most recently from fdb3edb to c2c47d3 Compare August 2, 2018 23:47
@cyphar
Copy link
Member Author

cyphar commented Aug 2, 2018

Yeah, I think you're right @flx42. After some benchmarking, it looks like decompression is still mostly a crapshoot (though I do get more minor positive results with pgzip -- though that might just be luck). Compression is insanely fast though -- entire umoci repack runs with lots of changes can be up to 3x faster (which is much larger than I expected).

@tych0
Copy link
Member

tych0 commented Aug 2, 2018

I'd run it on a machine with 48 cores and set the number of blocks to runtime.GOMAXPROCS() or whatever the function is. We basically see linear speedup with the number of cores, which is cool.

@cyphar
Copy link
Member Author

cyphar commented Aug 3, 2018

@tych0 Are you sure you'd want GOMAXPROCS and not just nproc?

@tych0
Copy link
Member

tych0 commented Aug 3, 2018

https://golang.org/pkg/runtime/#NumCPU Is what I want, I think.

@cyphar
Copy link
Member Author

cyphar commented Aug 3, 2018

Though, the documentation recommends using double the number of CPUs you wish to use:

You should at least have a block size of 100k and at least a number of blocks that match the number of cores your would like to utilize, but about twice the number of blocks would be the best.

@cyphar cyphar changed the title *: switch from "compress/gzip" to zlib library *: switch from "compress/gzip" to more optimal library Aug 3, 2018
One of the largest resource hogs in umoci is compression-related, and it
turns out[1] that Go's ordinary gzip implementation is nowhere near as
fast as other modern gzip implementations. In order to help our users
get more efficient builds, switch to a "pure-Go" implementation which
has x64-specific optimisations. We cannot use zlib wrappers like [2]
because of issues with "os/user" and static compilation (so we wouldn't
be able to release binaries).

This very simplified benchmark shows the positive difference when
switching libraries (using "tensorflow/tensorflow:latest" as the image
base):

  % time umoci unpack --image tensorflow:latest bundle # before
  39.03user 7.58system 0:45.16elapsed
  % time umoci unpack --image tensorflow:latest bundle # after
  40.89user 7.99system 0:34.89elapsed

But the real benefit is when it comes to compression and repacking (in
this benchmark the changes were installing FireFox and doing a repo
refresh):

  % time umoci repack --image tensorflow:latest bundle # before
  78.54user 13.71system 1:26.66elapsed
  % time umoci repack --image tensorflow:latest bundle # after
  51.14user 3.25system 0:30.30elapsed

That's almost 3x faster, just by having a more optimised compression
library!

[1]: golang/go#23154
[2]: https://github.com/vitessio/vitess/tree/v2.2/go/cgzip

Signed-off-by: Aleksa Sarai <asarai@suse.de>
@cyphar
Copy link
Member Author

cyphar commented Aug 3, 2018

LGTM. @tych0 if you have any better ideas of how to use SetConcurrency feel free to send a pull request or open an issue.

@cyphar cyphar merged commit 25e50d3 into opencontainers:master Aug 3, 2018
cyphar added a commit that referenced this pull request Aug 3, 2018
  *: switch from "compress/gzip" to more optimal library

LGTMs: @cyphar
Closes #233
@cyphar cyphar deleted the cgzip-migration branch August 3, 2018 10:34
@cyphar cyphar mentioned this pull request Jul 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants