-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: switch from "compress/gzip" to more optimal library #233
Conversation
Static compilation is going to be a problem with this change. This is going to |
surely you can't drop |
By "drop |
Given the above |
This was effectively blocked by golang/go#23265. So in Go 1.11 we can actually build static binaries. |
What if instead we use pgzip? https://github.com/klauspost/pgzip No static linking, and it is generally much faster because it works in parallel (at least for compressing). |
IIRC it provided no benefit to decompression, but you can prove me wrong :) |
It does read ahead, but yes, there's no parallel decompression. Still,
having half of the process faster seems better, no?
…On Mon, Jul 30, 2018, 3:09 PM Felix Abecassis ***@***.***> wrote:
IIRC it provided no benefit to decompression, but you can prove me wrong :)
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<https://github.com/openSUSE/umoci/pull/233#issuecomment-409028199>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAv61zU-h4bLUp3C8IPpgBT8zzW1IVOWks5uL4P_gaJpZM4Sq4e4>
.
|
I would argue that half the process (compression) might be far from half the use cases. I would guess that that more people use |
But yeah, if compression is part of your use cases, then it's better than status quo :) |
Regarding decompression (which is the purpose for which a lot of people would be using I'm not sure what zlib does to make their decompression faster (since gzip decompression is not parallelisable). Is it just that it's written in C -- in which case the x86 speedups in I think going with |
Fine for me. |
Somehow, the default |
fdb3edb
to
c2c47d3
Compare
Yeah, I think you're right @flx42. After some benchmarking, it looks like decompression is still mostly a crapshoot (though I do get more minor positive results with |
I'd run it on a machine with 48 cores and set the number of blocks to |
@tych0 Are you sure you'd want |
https://golang.org/pkg/runtime/#NumCPU Is what I want, I think. |
Though, the documentation recommends using double the number of CPUs you wish to use:
|
One of the largest resource hogs in umoci is compression-related, and it turns out[1] that Go's ordinary gzip implementation is nowhere near as fast as other modern gzip implementations. In order to help our users get more efficient builds, switch to a "pure-Go" implementation which has x64-specific optimisations. We cannot use zlib wrappers like [2] because of issues with "os/user" and static compilation (so we wouldn't be able to release binaries). This very simplified benchmark shows the positive difference when switching libraries (using "tensorflow/tensorflow:latest" as the image base): % time umoci unpack --image tensorflow:latest bundle # before 39.03user 7.58system 0:45.16elapsed % time umoci unpack --image tensorflow:latest bundle # after 40.89user 7.99system 0:34.89elapsed But the real benefit is when it comes to compression and repacking (in this benchmark the changes were installing FireFox and doing a repo refresh): % time umoci repack --image tensorflow:latest bundle # before 78.54user 13.71system 1:26.66elapsed % time umoci repack --image tensorflow:latest bundle # after 51.14user 3.25system 0:30.30elapsed That's almost 3x faster, just by having a more optimised compression library! [1]: golang/go#23154 [2]: https://github.com/vitessio/vitess/tree/v2.2/go/cgzip Signed-off-by: Aleksa Sarai <asarai@suse.de>
LGTM. @tych0 if you have any better ideas of how to use |
One of the largest resource hogs in umoci is compression-related, and it
turns out that Go's ordinary gzip implementation is nowhere near as
fast as other modern gzip implementations. In order to help our users
get more efficient builds, switch to a "pure-Go" implementation which
has x64-specific optimisations. We cannot use zlib wrappers like vitess
because of issues with "os/user" and static compilation (so we wouldn't
be able to release binaries).
This very simplified benchmark shows the positive difference when
switching libraries (using "tensorflow/tensorflow:latest" as the image
base):
% time umoci unpack --image tensorflow:latest bundle # before
39.03user 7.58system 0:45.16elapsed
% time umoci unpack --image tensorflow:latest bundle # after
40.89user 7.99system 0:34.89elapsed
But the real benefit is when it comes to compression and repacking (in
this benchmark the changes were installing FireFox and doing a repo
refresh):
% time umoci repack --image tensorflow:latest bundle # before
78.54user 13.71system 1:26.66elapsed
% time umoci repack --image tensorflow:latest bundle # after
53.11user 4.87system 0:36.75elapsed
That's almost 3x faster, just by having a more optimised compression
library!
Closes #225
Signed-off-by: Aleksa Sarai asarai@suse.de