Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added -zstd_implementation=cgo option. #485

Closed

Conversation

glukasiknuro
Copy link
Contributor

@glukasiknuro glukasiknuro commented Oct 15, 2021

Allows to select cgo implementation of zstd if compiled in.

The cgo implementation is more performant than pure go implementation,
see #482 for actual benchmarks and dicussion.

Stats show that with cgo implementation the overhead of zstd compression itself is much lower - 40% vs 160% for downloading. It increases possible bandwidth for downloading from bazel-remote from 100MB/s to 190MB/s per single cpu the bazel-remote runs on, so roughly 2x.

For uploading the difference is smaller - 310% vs 490%, the overhead is big mainly because a) compression is slower than decompression in zstd case, and b) bazel-cache needs to perform also sha256 computation, which from basic observations is similarly expensive as zstd compression at faster levels. It increases upload of whole bazel-remote from 45MB/s to 64MB/s.

See https://github.com/glukasiknuro/go-zstd-benchmarks/tree/master/bazel-remote-load-test#results---download

@glukasiknuro glukasiknuro marked this pull request as ready for review October 15, 2021 20:09
@glukasiknuro glukasiknuro changed the title [WIP] Added -zstd_implementation=cgo option. Added -zstd_implementation=cgo option. Oct 15, 2021
Allows to select cgo implementation of zstd if compiled in.

The cgo implementation is more performant than pure go implementation,
see buchgr#482 for actual
benchmarks.
@glukasiknuro
Copy link
Contributor Author

Created bug for bazel-gazelle with question how to properly support cgo zstd library - currently using a patch which potentially can be improved going forward: bazelbuild/bazel-gazelle#1120

Copy link
Collaborator

@mostynb mostynb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created bug for bazel-gazelle with question how to properly support cgo zstd library - currently using a patch which potentially can be improved going forward: bazelbuild/bazel-gazelle#1120

IIRC bazel-remote is a few releases behind on bazel-gazelle.

&cli.StringFlag{
Name: "zstd_implementation",
Value: "go",
Usage: "ZSTD implementation to use - either \"go\" or \"cgo\n (cgo may not be compiled in).",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any downside to always compiling cgo support in, apart from binary size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current bazel-remote images are compiled with no cgo support at all (linux-build has CGO_ENABLED=0, and images have pure="on"). Not sure what's the rationale behind it - maybe not to bring any unexpected cgo deps? Wanted to preserve the status quo, but if you think it's worth reconsidering it, maybe we can compile it into both amd and arm images.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like cgo has been turned off since early in bazel-remote's history, and we probably just haven't needed to reconsider it until now.

Copy link
Collaborator

@mostynb mostynb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good in general, I guess the most important thing we need to do next is check what the implications of always building with cgo enabled would be. eg does performance suffer to the native go zstd implementation mode? And how much larger are the docker images with cgo enabled?

err = cli.ShowAppHelp(ctx)
_ = cli.ShowAppHelp(ctx)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

&cli.StringFlag{
Name: "zstd_implementation",
Value: "go",
Usage: "ZSTD implementation to use - either \"go\" or \"cgo\n (cgo may not be compiled in).",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like cgo has been turned off since early in bazel-remote's history, and we probably just haven't needed to reconsider it until now.

@@ -382,6 +357,7 @@ func GetZstdReadCloser(f *os.File, expectedSize int64, offset int64) (io.ReadClo
if chunkNum > 0 {
_, err = f.Seek(h.chunkOffsets[chunkNum], io.SeekStart)
if err != nil {
f.Close()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another good catch.

@@ -24,6 +25,13 @@ func WithStorageMode(mode string) Option {
}
}

func WithZstdImplementation(impl string) Option {
return func(c *Cache) error {
c.zstd = zstdimpl.Get(impl)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe do some error checking here?

package zstdimpl

import (
"github.com/valyala/gozstd"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: please put standard library imports together first, then an empty line and then this import

"io/ioutil"
)

const compressionLevel = 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you experimented with different levels?

"log"
)

var registry map[string]ZstdImpl
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we only have two values, I wonder if we should just use a bool and if/else?

mostynb pushed a commit to mostynb/bazel-remote that referenced this pull request Jun 26, 2022
This bugfix was extracted from PR buchgr#485 since it can land separately.
mostynb pushed a commit to mostynb/bazel-remote that referenced this pull request Jun 26, 2022
This bugfix was extracted from PR buchgr#485 since it can land separately.
@mostynb
Copy link
Collaborator

mostynb commented Jul 17, 2022

This landed in #559.

@mostynb mostynb closed this Jul 17, 2022
@glukasiknuro
Copy link
Contributor Author

Uh, thanks Mostyn for taking care of it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants