Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http: Add support for using SHA1, MD5 or SHA256 hashes. #175

Closed
wants to merge 1 commit into from

Conversation

bentonj-vmw
Copy link
Contributor

We are working on adding caching to SCons using the bazel-remote server, for this we require support for MD5 and SHA1 hashes. Unfortunately for the s3 backend it only supports base64 encoded MD5 or SHA256 for upload hash verification so for SHA1 we have to rehash the uploaded data to generate a SHA256.

Additionally we want to avoid pulling in python protobuf dependencies into SCons, and thus it is much better for us if we can use the JSON protobuf encoding. Here I add support for that to the http server by using the "Accept:" header for a GET request and a "Content-Type" header for a PUT.

ps. sorry if my Go is not idiomatic, I am not familiar with it!

@mostynb
Copy link
Collaborator

mostynb commented Feb 5, 2020

Hi, thanks for the PR. I have a work in progress that made the assumption of sha256 even more concrete, so I'm glad to have seen this first! I think it's reasonable to add support for hashes that are easily distinguishable, but if we do so then I would prefer it to be implemented for both the HTTP and gRPC APIs.

I'm less sure about adding json support to bazel-remote, but it looks like the code required is fairly small. @buchgr: what do you think?

@bentonj-vmw
Copy link
Contributor Author

fwiw remote_execution.proto seems to have the capability to indicate support for SHA256, SHA1, MD5, VSO, SHA384, SHA512

https://github.com/bazelbuild/remote-apis/blob/2846a67ac8feb5001e9f704b66f5acc1e90f1ade/build/bazel/remote/execution/v2/remote_execution.proto#L1463

@mostynb
Copy link
Collaborator

mostynb commented Feb 5, 2020

fwiw remote_execution.proto seems to have the capability to indicate support for SHA256, SHA1, MD5, VSO, SHA384, SHA512

Yes, and it seems fairly easy to support a subset which are easily distinguishable by hash length like sha256/sha1/md5.

@buchgr
Copy link
Owner

buchgr commented Feb 5, 2020

I think this should really be split into two pull requests:

  1. Add support for different hash functions
    Instead of length distinction we could also make the name of the hash function part of the URL for HTTP requests and default to SHA256 if none is specified. If necessary we can make the default configurable.

  2. Add support for a HTTP+JSON caching.
    I fully agree that we should have this.

@bentonj-vmw bentonj-vmw changed the title http: Add support for json protobuf encoding and SHA1/MD5 hashes. http: Add support for using SHA1, MD5 or SHA256 hashes. Feb 5, 2020
@mostynb
Copy link
Collaborator

mostynb commented Feb 5, 2020

Instead of length distinction we could also make the name of the hash function part of the URL for HTTP requests and default to SHA256 if none is specified. If necessary we can make the default configurable.

If we use the "instance" (aka "pool" from bazel-remote's README.md) to specify the hash, then we can support any of the REAPI hashes, though VSO would require a new hash implementation. But since the contents of this field is implementation defined, we might need to work around other implementations' requirements- do you know of any?

@bentonj-vmw
Copy link
Contributor Author

Sure, I changed this PR to just hash types and opened a new one for JSON encoding #180 .

I don't mind if we use hash type in URL or by length, the only advantage of using string length was that it required no changes to the Cache (now CacheProxy) interface.

@bentonj-vmw
Copy link
Contributor Author

One problem to think about when using the URL instead of string length is for hashes which are stored inside an ActionResult for validation. Unless we want /ac/md5/ to assume all hashes inside the ActionResult are md5.

@bentonj-vmw
Copy link
Contributor Author

After speaking with a colleague who is working on the SCons part of it, we think we could get away with dropping this patch and using SHA256 hashes for /ac/ and /cas/, but only if we can use ActionResult.OutputFile.NodeProperties for storage of like a "SconsMd5" hash (as we would still be using md5/sha1 in SCons internally, but just rehashing with sha256 for caching).

This would be on the assumption that it will always be safe to put any K/V we want in NodeProperties outside of the strings defined in https://github.com/bazelbuild/remote-apis/blob/master/build/bazel/remote/execution/v2/nodeproperties.md

We would still need the JSON patch though :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants