-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add compressed writes to cas.go. #232
Conversation
These are both pre-works for adding write compression support to the remote-apis-sdks. Chunker.Next should be "generic" as it allows for a drop in implementation of a reader that compresses on the fly. I still left the special case of caching data in memory as it saves the effort from memory copies. Making the chunker reads work independently from the digest size is also useful as it allows us to not have to pre-compute the digest of the compressed blobs. The current draft of the RE API never requires the digest of the compressed blob at any point, and this saves us the trouble of needing to read the data twice. Notice that this implies that the chunker won't be actually matching data against the supplied digest at any moment. The digest is now purely information storage, rather than necessary for chunker logic. As a caveat, for simplicity, I made it that we only cache in memory files that are smaller than the *chunk* size rather than the IO buffer size.
Follow up for #228.
6d54e76
to
ece84c4
Compare
This follows the current tentative API being worked on in bazelbuild/remote-apis#168. While there's technically room for it to change, it has reached a somewhat stable point worth implementing.
ece84c4
to
13788e3
Compare
@@ -135,7 +142,26 @@ func (c *Client) WriteProto(ctx context.Context, msg proto.Message) (digest.Dige | |||
func (c *Client) WriteBlob(ctx context.Context, blob []byte) (digest.Digest, error) { | |||
ch := chunker.NewFromBlob(blob, int(c.ChunkMaxSize)) | |||
dg := ch.Digest() | |||
return dg, c.WriteChunked(ctx, c.ResourceNameWrite(dg.Hash, dg.Size), ch) | |||
|
|||
name, err := c.maybeCompressBlob(ch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will be cleaner if you create the right type of chunker to begin with, instead of using the CompressChunker function. That way, you won't need the function at all (and probably not the .compressed variable either). It's simpler, because then the created chunker never has to change.
@@ -284,7 +284,7 @@ func TestWrite(t *testing.T) { | |||
} | |||
|
|||
for _, tc := range tests { | |||
t.Run(tc.name, func(t *testing.T) { | |||
testFunc := func(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead please add the new parameters to the tc struct.
@@ -135,7 +142,26 @@ func (c *Client) WriteProto(ctx context.Context, msg proto.Message) (digest.Dige | |||
func (c *Client) WriteBlob(ctx context.Context, blob []byte) (digest.Digest, error) { | |||
ch := chunker.NewFromBlob(blob, int(c.ChunkMaxSize)) | |||
dg := ch.Digest() | |||
return dg, c.WriteChunked(ctx, c.ResourceNameWrite(dg.Hash, dg.Size), ch) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is not enough to change if we want all writes to be compressed. Things don't usually go through this intermediate-layer function. Most use-cases use UploadIfMissing. I'd suggest creating a Client wrapper function for creating a Chunker from file (which initializes the correct type of chunker based on client parameters) and use it everywhere that Chunkers are created.
It follows the specs specified in bazelbuild/remote-apis#168, and it is similar to #232. Note that while the API still has room to change, it is mostly finalized and worth implementing. A caveat of this implementation is that while the `offset` in reads refers to the uncompressed bytes, the `limit` refers to the compressed bytes.
0371e47
to
6c40fe7
Compare
It follows the specs specified in bazelbuild/remote-apis#168, and it is similar to #232. Note that while the API still has room to change, it is mostly finalized and worth implementing. A caveat of this implementation is that while the `offset` in reads refers to the uncompressed bytes, the `limit` refers to the compressed bytes.
It follows the specs specified in bazelbuild/remote-apis#168, and it is similar to #232. Note that while the API still has room to change, it is mostly finalized and worth implementing. A caveat of this implementation is that while the `offset` in reads refers to the uncompressed bytes, the `limit` refers to the compressed bytes.
It follows the specs specified in bazelbuild/remote-apis#168, and it is similar to #232. Note that while the API still has room to change, it is mostly finalized and worth implementing. A caveat of this implementation is that while the `offset` in reads refers to the uncompressed bytes, the `limit` refers to the compressed bytes.
Closing in favor of #240 |
It follows the specs specified in bazelbuild/remote-apis#168, and it is similar to #232. Note that while the API still has room to change, it is mostly finalized and worth implementing. A caveat of this implementation is that while the `offset` in reads refers to the uncompressed bytes, the `limit` refers to the compressed bytes.
It follows the specs specified in bazelbuild/remote-apis#168, and it is similar to #232. Note that while the API still has room to change, it is mostly finalized and worth implementing. A caveat of this implementation is that while the `offset` in reads refers to the uncompressed bytes, the `limit` refers to the compressed bytes.
It follows the specs specified in bazelbuild/remote-apis#168, and it is similar to #232. Note that while the API still has room to change, it is mostly finalized and worth implementing. A caveat of this implementation is that while the `offset` in reads refers to the uncompressed bytes, the `limit` refers to the compressed bytes.
This follows the current tentative API being worked on in
bazelbuild/remote-apis#168. While there's technically room for it to
change, it has reached a somewhat stable point worth implementing.