Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial work on s3 blobstore #1863

Merged
merged 26 commits into from
May 21, 2019
Merged

Conversation

jontro
Copy link
Contributor

@jontro jontro commented May 16, 2019

I've created an initial implementation of the blobstore interface for s3.

Some notes

  • Credentials are assumed to be configured according to the section in "Configuring Credentials" https://docs.aws.amazon.com/sdk-for-go/api/
  • All buckets have to be in the same region, it would maybe make sense to be able to configure this per bucket

Is there an easy way to test archival of a workflow in a development environment?

Copy link
Contributor

@andrewjdawson2016 andrewjdawson2016 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this task on. I have left comments on your review. One comment that I did not leave in code is it would be awesome if there was a way to run this locally.

For example a local docker image running S3 that we could connect to. Then we can include config in development.yaml to run this S3 based implementation on local dev.

common/blobstore/s3store/client.go Outdated Show resolved Hide resolved
common/blobstore/s3store/client.go Outdated Show resolved Hide resolved
common/blobstore/s3store/client.go Outdated Show resolved Hide resolved
common/blobstore/s3store/client.go Outdated Show resolved Hide resolved
common/blobstore/s3store/client.go Outdated Show resolved Hide resolved
common/blobstore/s3store/client.go Outdated Show resolved Hide resolved
common/blobstore/s3store/config.go Outdated Show resolved Hide resolved
// Config describes the configuration needed to construct a blobstore client backed by file system
Config struct {
Region string `yaml:"region"`
DefaultBucket BucketConfig `yaml:"defaultBucket"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason filestore has DefaultBucket and CustomBucket is on start up it actually creates these buckets. But in the case of S3 blobstore we won't be doing that. There should not be any DefaultBucket or CustomBuckets as part of this config. S3 blobstore should just know how to take a bucket name in its public APIs and attempt to operate on that bucket.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about the use case of having different buckets in different regions? Or is one sufficient? Doing this lookup on demand is quite expensive but could of course be cached if so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed it for now

common/blobstore/s3store/config.go Outdated Show resolved Hide resolved
common/service/config/config.go Outdated Show resolved Hide resolved
Copy link
Contributor

@andrewjdawson2016 andrewjdawson2016 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks better. Thanks for working on this.

common/blobstore/s3store/client.go Show resolved Hide resolved
common/blobstore/s3store/README.md Show resolved Hide resolved
)

var (
// ErrListFiles could not list files
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both of these are common error types between filestore and s3 store. Let's factor these out and put them in interface.go. Can you also remove remove these error types from filestore. That way we only have these errors in one place.

common/blobstore/s3store/client.go Show resolved Hide resolved
common/blobstore/s3store/client.go Outdated Show resolved Hide resolved
*/

func (s *ClientSuite) TestUploadDownload_Success_CustomBucket() {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove extra enter

downloadBlob, err := client.Download(context.Background(), customBucketName, key)
s.NoError(err)
s.NotNil(downloadBlob)
s.assertBlobEquals(map[string]string{}, "blob body", downloadBlob)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we include some tests with non-empty tags?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this, though maybe this whole test should be removed as a custom bucket name isn't anything special with the s3store any more

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes agreed. Let's remove.

common/blobstore/s3store/config_test.go Show resolved Hide resolved
common/blobstore/s3store/README.md Show resolved Hide resolved
cmd/server/server.go Show resolved Hide resolved
@jontro
Copy link
Contributor Author

jontro commented May 20, 2019

Hopefully I have fixed all feedback and left comments otherwise. One thing, I can't see why the tests are failing, locally the unit tests I've touched passes.

Copy link
Contributor

@andrewjdawson2016 andrewjdawson2016 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. I have a few small nits then let's get this landed. Thanks for the great work on this.

@@ -150,3 +150,8 @@ ignored = ["github.com/uber/cadence/.gen"]
[[constraint]]
name = "github.com/valyala/fastjson"
version = "1.4.1"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra enter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, somehow merging with master this automatically. Good find!

return tags, nil
}

// Exists returns false when the bucket does not exist or the bucket and key does not exist
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filestore currently returns error if bucket does not exist (e.g. bucket not existing is an error but key not existing is just false, nil) I think both APIs are reasonable but I want them to be the same. Lets change this to return an error if bucket does not exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem is it would be more inefficient. I.e. I would need to call a different api method. The same error is returned from the api whether or not the bucket exists. What do you think?

})
if err != nil {
aerr, ok := err.(awserr.Error)
if ok && (aerr.Code() == "NotFound") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this hardcoded string? Is there no error code or constant in s3 lib we can check against?

}
retentionDays := 0
for _, v := range lifecycleResults.Rules {
if *v.Status == "Enabled" && retentionDays < int(*v.Expiration.Days) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any S3 library constant we can use instead of this hardcoded string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

return nil, err
}
retentionDays := 0
for _, v := range lifecycleResults.Rules {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we iterating through these rules in chronological order? If so will retention days always be correct at the end of this loop? The way I am reading this code is we select the highest Expiration.Days that has ever been seen even if that is higher than the current Expiration.Days? Maybe I am just misunderstanding these APIs?

}

func (c *client) IsRetryableError(err error) bool {
return c.isStatusCodeRetryable(err) || request.IsErrorRetryable(err) || request.IsErrorThrottle(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice :)

common/blobstore/s3store/client.go Show resolved Hide resolved
downloadBlob, err = client.Download(context.Background(), defaultBucketName, key)
s.NoError(err)
s.NotNil(downloadBlob)
s.assertBlobEquals(map[string]string{"key": "value"}, "body version 2", downloadBlob)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

downloadBlob, err := client.Download(context.Background(), customBucketName, key)
s.NoError(err)
s.NotNil(downloadBlob)
s.assertBlobEquals(map[string]string{}, "blob body", downloadBlob)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes agreed. Let's remove.

Copy link
Contributor

@andrewjdawson2016 andrewjdawson2016 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome.

@andrewjdawson2016 andrewjdawson2016 merged commit ffa6a99 into uber:master May 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants