-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(fs/s3): Add support for prefix filtering #1938
Conversation
* This allows for using part of an s3 bucket for feature flags, using the S3 native `Prefix` filter
0dd0a47
to
4ff0e1c
Compare
Codecov Report
@@ Coverage Diff @@
## main #1938 +/- ##
==========================================
+ Coverage 71.07% 71.13% +0.06%
==========================================
Files 67 67
Lines 6541 6572 +31
==========================================
+ Hits 4649 4675 +26
- Misses 1629 1634 +5
Partials 263 263
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks! Definitely worth considering. Just having a thought before we go any further though.
The .flipt.yml
index file we already parse and respect when using the filesystem backend might actually give you the capabilities you already need without a code change.
Flipt will attempt to fetch and parse this file first, it includes both an include and exclude list. If the file does not exist, it will use a default index file:
version: "1.0"
include:
- "**/features.yaml"
- "**/features.yml"
- "**/*.features.yaml"
- "**/*.features.yml"
exclude: []
However, it will respect any provided override of that file instead. So, given you just want to source from a prefix today, e.g. someprefix
you could add a .flipt.yml
with:
version: "1.0"
include:
- "someprefix/**/features.yaml"
- "someprefix/**/features.yml"
- "someprefix/**/*.features.yaml"
- "someprefix/**/*.features.yml"
exclude: []
I haven't tried this particular example yet though, but that would be my assumption at least.
Obviously, the one advantage to a prefix server side like you have here is the size of the list response fetched from S3 can be reduced. But I don't know how much of a problem it is that we might need that.
What do you think @ahobson ? Do you think server side prefixed fetching is still valuable regardless of the index capability? i.e. to reduce the number of keys fetched.
My one thought originally around using prefixed search was just to make the fs.FS
implementation behave like a much like an FS as possible. Have it e.g. tokenize on /
and walk prefixes, but after reflecting on it, I dont see any real value in that. What you implemented and proposed already seems pretty perfect already.
TL; DR: for my specific use case, the prefix would be helpful, but I'm not sure how common that use case is.
I hadn't realized that!
As I explored how we might deploy flipt for one of our applications, I realized we already had an S3 bucket configured, but we use it for multiple purposes, including user uploads. The result set could be quite large. ... and that has made me realize I didn't implement any support for dealing with truncated responses because the bucket has more than 1000 objects. I should probably implement that anyway. However, if I could set a prefix, the result size would be much more manageable. I'm not sure how common a situation it is to have an s3 bucket used for multiple purposes like this. |
I was thinking about this a bit more and my understanding is that filtering happens at the strorage level, not at the filesystem level. I believe that means for all filesystems, the entire directory tree will be walked regardless of Supporting a prefix for S3 means that only the objects that match the prefix will be returned, limiting the size of the results and the number of calls to AWS. I suppose that's just an optimization, but to me it seems like a pretty important one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the slow review @ahobson , I was away end of last week.
One question around how the prefix in enforced internally. Otherwise, this looks great and the reasoning makes sense.
internal/s3fs/s3fs.go
Outdated
// If a prefix is provided, cannot Open files without the prefix | ||
if f.prefix != "" && !strings.HasPrefix(name, f.prefix) { | ||
return nil, pathError | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thought on this:
I wonder if instead of enforcing that all keys provide the prefix, we should instead forcibly prepend the prefix.
My reasoning is for e.g. .flipt.yml
if we dont do this, given a prefix is defined, the user will never be able to override the .flipt.yml
. Since the storage layer will always request it without the prefix. This might be a confusing and easy mistaken to make.
However, if we forcibly apply the prefix, then the user can still put a .flipt.yml
at the prefix in their bucket and get the same experience of controlling the overrides that way.
What do you think @ahobson ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! Nice catch! Yes, supporting opening the .flipt.yml
file in s3 under the prefix seems like an important use case.
The current implementation of ReadDir
in the s3fs returns the Key
that includes the prefix and thus Open
gets the full path.
I think it would be less confusing (and less error prone in the code) if the includes
and excludes
in the .flipt.yml
file matched against the S3 Key
instead of the s3fs removing the prefix.
If a path provided to Open
does not have a prefix, then one is prepended before the S3 GetObject
call. That would support the .flipt.yml
file, but still allow everything else to operate on the full Key
.
@GeorgeMac I think I probably just restated your proposal, but I wanted to make sure we are on the same page about the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh yeah I think I see what you're saying there r.e. the subtle interaction there.
Kind of append if not already prefixed?:
if !strings.HasPrefix(path, prefix) {
path = prefix + path
}
That sounds great if what I am saying equates with what you're thinking 👍
…-io#1962) Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.56.2 to 1.57.0. - [Release notes](https://github.com/grpc/grpc-go/releases) - [Commits](grpc/grpc-go@v1.56.2...v1.57.0) --- updated-dependencies: - dependency-name: google.golang.org/grpc dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: George <me@georgemac.com>
…flipt-io#1961) Bumps [golang.org/x/tools](https://github.com/golang/tools) from 0.11.0 to 0.11.1. - [Release notes](https://github.com/golang/tools/releases) - [Commits](golang/tools@v0.11.0...v0.11.1) --- updated-dependencies: - dependency-name: golang.org/x/tools dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
) Bumps [eslint](https://github.com/eslint/eslint) from 8.45.0 to 8.46.0. - [Release notes](https://github.com/eslint/eslint/releases) - [Changelog](https://github.com/eslint/eslint/blob/main/CHANGELOG.md) - [Commits](eslint/eslint@v8.45.0...v8.46.0) --- updated-dependencies: - dependency-name: eslint dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [github.com/aws/aws-sdk-go-v2/config](https://github.com/aws/aws-sdk-go-v2) from 1.18.30 to 1.18.32. - [Release notes](https://github.com/aws/aws-sdk-go-v2/releases) - [Changelog](https://github.com/aws/aws-sdk-go-v2/blob/main/CHANGELOG.md) - [Commits](aws/aws-sdk-go-v2@config/v1.18.30...config/v1.18.32) --- updated-dependencies: - dependency-name: github.com/aws/aws-sdk-go-v2/config dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: George <me@georgemac.com> Co-authored-by: Mark Phelps <209477+markphelps@users.noreply.github.com>
Unsure if you have access to the GH action output but I am seeing this error currently in CI @ahobson :
|
@GeorgeMac 🤦 I'm sorry, I missed that. Thank you for the direct ping. I've updated the tests and they now pass for me locally. Hopefully I haven't missed anything else. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome. Thanks @ahobson
Thank you @ahobson !! Very awesome indeed |
This PR allows for using part of an s3 bucket for feature flags, using the S3 native
Prefix
filter.@GeorgeMac You wondered about prefix listing and I realized it was much easier to do than I thought and also a feature that I think I might be able to use sooner rather than later.
I tried to update the tests to reflect this new feature, but I didn't test all combinations. Let me know if you think it needs more coverage.