-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
resolver: Isolate auth token cache per session #3592
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that this makes much sense as the LLB vertexes from different requests are merged anyway and even if it was not then all requests have access to build cache without any session auth check. So this fixes a small window (possibly by making things slower) while in many other order of requests the access to the resources would remain as it was before.
If you want to serve multiple untrusted clients then we would need a "namespace" construct for the whole builder so all aspects of the build are isolated. Even sharing content addressable blobs is actually tricky because although they can be shared safely, how would you prove that the second client actually had access to the full data(not just digest) without actually pulling it for a second time.
I see three distinct behaviors in the way the auth token cache is implemented now given a two-client scenario, one of which is not really of concern and two that are concerning. Of the two concerning behaviors, one is a functional (non-deterministic race condition) concern and the other is a security concern. Scenario: Client A and Client B request a solve that share a vertex (same cache key) for a remote resource that requires remote auth and push the output to a remote registry that requires auth. Client A has valid credentials to pull the remote resource of the shared vertex and to push to the remote registry. Client B has no credentials. Behaviors:
IMHO, it is bad practice in general to ever cache auth credentials or results across sessions of unrelated clients, because even if the clients are considered trusted in the primary domain, there may be operations with remote systems for which these same clients would not be considered trusted. |
I'm not sure this is what actually happens. The behavior should be to try all the sessions associated with the pull an look for one that works. But maybe the error codes have some effect.
Ok, but in that case you only need it if auth is asked for the push scope? |
You're right. I thought I had observed this behavior a while back, but I'm unable to reproduce it now. Perhaps our registry was returning something unexpected due to our JWT auth setup. In any case, this isn't a problem.
That's true. I'm happy to amend the change and limit this to push scopes if you're open to that. |
SGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^
My sincere apologies for disappearing. I will refactor this today. |
62570b2
to
b49c685
Compare
@tonistiigi not sure if you saw my request for re-review. I've (finally) made the changes you requested. |
@tonistiigi is this still an acceptable change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I've missed notifications about this. Yes, nothing against the concept.
// party registries from leaking between client sessions. The key will end | ||
// up looking something like: | ||
// 'wujskoey891qc5cv1edd3yj3p::repository:foo/bar::pull,push' | ||
key = fmt.Sprintf("%s::%s::%s", strings.Join(session.AllSessionIDs(g), ":"), name, scope) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if this is just random, or something unique that is passed in from each call path that does push? session.AllSessionIDs(g)
looks weird but I guess it has the correct behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I know enough about the codebase to say whether a random value would have the same exact effect. I would think use of the session ID would at least benefit sessions where multiple pushes are made for the same repositories. (e.g. cache exports?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs rebase as well
Prior to this change, entries in the resolver's auth token cache was keyed only by remote name and action (push/pull). This resulted in remote authenticated sessions being leaked between distinct client sessions with the potential for one client's token to be used in authentication for another client's registry access. While this may not have had a substantial impact for pull requests as the solver vertices are shared for all clients as is the local cache namespace—i.e. one client can always receive a cache entry that was the result of another client's cached pull request—the shared auth cache also allowed one client to push to a remote registry for a ref it may not have otherwise had rights to. This behavior is particularly problematic in shared CI environments where auth token scopes are used to limit push access by registry namespace. Key each entry in the auth token cache by `<session id>:<ref>:<action>` to avoid cross use between distinct client sessions. Signed-off-by: Dan Duvall <dduvall@wikimedia.org>
Signed-off-by: Dan Duvall <dduvall@wikimedia.org>
b49c685
to
1a5cf52
Compare
Rebased, @tonistiigi |
❤️ Thanks, @tonistiigi |
Prior to this change, entries in the resolver's auth token cache was keyed only by remote name and action (push/pull). This resulted in remote authenticated sessions being leaked between distinct client sessions with the potential for one client's token to be used in authentication for another client's registry access.
While this may not have had a substantial impact for pull requests as the solver vertices are shared for all clients as is the local cache namespace—i.e. one client can always receive a cache entry that was the result of another client's cached pull request—the shared auth cache also allowed one client to push to a remote registry for a ref it may not have otherwise had rights to. This behavior is particularly problematic in shared CI environments where auth token scopes are used to limit push access by registry namespace.
Key each entry in the auth token cache by
<session id>:<ref>:<action>
to avoid cross use between distinct client sessions.Signed-off-by: Dan Duvall dduvall@wikimedia.org