resolver: Isolate auth token cache per session #3592

marxarelli · 2023-02-06T17:43:25Z

Prior to this change, entries in the resolver's auth token cache was keyed only by remote name and action (push/pull). This resulted in remote authenticated sessions being leaked between distinct client sessions with the potential for one client's token to be used in authentication for another client's registry access.

While this may not have had a substantial impact for pull requests as the solver vertices are shared for all clients as is the local cache namespace—i.e. one client can always receive a cache entry that was the result of another client's cached pull request—the shared auth cache also allowed one client to push to a remote registry for a ref it may not have otherwise had rights to. This behavior is particularly problematic in shared CI environments where auth token scopes are used to limit push access by registry namespace.

Key each entry in the auth token cache by <session id>:<ref>:<action> to avoid cross use between distinct client sessions.

Signed-off-by: Dan Duvall dduvall@wikimedia.org

tonistiigi

I don't think that this makes much sense as the LLB vertexes from different requests are merged anyway and even if it was not then all requests have access to build cache without any session auth check. So this fixes a small window (possibly by making things slower) while in many other order of requests the access to the resources would remain as it was before.

If you want to serve multiple untrusted clients then we would need a "namespace" construct for the whole builder so all aspects of the build are isolated. Even sharing content addressable blobs is actually tricky because although they can be shared safely, how would you prove that the second client actually had access to the full data(not just digest) without actually pulling it for a second time.

marxarelli · 2023-02-08T18:50:03Z

I see three distinct behaviors in the way the auth token cache is implemented now given a two-client scenario, one of which is not really of concern and two that are concerning. Of the two concerning behaviors, one is a functional (non-deterministic race condition) concern and the other is a security concern.

Scenario: Client A and Client B request a solve that share a vertex (same cache key) for a remote resource that requires remote auth and push the output to a remote registry that requires auth. Client A has valid credentials to pull the remote resource of the shared vertex and to push to the remote registry. Client B has no credentials.

Behaviors:

Given Client A completes its solve first, Client B has access to the cached vertex, either by direct retrieval via the content client or indirectly by integrating the vertex into its result and exporting the result. For reasons you describe, this is not of concern. All clients share the same namespace and cache space.
Given Client B requests its solve first, the request for the remote resource of the shared vertex fails, because Client B has no credentials with which to retrieve it. The failed auth result is cached. Client A then requests its solve and the solve fails because Client B's failed auth request is used with Client A. This is a functional concern, a race condition and non-deterministic behavior depending on timing of two unrelated clients.
Given Client A completes its solve and exports its result first, Client B can piggyback on Client A's cached auth to write its result to a remote registry for which it has no credential. This is a security exploit and not one effecting the clients or buildkitd itself but effecting the remote registry, quite likely a third party system.

IMHO, it is bad practice in general to ever cache auth credentials or results across sessions of unrelated clients, because even if the clients are considered trusted in the primary domain, there may be operations with remote systems for which these same clients would not be considered trusted.

tonistiigi · 2023-02-08T21:18:47Z

Given Client B requests its solve first, the request for the remote resource of the shared vertex fails, because Client B has no credentials with which to retrieve it. The failed auth result is cached. Client A then requests its solve and the solve fails because Client B's failed auth request is used with Client A. This is a functional concern, a race condition and non-deterministic behavior depending on timing of two unrelated clients.

I'm not sure this is what actually happens. The behavior should be to try all the sessions associated with the pull an look for one that works. But maybe the error codes have some effect.

Given Client A completes its solve and exports its result first, Client B can piggyback on Client A's cached auth to write its result to a remote registry for which it has no credential. This is a security exploit and not one effecting the clients or buildkitd itself but effecting the remote registry, quite likely a third party system.

Ok, but in that case you only need it if auth is asked for the push scope?

marxarelli · 2023-02-10T18:16:33Z

I'm not sure this is what actually happens. The behavior should be to try all the sessions associated with the pull an look for one that works. But maybe the error codes have some effect.

You're right. I thought I had observed this behavior a while back, but I'm unable to reproduce it now. Perhaps our registry was returning something unexpected due to our JWT auth setup. In any case, this isn't a problem.

Given Client A completes its solve and exports its result first, Client B can piggyback on Client A's cached auth to write its result to a remote registry for which it has no credential. This is a security exploit and not one effecting the clients or buildkitd itself but effecting the remote registry, quite likely a third party system.

Ok, but in that case you only need it if auth is asked for the push scope?

That's true. I'm happy to amend the change and limit this to push scopes if you're open to that.

tonistiigi · 2023-02-10T19:27:47Z

That's true. I'm happy to amend the change and limit this to push scopes if you're open to that.

SGTM

tonistiigi

^

marxarelli · 2023-06-21T18:43:06Z

My sincere apologies for disappearing. I will refactor this today.

marxarelli · 2023-07-19T16:22:30Z

@tonistiigi not sure if you saw my request for re-review. I've (finally) made the changes you requested.

marxarelli · 2024-02-01T19:13:29Z

@tonistiigi is this still an acceptable change?

tonistiigi

Sorry, I've missed notifications about this. Yes, nothing against the concept.

tonistiigi · 2024-02-01T20:33:28Z

util/resolver/pool.go

+		// party registries from leaking between client sessions. The key will end
+		// up looking something like:
+		// 'wujskoey891qc5cv1edd3yj3p::repository:foo/bar::pull,push'
+		key = fmt.Sprintf("%s::%s::%s", strings.Join(session.AllSessionIDs(g), ":"), name, scope)


What if this is just random, or something unique that is passed in from each call path that does push? session.AllSessionIDs(g) looks weird but I guess it has the correct behavior.

I'm not sure I know enough about the codebase to say whether a random value would have the same exact effect. I would think use of the session ID would at least benefit sessions where multiple pushes are made for the same repositories. (e.g. cache exports?)

tonistiigi

Needs rebase as well

Prior to this change, entries in the resolver's auth token cache was keyed only by remote name and action (push/pull). This resulted in remote authenticated sessions being leaked between distinct client sessions with the potential for one client's token to be used in authentication for another client's registry access. While this may not have had a substantial impact for pull requests as the solver vertices are shared for all clients as is the local cache namespace—i.e. one client can always receive a cache entry that was the result of another client's cached pull request—the shared auth cache also allowed one client to push to a remote registry for a ref it may not have otherwise had rights to. This behavior is particularly problematic in shared CI environments where auth token scopes are used to limit push access by registry namespace. Key each entry in the auth token cache by `<session id>:<ref>:<action>` to avoid cross use between distinct client sessions. Signed-off-by: Dan Duvall <dduvall@wikimedia.org>

Signed-off-by: Dan Duvall <dduvall@wikimedia.org>

marxarelli · 2024-02-08T21:48:42Z

Rebased, @tonistiigi

marxarelli · 2024-02-09T23:39:51Z

❤️ Thanks, @tonistiigi

tonistiigi reviewed Feb 7, 2023

View reviewed changes

tonistiigi requested changes May 22, 2023

View reviewed changes

marxarelli force-pushed the review/isolate-token-cache branch from 62570b2 to b49c685 Compare June 21, 2023 19:09

marxarelli requested a review from tonistiigi June 30, 2023 18:55

tonistiigi reviewed Feb 1, 2024

View reviewed changes

tonistiigi reviewed Feb 2, 2024

View reviewed changes

marxarelli added 2 commits February 8, 2024 13:33

resolver: Limit auth handler isolation to push scopes

1a5cf52

Signed-off-by: Dan Duvall <dduvall@wikimedia.org>

marxarelli force-pushed the review/isolate-token-cache branch from b49c685 to 1a5cf52 Compare February 8, 2024 21:34

tonistiigi approved these changes Feb 8, 2024

View reviewed changes

tonistiigi merged commit 092bec8 into moby:master Feb 9, 2024
63 checks passed

marxarelli deleted the review/isolate-token-cache branch February 21, 2024 18:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

resolver: Isolate auth token cache per session #3592

resolver: Isolate auth token cache per session #3592

marxarelli commented Feb 6, 2023

tonistiigi left a comment

marxarelli commented Feb 8, 2023 •

edited

Loading

tonistiigi commented Feb 8, 2023

marxarelli commented Feb 10, 2023

tonistiigi commented Feb 10, 2023

tonistiigi left a comment

marxarelli commented Jun 21, 2023

marxarelli commented Jul 19, 2023

marxarelli commented Feb 1, 2024

tonistiigi left a comment

tonistiigi Feb 1, 2024

marxarelli Feb 2, 2024

tonistiigi left a comment

marxarelli commented Feb 8, 2024

marxarelli commented Feb 9, 2024

resolver: Isolate auth token cache per session #3592

resolver: Isolate auth token cache per session #3592

Conversation

marxarelli commented Feb 6, 2023

tonistiigi left a comment

Choose a reason for hiding this comment

marxarelli commented Feb 8, 2023 • edited Loading

tonistiigi commented Feb 8, 2023

marxarelli commented Feb 10, 2023

tonistiigi commented Feb 10, 2023

tonistiigi left a comment

Choose a reason for hiding this comment

marxarelli commented Jun 21, 2023

marxarelli commented Jul 19, 2023

marxarelli commented Feb 1, 2024

tonistiigi left a comment

Choose a reason for hiding this comment

tonistiigi Feb 1, 2024

Choose a reason for hiding this comment

marxarelli Feb 2, 2024

Choose a reason for hiding this comment

tonistiigi left a comment

Choose a reason for hiding this comment

marxarelli commented Feb 8, 2024

marxarelli commented Feb 9, 2024

marxarelli commented Feb 8, 2023 •

edited

Loading