-
Notifications
You must be signed in to change notification settings - Fork 475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aws: IAM pod identity #260
aws: IAM pod identity #260
Conversation
6d29f3f
to
4e8da06
Compare
/cc @marun |
/assign @derekwaynecarr |
/cc @smarterclayton |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to work out tomorrow if we should require a unique provider per cluster and if the user could provide the provider as a day-1 option rather than we create it.
enhancements/aws-pod-identity.md
Outdated
@@ -0,0 +1,442 @@ | |||
--- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also this needs to pick a domain ??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean an subdirectory under enhancements? i'm open to suggestions 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm working on a CCO enhancement as well, my best guess was security.
enhancements/aws-pod-identity.md
Outdated
|
||
### Goals | ||
|
||
1. A pod can assume an IAM role created with a `CredentialRequest` CR using a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this means that a pod can have only ONE identity attached right?
cloud-credential-operator currently supports dual identity.
- an admin entity that is used to create new roles/users etc.
- an reader entity that is only allowed to read roles/users to make sure things stay as-is
a lot of the operators today require create/update/delete permissions do that they can create dynamic objects. But some of the operators can be modified to have dual identities.
- ADMIN_WRITE identity that can be used to create the objects.
- RESTRICTED_WRITE identity that is restricted to only the objects it created.
that could allow users to remove/disable the ADMIN_WRITE permissions as day-2 operation.. making the security stance even stronger.
if using the POD identity based on webhook and environment disallows us to ever move in that direction, I think we should maybe not use it and stick to secrets as multiple secrets can be mounted compared to single AWS env variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this would currently not be suitable for use by the CCO itself. since the pod can only run with one serviceaccount, this mechanism is one-to-one IAM role to pod.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would/could this be solved by having the operator have the ADMIN_WRITE and the operand have RESTRICTED_WRITE (assuming a clean separation between operator and operand)?
i suppose this might also imply that the operator is also creating a new CredentialsRequest so that the perms for the operand only grant perms to the actually created objects...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we should punt on the multi-cred use case for now since serviceaccount-to-pod is 1:1 vs secret-to-pod is N:1
4e8da06
to
480e900
Compare
enhancements/aws-pod-identity.md
Outdated
|
||
### Non-Goals | ||
|
||
1. Convert existing IAM user-based components (e.g. ingress, image-registry, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we plan to update the existing CredRequests to use this new mechanism on AWS? And if so, how can we upgrade without triggering something bad here. These CredRequests are delivered via the release manifest and I believe that means the CVO will update them to their new state. Is there a way we can make this manual and opt-in?
In doing so, the operator permissions would then become fixed and need manual intervention in the event an upgraded operator needed new permissions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No we are not planning to update existing CredentialRequests
at this time.
For this stage, I recommended we disallow CR changes where the spec changes from a secretRef
to a serviceAccountRef
or visa versa.
There doesn't seem to be a way to mark CRD fields as immutable atm kubernetes/kubernetes#65973
We'll probably just overwrite the change in the controller is someone tries to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll probably just overwrite the change in the controller is someone tries to do that.
Do we have an admission webhook for this? What happens if someone tries to push a change while the cred minter is, for some reason, out of action? The new cred minter comes up and... searches for existing Secrets to decide if it needs to overwrite a serviceAccountRef
?
@deads2k @derekwaynecarr @joelddiaz @dgoodwin @abhinavdahiya Ok so I've encountered an issue implementing the API change to the CredentialsRequest CRD Currently, this is the spec
The issue is two fold depending on what solution are you trying to implement.
I've thought of 3 solutions and I'm looking for feedback:
I would apreciate any feedback on these methods or there might be a far superior way that I'm overlooking. |
3 feels the best to me, going from a mandatory field to optional shouldn't cause any harm or break API compat, and I'm not particularly worried about converting that field to a pointer, though you would need to update all usage of it to add in the other implementation. We do not typically use discriminator fields for this kind of thing but it sounds like good practice. I would suggest some kind of name with "Type", maybe "CredentialsType". I don't think we have webhook validation today so this will probably need to be validated at runtime and reflected in a condition? |
There is going to need to be some intermediate work either way. The core operator and the actuators currently operate directly on the secret. I'm going to need to create an abstraction so we aren't if-else'ing throughout the whole codebase. |
What is the advantage of the StorageDestination approach? Is it that the StorageType tells you which of the pointers contains the actual data vs searching through the object for a non-nil value? Is converting the SecretRef to a pointer and adding the ServiceAccountRef as a pointer not an option, or is the StorageDestination basically the solution you offered covering that? FWIW, I'm partial to #3 (or basically any option which can take us to a place where the SecretRef field becomes optional). |
Yes. UnionDiscriminators allow older clients to know they don't understand a newer format. Imagine adding, "nextCoolThingSethWants". Old clients won't even see the new fields, but they will see the discriminator.
You cannot change a field to a pointer because old clients won't be able to deserialize |
e389f60
to
2f9b54a
Compare
521e99a
to
c6e57aa
Compare
Latest push contains significant revisions:
We also resolved to writing a new controller that will create and reconcile an OIDC discovery endpoint in a public S3 bucket. This is a significant amount of work entering the scope (configuration for enable/disable this functionality, new IAM perms required, etc). As such, the work on openshift/cloud-credential-operator#187 will be put on hold, and work will begin on the OIDC discovery controller. |
enhancements/aws-pod-identity.md
Outdated
|
||
Pros: | ||
- Works even with the apiserver is not accessible by AWS STS | ||
(disconnected/private cluster) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the door open to seamlessly transition to ServiceAccountDiscovery later once it's out of alpha?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so. We just have to detect if the URL in the Authentication CR is empty or our default and switch it to the ServiceAccountDiscovery URL. If it is not our default (the customer set it purposefully), we would do nothing. However, it was decided to do the s3 bucket because ServiceAccountDiscovery won't work for private clusters.
enhancements/aws-pod-identity.md
Outdated
to use this method if we work out the ordering i.e. the webhook needs to be | ||
running and the IAM roles need to exist before any of the pods for these | ||
components are created and would require change and coordination with the | ||
operators that manage those components. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CCO does run pretty early on the bootstrap node which may help here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But having in-cluster logic that switches on "am I a fresh install after $INSTALL_APPROACH_PIVOT
?" seems pretty complicated.
enhancements/aws-pod-identity.md
Outdated
|
||
This allows users to more cleanly express and manage pod identities in IAM when | ||
running on AWS. The use of bound tokens to assume roles (vs user id/key) is | ||
preferred as these tokens can expire and be rotated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we expire and rotate user IDs and keys?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can, but the expiration is not a direct property of the userID/key, where it is for the token i.e. the token user can discover the token's expiration.
enhancements/aws-pod-identity.md
Outdated
eks.amazonaws.com/role-arn: "arn:aws:iam::111122223333:role/sjenning-abcde-openshift-image-registry" | ||
``` | ||
|
||
Any pod started with this `serviceAccountName` will have the token injected via |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens when a Pod is defined that references the not-yet-created Service Account (CCO hasn't finished creating the role/ServiceAccount)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the webhook does not inject the projected volume. the annotated SA must exist before the pod is admitted for this to work. This is a known limitation of the upstream implementation i.e. that a pod can be admitted before the service account that runs it exists.
enhancements/aws-pod-identity.md
Outdated
``` | ||
Ideally, new 4.5 clusters deployed on platform `aws` would roll out with this set day-1. | ||
|
||
For day-2 deployments, changing this results in a redeployment of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the first release of CCO (and all future ones) with support for pod-identiy be responsible for performing this modification of the Authentication object (so that all 4.5+ clusters are in pod-identity-capable-mode)?
Getting requests for STS to be in 4.8. What is the status of this RFE? |
c6e57aa
to
bf8ab49
Compare
bf8ab49
to
4a9f07e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can make the requested tweaks in follow-up.
/approve
/lgtm
|
||
# Support Native (IAM-based) Pod Identity on AWS | ||
|
||
## Release Signoff Checklist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you update which of these items got completed since we are retro merging?
Phase 2 will make the Role creation and `ServiceAccount` annotation | ||
automatic, just as the User creation and `Secret` injection are today. This will | ||
involve changing the API in the CredentialRequest CRD to allow references to | ||
`ServiceAccounts` in addition to `Secerts` as an injection mechanism for the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Secrets
integratedOAuthMetadata: | ||
name: oauth-openshift | ||
``` | ||
Ideally, new 4.5 clusters deployed on platform `aws` would roll out with this set day-1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clarify that this is how we are rolling out now (i see issuer specified)
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: derekwaynecarr, sjenning The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
||
1. An admin can create a IAM Role via `CredentialRequest` CR and inject (via | ||
annotation) to a `ServiceAccount`, just as they can create an IAM user and | ||
inject to a `Secret` currently. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sjenning we've talked about only having done phase 1 but this Phase 2 is actually complete isn't it?
There's really a phase 3 of transitioning a full OCP install and all components to use this functionality, is that right?
Access to the new functionality is gated by two things: the platform being | ||
`aws` and, once phase 2 is complete, the user specifying a `spec.storage.serviceAccount` in the `CredentialsRequests` CR. | ||
|
||
## Design Details |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like everything after this is template boilerplate. Maybe a followup PR where you drop the sections? Or explicitly say "We do not need a test plan for this enhancement" and so forth for each section?
@derekwaynecarr