-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolve AWS IAM unique IDs #2814
Conversation
This adds a (now-default) option on roles to resolve the bound_iam_principal_arn (when using AWS IAM auth) to AWS's internal unique ID. The primary reason for this is to prevent a particular role or user from being deleted and recreated with the same ARN and thus taking over Vault permissions that were intended to be bound to the previous ARN, which more closely mimics AWS's behavior. By preferentially resolving via the internal unqiue ID rather than the ARN, this also fixes the issue in hashicorp#2729
Similar to hashicorp#2729, the existing implementation for the aws ec2 auth method can't handle an instance profile with a path in the name, when using a bound IAM role. This improves the parsing of the instance profile ARN to eliminate that error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joelthompson This is looking great to me. Nice work! I've left some minor comments.
builtin/credential/aws/backend.go
Outdated
// accounts using their IAM instance profile to get their credentials. | ||
defaultAWSAccountID string | ||
|
||
resolveArnToUniqueId func(logical.Storage, string) (string, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename this to resolveArnToUniqueIDFunc
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
builtin/credential/aws/client.go
Outdated
if err != nil { | ||
return nil, fmt.Errorf("unable to fetch current caller: %v", err) | ||
} | ||
b.defaultAWSAccountID = *identity.Account |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a nil
check for identity here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
builtin/credential/aws/path_login.go
Outdated
stsRole := "" | ||
if sts != nil { | ||
stsRole = sts.StsRole | ||
return nil, fmt.Errorf("failed to extract out IAM instance profile name from IAM instance profile ARN; error: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we update the error message to indicate failure to extract out entity instead of profile name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
website/source/docs/auth/aws.html.md
Outdated
role creation (or upgrading to use this) succeed, then Vault has | ||
already been able to resolve internal IDs, and it doesn't need any | ||
further IAM permissions to authenticate users. If a role has been | ||
deleted and recreated, and Vault has cached the old unique ID, the you |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/the you/you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
builtin/credential/aws/backend.go
Outdated
if err != nil { | ||
return "", err | ||
} | ||
return *userInfo.User.UserId, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When err == nil
, can we trust the responses from GetUser
, GetRole
and GetInstanceProfile
to be non-nil. If not, can we add nil
checks appropriately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the current source, it's not possible. But, I don't see that as a guarantee documented anywhere, so better safe than sorry. I've added the additional checks to be safe.
builtin/credential/aws/path_login.go
Outdated
"canonical_arn": canonicalArn, | ||
"client_arn": callerID.Arn, | ||
"canonical_arn": entity.canonicalArn(), | ||
"client_user_id": callerID.UserId, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this not be the split version of the UserId in callerID?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, great catch, thanks! Done (and added a test for it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I think I've addressed everything.
builtin/credential/aws/backend.go
Outdated
// accounts using their IAM instance profile to get their credentials. | ||
defaultAWSAccountID string | ||
|
||
resolveArnToUniqueId func(logical.Storage, string) (string, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
builtin/credential/aws/backend.go
Outdated
if err != nil { | ||
return "", err | ||
} | ||
return *userInfo.User.UserId, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the current source, it's not possible. But, I don't see that as a guarantee documented anywhere, so better safe than sorry. I've added the additional checks to be safe.
builtin/credential/aws/client.go
Outdated
if err != nil { | ||
return nil, fmt.Errorf("unable to fetch current caller: %v", err) | ||
} | ||
b.defaultAWSAccountID = *identity.Account |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
builtin/credential/aws/path_login.go
Outdated
stsRole := "" | ||
if sts != nil { | ||
stsRole = sts.StsRole | ||
return nil, fmt.Errorf("failed to extract out IAM instance profile name from IAM instance profile ARN; error: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
website/source/docs/auth/aws.html.md
Outdated
role creation (or upgrading to use this) succeed, then Vault has | ||
already been able to resolve internal IDs, and it doesn't need any | ||
further IAM permissions to authenticate users. If a role has been | ||
deleted and recreated, and Vault has cached the old unique ID, the you |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
builtin/credential/aws/path_login.go
Outdated
"canonical_arn": canonicalArn, | ||
"client_arn": callerID.Arn, | ||
"canonical_arn": entity.canonicalArn(), | ||
"client_user_id": callerID.UserId, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, great catch, thanks! Done (and added a test for it).
@@ -80,20 +81,39 @@ func (b *backend) getClientConfig(s logical.Storage, region, stsRole, clientType | |||
return nil, fmt.Errorf("could not compile valid credentials through the default provider chain") | |||
} | |||
|
|||
stsConfig, err := b.getRawClientConfig(s, region, "sts") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain this change? Will it always be a valid assumption to assume that you're getting STS credentials?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question (and I probably should have commented it).
Previously, Vault just assumed that if there wasn't an stsRole
configured for the given account, to just use the default credentials provided. That worked because you were looking up two things:
- Instance IDs, which were basically guaranteed to be globally unique, so if you found the given instance ID, you knew it was the instance ID you were looking for.
- IAM instance profiles by name. These weren't globally unique across accounts, but it was OK because the only thing you looked for was the associated IAM role ARN, which does include the account number, so you were OK if you looked up the profile in the wrong account because it wouldn't have the associated IAM ARN you need.
In this PR, I need to resolve unique IDs, which requires looking up a particular IAM entity (user or role) in a given account with a given name. If I have an stsRole
configured for the given account, then great, I'll just try to assume it. But, what if I don't have an stsRole
defined for that account ID? I'm not a huge fan of the assumption that is being made that it should just look up using the default credentials -- I'm looking up the entity by name, but entities with the same name can exist in multiple accounts. I could check the returned ARN to ensure it has the expected account ID embedded in it, but this feels more correct to me. getClientConfig
is being asked for an AWS API client in account 123, we should check to ensure that the client we're returning is actually in account 123. And that's the goal here.
Thinking through this a little more, there's a small issue here in that an operator could call auth/aws/config/sts
with an account_id
of 123456789012, but pass in an stsRole
belonging to a different account, as the config/sts
endpoint doesn't actually validate that the passed-in stsRole
ARN belongs to account_id
but I think that's probably a separate discussion (and I'll open a separate issue for it).
Will it always be a valid assumption to assume that you're getting STS credentials?
I'd put it a little differently. I believe it is a valid assumption that, if you can make any AWS API call, you are able to call sts:GetCallerIdentity
. I've even tried an IAM policy to explicitly deny sts:GetCallerIdentity, and it still succeeds.
Does this all make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My question was less about sts:GetCallerIdentity and more about whether it's safe to assume that you always want to use STS creds to make the client calls but from your explanation it sounds correct -- I assume that in any given account we should always be able to get STS creds? Or will this not be the case if they're not using IAM roles, in which case there needs to be a fallback?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how to answer your question because the phrase "STS creds" isn't well defined, so I'm not sure what you're asking and thus not sure how to answer. Let me try to explain a little more, and if I'm still not answering your question, it'd probably be best to discuss offline. Apologies if this response is a bit basic, but I'm just trying to understand where we're not connecting.
STS is both a service that requires credentials to authenticate to it and a service that can act as a source of credentials. The code in question does both. I'm pretty sure that any type of IAM entity can authenticate to STS.
it's safe to assume that you always want to use STS creds to make the client calls
Which client calls?
If an stsRole
has been configured for a given target account ID, then we use the existing creds Vault has been configured with to authenticate to STS to ask for a new set of credentials corresponding to the stsRole
, and it uses those credentials for future requests. I'm not really changing this code path at all -- I'm just pulling the existing code up a level so the results can be used, but no material changes are being made. And we're depending on customers to configure the IAM roles such that Vault is able to retrieve credentials via sts:AssumeRole
, so the customer has said he/she wants to get credentials from STS.
If no stsRole
has been configured for an account (the big change I'm making here), then we just use the existing creds Vault has been configured with to authenticate to sts:GetCallerIdentity
to learn the identity of the creds being given; we're not asking for any new credentials from STS. And we continue to use the creds Vault has been given (i.e., not any temporary credentials retrieved via STS) to make future client calls.
it's safe to assume that you always want to use STS creds to make the client calls
Which client calls? The only times we ask STS for credentials is when clients have explicitly told Vault to go to STS for credentials for those calls, and I'm not changing that.
I assume that in any given account we should always be able to get STS creds?
In any given account, we should always be able to call sts:GetCallerIdentity
using any valid creds, whether those are IAM user creds or assumed role creds. Beyond that, it depends. If users have set up the appropriate STS configs in Vault and have set up the appropriate IAM permissions in AWS, then yes. But if's really a user configuration issue.
Or will this not be the case if they're not using IAM roles
The stsRole
must be an IAM role, yes. Otherwise, even if Vault is using, say, IAM User credentials, all this will still work. IAM users can call sts:GetCallerIdentity
and IAM users can call sts:AssumeRole
(as long as the appropriate IAM permissions have been set up).
Does this answer what you're asking? If not, what am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to go with "I'm out of my AWS depth here" and just trust you on this. Don't worry, it's not that you haven't answered my question, it's more that I'm not sure what my question is anymore, but your answer has convinced me that you know exactly what you're doing and I should leave it be :-D
Thanks a bunch @joelthompson ! |
This adds a (now-default) option on roles to resolve the
bound_iam_principal_arn (when using AWS IAM auth) to AWS's internal
unique ID. The primary reason for this is to prevent a particular role
or user from being deleted and recreated with the same ARN and thus
taking over Vault permissions that were intended to be bound to the
previous ARN, which more closely mimics AWS's behavior.
By preferentially resolving via the internal unqiue ID rather than the
ARN, this also fixes the issue in #2729 when resolution is turned on.
This also includes a commit that supersedes the changes in #2802. The fix
in #2802 is correct, but by using the new ARN parser, we're able to better
test the resolution to ensure that we're at least parsing the ARN properly.