Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: adjust credential chain #2958

Closed
wants to merge 4 commits into from
Closed

Conversation

ekristen
Copy link
Contributor

@ekristen ekristen commented Sep 22, 2023

PR #2889 was suppose to fix this but it doesn't as far as I can tell. It's possibly because I have IDMS enabled too and not blocked, but you can't expect that to get blocked to make things work. (I read about this in one of the issues)

Simply switching the chain flow up a bit resolves this issue without the need of if statements makes this all work again, which is what this PR does.

I've build the image and am hosting it at ghcr.io/ekristen/tempo:2.2.3-ek1 if you'd like to test it out.

What this PR does:

Fixes AWS authentication for IRSA.

Which issue(s) this PR fixes:
Fixes #2888

Checklist

  • Tests updated
  • Documentation removed
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Configuring IRSA Testing with GitHub Actions

  1. Identify a test AWS Account (ideally a dedicated account for this with nothing else)
  2. Add GitHub Actions Tokens as an Identity Provider
  1. Create an AWS Role named tempo-tests
  2. Set the Trust Policy to the trust policy section
  • Replace 123456123456 with the real account id
  • Edit line 162 and replace 123456123456 with the real account id
  1. Add NO permissions to the role. None. This will allow the role to be assumed but it'll have no permissions to do anything in the account.
  2. Complete

Trust Policy

Note: replace 123456123456 with your account id.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::123456123456:oidc-provider/token.actions.githubusercontent.com"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringLike": {
                    "token.actions.githubusercontent.com:sub": "repo:grafana/tempo:*"
                },
                "StringEquals": {
                    "token.actions.githubusercontent.com:aud": "sts.amazonaws.com"
                }
            }
        }
    ]
}

@CLAassistant
Copy link

CLAassistant commented Sep 22, 2023

CLA assistant check
All committers have signed the CLA.

},
}),
}
chain := []credentials.Provider{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This chain seems to exclude the AWS SDK auth that was introduced in #2760. Is it not needed to fix the problem described in #2743?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't. That's what this chain does, since v6.0.53 of minio package. https://github.com/grafana/tempo/pull/2958/files#diff-c0f785bb00d262f48124144ac01da108c80587ee0cbadb13f4451942be17dab3R443-R447

The other code doesn't seem to work right. I didn't troubleshoot it in depth, except for IRSA doesn't work, the minio client sdk can handle all the various methods.

Copy link
Contributor

@coufalja coufalja Oct 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that is removed the IDMSv1 will break again, I in parallel engaged with Minio and uncovered that it is actually a minio-go lib issue minio/minio-go#1866 fixed by minio/minio-go#1877 released in v7.0.63, if we could bump minio-go version we could revert the #2760.

Copy link
Contributor

@knylander-grafana knylander-grafana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for updating the documentation.

@hobbsh
Copy link

hobbsh commented Sep 27, 2023

Tested and it seems to solve our IRSA issue in EKS but I did not test any other cases.

@ekristen
Copy link
Contributor Author

All the other methods are pretty easy to test. I've tested this chain in another tool, happy to have whoever test to get it merged.

@leandro-alt
Copy link

Any update on this PR? Would be great to have this fix released

@mapno
Copy link
Member

mapno commented Oct 4, 2023

All the other methods are pretty easy to test.

How would you test them? We're looking for a reliable way to test all the different auth modes. We manually tested the current implementation and it worked fine before.

It feels we're going in circles with these S3 auth changes, as this PR is actually configuring the credential provider chain how it was in the first v2.2 release, before issue #2743 was reported and fixed.

@ekristen
Copy link
Contributor Author

ekristen commented Oct 4, 2023

My suspicion to why it appear to not work in the first place is because one of the handlers in the chain probably retuned keys prior to getting to the IRSA chain.

Most of the handlers are env vars or config file parsers, it would take sitting down and writing tests and having valid keys or testing keys were returned at a minimum.

The IMDS ones would be most difficult because you'd either have to test on an AWS node with a role or intercept the call to 169.x.

The IRSA can be done with GitHub actions as it provides an IDP tokens endpoint to auth to AWS.

My changes work. Restic project uses the exact same chain as well, minus the weird v2 signature override function this project has going on.

I'll see if I can find some time to see how difficult it would be too write some tests.

@ekristen
Copy link
Contributor Author

ekristen commented Oct 4, 2023

@mapno The latest commit adds tests for the entire credential chain with calls to metadata service for instance profile role and IRSA being mocked. I excluded calls to ECS as I don't think that's a valid deployment target at this time.

Tests will start to fail due to GitHub Actions OIDC to an AWS account not being setup. These instructions are being put at the top of the PR to be followed as well.

  1. Identify a test AWS Account (ideally a dedicated account for this with nothing else)
  2. Add GitHub Actions Tokens as an Identity Provider
  1. Create an AWS Role named tempo-tests
  2. Set the Trust Policy to the trust policy section
  • Replace 123456123456 with the real account id
  • Edit line 162 and replace 123456123456 with the real account id
  1. Add NO permissions to the role. None. This will allow the role to be assumed but it'll have no permissions to do anything in the account.
  2. Complete

Trust Policy

Note: replace 123456123456 with your account id.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::123456123456:oidc-provider/token.actions.githubusercontent.com"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringLike": {
                    "token.actions.githubusercontent.com:sub": "repo:grafana/tempo:*"
                },
                "StringEquals": {
                    "token.actions.githubusercontent.com:aud": "sts.amazonaws.com"
                }
            }
        }
    ]
}

@mapno
Copy link
Member

mapno commented Oct 5, 2023

Wow, thanks for the tests. I'll need to take a detailed look at them.

My suspicion to why it appear to not work in the first place is because one of the handlers in the chain probably retuned keys prior to getting to the IRSA chain.

But the chain barely changes from its current state to your changes, no? It's the same chain, except for credentials.Static that is put at the top of the chain. I run your test with the chain split like that and they passed as well. I don't understand what this PR is changing in that sense.

@ekristen
Copy link
Contributor Author

ekristen commented Oct 5, 2023

I think a lot of things got mixed together.

It looks like there was some issues with IMDS and then IRSA (assume role with web identity)

If the minio library is updated to the latest based on the other individuals comment and other this is merged or the other PRs are reverted then things should be ok.

I updated the order of the chain to make more sense and added tests.

@mapno
Copy link
Member

mapno commented Oct 5, 2023

Just reviewed the tests, they're a pretty nice addition! Do you think I'd be possible to mock the metadata service for IRSA with a tool like https://github.com/aws/amazon-ec2-metadata-mock, instead of using an actual AWS account? Unfortunately, I don't think we'll be able to set up an AWS account.

I still don't understand why the new chain works and the current doesn't, as it doesn't introduce changes from what we have now (or had in the past). Sorry if I'm being very obtuse here.

If the minio library is updated to the latest based on the other individuals comment and other this is merged or the other PRs are reverted then things should be ok.

Would you like to take care of that change? I can do it if not.

Thanks for the patience! Love to see these contributions in Tempo ❤️

@mapno
Copy link
Member

mapno commented Oct 11, 2023

I've opened a new PR based on this one, but with minio-go updated. There are images publised for amd64 and arm64 architectures. If all of you can test them, we can get that PR merged. cc/ @ekristen @coufalja @sberz @z0rc @finda-yeongjo @hobbsh

@ekristen
Copy link
Contributor Author

Updated the dependency. I still thing this PR is the better way to go with the tests.

@ekristen
Copy link
Contributor Author

I will rebase.

@ekristen
Copy link
Contributor Author

Closing in favor of #3006

@ekristen ekristen closed this Oct 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2.2.2 regression: S3 storage does not work with AWS IRSA authentication anymore
7 participants