Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: add test coverage for missing permissions #1123

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nan-yu
Copy link
Contributor

@nan-yu nan-yu commented Feb 3, 2024

There are multiple issues with missing permission, though they all share the same KNV2004 error code.
This commit adds test coverage for the following scenarios:

  1. A GSA doesn't exist when using gcpserviceaccount
  2. A GSA exists, but it is not impersonated with a KSA.
  3. A GSA exists and is impersonated, but it doesn't have reader permission.
  4. A WI+KSA doesn't have reader permission.

Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from nan-yu. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

There are multiple issues with missing permission, though they all share
the same KNV2004 error code.
This commit adds test coverage for the following scenarios:
1. A GSA doesn't exist when using gcpserviceaccount
2. A GSA exists, but it is not impersonated with a KSA.
3. A GSA exists and is impersonated, but it doesn't have reader
   permission.
4. A WI+KSA doesn't have reader permission.
Namespace: configsync.ControllerNamespace,
Name: core.RootReconcilerName(rs.Name),
}
require.NoError(nt.T,
Copy link
Contributor

@karlkfi karlkfi Feb 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

most of the e2e tests use nt.Must(...) instead of require.NoError(nt.T, ...).

gsaEmail: tc.gsaEmail,
}
mustConfigureRootSync(nt, rs, tenant, spec)
nt.WaitForRootSyncSourceError(rs.Name, status.SourceErrorCode, tc.expectedErrMsg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these auth errors makes me wonder if we shouldnt add a new error code. I would normally expect the problem with a source error to be in the source, but these are all auth problems fetching the source.


// The exact error message returned by AR for the Helm chart is:
// unexpected error rendering chart, will retry","Err":"rendering helm chart: invoking helm: Error: failed to authorize: failed to fetch oauth token: unexpected status from GET request to https://us-docker.pkg.dev/v2/token?scope=repository%3Astolos-dev%2Fconfig-sync-e2e-test--nanyu-cluster-1%2Fcoredns-test-helm-argke-workload-identit%3Apull\u0026service=us-docker.pkg.dev: 403 Forbidden
missingHelmARPermissionError = "403 Forbidden"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to change the test validation to allow regex so we could test for more of the error message without needing to know/encode the url? Like maybe rendering helm chart: invoking helm: Error: failed to authorize: failed to fetch oauth token: unexpected status from GET request to (.*): 403 Forbidden

// The exact error message returned for GKE WI is:
// credential refresh failed: auth URL returned status 500, body: \"error retrieveing TokenSource.Token: compute: Received 403 `Unable to generate access token; IAM returned 403 Forbidden: Permission 'iam.serviceAccounts.getAccessToken' denied on resource (or it may not exist).\\nThis error could be caused by a missing IAM policy binding on the target IAM service account.\\nFor more information, refer to the Workload Identity documentation:\\n\\thttps://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity#authenticating_to
// The exact error message returned for Fleet WI is:
// credential refresh failed: auth URL returned status 500, body: \"error retrieveing TokenSource.Token: oauth2/google: status code 403: {\\n \\\"error\\\": {\\n \\\"code\\\": 403,\\n \\\"message\\\": \\\"Permission 'iam.serviceAccounts.getAccessToken' denied on resource (or it may not exist).\\\",\\n \\\"status\\\": \\\"PERMISSION_DENIED\\\",\\n \\\"details\\\": [\\n {\\n \\\"@type\\\": \\\"type.googleapis.com/google.rpc.ErrorInfo\\\",\\n \\\"reason\\\": \\\"IAM_PERMISSION_DENIED\\\",\\n \\\"domain\\\": \\\"iam.googleapis.com\\\",\\n \\\"metadata\\\": {\\n \\\"permission\\\": \\\"iam.serviceAccounts.getAccessToken\\\"\\n }\\n }\\n ]\\n }\\n}\\n\\n\"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error is really messy with all the multi-line json escaping. Is there any way we can clean up the error a bit for the user to make it more readable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • error retrieving TokenSource.Token is from our askpass sidecar code.
  • oauth2/google: status code %d: %s is from one of 3 places in the oauth2/google internal library.

I wonder if we could add some code to the askpass sidecar to detect and parse a JSON payload in the error string and do some formatting to pretty-print it so it's more readable like the GKE WI error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even better, maybe we can modify the oauth2 library to parse it for us...

@nan-yu
Copy link
Contributor Author

nan-yu commented Feb 22, 2024

I accidentally pushed from the wrong fork branch. I'll push a new patch after addressing these comments.

Copy link

@nan-yu: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
kpt-config-sync-presubmit-e2e-multi-repo-test-group1 c900b86 link true /test kpt-config-sync-presubmit-e2e-multi-repo-test-group1
kpt-config-sync-presubmit-e2e-multi-repo-test-group2 c900b86 link true /test kpt-config-sync-presubmit-e2e-multi-repo-test-group2
kpt-config-sync-presubmit-e2e-multi-repo-test-group3 c900b86 link true /test kpt-config-sync-presubmit-e2e-multi-repo-test-group3
kpt-config-sync-presubmit c900b86 link true /test kpt-config-sync-presubmit
kpt-config-sync-terraform-presubmit c900b86 link true /test kpt-config-sync-terraform-presubmit

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants