Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use v1 credential provider API in 1.27+ #1269

Merged
merged 6 commits into from
Apr 19, 2023

Conversation

cartermckinnon
Copy link
Member

@cartermckinnon cartermckinnon commented Apr 18, 2023

Issue #, if available:

Closes #1268 .

Description of changes:

This moves to the v1 API for CredentialProviderRequest/CredentialProviderResponse. These API's have graduated to v1.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Testing Done

You can observe the issue by placing a shim in between kubelet and the ecr-credential-provider.

  1. Create cred-provider-shim:
#!/usr/bin/env bash
REQUEST=$(cat -)
LOG_FILE=/tmp/cred-provider-shim.log
function log() {
  echo "$(date)" "$@" >> $LOG_FILE
}
log "request: $REQUEST"
RESPONSE=$(mktemp)
echo "$REQUEST" | /etc/eks/ecr-credential-provider/ecr-credential-provider > $RESPONSE 2>> $LOG_FILE
log "response: $(cat $RESPONSE)"
cat "$RESPONSE"
  1. Modify the ecr-credential-provider-config to use this executable.
  2. Modify kubelet's --image-credential-provider-bin-dir flag to the directory containing the shim executable.
  3. Create a test pod on the node using an image that matches the ECR pattern:
kubectl \
  run test-pod \
  --image 123456789123.dkr.ecr.us-west-2.amazonaws.com/foo \
  --overrides='{"apiVersion": "v1", "spec": {"nodeSelector": { "kubernetes.io/hostname": "$HOSTNAME" }}}' 
  1. Look at /tmp/cred-provider-shim.log:
Tue Apr 18 06:12:14 UTC 2023 request: {"kind":"CredentialProviderRequest","apiVersion":"credentialprovider.kubelet.k8s.io/v1beta1","image":"123456789123.dkr.ecr.us-west-2.amazonaws.com/foo"}
E0418 06:12:14.490885   14903 main.go:161] Error running credential provider plugin: group version credentialprovider.kubelet.k8s.io/v1beta1 is not supported
Tue Apr 18 06:12:14 UTC 2023 response:

You can also change the apiVersion in ecr-credential-provider-config to v1alpha1 and see a token generated successfully.

@cartermckinnon
Copy link
Member Author

cartermckinnon commented Apr 18, 2023

Something is off with our cred provider config.

The ecr-credential-provider only supported the v1alpha1 API prior to this change: kubernetes/cloud-provider-aws#597, and we changed the configuration file to specify v1beta1 back when we launched 1.24: a521047.

We haven't disabled the in-tree credential providers by way of kubelet's DisableKubeletCloudCredentialProviders feature gate. This means that the in-tree ECR provider was always registered. We'd always fall back to the in-tree amazon-ecr credential provider.

If you invoke ecr-credential-provider with v1beta1, it fails:

> echo '{"kind":"CredentialProviderRequest","apiVersion":"credentialprovider.kubelet.k8s.io/v1beta1","image":"test.registry.io/foobar"}' | /etc/eks/ecr-credential-provider/ecr-credential-provider

E0418 02:14:23.151461    4380 main.go:161] Error running credential provider plugin: group version credentialprovider.kubelet.k8s.io/v1beta1 is not supported

@cartermckinnon
Copy link
Member Author

The in-tree ECR credential provider is gone in 1.27.

@cartermckinnon
Copy link
Member Author

cartermckinnon commented Apr 18, 2023

This likely went unnoticed because the stderr from failed cred provider plugin invocations is discarded: https://github.com/kubernetes/kubernetes/blob/master/pkg/credentialprovider/plugin/plugin.go#L413

@prasad0896
Copy link
Contributor

Can you elaborate on the implications of not using v1beta1 for k8sversions 1.24-1.26 ? I understand that we were not using this anyway for those versions. But I would like to understand if this would break any other changes where the credential provider was different.

Copy link
Member

@mmerkes mmerkes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of validation did you do to confirm that your fix works? Probably need to validation a version 1.24-1.26 and 1.27 to ensure it works correctly.

@cartermckinnon
Copy link
Member Author

@mmerkes We'll need to turn on the DisableKubeletCloudCredentialProviders feature gate to force usage of the ecr-credential-provider. If the standard pods (kube-proxy, aws-node, etc.) stabilize, then we're good. I'll post a cluster config file here and my results once that's been run.

@cartermckinnon
Copy link
Member Author

I tested this with 1.26 by:

  1. Create config.yaml:
---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: "126-credprovider"
  region: us-west-2
  version: "1.26"
managedNodeGroups:
  - name: nodes
    ami: ami-022441ec63297a0c9
    amiFamily: AmazonLinux2
    minSize: 1
    maxSize: 1
    desiredCapacity: 1
    overrideBootstrapCommand: |
      #!/bin/bash
      echo "$(jq '.featureGates.DisableKubeletCloudCredentialProviders = true' /etc/kubernetes/kubelet/kubelet-config.json)" > /etc/kubernetes/kubelet/kubelet-config.json
      /etc/eks/bootstrap.sh 126-credprovider --kubelet-extra-args "--node-labels=eks.amazonaws.com/nodegroup=nodes,eks.amazonaws.com/nodegroup-image=ami-022441ec63297a0c9"
  1. eksctl create cluster --config-file config.yaml
  2. Nodes joined the cluster.
  3. Created a test Pod with an ECR image (that is not cached in the AMI, to be certain).
  4. Image pulls succeeded.

@cartermckinnon
Copy link
Member Author

@prasad0896 has tested this with 1.27.

@dims
Copy link
Member

dims commented Apr 18, 2023

@cartermckinnon if a feature flag in an old release is not already on, then i don't think we should go back and switch it on... 2 cents.

@cartermckinnon
Copy link
Member Author

cartermckinnon commented Apr 18, 2023

@dims agree! I only turned this on for testing, otherwise the kubelet falls back to the in-tree amazon-ecr credential provider and it's not obvious if something has gone awry with the plugin.

Also opened this: kubernetes/kubernetes#117448 because debugging this issue was roundabout.

@dims
Copy link
Member

dims commented Apr 18, 2023

whew! thanks Carter!


---

## Image credential provider plugins
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tx for the addition here.

@cartermckinnon cartermckinnon merged commit 42c3f52 into master Apr 19, 2023
@cartermckinnon cartermckinnon deleted the cred-provider-api-migration branch April 19, 2023 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle credentialprovider.kubelet.k8s.io/v1alpha1 -> v1 migration in 1.27+
4 participants