Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incompatible with grpc health probes #165

Open
mikberg opened this issue Dec 23, 2022 · 11 comments
Open

Incompatible with grpc health probes #165

mikberg opened this issue Dec 23, 2022 · 11 comments

Comments

@mikberg
Copy link

mikberg commented Dec 23, 2022

What happened:

gRPC probes were introduced in Kubernetes 1.24, adding a new field grpc to Probe (used in readinessProbe and livenessProbe). The pod identity webhook seems to be incompatible with this. Pods with a service account with the eks.amazonaws.com/role-arn annotation can't be created:

The Pod "test-grpc" is invalid: spec.containers[0].readinessProbe: Required value: must specify a handler type

What you expected to happen:

gRPC probes working, and the pod identity webhook working

How to reproduce it (as minimally and precisely as possible):

Observe that this example pod can be deployed and works as expected:

apiVersion: v1
kind: Pod
metadata:
  name: test-grpc
spec:
  containers:
  - name: agnhost
    image: k8s.gcr.io/e2e-test-images/agnhost:2.35
    command: ["/agnhost", "grpc-health-checking"]
    ports:
    - containerPort: 5000
    - containerPort: 8080
    readinessProbe:
      grpc:
        port: 5000

Create a Kubernetes service account my-sa with the eks.amazonaws.com/role-arn annotation set, and try to use it in a new pod:

apiVersion: v1
kind: Pod
metadata:
  name: test-grpc
spec:
  serviceAccountName: my-sa
  containers:
  - name: agnhost
    image: k8s.gcr.io/e2e-test-images/agnhost:2.35
    command: ["/agnhost", "grpc-health-checking"]
    ports:
    - containerPort: 5000
    - containerPort: 8080
    readinessProbe:
      grpc:
        port: 5000

This error message is returned:

The Pod "test-grpc" is invalid: spec.containers[0].readinessProbe: Required value: must specify a handler type

Anything else we need to know?:

Environment:

  • AWS Region: eu-north-1
  • EKS Platform version: eks.3
  • Kubernetes version: 1.24
  • Webhook Version: ?
@lareeth
Copy link

lareeth commented Feb 1, 2023

I have also just encountered this issue, and was wondering if there is a fix or a work around available yet?

@lareeth
Copy link

lareeth commented Feb 1, 2023

Just to add to this, when checking the audit log, you can see the webhook patch contents, which has stripped the grpc: {} element from the parent readinessProbe

{
	"configuration": "pod-identity-webhook",
	"webhook": "iam-for-pods.amazonaws.com",
	"patch": [
		{
			"op": "add",
			"path": "/spec/volumes/0",
			"value": {
				"name": "aws-iam-token",
				"projected": {
					"sources": [
						{
							"serviceAccountToken": {
								"audience": "sts.amazonaws.com",
								"expirationSeconds": 86400,
								"path": "token"
							}
						}
					]
				}
			}
		},
		{
			"op": "add",
			"path": "/spec/containers",
			"value": [
				{
					"name": "<removed>",
					"image": "<removed>",
					"ports": [
						{
							"name": "http",
							"containerPort": 80,
							"protocol": "TCP"
						}
					],
					"env": [
						{
							"name": "AWS_STS_REGIONAL_ENDPOINTS",
							"value": "regional"
						},
						{
							"name": "AWS_DEFAULT_REGION",
							"value": "eu-west-1"
						},
						{
							"name": "AWS_REGION",
							"value": "eu-west-1"
						},
						{
							"name": "AWS_ROLE_ARN",
							"value": "arn:aws:iam::<removed>:role/<removed>"
						},
						{
							"name": "AWS_WEB_IDENTITY_TOKEN_FILE",
							"value": "/var/run/secrets/eks.amazonaws.com/serviceaccount/token"
						}
					],
					"resources": {},
					"volumeMounts": [
						{
							"name": "kube-api-access-dgm7x",
							"readOnly": true,
							"mountPath": "/var/run/secrets/kubernetes.io/serviceaccount"
						},
						{
							"name": "aws-iam-token",
							"readOnly": true,
							"mountPath": "/var/run/secrets/eks.amazonaws.com/serviceaccount"
						}
					],
					"readinessProbe": {
						"timeoutSeconds": 1,
						"periodSeconds": 10,
						"successThreshold": 1,
						"failureThreshold": 3
					},
					"terminationMessagePath": "/dev/termination-log",
					"terminationMessagePolicy": "File",
					"imagePullPolicy": "IfNotPresent",
					"securityContext": {}
				}
			]
		}
	],
	"patchType": "JSONPatch"
}

@lareeth
Copy link

lareeth commented Feb 1, 2023

Again, I have pulled down the code and added a test for GRPC and can replicate this on version v0.3.0, but it seems to be working with v0.4.0. Is there an easy way to verify what version is running with EKS? And is there a way to update this?

# make test
go test -coverprofile=coverage.out ./...
?       github.com/aws/amazon-eks-pod-identity-webhook  [no test files]
?       github.com/aws/amazon-eks-pod-identity-webhook/hack/self-hosted [no test files]
?       github.com/aws/amazon-eks-pod-identity-webhook/pkg      [no test files]
ok      github.com/aws/amazon-eks-pod-identity-webhook/pkg/cache        0.097s  coverage: 41.7% of statements
ok      github.com/aws/amazon-eks-pod-identity-webhook/pkg/cache/debug  0.578s  coverage: 50.0% of statements
ok      github.com/aws/amazon-eks-pod-identity-webhook/pkg/cert 0.007s  coverage: 69.2% of statements
--- FAIL: TestUpdatePodSpec (0.01s)
    --- FAIL: TestUpdatePodSpec/Pod_balajilovesoreos_in_file_testdata/rawPodWithGrpc.pod.yaml (0.00s)
        handler_pod_test.go:162: Expected patch didn't match:
            Got
                [{"op":"add","path":"/spec/volumes","value":[{"name":"aws-iam-token","projected":{"sources":[{"serviceAccountToken":{"audience":"sts.amazonaws.com","expirationSeconds":86400,"path":"token"}}]}}]},{"op":"add","path":"/spec/containers","value":[{"name":"balajilovesoreos","image":"amazonlinux","env":[{"name":"AWS_ROLE_ARN","value":"arn:aws:iam::111122223333:role/s3-reader"},{"name":"AWS_WEB_IDENTITY_TOKEN_FILE","value":"/var/run/secrets/eks.amazonaws.com/serviceaccount/token"}],"resources":{},"volumeMounts":[{"name":"aws-iam-token","readOnly":true,"mountPath":"/var/run/secrets/eks.amazonaws.com/serviceaccount"}],"readinessProbe":{}}]}]
            Wanted:
                [{"op":"add","path":"/spec/volumes","value":[{"name":"aws-iam-token","projected":{"sources":[{"serviceAccountToken":{"audience":"sts.amazonaws.com","expirationSeconds":86400,"path":"token"}}]}}]},{"op":"add","path":"/spec/containers","value":[{"name":"balajilovesoreos","image":"amazonlinux","env":[{"name":"AWS_ROLE_ARN","value":"arn:aws:iam::111122223333:role/s3-reader"},{"name":"AWS_WEB_IDENTITY_TOKEN_FILE","value":"/var/run/secrets/eks.amazonaws.com/serviceaccount/token"}],"resources":{},"volumeMounts":[{"name":"aws-iam-token","readOnly":true,"mountPath":"/var/run/secrets/eks.amazonaws.com/serviceaccount"}],"readinessProbe":{"grpc":{"port":80,"service":""}}}]}]
E0201 17:47:01.970551   51933 handler.go:453] Content-Type=application/xml, expected application/json
E0201 17:47:01.970974   51933 handler.go:461] Can't decode body: couldn't get version/kind; json parse error: unexpected end of JSON input
E0201 17:47:01.971337   51933 handler.go:374] Could not unmarshal raw object: json: cannot unmarshal string into Go value of type v1.Pod
E0201 17:47:01.971366   51933 handler.go:375] Object: "\"metadata\":{\"name\":\"fake\""
FAIL

@chickenbeef
Copy link
Contributor

The amazon-eks-pod-identity-webhook runs on the Control Plane in EKS and it's managed by EKS so you won't be able to manually update it on your end.

@lareeth
Copy link

lareeth commented Feb 4, 2023

Yeah I'm aware, as I asked above I was hoping there might be a work around or a different fix. As it currently stands EKS 1.24 has GRPC health checks broken which is a major issue.

@dims
Copy link
Member

dims commented Feb 4, 2023

@lareeth Do you mind filing it here - https://github.com/aws/containers-roadmap/issues (if you haven't already raised it with EKS folks, do you have a ticket?)

@chickenbeef
Copy link
Contributor

chickenbeef commented Feb 4, 2023

@lareeth A workaround I tested is to install the webhook manually into the cluster. This will create a pod-identity-webhook pod running in the dataplane - outside of EKS management so you will be responsible for monitoring it.

This is of course not ideal but should unblock you from carrying out further testing. Once the new version of the webhook is released onto EKS, you can revert back to using the EKS managed pod-identity-webhook.

@lareeth
Copy link

lareeth commented Feb 4, 2023

@lareeth Do you mind filing it here - https://github.com/aws/containers-roadmap/issues (if you haven't already raised it with EKS folks, do you have a ticket?)

I'll raise a ticket there and see what they say.

@lareeth A workaround I tested is to install the webhook manually into the cluster. This will create a pod-identity-webhook pod running in the dataplane - outside of EKS management so you will be responsible for monitoring it.

This is of course not ideal but should unblock you from carrying out further testing. Once the new version of the webhook is released onto EKS, you can revert back to using the EKS managed pod-identity-webhook.

I'll give this a try, we are using Flux so it should be easy to revert once it's fixed. Thanks for the suggestion

@thiduzz
Copy link

thiduzz commented Feb 16, 2023

Another workaround - which is a bit more radical than the one proposed above - is to instead of using the EKS Pod Identity (eks.amazonaws.com/role-arn) with a service account - is to switch to enable kube2iam in a node level to the namespace of your deployment, remove the service account and then add the role ( 'iam.amazonaws.com/role': yourPodRoleArn) to the annotations of the pod template. It's not fancy, but it circumvents the issue entirely.

@soasurs
Copy link

soasurs commented Sep 5, 2023

same issue still in EKS v1.27.1-eks-2f008fe

@dims
Copy link
Member

dims commented Sep 5, 2023

same issue still in EKS v1.27.1-eks-2f008fe

@soasurs please open a service ticket and ask them to investigate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants