Failing to mount secrets when a new node is scaled up #759

sjdweb · 2022-01-18T21:54:12Z

Have you

Read Troubleshooting Guide
Searched on GitHub issues and Discussions

What steps did you take and what happened:
[A clear and concise description of what the bug is.]

Every time we deploy this Helm chart, we hit this problem.

Because there are 200 pods in this over a number of deployments, there will be a node scale-up triggered on AKS.

Once the pod is assigned the new node/s, we see that the pod is unable to mount the secret volume.

Errors:

On pod:

driver name secrets-store.csi.k8s.io not found in the list of registered CSI drivers

message: 'Unable to attach or mount volumes: unmounted olumes=[secrets-store-inline], unattached volumes=[secrets-store-inline my-app-7srx2]: timed out waiting for the condition'

In MIC:

2022-01-18T21:18:11.340568206Z stderr F E0118 21:18:11.340460       1 server.go:145] GRPC error: failed to mount secrets store objects for pod my-ns/my-app-5b85748794-cjt52, err: rpc error: code = Canceled desc = context canceled


2022-01-18T21:18:11.338694492Z stderr F I0118 21:18:11.338460       1 nodeserver.go:72] "unmounting target path as node publish volume failed" targetPath="/var/lib/kubelet/pods/70c6d07b-370e-4e0b-bc81-d251e030c3ae/volumes/kubernetes.io~csi/secrets-store-inline/mount" pod="my-ns/my-app-5b85748794-cjt52"

What did you expect to happen:

The pod should run as expected on the new node.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

If I babysit the deployment and kill the pods after I know the nodes are healthy, the newly created pods are fine.

The problem here is the pods that fail to mount (because the CSI driver is not found yet, or times out) get stuck in a ContainerCreating state.

Which access mode did you use to access the Azure Key Vault instance:
[e.g. Service Principal, Pod Identity, User Assigned Managed Identity, System Assigned Managed Identity]

Pod Identity

Environment:

Secrets Store CSI Driver version: (use the image tag):

mcr.microsoft.com/oss/kubernetes-csi/csi-node-driver-registrar:v2.3.0
mcr.microsoft.com/oss/kubernetes-csi/secrets-store/driver:v0.3.0
mcr.microsoft.com/oss/kubernetes-csi/livenessprobe:v2.4.0
mcr.microsoft.com/oss/azure/secrets-store/provider-azure:v0.2.0

Azure Key Vault provider version: (use the image tag):

mcr.microsoft.com/oss/azure/aad-pod-identity/nmi:v1.8.6
mcr.microsoft.com/oss/azure/aad-pod-identity/mic:v1.8.6

Kubernetes version: (use kubectl version and kubectl get nodes -o wide):

Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-06-16T12:52:14Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.9", GitCommit:"a5e4de7e277a707bd28d448bd75de58b4f1cdc22", GitTreeState:"clean", BuildDate:"2021-11-16T01:09:55Z", GoVersion:"go1.15.14", Compiler:"gc", Platform:"linux/amd64"}

Nodes:

v1.20.9
Ubuntu 18.04.6 LTS
containerd://1.4.9+azure

Cluster type: (e.g. AKS, aks-engine, etc):

Azure AKS

The text was updated successfully, but these errors were encountered:

sjdweb · 2022-01-18T22:04:31Z

Here's some more detail as I watch another deployment:

│ Events:                                                                                                                                                                                                              │
│   Type     Reason            Age    From                Message                                                                                                                                                      │
│   ----     ------            ----   ----                -------                                                                                                                                                      │
│   Warning  FailedScheduling  2m52s  default-scheduler   0/8 nodes are available: 8 Insufficient cpu.                                                                                                                 │
│   Warning  FailedScheduling  2m52s  default-scheduler   0/8 nodes are available: 8 Insufficient cpu.                                                                                                                 │
│   Normal   Scheduled         2m33s  default-scheduler   Successfully assigned my-rest/my-rest-api-54df7d596b-b6qpq to aks-default-38331632-vmss00000l                                                              │
│   Normal   TriggeredScaleUp  2m33s  cluster-autoscaler  pod triggered scale-up: [{aks-default-38331632-vmss 9->10 (max: 50)}]                                                                                        │
│   Warning  FailedMount       34s    kubelet             MountVolume.SetUp failed for volume "secrets-store-inline" : rpc error: code = DeadlineExceeded desc = context deadline exceeded                             │
│   Warning  FailedMount       31s    kubelet             Unable to attach or mount volumes: unmounted volumes=[secrets-store-inline], unattached volumes=[my-rest-token-7srx2 secrets-store-inline]: timed out waiti │
│ ng for the condition

│ Events:                                                                                                                                                                                                              │
│   Type     Reason            Age                From                Message                                                                                                                                          │
│   ----     ------            ----               ----                -------                                                                                                                                          │
│   Warning  FailedScheduling  4m16s              default-scheduler   0/8 nodes are available: 8 Insufficient cpu.                                                                                                     │
│   Warning  FailedScheduling  4m16s              default-scheduler   0/8 nodes are available: 8 Insufficient cpu.                                                                                                     │
│   Normal   Scheduled         83s                default-scheduler   Successfully assigned my-rest/my-rest-job-workers-processed-consumer-worker-5c459c86c-8rtt8 to aks-default-38331632-vmss00000r                 │
│   Normal   TriggeredScaleUp  4m3s               cluster-autoscaler  pod triggered scale-up: [{aks-default-38331632-vmss 8->9 (max: 50)}]                                                                             │
│   Warning  FailedMount       15s (x8 over 84s)  kubelet             MountVolume.SetUp failed for volume "secrets-store-inline" : kubernetes.io/csi: mounter.SetUpAt failed to get CSI client: driver name secrets-st │
│ ore.csi.k8s.io not found in the list of registered CSI drivers                                                                                                                                                       │
│

I see that some pods are in ContainerCreating (over 120s!)

kube-system   csi-secrets-azure-csi-secrets-store-provider-azure-kgjtc   ●   0/1            0 ContainerCreating
│ Events:                                                                                                                                                                                                              │
│   Type    Reason     Age    From               Message                                                                                                                                                               │
│   ----    ------     ----   ----               -------                                                                                                                                                               │
│   Normal  Scheduled  2m25s  default-scheduler  Successfully assigned kube-system/secrets-store-csi-driver-jkl6n to aks-default-38331632-vmss00000s                                                                   │
│   Normal  Pulled     2m22s  kubelet            Container image "mcr.microsoft.com/oss/kubernetes-csi/csi-node-driver-registrar:v2.3.0" already present on machine                                                    │
│   Normal  Created    2m21s  kubelet            Created container node-driver-registrar                                                                                                                               │
│   Normal  Started    2m21s  kubelet            Started container node-driver-registrar                                                                                                                               │
│   Normal  Pulling    2m21s  kubelet            Pulling image "mcr.microsoft.com/oss/kubernetes-csi/secrets-store/driver:v0.3.0"
```

aramase · 2022-01-18T22:09:02Z

Thanks for reporting the issue!

Unfortunately, this behavior isn't specific to the secrets-store-csi-driver implementation but rather how workloads are scheduled in Kubernetes. The CSI driver needs to be running on the node for the volume mount request to be processed but in case of a scale up event there is no way to ensure all system pods (csi driver, kube-proxy, other pods in kube-system namespace) are running before the workload pods start running.

There was an enhancement proposal centered around this: kubernetes/enhancements#1003 but was closed. Once the driver and providers pods are running on the new node, the pods waiting volume mount will eventually start Running because kubelet will keep retrying to mount the volume.

aramase · 2022-01-18T22:11:48Z

Typically the images for some of these components are baked in the VHD image. If they're not present in the VHD, then the image needs to be pulled which is probably what you're seeing in the describe output.

sjdweb · 2022-01-19T00:37:08Z

@aramase thank you for the quick response! I assume there's no known workaround for this?

The behaviour we've observed over ~10 rollouts is that the pods never hit Running without manual intervention (delete pod, new one then starts). We have a timeout of 25 minutes for the helm upgrade.

nilekhc · 2022-01-19T09:42:28Z

@sjdweb, Do we know if driver and provider pods were running at the time of manual intervention?

sjdweb · 2022-01-19T14:20:33Z

@nilekhc yes all other pods were running relating to the secret provider, identity etc before manual intervention

aramase · 2022-01-19T17:31:38Z

@aramase thank you for the quick response! I assume there's no known workaround for this?

Currently there is no workaround for this. Eventually the pod volume mounts will succeed when the drivers are running because of the retries available in kubelet. Long term I think it would be great if node readiness as mentioned in the KEP above is a thing so we can mark the node ready only after the csi drivers and other system critical pods are running.

sjdweb · 2022-01-19T18:27:21Z

@aramase unfortunately the pod volume mounts never succeed in our case. Today for example the containers were stuck for 35m in ContainerCreating without intervention. Killing the pods fixed the issue, but we cannot do this every deployment.

aramase · 2022-01-19T18:36:19Z

Yeah, I agree that's not a great experience.

Could you use the v1.0.0 version for the driver and provider?
Share the kubectl events during the timeframe.
If you have access to kubelet logs on the node, that'll be great so we can see how often kubelet is retrying the mount.
Output for kubectl get pods to show when the driver, provider pods started running.
Logs from the driver and provider pods on the new node.

github-actions · 2022-02-03T00:06:25Z

This issue is stale because it has been open 14 days with no activity. Please comment or this will be closed in 7 days.

github-actions · 2022-02-11T00:05:06Z

This issue was closed because it has been stalled for 21 days with no activity. Feel free to re-open if you are experiencing the issue again.

vinhnguyen500 · 2022-03-14T22:56:38Z

@sjdweb we're running into this issue as well where the pod volume mounts never succeed. Did you manage to find a workaround?

It'd be great if the kubelet would actually retry and eventually succeed but kicking the pods manually sucks especially when scale-ups are happening automatically with our cluster-autoscaler

pdefreitas · 2022-06-27T10:39:05Z

We're experiencing the same issue and the only workaround we found is manual intervention by deleting the pod. When a new pod is scheduled it is able to successfully mount the secrets. Does anyone have a workaround to avoid the necessity of manual intervention? This is important for workloads that are very dynamic (autoscaling).
Edit for reference:
We're using v1.0.1 and aad-pod-policy v1.8.5.

pdefreitas · 2022-07-01T08:43:16Z

I've managed to fix this situation. In our case is because the default Helm chart values do not match the PriorityClass used if you install the components through Azure Portal Addons installer.

Examples:

@aramase shouldn't the documentation pinpoint if you install through Helm chart you should setup the PriorityClass accordingly? Out-of-the-box is a bit hard to track down the missing PriorityClass.

sjdweb added the bug Something isn't working label Jan 18, 2022

github-actions bot added the stale label Feb 3, 2022

github-actions bot closed this as completed Feb 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failing to mount secrets when a new node is scaled up #759

Failing to mount secrets when a new node is scaled up #759

sjdweb commented Jan 18, 2022

sjdweb commented Jan 18, 2022 •

edited

Loading

aramase commented Jan 18, 2022 •

edited

Loading

aramase commented Jan 18, 2022

sjdweb commented Jan 19, 2022

nilekhc commented Jan 19, 2022

sjdweb commented Jan 19, 2022

aramase commented Jan 19, 2022

sjdweb commented Jan 19, 2022

aramase commented Jan 19, 2022

github-actions bot commented Feb 3, 2022

github-actions bot commented Feb 11, 2022

vinhnguyen500 commented Mar 14, 2022

pdefreitas commented Jun 27, 2022 •

edited

Loading

pdefreitas commented Jul 1, 2022

Failing to mount secrets when a new node is scaled up #759

Failing to mount secrets when a new node is scaled up #759

Comments

sjdweb commented Jan 18, 2022

sjdweb commented Jan 18, 2022 • edited Loading

aramase commented Jan 18, 2022 • edited Loading

aramase commented Jan 18, 2022

sjdweb commented Jan 19, 2022

nilekhc commented Jan 19, 2022

sjdweb commented Jan 19, 2022

aramase commented Jan 19, 2022

sjdweb commented Jan 19, 2022

aramase commented Jan 19, 2022

github-actions bot commented Feb 3, 2022

github-actions bot commented Feb 11, 2022

vinhnguyen500 commented Mar 14, 2022

pdefreitas commented Jun 27, 2022 • edited Loading

pdefreitas commented Jul 1, 2022

sjdweb commented Jan 18, 2022 •

edited

Loading

aramase commented Jan 18, 2022 •

edited

Loading

pdefreitas commented Jun 27, 2022 •

edited

Loading