csi-provisioner is unable to create volume via ebs-plugin with no helpful error #214

frittentheke · 2019-02-14T17:52:36Z

/kind bug

What happened?

I installed a fresh Kubernetes 1.13.3 cluster using kubeadm on AWS.
I created the CSINodeInfo CRD via https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/storage-crds/csinodeinfo.yaml

And then applied controller.yaml + node.yaml.

After adjusting the volumes for the controller (see: #212) things started up nicely and I was able to query the CSINodeInfo objects from my nodes just fine.

I created the most basic storage class I could think of:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: ebs
provisioner: ebs.csi.aws.com
parameters:
  type: gp2
reclaimPolicy: Retain
volumeBindingMode: Immediate

and then applied a corresponding PVC:

apiVersion: v1                                                                                                                                                                                                                                                                                                                 
kind: PersistentVolumeClaim
metadata:
  name: ebs-claim
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: ebs 
  resources:
    requests:
      storage: 10Gi

The ebs-csi-controller, or rather the csi-provisioner picked up the request to CreateVolume, but then things fail miserably and without (at least to me) a helpful error message - even when running ebs-plugin with v=9:

+ ebs-csi-controller-0 › cluster-driver-registrar
+ ebs-csi-controller-0 › csi-attacher
+ ebs-csi-controller-0 › csi-provisioner
+ ebs-csi-controller-0 › ebs-plugin
ebs-csi-controller-0 csi-provisioner I0214 17:43:39.263549       1 csi-provisioner.go:82] Version: v1.0.1-0-g9a258a0a
ebs-csi-controller-0 csi-provisioner I0214 17:43:39.263600       1 csi-provisioner.go:96] Building kube configs for running in cluster...
ebs-csi-controller-0 cluster-driver-registrar I0214 17:43:37.821382       1 main.go:77] Version: v1.0.1-0-g0660c6c6
ebs-csi-controller-0 cluster-driver-registrar I0214 17:43:37.821512       1 main.go:80] Attempting to open a gRPC connection with: "/csi/csi.sock"
ebs-csi-controller-0 cluster-driver-registrar I0214 17:43:37.821535       1 connection.go:69] Connecting to /csi/csi.sock
ebs-csi-controller-0 cluster-driver-registrar I0214 17:43:37.821813       1 connection.go:96] Still trying, connection is CONNECTING
ebs-csi-controller-0 cluster-driver-registrar I0214 17:43:37.822145       1 connection.go:96] Still trying, connection is TRANSIENT_FAILURE
ebs-csi-controller-0 cluster-driver-registrar I0214 17:43:38.822216       1 connection.go:96] Still trying, connection is TRANSIENT_FAILURE
ebs-csi-controller-0 csi-attacher I0214 17:43:40.721128       1 main.go:76] Version: v1.0.1-0-gb7dadac
ebs-csi-controller-0 csi-attacher I0214 17:43:40.722553       1 connection.go:89] Connecting to /var/lib/csi/sockets/pluginproxy/csi.sock
ebs-csi-controller-0 ebs-csi-controller-0 csi-attacherebs-csi-controller-0 cluster-driver-registrar I0214 17:43:39.864132       1 connection.go:96] Still trying, connection is TRANSIENT_FAILURE
I0214 17:43:40.722806       1 connection.go:116] Still trying, connection is CONNECTING
 ebs-pluginebs-csi-controller-0 csi-attacher  I0214 17:43:36.330221       1 driver.go:49] Driver: ebs.csi.aws.com Version: 0.3.0-alpha
I0214 17:43:40.722991       1 connection.go:116] Still trying, connection is TRANSIENT_FAILURE
ebs-csi-controller-0ebs-csi-controller-0  csi-attacher cluster-driver-registrarI0214 17:43:41.723107       1 connection.go:116] Still trying, connection is CONNECTING
 I0214 17:43:41.040443       1 connection.go:96] Still trying, connection is CONNECTING
ebs-csi-controller-0 cluster-driver-registrar I0214 17:43:41.040493       1 connection.go:96] Still trying, connection is TRANSIENT_FAILURE
ebs-csi-controller-0 cluster-driver-registrar I0214 17:43:42.106398       1 connection.go:96] Still trying, connection is CONNECTING
ebs-csi-controller-0 cluster-driver-registrar I0214 17:43:42.106437       1 connection.go:96] Still trying, connection is TRANSIENT_FAILURE
ebs-csi-controller-0 ebs-plugin I0214 17:43:42.429605       1 mount_linux.go:174] Cannot run systemd-run, assuming non-systemd OS
ebs-csi-controller-0 ebs-csi-controller-0ebs-plugin I0214 17:43:42.429619       1 mount_linux.go:175] systemd-run failed with: exit status 1
 ebs-csi-controller-0 ebs-plugin I0214 17:43:42.429628       1 mount_linux.go:176] systemd-run output: Failed to create bus connection: No such file or directory
csi-provisionerebs-csi-controller-0  ebs-plugin I0214 17:43:42.375961       1 controller.go:197] Connecting to /var/lib/csi/sockets/pluginproxy/csi.sock
I0214 17:43:42.429859       1 driver.go:93] Listening for connections on address: &net.UnixAddr{Name:"/var/lib/csi/sockets/pluginproxy/csi.sock", Net:"unix"}
ebs-csi-controller-0 csi-provisioner I0214 17:43:42.376205       1 controller.go:224] Still trying, connection is CONNECTING
ebs-csi-controller-0 csi-provisioner I0214 17:43:42.378224       1 controller.go:224] Still trying, connection is TRANSIENT_FAILURE
ebs-csi-controller-0 ebs-csi-controller-0csi-attacher  I0214 17:43:42.765155       1 connection.go:113] Connected
ebs-csi-controller-0ebs-plugin  csi-attacher I0214 17:43:42.765176       1 connection.go:242] GRPC call: /csi.v1.Identity/Probe
I0214 17:43:42.771770       1 identity.go:61] Probe: called with args {XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
ebs-csi-controller-0 ebs-csi-controller-0csi-attacher I0214 17:43:42.765181       1 connection.go:243] GRPC request: {}
 ebs-csi-controller-0 csi-attacher I0214 17:43:42.772187       1 connection.go:245] GRPC response: {}
ebs-pluginebs-csi-controller-0  csi-attacher I0214 17:43:42.772849       1 connection.go:246] GRPC error: <nil>
I0214 17:43:42.773643       1 identity.go:27] GetPluginInfo: called with args {XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
ebs-csi-controller-0ebs-csi-controller-0  csi-attacher I0214 17:43:42.772857       1 main.go:211] Probe succeeded
ebs-csi-controller-0 ebs-plugincsi-attacher  I0214 17:43:42.772880       1 connection.go:242] GRPC call: /csi.v1.Identity/GetPluginInfo
I0214 17:43:42.775224       1 identity.go:37] GetPluginCapabilities: called with args {XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
ebs-csi-controller-0 csi-attacher I0214 17:43:42.772887       1 connection.go:243] GRPC request: {}
ebs-csi-controller-0 csi-attacher I0214 17:43:42.773941       1 connection.go:245] GRPC response: {"name":"ebs.csi.aws.com","vendor_version":"0.3.0-alpha"}
ebs-csi-controller-0 csi-attacher I0214 17:43:42.774549       1 connection.go:246] GRPC error: <nil>
ebs-csi-controller-0 csi-attacher I0214 17:43:42.774558       1 main.go:128] CSI driver name: "ebs.csi.aws.com"
ebs-csi-controller-0 csi-attacher I0214 17:43:42.774566       1 connection.go:242] GRPC call: /csi.v1.Identity/GetPluginCapabilities
ebs-csi-controller-0 csi-attacher I0214 17:43:42.774571       1 connection.go:243] GRPC request: {}
ebs-csi-controller-0 csi-attacher I0214 17:43:42.775700       1 connection.go:245] GRPC response: {"capabilities":[{"Type":{"Service":{"type":1}}},{"Type":{"Service":{"type":2}}}]}
ebs-csi-controller-0 csi-attacher I0214 17:43:42.783221       1 connection.go:246] GRPC error: <nil>
ebs-csi-controller-0 csi-attacher I0214 17:43:42.783230       1 connection.go:242] GRPC call: /csi.v1.Controller/ControllerGetCapabilities
ebs-csi-controller-0 csi-attacher I0214 17:43:42.783236       1 connection.go:243] GRPC request: {}
ebs-csi-controller-0ebs-csi-controller-0  csi-attacherebs-plugin I0214 17:43:42.784204       1 controller.go:220] ControllerGetCapabilities: called with args {XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
 I0214 17:43:42.785050       1 connection.go:245] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}}]}
ebs-csi-controller-0 csi-attacher I0214 17:43:42.788303       1 connection.go:246] GRPC error: <nil>
ebs-csi-controller-0 csi-attacher I0214 17:43:42.789537       1 main.go:152] CSI driver supports ControllerPublishUnpublish, using real CSI handler
ebs-csi-controller-0 csi-attacher I0214 17:43:42.790980       1 controller.go:111] Starting CSI attacher
ebs-csi-controller-0 ebs-csi-controller-0 cluster-driver-registrar I0214 17:43:43.081709       1 connection.go:96] Still trying, connection is CONNECTING
ebs-csi-controller-0 cluster-driver-registrar I0214 17:43:43.081811       1 connection.go:93] Connected
ebs-csi-controller-0 cluster-driver-registrar I0214 17:43:43.081820       1 main.go:88] Calling CSI driver to discover driver name.
ebs-csi-controller-0 cluster-driver-registrar I0214 17:43:43.081829       1 connection.go:137] GRPC call: /csi.v1.Identity/GetPluginInfo
ebs-csi-controller-0 cluster-driver-registrar I0214 17:43:43.081834       1 connection.go:138] GRPC request: {}
ebs-csi-controller-0ebs-plugin cluster-driver-registrar I0214 17:43:43.083106       1 identity.go:27] GetPluginInfo: called with args {XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
 I0214 17:43:43.083821       1 connection.go:140] GRPC response: {"name":"ebs.csi.aws.com","vendor_version":"0.3.0-alpha"}
ebs-csi-controller-0 cluster-driver-registrar I0214 17:43:43.084433       1 connection.go:141] GRPC error: <nil>
ebs-csi-controller-0 cluster-driver-registrar I0214 17:43:43.084441       1 main.go:96] CSI driver name: "ebs.csi.aws.com"
ebs-csi-controller-0 cluster-driver-registrar I0214 17:43:43.084447       1 main.go:100] Loading kubeconfig.
ebs-csi-controller-0 cluster-driver-registrar I0214 17:43:43.084737       1 k8s_register.go:54] AttachRequired: true
ebs-csi-controller-0 cluster-driver-registrar I0214 17:43:43.084765       1 k8s_register.go:55] PodInfoOnMountVersion: 
ebs-csi-controller-0 cluster-driver-registrar I0214 17:43:43.084771       1 k8s_register.go:58] Registering csidrivers
ebs-csi-controller-0 cluster-driver-registrar I0214 17:43:43.104433       1 k8s_register.go:69] CSIDriver CRD already had been registered
ebs-csi-controller-0 csi-provisioner I0214 17:43:43.376479       1 controller.go:224] Still trying, connection is CONNECTING
ebs-csi-controller-0 csi-provisioner I0214 17:43:43.376585       1 controller.go:221] Connected
ebs-csi-controller-0 csi-provisioner I0214 17:43:43.380089       1 controller.go:571] Starting provisioner controller ebs.csi.aws.com_ebs-csi-controller-0_12606c0e-3080-11e9-b3df-b28d31712265!
ebs-csi-controller-0 csi-provisioner I0214 17:43:43.483402       1 controller.go:620] Started provisioner controller ebs.csi.aws.com_ebs-csi-controller-0_12606c0e-3080-11e9-b3df-b28d31712265!
ebs-csi-controller-0 csi-provisioner I0214 17:43:43.483520       1 controller.go:926] provision "default/ebs-claim" class "ebs": started
ebs-csi-controller-0 csi-provisioner I0214 17:43:43.487847       1 controller.go:188] GRPC call: /csi.v1.Identity/GetPluginCapabilities
ebs-csi-controller-0 csi-provisioner I0214 17:43:43.488450       1 event.go:221] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"ebs-claim", UID:"e6d74eec-307b-11e9-a224-0274920412e2", APIVersion:"v1", ResourceVersion:"207179", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/ebs-claim"
ebs-csi-controller-0 ebs-plugin I0214 17:43:43.489633       1 identity.go:37] GetPluginCapabilities: called with args {XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
ebs-csi-controller-0 csi-provisioner I0214 17:43:43.487861       1 controller.go:189] GRPC request: {}
ebs-csi-controller-0 csi-provisioner I0214 17:43:43.492706       1 controller.go:191] GRPC response: {"capabilities":[{"Type":{"Service":{"type":1}}},{"Type":{"Service":{"type":2}}}]}
ebs-csi-controller-0 csi-provisioner I0214 17:43:43.498658       1 controller.go:192] GRPC error: <nil>
ebs-csi-controller-0 csi-provisioner I0214 17:43:43.498671       1 controller.go:188] GRPC call: /csi.v1.Controller/ControllerGetCapabilities
ebs-csi-controller-0 csi-provisioner I0214 17:43:43.498676       1 controller.go:189] GRPC request: {}
ebs-csi-controller-0 ebs-plugin I0214 17:43:43.502479       1 controller.go:220] ControllerGetCapabilities: called with args {XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
ebs-csi-controller-0 csi-provisioner I0214 17:43:43.503051       1 controller.go:191] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}}]}
ebs-csi-controller-0 csi-provisioner I0214 17:43:43.506029       1 controller.go:192] GRPC error: <nil>
ebs-csi-controller-0 ebs-csi-controller-0 csi-provisioner I0214 17:43:43.506043       1 controller.go:188] GRPC call: /csi.v1.Identity/GetPluginInfo
ebs-plugin I0214 17:43:43.507153       1 identity.go:27] GetPluginInfo: called with args {XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
ebs-csi-controller-0 csi-provisioner I0214 17:43:43.506047       1 controller.go:189] GRPC request: {}
ebs-csi-controller-0 csi-provisioner I0214 17:43:43.507428       1 controller.go:191] GRPC response: {"name":"ebs.csi.aws.com","vendor_version":"0.3.0-alpha"}
ebs-csi-controller-0 csi-provisioner I0214 17:43:43.508821       1 controller.go:192] GRPC error: <nil>
ebs-csi-controller-0 ebs-plugin I0214 17:43:43.512210       1 controller.go:49] CreateVolume: called with args {Name:pvc-e6d74eec-307b-11e9-a224-0274920412e2 CapacityRange:required_bytes:10737418240  VolumeCapabilities:[mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[type:gp2] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
ebs-csi-controller-0 csi-provisioner I0214 17:43:43.508853       1 controller.go:544] CreateVolumeRequest {Name:pvc-e6d74eec-307b-11e9-a224-0274920412e2 CapacityRange:required_bytes:10737418240  VolumeCapabilities:[mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[type:gp2] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
ebs-csi-controller-0 csi-provisioner I0214 17:43:43.509013       1 controller.go:188] GRPC call: /csi.v1.Controller/CreateVolume
ebs-csi-controller-0 csi-provisioner I0214 17:43:43.509022       1 controller.go:189] GRPC request: {"capacity_range":{"required_bytes":10737418240},"name":"pvc-e6d74eec-307b-11e9-a224-0274920412e2","parameters":{"type":"gp2"},"volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
ebs-csi-controller-0 ebs-pluginebs-csi-controller-0  E0214 17:43:53.509489       1 driver.go:80] GRPC error: rpc error: code = Internal desc = RequestCanceled: request context canceled
csi-provisioner I0214 17:43:53.509243       1 controller.go:191] GRPC response: {}
ebs-csi-controller-0 csi-provisioner I0214 17:43:53.509800       1 controller.go:192] GRPC error: rpc error: code = DeadlineExceeded desc = context deadline exceeded
ebs-csi-controller-0 ebs-csi-controller-0ebs-plugin caused by: context canceled
 csi-provisioner W0214 17:43:53.509816       1 controller.go:592] CreateVolume timeout: 10s has expired, operation will be retried
ebs-csi-controller-0ebs-csi-controller-0 csi-provisioner  I0214 17:43:58.509996       1 controller.go:188] GRPC call: /csi.v1.Controller/CreateVolume
ebs-plugin ebs-csi-controller-0I0214 17:43:58.513424       1 controller.go:49] CreateVolume: called with args {Name:pvc-e6d74eec-307b-11e9-a224-0274920412e2 CapacityRange:required_bytes:10737418240  VolumeCapabilities:[mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[type:gp2] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
 csi-provisioner I0214 17:43:58.510027       1 controller.go:189] GRPC request: {"capacity_range":{"required_bytes":10737418240},"name":"pvc-e6d74eec-307b-11e9-a224-0274920412e2","parameters":{"type":"gp2"},"volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
ebs-csi-controller-0 ebs-plugin E0214 17:44:08.510374       1 driver.go:80] GRPC error: rpc error: code = Internal desc = RequestCanceled: request context canceled
ebs-csi-controller-0 csi-provisioner I0214 17:44:08.510168       1 controller.go:191] GRPC response: {}
ebs-csi-controller-0ebs-csi-controller-0  ebs-plugincsi-provisioner  caused by: context canceled
I0214 17:44:08.510648       1 controller.go:192] GRPC error: rpc error: code = DeadlineExceeded desc = context deadline exceeded
ebs-csi-controller-0 csi-provisioner W0214 17:44:08.510662       1 controller.go:592] CreateVolume timeout: 10s has expired, operation will be retried
ebs-csi-controller-0ebs-csi-controller-0  ebs-plugincsi-provisioner I0214 17:44:14.510855       1 controller.go:188] GRPC call: /csi.v1.Controller/CreateVolume
 I0214 17:44:14.513327       1 controller.go:49] CreateVolume: called with args {Name:pvc-e6d74eec-307b-11e9-a224-0274920412e2 CapacityRange:required_bytes:10737418240  VolumeCapabilities:[mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[type:gp2] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
ebs-csi-controller-0 csi-provisioner I0214 17:44:14.510911       1 controller.go:189] GRPC request: {"capacity_range":{"required_bytes":10737418240},"name":"pvc-e6d74eec-307b-11e9-a224-0274920412e2","parameters":{"type":"gp2"},"volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
ebs-csi-controller-0 ebs-plugin E0214 17:44:24.511174       1 driver.go:80] GRPC error: rpc error: code = Internal desc = RequestCanceled: request context canceled
ebs-csi-controller-0ebs-csi-controller-0  csi-provisionerebs-plugin  caused by: context deadline exceeded
I0214 17:44:24.511042       1 controller.go:191] GRPC response: {}
ebs-csi-controller-0 csi-provisioner I0214 17:44:24.511578       1 controller.go:192] GRPC error: rpc error: code = DeadlineExceeded desc = context deadline exceeded
ebs-csi-controller-0 csi-provisioner W0214 17:44:24.511592       1 controller.go:592] CreateVolume timeout: 10s has expired, operation will be retried
ebs-csi-controller-0 ebs-pluginebs-csi-controller-0  I0214 17:44:31.714097       1 controller.go:49] CreateVolume: called with args {Name:pvc-e6d74eec-307b-11e9-a224-0274920412e2 CapacityRange:required_bytes:10737418240  VolumeCapabilities:[mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[type:gp2] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
csi-provisioner I0214 17:44:31.711803       1 controller.go:188] GRPC call: /csi.v1.Controller/CreateVolume
ebs-csi-controller-0 csi-provisioner I0214 17:44:31.711829       1 controller.go:189] GRPC request: {"capacity_range":{"required_bytes":10737418240},"name":"pvc-e6d74eec-307b-11e9-a224-0274920412e2","parameters":{"type":"gp2"},"volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
ebs-csi-controller-0ebs-csi-controller-0  csi-provisioner I0214 17:44:41.711943       1 controller.go:191] GRPC response: {}
ebs-pluginebs-csi-controller-0 csi-provisioner  I0214 17:44:41.715233       1 controller.go:192] GRPC error: rpc error: code = DeadlineExceeded desc = context deadline exceeded
E0214 17:44:41.712073       1 driver.go:80] GRPC error: rpc error: code = Internal desc = RequestCanceled: request context canceled
ebs-csi-controller-0 ebs-csi-controller-0csi-provisioner  W0214 17:44:41.715293       1 controller.go:592] CreateVolume timeout: 10s has expired, operation will be retried
ebs-plugin caused by: context deadline exceeded

What you expected to happen?

I expected either csi-provisioner to create a volume or ebs-plugin to return a helpful error message about i.e. the failing API call.

Environment

Kubernetes version (use kubectl version): 1.13.3
Driver version: latest (0.3.0alpha)

The text was updated successfully, but these errors were encountered:

leakingtapan · 2019-02-14T18:38:24Z

From the log:

ebs-csi-controller-0 csi-provisioner I0214 17:43:43.509022       1 controller.go:189] GRPC request: {"capacity_range":{"required_bytes":10737418240},"name":"pvc-e6d74eec-307b-11e9-a224-0274920412e2","parameters":{"type":"gp2"},"volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
ebs-csi-controller-0 ebs-pluginebs-csi-controller-0  E0214 17:43:53.509489       1 driver.go:80] GRPC error: rpc error: code = Internal desc = RequestCanceled: request context canceled

The CreateVolume is timed out after 10s which is the default value for external provisioner.

Could you paste the log for the ebs csi driver in the csi-controller pod?

frittentheke · 2019-02-14T20:47:41Z

Thanks for the quick reply @leakingtapan !

Certainly you may get the logs:

E0214 20:23:00.786708       1 driver.go:80] GRPC error: rpc error: code = Internal desc = RequestCanceled: request context canceled
caused by: context canceled
I0214 20:39:40.797478       1 identity.go:37] GetPluginCapabilities: called with args {XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
I0214 20:39:40.799849       1 controller.go:220] ControllerGetCapabilities: called with args {XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
I0214 20:39:40.802181       1 identity.go:27] GetPluginInfo: called with args {XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
I0214 20:39:40.805212       1 controller.go:49] CreateVolume: called with args {Name:pvc-e6d74eec-307b-11e9-a224-0274920412e2 CapacityRange:required_bytes:10737418240  VolumeCapabilities:[mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[type:gp2] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
E0214 20:39:50.803210       1 driver.go:80] GRPC error: rpc error: code = Internal desc = RequestCanceled: request context canceled
caused by: context deadline exceeded
I0214 20:39:55.808528       1 controller.go:49] CreateVolume: called with args {Name:pvc-e6d74eec-307b-11e9-a224-0274920412e2 CapacityRange:required_bytes:10737418240  VolumeCapabilities:[mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[type:gp2] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
E0214 20:40:05.805163       1 driver.go:80] GRPC error: rpc error: code = Internal desc = RequestCanceled: request context canceled
caused by: context canceled
I0214 20:40:11.807830       1 controller.go:49] CreateVolume: called with args {Name:pvc-e6d74eec-307b-11e9-a224-0274920412e2 CapacityRange:required_bytes:10737418240  VolumeCapabilities:[mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[type:gp2] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
E0214 20:40:21.806068       1 driver.go:80] GRPC error: rpc error: code = Internal desc = RequestCanceled: request context canceled
caused by: context deadline exceeded
I0214 20:40:29.008532       1 controller.go:49] CreateVolume: called with args {Name:pvc-e6d74eec-307b-11e9-a224-0274920412e2 CapacityRange:required_bytes:10737418240  VolumeCapabilities:[mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[type:gp2] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
E0214 20:40:39.006983       1 driver.go:80] GRPC error: rpc error: code = Internal desc = RequestCanceled: request context canceled
caused by: context deadline exceeded
I0214 20:40:47.649623       1 controller.go:49] CreateVolume: called with args {Name:pvc-e6d74eec-307b-11e9-a224-0274920412e2 CapacityRange:required_bytes:10737418240  VolumeCapabilities:[mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[type:gp2] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
E0214 20:40:57.647926       1 driver.go:80] GRPC error: rpc error: code = Internal desc = RequestCanceled: request context canceled
caused by: context deadline exceeded
I0214 20:41:08.020603       1 controller.go:49] CreateVolume: called with args {Name:pvc-e6d74eec-307b-11e9-a224-0274920412e2 CapacityRange:required_bytes:10737418240  VolumeCapabilities:[mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[type:gp2] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
E0214 20:41:18.016828       1 driver.go:80] GRPC error: rpc error: code = Internal desc = RequestCanceled: request context canceled
caused by: context canceled
I0214 20:41:30.461074       1 controller.go:49] CreateVolume: called with args {Name:pvc-e6d74eec-307b-11e9-a224-0274920412e2 CapacityRange:required_bytes:10737418240  VolumeCapabilities:[mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[type:gp2] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
E0214 20:41:40.459776       1 driver.go:80] GRPC error: rpc error: code = Internal desc = RequestCanceled: request context canceled
caused by: context deadline exceeded
I0214 20:41:55.393744       1 controller.go:49] CreateVolume: called with args {Name:pvc-e6d74eec-307b-11e9-a224-0274920412e2 CapacityRange:required_bytes:10737418240  VolumeCapabilities:[mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[type:gp2] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
E0214 20:42:05.390616       1 driver.go:80] GRPC error: rpc error: code = Internal desc = RequestCanceled: request context canceled
caused by: context canceled
I0214 20:42:23.309213       1 controller.go:49] CreateVolume: called with args {Name:pvc-e6d74eec-307b-11e9-a224-0274920412e2 CapacityRange:required_bytes:10737418240  VolumeCapabilities:[mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[type:gp2] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
E0214 20:42:33.307443       1 driver.go:80] GRPC error: rpc error: code = Internal desc = RequestCanceled: request context canceled
caused by: context deadline exceeded
I0214 20:42:54.810461       1 controller.go:49] CreateVolume: called with args {Name:pvc-e6d74eec-307b-11e9-a224-0274920412e2 CapacityRange:required_bytes:10737418240  VolumeCapabilities:[mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[type:gp2] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
E0214 20:43:04.807458       1 driver.go:80] GRPC error: rpc error: code = Internal desc = RequestCanceled: request context canceled
caused by: context canceled

The initially posted logs contain output from all containers in the controller pod (fetched using stern) cluttered together.

leakingtapan · 2019-02-15T06:49:04Z

Tested v1.13.3 cluster created using kops. I was not able to repo the issue. Is this consistently happen to you? What are the feature gates turned on? Also, could you check on aws console (or using cli) to see what's the status of the EBS volume? Is there any EBS volume got created with tag CSIVolumeName that has the volume name being created eg pvc-e6d74eec-307b-11e9-a224-0274920412e2 ?

leakingtapan · 2019-02-15T06:53:06Z

NVM about the feature gate. Saw your reply from #213

frittentheke · 2019-02-15T07:55:54Z

@leakingtapan there is no volume created at all, not even without certain tags.
Thats why I was wondering if it's possible to turn on full AWS API request debugging somehow to see if the ebs-plugin is even trying to do anything.

Could you maybe look at the feature gates and other options you have for your kubelet and Kube-API server? Regarding #213 ... am I mistaken about the "wrong" placement of the csi.socket files somehow? But I suppose if the csi-provisioner is picking up the requested PVC and then is able to talk to ebs-plugin in the same pod, it should at least lead to a volume being created with AWS, right?

frittentheke · 2019-02-15T15:49:42Z

@leakingtapan
Luckly or sadly we were able to identify the root cause of this issue to be fully unrelated to the aws-ebs-csi-driver. The ebs-plugin was unable to resolve the IP of the AWS API via DNS. So the plugin initialized just fine (using the metadata service), but then was unable to send out any requests to the AWS API. If I may kindly ask to add more logging of the ebs-plugin when doing API requests? A failed DNS request should be distinguishable from timeout reaching the AWS API.

The only thing left to say is: Thank you.

leakingtapan · 2019-02-15T17:50:48Z

@frittentheke thx for root cause. We definite want to make improvements on such debugability issue. I added an issue to track this as a feature request.

shishirkh · 2021-11-28T19:45:30Z

I think I am facing the same issue - cannot be sure as I am not seeing any DNS error in the logs.

@frittentheke How did you solve it when you faced the DNS issue?

throrin19 · 2022-06-08T06:50:39Z

Same problem as @shishirkh here. Anybody to help ?

namgk · 2022-09-29T19:15:52Z

we've got the same issue running k8s 1.23

failed to provision volume with StorageClass "gp2": rpc error: code = Internal desc = Could not create volume "pvc-c7610d24-aa0c-42ca-bf23-f07b0b06139f": could not create volume in EC2: RequestCanceled: request context canceled caused by: context deadline exceeded

sharif997 · 2022-10-02T17:39:41Z

we've got the same issue running k8s 1.23

failed to provision volume with StorageClass "gp2": rpc error: code = Internal desc = Could not create volume "pvc-c7610d24-aa0c-42ca-bf23-f07b0b06139f": could not create volume in EC2: RequestCanceled: request context canceled caused by: context deadline exceeded

I am facing the same issue , any luck solving it ?

Kingshuk-Kubera · 2022-10-27T08:02:19Z

I followed below steps and it worked -
https://docs.aws.amazon.com/eks/latest/userguide/ebs-csi.html

edrimon · 2023-03-15T20:09:26Z

@leakingtapan Luckly or sadly we were able to identify the root cause of this issue to be fully unrelated to the aws-ebs-csi-driver. The ebs-plugin was unable to resolve the IP of the AWS API via DNS. So the plugin initialized just fine (using the metadata service), but then was unable to send out any requests to the AWS API. If I may kindly ask to add more logging of the ebs-plugin when doing API requests? A failed DNS request should be distinguishable from timeout reaching the AWS API.

The only thing left to say is: Thank you.

Hi, I think I have similar issue. How you managed at the end to perform debug to identify the cause? For example how can I find exactly which FQDN it tries to resolve and it fails?

ricardorqr · 2023-03-24T01:40:34Z

I'm also having the same problem. No solution?

failed to provision volume with StorageClass "ebs-sc": rpc error: code = Internal desc = Could not create volume "pvc-8eb69f7e-fa16-45bb-803e-295cb924cf81": could not create volume in EC2: RequestCanceled: request context canceled caused by: context deadline exceeded

Kingshuk-Kubera · 2023-03-24T02:34:02Z

I was following this https://docs.aws.amazon.com/eks/latest/userguide/ebs-csi.html, and I made two mistakes, which caused the issue -

Creating the Amazon EBS CSI driver IAM role for service accounts - (https://docs.aws.amazon.com/eks/latest/userguide/csi-iam-role.html), check role has been created properly. Check "Trust Relationships" of the role, click edit in AWS console and check for any validation errors. For, me, there were validation error.
Creating an IAM OIDC provider for your cluster - Again, this has been created properly. It has at least one thumbprint.

ricardorqr · 2023-03-25T02:54:09Z

I was following this https://docs.aws.amazon.com/eks/latest/userguide/ebs-csi.html, and I made two mistakes, which caused the issue -

Creating the Amazon EBS CSI driver IAM role for service accounts - (https://docs.aws.amazon.com/eks/latest/userguide/csi-iam-role.html), check role has been created properly. Check "Trust Relationships" of the role, click edit in AWS console and check for any validation errors. For, me, there were validation error.

Creating an IAM OIDC provider for your cluster - Again, this has been created properly. It has at least one thumbprint.

Man, have I said that I love you???????? 🤣🤣🤣🤣🤣 You solved my problem.

Yeah, I made a few mistakes too. Let me show you how I fixed it. Mine was a bit similar to yours.

I was following this tutorial: https://repost.aws/knowledge-center/eks-persistent-storage However, I noticed that my cluster didn't have OIDC provider. So I was following the next
Then, I was following this one https://repost.aws/knowledge-center/eks-troubleshoot-oidc-and-irsa to add OIDC provider to my cluster
Lastly, I follow this one https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html

After all, everything was working nicely. Thanks again, brother. 🤜🏼🤛🏼

fl-max · 2023-07-25T17:03:57Z

My issue was an I/O Timeout error while making a request to STS. To debug this issue, you'll want to enable aws-sdk-debug-log ouput (see docs).

NOTE: If using Terraform, you can enable this option with the following configuration:

resource "aws_eks_addon" "example" {
    ....
    configuration_values = jsonencode({
    controller: {
        sdkDebugLog = true
    }
  })
}

Once enabled, and your EBS controller pods have restarted, check the ebs-plugin container logs in the ebs-csi-controller pod. After doing this, I noticed the following in my logs:

DEBUG: Send Request sts/AssumeRoleWithWebIdentity failed, attempt 1/8, error RequestError: send request failed
caused by: Post "https://sts.us-east-1.amazonaws.com/": dial tcp x.x.x.x:443: i/o timeout

To fix, I added the VPC Service Endpoint for com.amazonaws.REGION.sts to the VPC my EKS cluster was deployed in.

d3bt3ch · 2023-09-01T19:33:25Z

I am having the same issue. In our case we are not using EKS and provisioning our clusters with ClusterAPI. Any help?

bck01215 · 2023-09-15T23:17:47Z

@fl-max could you link a guide to creating the VPC service endpoint. I have one created now with the same subnets, but am still getting the same errors you were when debugging is turned on.

EDIT:

It's only failing in two of my subnets

edrimon · 2023-09-16T05:16:21Z

The issue was related to DNS resolution of the AWS API. In my case I just hardcoded the IP address, I added the following lines (hostaliases): kubectl edit deployments.apps -n kube-system ebs-csi-controller dnsPolicy: ClusterFirst hostAliases: - hostnames: - ec2.eu-west-1.amazonaws.com ip: 67.220.226.37 Also the following should be working, to get the metadata info needed to connect. http://169.254.169.254/latest/meta-data/

…

On Sat, Sep 16, 2023 at 2:17 AM Brandon Kauffman ***@***.***> wrote: @fl-max <https://github.com/fl-max> could you link a guide to creating the VPC service endpoint. I have one created now with the same subnets, but am still getting the same errors you were when debugging is turned on. — Reply to this email directly, view it on GitHub <#214 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJAG4D322AUAHP3I7VPQATX2TO2NANCNFSM4GXQPZAQ> . You are receiving this because you commented.Message ID: ***@***.***>

bck01215 · 2023-09-16T11:59:10Z

@edrimon I ended up having the same issue. Opening port 53/udp in the ACL worked.

jtmarmon · 2023-12-17T06:02:17Z

I ran into this issue and the DNS comments above were a helpful tip. In our case, I had recently installed a CNI plugin (Cilium) and forgot to restart the addon pods. Rolling restart of the ebs-csi-controller deployment immediately fixed it for us

arathore47 · 2024-02-29T07:29:39Z

Rolling restart of the ebs-csi-controller as well as the ebs-csi-node refreshed the DNS cache and resolved the issue for us

CameronHudson8 · 2024-03-14T22:24:00Z

My issue was an I/O Timeout error while making a request to STS. To debug this issue, you'll want to enable aws-sdk-debug-log ouput (see docs).

Thank you so much @fl-max. After adding the flag --aws-sdk-debug-log to my ebs-csi-controller deployment, I found that DNS resolution for sts.ap-northeast-1.amazonaws.com was working, but the ebs-csi-controller was getting I/O timeouts when trying to connect to sts.ap-northeast-1.amazonaws.com:443.

Since I currently have no way to filter egress by hostname, I opened up egress with the following network policy. After that, traffic began flowing from the ebs-csi-controller, and the PV successfully provisioned.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-ebs-csi-controller-egress
spec:
  egress:
    - ports:
        - port: 443
          protocol: TCP
  podSelector:
    matchLabels:
      app: ebs-csi-controller
  policyTypes:
    - Egress

AjayEdupuganti · 2024-05-28T20:20:25Z

This solved the issue for me.
first i did
nslookup ec2.ap-south-1.amazonaws.com
it gave

Server:         127.0.0.53
Address:        127.0.0.53#53

Non-authoritative answer:
Name:   ec2.ap-south-1.amazonaws.com
Address: 52.95.80.15

Then i edited ebs-csi-controller configuration

kubectl edit deployment -n kube-system ebs-csi-controller
added dnspolicy and hostaliases.

spec:
  template:
    spec:
      dnsPolicy: ClusterFirst
      hostAliases:
      - hostnames:
        - ec2.ap-south-1.amazonaws.com
        ip: 52.95.80.15

thats it. hope it helps someone

k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Feb 14, 2019

frittentheke closed this as completed Feb 15, 2019

leakingtapan mentioned this issue Feb 15, 2019

Implement Deep health check in driver probe #216

Closed

hectorj2f mentioned this issue Jun 20, 2019

calico: use host-local and pod subnet cidr mesosphere-backup/kubeaddons-configs#61

Closed

frittentheke mentioned this issue Oct 1, 2020

Report volume usage #499

Closed

zjalicflw mentioned this issue Oct 17, 2023

Weird Rpc error: code = DeadlineExceeded desc = context deadline exceeded and error listing AWS instances: RequestCanceled: request context canceled #1783

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

csi-provisioner is unable to create volume via ebs-plugin with no helpful error #214

csi-provisioner is unable to create volume via ebs-plugin with no helpful error #214

frittentheke commented Feb 14, 2019 •

edited

Loading

leakingtapan commented Feb 14, 2019

frittentheke commented Feb 14, 2019 •

edited

Loading

leakingtapan commented Feb 15, 2019 •

edited

Loading

leakingtapan commented Feb 15, 2019

frittentheke commented Feb 15, 2019

frittentheke commented Feb 15, 2019

leakingtapan commented Feb 15, 2019 •

edited

Loading

shishirkh commented Nov 28, 2021

throrin19 commented Jun 8, 2022

namgk commented Sep 29, 2022

sharif997 commented Oct 2, 2022

Kingshuk-Kubera commented Oct 27, 2022 •

edited

Loading

edrimon commented Mar 15, 2023

ricardorqr commented Mar 24, 2023

Kingshuk-Kubera commented Mar 24, 2023

ricardorqr commented Mar 25, 2023 •

edited

Loading

fl-max commented Jul 25, 2023

d3bt3ch commented Sep 1, 2023

bck01215 commented Sep 15, 2023 •

edited

Loading

edrimon commented Sep 16, 2023 via email

bck01215 commented Sep 16, 2023

jtmarmon commented Dec 17, 2023

arathore47 commented Feb 29, 2024

CameronHudson8 commented Mar 14, 2024 •

edited

Loading

AjayEdupuganti commented May 28, 2024

csi-provisioner is unable to create volume via ebs-plugin with no helpful error #214

csi-provisioner is unable to create volume via ebs-plugin with no helpful error #214

Comments

frittentheke commented Feb 14, 2019 • edited Loading

leakingtapan commented Feb 14, 2019

frittentheke commented Feb 14, 2019 • edited Loading

leakingtapan commented Feb 15, 2019 • edited Loading

leakingtapan commented Feb 15, 2019

frittentheke commented Feb 15, 2019

frittentheke commented Feb 15, 2019

leakingtapan commented Feb 15, 2019 • edited Loading

shishirkh commented Nov 28, 2021

throrin19 commented Jun 8, 2022

namgk commented Sep 29, 2022

sharif997 commented Oct 2, 2022

Kingshuk-Kubera commented Oct 27, 2022 • edited Loading

edrimon commented Mar 15, 2023

ricardorqr commented Mar 24, 2023

Kingshuk-Kubera commented Mar 24, 2023

ricardorqr commented Mar 25, 2023 • edited Loading

fl-max commented Jul 25, 2023

d3bt3ch commented Sep 1, 2023

bck01215 commented Sep 15, 2023 • edited Loading

edrimon commented Sep 16, 2023 via email

bck01215 commented Sep 16, 2023

jtmarmon commented Dec 17, 2023

arathore47 commented Feb 29, 2024

CameronHudson8 commented Mar 14, 2024 • edited Loading

AjayEdupuganti commented May 28, 2024

frittentheke commented Feb 14, 2019 •

edited

Loading

frittentheke commented Feb 14, 2019 •

edited

Loading

leakingtapan commented Feb 15, 2019 •

edited

Loading

leakingtapan commented Feb 15, 2019 •

edited

Loading

Kingshuk-Kubera commented Oct 27, 2022 •

edited

Loading

ricardorqr commented Mar 25, 2023 •

edited

Loading

bck01215 commented Sep 15, 2023 •

edited

Loading

CameronHudson8 commented Mar 14, 2024 •

edited

Loading