-
Notifications
You must be signed in to change notification settings - Fork 793
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
csi-provisioner is unable to create volume via ebs-plugin with no helpful error #214
Comments
From the log:
The Could you paste the log for the ebs csi driver in the csi-controller pod? |
Thanks for the quick reply @leakingtapan ! Certainly you may get the logs:
The initially posted logs contain output from all containers in the controller pod (fetched using stern) cluttered together. |
Tested v1.13.3 cluster created using kops. I was not able to repo the issue. Is this consistently happen to you? What are the feature gates turned on? Also, could you check on aws console (or using cli) to see what's the status of the EBS volume? Is there any EBS volume got created with tag |
NVM about the feature gate. Saw your reply from #213 |
@leakingtapan there is no volume created at all, not even without certain tags. Could you maybe look at the feature gates and other options you have for your kubelet and Kube-API server? Regarding #213 ... am I mistaken about the "wrong" placement of the csi.socket files somehow? But I suppose if the csi-provisioner is picking up the requested PVC and then is able to talk to ebs-plugin in the same pod, it should at least lead to a volume being created with AWS, right? |
@leakingtapan The only thing left to say is: Thank you. |
@frittentheke thx for root cause. We definite want to make improvements on such debugability issue. I added an issue to track this as a feature request. |
I think I am facing the same issue - cannot be sure as I am not seeing any DNS error in the logs. @frittentheke How did you solve it when you faced the DNS issue? |
Same problem as @shishirkh here. Anybody to help ? |
we've got the same issue running k8s 1.23
|
I am facing the same issue , any luck solving it ? |
I followed below steps and it worked - |
Hi, I think I have similar issue. How you managed at the end to perform debug to identify the cause? For example how can I find exactly which FQDN it tries to resolve and it fails? |
I'm also having the same problem. No solution?
|
I was following this https://docs.aws.amazon.com/eks/latest/userguide/ebs-csi.html, and I made two mistakes, which caused the issue -
|
Man, have I said that I love you???????? 🤣🤣🤣🤣🤣 You solved my problem. Yeah, I made a few mistakes too. Let me show you how I fixed it. Mine was a bit similar to yours.
After all, everything was working nicely. Thanks again, brother. 🤜🏼🤛🏼 |
My issue was an I/O Timeout error while making a request to STS. To debug this issue, you'll want to enable
resource "aws_eks_addon" "example" {
....
configuration_values = jsonencode({
controller: {
sdkDebugLog = true
}
})
} Once enabled, and your EBS controller pods have restarted, check the DEBUG: Send Request sts/AssumeRoleWithWebIdentity failed, attempt 1/8, error RequestError: send request failed
caused by: Post "https://sts.us-east-1.amazonaws.com/": dial tcp x.x.x.x:443: i/o timeout To fix, I added the VPC Service Endpoint for |
I am having the same issue. In our case we are not using EKS and provisioning our clusters with |
@fl-max could you link a guide to creating the VPC service endpoint. I have one created now with the same subnets, but am still getting the same errors you were when debugging is turned on. EDIT: It's only failing in two of my subnets |
The issue was related to DNS resolution of the AWS API.
In my case I just hardcoded the IP address, I added the following lines
(hostaliases):
kubectl edit deployments.apps -n kube-system ebs-csi-controller
dnsPolicy: ClusterFirst
hostAliases:
- hostnames:
- ec2.eu-west-1.amazonaws.com
ip: 67.220.226.37
Also the following should be working, to get the metadata info needed to
connect.
http://169.254.169.254/latest/meta-data/
…On Sat, Sep 16, 2023 at 2:17 AM Brandon Kauffman ***@***.***> wrote:
@fl-max <https://github.com/fl-max> could you link a guide to creating
the VPC service endpoint. I have one created now with the same subnets, but
am still getting the same errors you were when debugging is turned on.
—
Reply to this email directly, view it on GitHub
<#214 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJAG4D322AUAHP3I7VPQATX2TO2NANCNFSM4GXQPZAQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
@edrimon I ended up having the same issue. Opening port 53/udp in the ACL worked. |
I ran into this issue and the DNS comments above were a helpful tip. In our case, I had recently installed a CNI plugin (Cilium) and forgot to restart the addon pods. Rolling restart of the |
Rolling restart of the ebs-csi-controller as well as the ebs-csi-node refreshed the DNS cache and resolved the issue for us |
Thank you so much @fl-max. After adding the flag Since I currently have no way to filter egress by hostname, I opened up egress with the following network policy. After that, traffic began flowing from the
|
This solved the issue for me.
Then i edited ebs-csi-controller configuration
thats it. hope it helps someone |
/kind bug
What happened?
I installed a fresh Kubernetes 1.13.3 cluster using kubeadm on AWS.
I created the CSINodeInfo CRD via https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/storage-crds/csinodeinfo.yaml
And then applied controller.yaml + node.yaml.
After adjusting the volumes for the controller (see: #212) things started up nicely and I was able to query the CSINodeInfo objects from my nodes just fine.
I created the most basic storage class I could think of:
and then applied a corresponding PVC:
The ebs-csi-controller, or rather the csi-provisioner picked up the request to CreateVolume, but then things fail miserably and without (at least to me) a helpful error message - even when running ebs-plugin with v=9:
What you expected to happen?
I expected either csi-provisioner to create a volume or ebs-plugin to return a helpful error message about i.e. the failing API call.
Environment
kubectl version
): 1.13.3The text was updated successfully, but these errors were encountered: