[Bug] Logs report "failed to acquire semaphore" during deletion #7818

artem-nefedov · 2024-06-04T16:49:34Z

What were you trying to accomplish?

Delete the cluster (it seem to work fine).

What happened?

Logs report this message during deletion:

[ℹ]  deleting EKS cluster "redacted"
[ℹ]  will drain 0 unmanaged nodegroup(s) in cluster "redacted"
[ℹ]  starting parallel draining, max in-flight of 1
[✖]  failed to acquire semaphore while waiting for all routines to finish: %!w(*errors.errorString=&{context canceled})

Deletion still finished without errors, so it does not look like this affect anything. But the log does look like there's a problem.

The behavior is reproduced on all attempts.

How to reproduce it?

Create cluster with 1 managed nodegroup and no unmanaged nodegroups, then delete it.

Versions

eksctl version: 0.180.0
EKS version 1.30

The message was not present on eksctl version 0.176.0 with EKS version 1.29 (there are no changes in cluster config besides EKS version).

The text was updated successfully, but these errors were encountered:

cPu1 · 2024-06-06T10:25:11Z

@artem-nefedov, this is a bug in the logging and concurrency handling but it should not affect normal operation of the command. That part of the codebase is a bit dated and could use some refactoring. We'll look into this soon.

lgb861213 · 2024-07-06T12:51:11Z

We also encountered the same error log information when deleting EKS version 1.30, and our eksctl version is 0.183.
the error message that is following:
2024-07-06 16:56:08 [ℹ] deleting EKS cluster "test"
2024-07-06 16:56:11 [ℹ] will drain 0 unmanaged nodegroup(s) in cluster "aloda-test"
2024-07-06 16:56:11 [ℹ] starting parallel draining, max in-flight of 1
2024-07-06 16:56:11 [✖] failed to acquire semaphore while waiting for all routines to finish: %!w(*errors.errorString=&{context canceled})
2024-07-06 16:56:14 [ℹ] deleted 0 Fargate profile(s)
2024-07-06 16:56:16 [✔] kubeconfig has been updated
2024-07-06 16:56:16 [ℹ] cleaning up AWS load balancers created by Kubernetes objects of Kind Service or Ingress
2024-07-06 16:56:23 [ℹ]

AmitBenAmi · 2024-07-10T16:19:41Z

Seeing the same issue with version 0.185.0

acarey-haus · 2024-07-18T20:37:39Z

Seeing this issue with eksctl version 0.187.0 when deleting a nodegroup. The deletion succeeded.

% eksctl delete nodegroup --cluster redacted --name redacted-nodegroup
2024-07-18 13:31:37 [ℹ]  1 nodegroup (redacted-nodegroup) was included (based on the include/exclude rules)
2024-07-18 13:31:37 [ℹ]  will drain 1 nodegroup(s) in cluster "redacted"
2024-07-18 13:31:37 [ℹ]  starting parallel draining, max in-flight of 1
2024-07-18 13:31:37 [!]  no nodes found in nodegroup "redacted-nodegroup" (label selector: "alpha.eksctl.io/nodegroup-name=redacted-nodegroup")
2024-07-18 13:31:37 [✖]  failed to acquire semaphore while waiting for all routines to finish: context canceled
2024-07-18 13:31:37 [ℹ]  will delete 1 nodegroups from cluster "redacted"
2024-07-18 13:31:40 [ℹ]  1 task: { 1 task: { delete nodegroup "redacted-nodegroup" [async] } }
2024-07-18 13:31:40 [ℹ]  will delete stack "eksctl-redacted-nodegroup-redacted-nodegrou"p
2024-07-18 13:31:40 [✔]  deleted 1 nodegroup(s) from cluster "redacted"

fnzwex · 2024-08-06T04:48:06Z

Test results after finding this and in an attempt to help:

0.176.0 - good until 1.30 - 1.30 is not supported and it refuses to work
0.177.0 - this issue
0.178.0 - this issue
0.179.0 - this issue
0.180.0 through 0.186.0 - untested by me but presumed bad since surrounded by bad
0.187.0 - this issue
0.188.0 - STILL this issue - 5 days old.

It'd be great if this could be addressed ASAP and released as 0.189.0 soon. Any chance of that?

Pretty bad that it got broken in the first place and even worse that it got left broken for such a long time.

Still a better way to manage clusters than Terraform/OpenTofu IMO (when it works properly) (which NO version does for 1.30)

jarvisbot01 · 2024-09-21T15:01:28Z

eksctl version 0.190.0
eks 1.30

2024-09-21 14:55:45 [✖] failed to acquire semaphore while waiting for all routines to finish: context canceled

artem-nefedov added the kind/bug label Jun 4, 2024

TiberiuGC added the needs-investigation label Jun 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Logs report "failed to acquire semaphore" during deletion #7818

[Bug] Logs report "failed to acquire semaphore" during deletion #7818

artem-nefedov commented Jun 4, 2024 •

edited

Loading

cPu1 commented Jun 6, 2024

lgb861213 commented Jul 6, 2024

AmitBenAmi commented Jul 10, 2024

acarey-haus commented Jul 18, 2024

fnzwex commented Aug 6, 2024

jarvisbot01 commented Sep 21, 2024

[Bug] Logs report "failed to acquire semaphore" during deletion #7818

[Bug] Logs report "failed to acquire semaphore" during deletion #7818

Comments

artem-nefedov commented Jun 4, 2024 • edited Loading

What were you trying to accomplish?

What happened?

How to reproduce it?

cPu1 commented Jun 6, 2024

lgb861213 commented Jul 6, 2024

AmitBenAmi commented Jul 10, 2024

acarey-haus commented Jul 18, 2024

fnzwex commented Aug 6, 2024

jarvisbot01 commented Sep 21, 2024

artem-nefedov commented Jun 4, 2024 •

edited

Loading