generated from ansible-collections/collection_template
-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
k8s_drain runs into a timeout when evicting a pod which is part of a stateful set #792
Comments
patchback bot
pushed a commit
that referenced
this issue
Dec 10, 2024
SUMMARY Fixes #792 . The function wait_for_pod_deletion in k8s_drain never checks on which node a pod is actually running: try: response = self._api_instance.read_namespaced_pod( namespace=pod[0], name=pod[1] ) if not response: pod = None time.sleep(wait_sleep) This means that if a pod is successfully evicted and restarted with the same name on a new node, k8s_drain does not notice and thinks that the original pod is still running. This is the case for pods which are part of a stateful set. ISSUE TYPE Bugfix Pull Request COMPONENT NAME k8s_drain Reviewed-by: Mike Graves <mgraves@redhat.com> (cherry picked from commit fca0dc0)
patchback bot
pushed a commit
that referenced
this issue
Dec 10, 2024
SUMMARY Fixes #792 . The function wait_for_pod_deletion in k8s_drain never checks on which node a pod is actually running: try: response = self._api_instance.read_namespaced_pod( namespace=pod[0], name=pod[1] ) if not response: pod = None time.sleep(wait_sleep) This means that if a pod is successfully evicted and restarted with the same name on a new node, k8s_drain does not notice and thinks that the original pod is still running. This is the case for pods which are part of a stateful set. ISSUE TYPE Bugfix Pull Request COMPONENT NAME k8s_drain Reviewed-by: Mike Graves <mgraves@redhat.com> (cherry picked from commit fca0dc0)
gravesm
pushed a commit
that referenced
this issue
Dec 11, 2024
SUMMARY Fixes #792 . The function wait_for_pod_deletion in k8s_drain never checks on which node a pod is actually running: try: response = self._api_instance.read_namespaced_pod( namespace=pod[0], name=pod[1] ) if not response: pod = None time.sleep(wait_sleep) This means that if a pod is successfully evicted and restarted with the same name on a new node, k8s_drain does not notice and thinks that the original pod is still running. This is the case for pods which are part of a stateful set. ISSUE TYPE Bugfix Pull Request COMPONENT NAME k8s_drain Reviewed-by: Mike Graves <mgraves@redhat.com> (cherry picked from commit fca0dc0)
gravesm
pushed a commit
that referenced
this issue
Dec 11, 2024
SUMMARY Fixes #792 . The function wait_for_pod_deletion in k8s_drain never checks on which node a pod is actually running: try: response = self._api_instance.read_namespaced_pod( namespace=pod[0], name=pod[1] ) if not response: pod = None time.sleep(wait_sleep) This means that if a pod is successfully evicted and restarted with the same name on a new node, k8s_drain does not notice and thinks that the original pod is still running. This is the case for pods which are part of a stateful set. ISSUE TYPE Bugfix Pull Request COMPONENT NAME k8s_drain Reviewed-by: Mike Graves <mgraves@redhat.com> (cherry picked from commit fca0dc0)
softwarefactory-project-zuul bot
pushed a commit
that referenced
this issue
Dec 11, 2024
…807) This is a backport of PR #793 as merged into main (fca0dc0). SUMMARY Fixes #792 . The function wait_for_pod_deletion in k8s_drain never checks on which node a pod is actually running: try: response = self._api_instance.read_namespaced_pod( namespace=pod[0], name=pod[1] ) if not response: pod = None time.sleep(wait_sleep) This means that if a pod is successfully evicted and restarted with the same name on a new node, k8s_drain does not notice and thinks that the original pod is still running. This is the case for pods which are part of a stateful set. ISSUE TYPE Bugfix Pull Request COMPONENT NAME k8s_drain
softwarefactory-project-zuul bot
pushed a commit
that referenced
this issue
Dec 11, 2024
…808) This is a backport of PR #793 as merged into main (fca0dc0). SUMMARY Fixes #792 . The function wait_for_pod_deletion in k8s_drain never checks on which node a pod is actually running: try: response = self._api_instance.read_namespaced_pod( namespace=pod[0], name=pod[1] ) if not response: pod = None time.sleep(wait_sleep) This means that if a pod is successfully evicted and restarted with the same name on a new node, k8s_drain does not notice and thinks that the original pod is still running. This is the case for pods which are part of a stateful set. ISSUE TYPE Bugfix Pull Request COMPONENT NAME k8s_drain
Merged
softwarefactory-project-zuul bot
pushed a commit
that referenced
this issue
Jan 20, 2025
SUMMARY Version 3.3.0 of ansible-collection kubernetes.core came with several improvements and bugfixes ISSUE TYPE New release pull request Changelog Minor Changes k8s_drain - Improve error message for pod disruption budget when draining a node (#797). Bugfixes helm - Helm version checks did not support RC versions. They now accept any version tags. (#745). helm_pull - Apply no_log=True to pass_credentials to silence false positive warning.. (#796). k8s_drain - Fix k8s_drain does not wait for single pod (#769). k8s_drain - Fix k8s_drain runs into a timeout when evicting a pod which is part of a stateful set (#792). kubeconfig option should not appear in module invocation log (#782). kustomize - kustomize plugin fails with deprecation warnings (#639). waiter - Fix waiting for daemonset when desired number of pods is 0. (#756). ADDITIONAL INFORMATION Collection kubernets.core version 3.3.0 is compatible with ansible-core>=2.14.0 Reviewed-by: Alina Buzachis Reviewed-by: Yuriy Novostavskiy Reviewed-by: Mike Graves <mgraves@redhat.com>
This was referenced Jan 20, 2025
Closed
Merged
softwarefactory-project-zuul bot
pushed a commit
that referenced
this issue
Jan 20, 2025
SUMMARY This release came with new module helm_registry_auth, and improvements to the error messages in the k8s_drain module, new parameter insecure_registry for helm_template module and several bug fixes. ISSUE TYPE New release pull request Changelog Minor Changes Bump version of ansible-lint to minimum 24.7.0 (#765). Parameter insecure_registry added to helm_template as equivalent of insecure-skip-tls-verify (#805). connection/kubectl.py - Added an example of using the kubectl connection plugin to the documentation (#741). k8s_drain - Improve error message for pod disruption budget when draining a node (#797). Bugfixes helm - Helm version checks did not support RC versions. They now accept any version tags. (#745). helm_pull - Apply no_log=True to pass_credentials to silence false positive warning.. (#796). k8s_drain - Fix k8s_drain does not wait for single pod (#769). k8s_drain - Fix k8s_drain runs into a timeout when evicting a pod which is part of a stateful set (#792). kubeconfig option should not appear in module invocation log (#782). kustomize - kustomize plugin fails with deprecation warnings (#639). waiter - Fix waiting for daemonset when desired number of pods is 0. (#756). New Modules helm_registry_auth - Helm registry authentication module ADDITIONAL INFORMATION Collection kubernets.core version 3.1.0 is compatible with ansible-core>=2.15.0 Reviewed-by: Mike Graves <mgraves@redhat.com>
yurnov
added a commit
to yurnov/kubernetes.core
that referenced
this issue
Jan 20, 2025
SUMMARY This release came with new module helm_registry_auth, and improvements to the error messages in the k8s_drain module, new parameter insecure_registry for helm_template module and several bug fixes. ISSUE TYPE New release pull request Changelog Minor Changes Bump version of ansible-lint to minimum 24.7.0 (ansible-collections#765). Parameter insecure_registry added to helm_template as equivalent of insecure-skip-tls-verify (ansible-collections#805). connection/kubectl.py - Added an example of using the kubectl connection plugin to the documentation (ansible-collections#741). k8s_drain - Improve error message for pod disruption budget when draining a node (ansible-collections#797). Bugfixes helm - Helm version checks did not support RC versions. They now accept any version tags. (ansible-collections#745). helm_pull - Apply no_log=True to pass_credentials to silence false positive warning.. (ansible-collections#796). k8s_drain - Fix k8s_drain does not wait for single pod (ansible-collections#769). k8s_drain - Fix k8s_drain runs into a timeout when evicting a pod which is part of a stateful set (ansible-collections#792). kubeconfig option should not appear in module invocation log (ansible-collections#782). kustomize - kustomize plugin fails with deprecation warnings (ansible-collections#639). waiter - Fix waiting for daemonset when desired number of pods is 0. (ansible-collections#756). New Modules helm_registry_auth - Helm registry authentication module ADDITIONAL INFORMATION Collection kubernets.core version 3.1.0 is compatible with ansible-core>=2.15.0 Reviewed-by: Mike Graves <mgraves@redhat.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
SUMMARY
k8s_drain runs into a timeout when evicting a pod which is part of a stateful set.
The pod gets the same name on a different node and because k8s_drain checks only the pod name, but not the node name, it thinks that the original pod is still running.
ISSUE TYPE
COMPONENT NAME
k8s_drain
ANSIBLE VERSION
COLLECTION VERSION
CONFIGURATION
OS / ENVIRONMENT
STEPS TO REPRODUCE
EXPECTED RESULTS
k8s_drain should return directly after the pods are evicted.
ACTUAL RESULTS
k8s_drain keeps going until the timeout is reached although the pods are long gone. It then returns a warning.
This happens because the function
wait_for_pod_deletion
in k8s_drain never checks on which node a pod is actually running:The conditon
if not response
is never met, because the new pod has the same name as the old one.The text was updated successfully, but these errors were encountered: