Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not GC exited containers in running pods #53167

Merged
merged 1 commit into from
Sep 29, 2017

Conversation

dashpole
Copy link
Contributor

@dashpole dashpole commented Sep 27, 2017

This fixes a regression introduced by #45896, and was identified by #52462.
This bug causes the kubelet to garbage collect exited containers in a running pod.
This manifests in strange and confusing state when viewing the cluster. For example, it can show running pods as having no init container (see #52462), if that container has exited and been removed.

This PR solves this problem by only removing containers and sandboxes from terminated pods.
The important line change is:
if cgc.podDeletionProvider.IsPodDeleted(podUID) || evictNonDeletedPods { --->
if cgc.podStateProvider.IsPodDeleted(podUID) || (cgc.podStateProvider.IsPodTerminated(podUID) && evictTerminatedPods) {

cc @MrHohn @yujuhong @kubernetes/sig-node-bugs

BugFix: Exited containers are not Garbage Collected by the kubelet while the pod is running

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 27, 2017
@k8s-github-robot k8s-github-robot added do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Sep 27, 2017
@dashpole
Copy link
Contributor Author

/release-note

@dashpole
Copy link
Contributor Author

/test pull-kubernetes-unit

Copy link
Contributor

@yujuhong yujuhong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good with minor nits.

@@ -820,6 +820,15 @@ func (kl *Kubelet) podIsTerminated(pod *v1.Pod) bool {
return status.Phase == v1.PodFailed || status.Phase == v1.PodSucceeded || (pod.DeletionTimestamp != nil && notRunning(status.ContainerStatuses))
}

// IsPodTerminated returns trus if the pod with the provided UID is in a terminated state ("Failed" or "Succeeded")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add or if the pod has been deleted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -65,9 +65,10 @@ var (
ErrVersionNotSupported = errors.New("Runtime api version is not supported")
)

// podDeletionProvider can determine if a pod is deleted
type podDeletionProvider interface {
// podStateProvider can determine if a pod is deleted
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/is deleted/is deleted or terminated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@yujuhong yujuhong added this to the v1.8 milestone Sep 28, 2017
@yujuhong
Copy link
Contributor

We should cherry-pick this to 1.8 and 1.7.

@dashpole
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce-bazel

@yujuhong
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 28, 2017
@yujuhong
Copy link
Contributor

Let's wait for 1.8.1.

@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dashpole, yujuhong

Associated issue: 45896

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 28, 2017
@dashpole
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce-etcd3

@MrHohn
Copy link
Member

MrHohn commented Sep 29, 2017

/retest

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 44596, 52708, 53163, 53167, 52692). If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit dcaf8e8 into kubernetes:master Sep 29, 2017
@yujuhong yujuhong modified the milestones: v1.8, v1.7 Oct 2, 2017
@yujuhong
Copy link
Contributor

yujuhong commented Oct 2, 2017

@dashpole could you patch 1.7 as well? Thanks.

@dashpole
Copy link
Contributor Author

dashpole commented Oct 2, 2017

@yujuhong can you add the cherrypick-candidate label to this? Thanks

@dashpole dashpole deleted the fix_init_container branch October 2, 2017 16:30
@wojtek-t
Copy link
Member

@dashpole @yujuhong - I'm fine with cherrypicking it to 1.7. However, automated cherrypick is generating a bunch of conflict. So please the cherrypick yourself and I will approve it.

@yujuhong
Copy link
Contributor

@dashpole, ping!

@wojtek-t wojtek-t added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Oct 12, 2017
k8s-github-robot pushed a commit that referenced this pull request Oct 13, 2017
…67-upstream-release-1.7

Automatic merge from submit-queue.

Automated cherry pick of #53167 upstream release 1.7

Cherrypick of #53167 to the 1.7 branch

/assign @wojtek-t 


```release-note
BugFix: Exited containers are not Garbage Collected by the kubelet while the pod is running
```
@k8s-cherrypick-bot
Copy link

Commit found in the "release-1.7" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked.

openshift-merge-robot added a commit to openshift/origin that referenced this pull request Oct 20, 2017
Automatic merge from submit-queue (batch tested with PRs 16896, 16908, 16935, 16898, 16090).

UPSTREAM: 53167: Do not GC exited containers in running pods

kubernetes/kubernetes#53167

xref https://bugzilla.redhat.com/show_bug.cgi?id=1486356

I think this might fix the build issues we are having with init container status corruption

Thanks to @aveshagarwal for spotting this getting picked to kube 1.7 👍

@frobware @derekwaynecarr @smarterclayton @vikaslaad
k8s-github-robot pushed a commit that referenced this pull request Oct 31, 2017
…67-upstream-release-1.8

Automatic merge from submit-queue.

Automated cherry pick of #53167 upstream release 1.8

cherrypick of #53167

```release-note
BugFix: Exited containers are not Garbage Collected by the kubelet while the pod is running
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants