Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When --grace-period=0 is provided, wait for deletion #37263

Merged

Conversation

smarterclayton
Copy link
Contributor

@smarterclayton smarterclayton commented Nov 22, 2016

The grace-period is automatically set to 1 unless --force is provided, and the client waits until the object is deleted.

This preserves backwards compatibility with 1.4 and earlier. It does not handle scenarios where the object is deleted and a new object is created with the same name because we don't have the initial object loaded (and that's a larger change for 1.5).

Fixes #37117 by relaxing the guarantees provided.

When deleting an object with `--grace-period=0`, the client will begin a graceful deletion and wait until the resource is fully deleted.  To force deletion, use the `--force` flag.

This change is Reviewable

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 22, 2016
@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note-label-needed labels Nov 22, 2016
@smarterclayton smarterclayton added this to the v1.5 milestone Nov 22, 2016
@smarterclayton
Copy link
Contributor Author

Proposed for 1.5 to avoid backwards compatibility.

@smarterclayton smarterclayton added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-blocker release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-label-needed labels Nov 22, 2016
@k8s-ci-robot
Copy link
Contributor

Jenkins Bazel Build failed for commit 28a4d56. Full PR test history.

The magic incantation to run this job again is @k8s-bot bazel test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Contributor

Jenkins GCI GKE smoke e2e failed for commit 32a7c417a81661f05a056334d0305e145d278375. Full PR test history.

The magic incantation to run this job again is @k8s-bot gci gke e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@smarterclayton
Copy link
Contributor Author

@k8s-bot bazel test this

@fabianofranz
Copy link
Contributor

@smarterclayton so both --grace-period=0 and --now will set gracePeriod to 1, but the difference is that the first waits?

// into --grace-period=1 and wait until the object is successfully deleted. Users may provide --force
// to bypass this wait.
wait = true
gracePeriod = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--grace-period=0 may now be mutually exclusive with --timeout

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not, timeout is still respected.

@smarterclayton
Copy link
Contributor Author

so both --grace-period=0 and --now will set gracePeriod to 1, but the difference is that the first waits?

To avoid breaking old scripts, --grace-period=0 continues to not return (waits) until the object is gone. --now continues to behave as it did before.

@fabianofranz
Copy link
Contributor

LGTM

@@ -818,10 +818,12 @@ func (r *Request) request(fn func(*http.Request, *http.Response)) error {
// connection.
defer func() {
const maxBodySlurpSize = 2 << 10
if resp.ContentLength <= maxBodySlurpSize {
io.Copy(ioutil.Discard, &io.LimitedReader{R: resp.Body, N: maxBodySlurpSize})
if resp.Body != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check should not be required.

        // The http Client and Transport guarantee that Body is always
        // non-nil, even on responses without a body or responses with
        // a zero-length body.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test cases were passing nil in, and we guard against it in other places. I'll remove both guards and hunt down the bug.

case 1, 2, 3:
return &http.Response{StatusCode: 200, Header: defaultHeader(), Body: objBody(codec, &pods.Items[0])}, nil
default:
return &http.Response{StatusCode: 404, Header: defaultHeader(), Body: nil}, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should provide a non-nil body, even if it is empty.

"k8s.io/kubernetes/pkg/kubectl/resource"
"time"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: import ordering.

@krousey
Copy link
Contributor

krousey commented Nov 22, 2016

cc @roberthbailey

The grace-period is automatically set to 1 unless --force is provided,
and the client waits until the object is deleted.

This preserves backwards compatibility with 1.4 and earlier. It does not
handle scenarios where the object is deleted and a new object is created
with the same name.
@smarterclayton
Copy link
Contributor Author

Updated with all comments.

@saad-ali
Copy link
Member

@k8s-bot test this

@dchen1107
Copy link
Member

This pr might need more attention, and I decided not manually merge it for holiday code freeze in favor to merge two large prs: #36673
and #36782

@smarterclayton
Copy link
Contributor Author

Yes, that's fine with me.

@eparis
Copy link
Contributor

eparis commented Nov 28, 2016

@k8s-bot gce node e2e test this

@eparis
Copy link
Contributor

eparis commented Nov 28, 2016

@k8s-bot node e2e test this

@saad-ali
Copy link
Member

@foxish can you review. Let's get this merged ASAP, so we can unblock upgrade testing

@foxish
Copy link
Contributor

foxish commented Nov 29, 2016

LGTM except for the couple of comments.

// By default use a reaper to delete all related resources.
if cmdutil.GetFlagBool(cmd, "cascade") {
glog.Warningf("\"cascade\" is set, kubectl will delete and re-create all resources managed by this resource (e.g. Pods created by a ReplicationController). Consider using \"kubectl rolling-update\" if you want to update a ReplicationController together with its Pods.")
err = ReapResult(r, f, out, cmdutil.GetFlagBool(cmd, "cascade"), ignoreNotFound, timeout, cmdutil.GetFlagInt(cmd, "grace-period"), shortOutput, mapper, false)
err = ReapResult(r, f, out, cmdutil.GetFlagBool(cmd, "cascade"), ignoreNotFound, timeout, gracePeriod, waitForDeletion, shortOutput, mapper, false)
} else {
err = DeleteResult(r, out, ignoreNotFound, shortOutput, mapper)
Copy link
Contributor

@foxish foxish Nov 29, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this should go through ReapResult as well? Currently, we would print out "pod is deleted" prior to its actual deletion, and then eventually hit the timeout condition a few lines below. This is in the case where --cascade=false.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't actually even do graceful deletion in this code path (we use the server default) so it's not backwards incompatible. Definitely something we need to fix in the future, but not for 1.5

copied := *info
info = &copied
// TODO: refactor Reaper so that we can pass the "wait" option into it, and then check for UID change.
return wait.PollImmediate(objectDeletionWaitInterval, timeout, func() (bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in the timeout case, we should return a better error than "timed out waiting for the condition" because we surface
that error to the user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the error is going to be "error when stopping pod/foo: timed out waiting for the condition" which is pretty close. I think that's something we could improve in a follow up (there's basically a follow up here which is delete needs a better cleanup)

@foxish
Copy link
Contributor

foxish commented Nov 29, 2016

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 29, 2016
@smarterclayton
Copy link
Contributor Author

smarterclayton commented Nov 29, 2016 via email

@k8s-ci-robot
Copy link
Contributor

Jenkins GCE Node e2e failed for commit 7cdb6b1. Full PR test history.

The magic incantation to run this job again is @k8s-bot node e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@saad-ali
Copy link
Member

@k8s-bot node e2e test this

@saad-ali
Copy link
Member

@k8s-bot kops aws e2e test this

@dims
Copy link
Member

dims commented Nov 30, 2016

@k8s-bot node e2e test this

@eparis eparis removed the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Nov 30, 2016
@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit 9ccc291 into kubernetes:master Nov 30, 2016
saad-ali added a commit that referenced this pull request Nov 30, 2016
…63-upstream-release-1.5

Automated cherry pick of #37263 upstream release 1.5
@saad-ali saad-ali added cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. and removed cherrypick-candidate labels Dec 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-blocker release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.