Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

document pvc finalizer issue during 1.10 -> 1.9 downgrade #7731

Closed
wants to merge 5 commits into from

Conversation

rootfs
Copy link
Contributor

@rootfs rootfs commented Mar 13, 2018

Signed-off-by: Huamin Chen <hchen@redhat.com>
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 13, 2018
@k8sio-netlify-preview-bot
Copy link
Collaborator

k8sio-netlify-preview-bot commented Mar 13, 2018

Deploy preview for kubernetes-io-vnext-staging ready!

Built with commit df9d49f

https://deploy-preview-7731--kubernetes-io-vnext-staging.netlify.com

@Bradamant3
Copy link
Contributor

/assign

@Bradamant3
Copy link
Contributor

To any other reviewers: please ignore the Travis build error. It's on me -- I told the committer not to worry about the TOC YAML file.

@Bradamant3 Bradamant3 added this to the 1.10 milestone Mar 14, 2018
Copy link
Contributor

@Bradamant3 Bradamant3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rootfs congratulations you are the recipient of my most tech-writerly review to date.

I made the last comment a code block on purpose so you could just copy/paste if it works for you. gah markdown-within-markdown.

If any of my edits isn't clear, feel free to ping. Slack works better than GH notifications bc soooo too many and I lose track. But I'm checking 1.10 docs PRs several times a day, so whatever works for you.

{:toc}

---
title: Kubernetes Downgrade issue from 1.10 to 1.9 due to PV/PVC Protection
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's be more precise here:
"Issue downgrading Kubernetes from 1.10 to 1.9 if StorageObjectInUseProtection admission controller is enabled"

I suggest the change because of the name of the flag the user actually sets. I understand that the finalizer names are different, and the user needs to know what they are to troubleshoot.

The docs also explain about the finalizers, but wouldn't a user be more likely to think in terms of the plugin name?

If I have time, I'll submit a PR and flag y'all on it, to explain the feature more clearly in the docs, too. It took me a number of reads to sort out what was going on, and I know about admission controllers and finalizers. (I am also in high end-of-release-cycle mode, I admit.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Bradamant3 yes your rephrase sounds good to me. I feel both doc (admission controller in admin doc in the reference) and this one need to be in sync to help end users know what the term is and what it is for.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would vote for a shorter title for this because when the article is online, long titles would be a problem for navigation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The feature is called StorageObjectInUseProtection in K8s 1.10 and PVCProtection in K8s 1.9 that's why I would change the title in such a way that it refers to the feature names.
Do we need to include exact numbers from 1.10 to 1.9 in the title?
What about title: StorageObjectInUse/PVC Protection Downgrade Issue?


## PV/PVC Protection in Kubernetes 1.10

When enabled, [PV/PVC Protection](docs/admin/admission-controllers.md#storage-object-in-use-protection-beta) prevents PV/PVC from being removed when the finalizers are removed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"If you enable the admission controller StorageObjectInUseProtection, your PersistentVolume and PersistentVolumeClaim objects are not removed if the related pv-protection and pvc-protection finalizers are removed."

Not sure I've got this right because there seems to be some contradiction between the text ^^ and text below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, it is should be PersistentVolume and PersistentVolumeClaim objects are not removed if the related pv-protection and pvc-protection finalizers are still *present*

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, it is should be PersistentVolume and PersistentVolumeClaim objects are not removed if the related pv-protection and pvc-protection finalizers are still present

Correct, when the finalizer is present in an PV or PVC object, the object is not removed.

title: Kubernetes Downgrade issue from 1.10 to 1.9 due to PV/PVC Protection
---

## PV/PVC Protection in Kubernetes 1.10
Copy link
Contributor

@Bradamant3 Bradamant3 Mar 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To work with previous edit:
"Storage Object in Use Protection in Kubernetes 1.10"


## Downgrading issue

After downgrading from Kubernetes 1.10 to 1.9, PV/PVCs that are created in Kubernetes 1.10 with PVC Protection cannot be removed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"After you downgrade from Kubernetes 1.10 to 1.9, PersistentVolume or PersistentVolumeClaim objects that were created with version 1.10 cannot be removed. This is because their finalizers are not recognized in version 1.9."

I'm extrapolating from the first section -- is this indeed correct? Clearer to be explicit even if it looks repetitive to someone who knows the feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is the case, let @pospispa confirm

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, let me firstly explain the purpose of the StorageObjectInUseProtection feature.

When the StorageObjectInUseProtection feature is disabled PVs and PVCs have empty finalizers, i.e. are not protected. When a user deletes a PVC (no finalizer present) that is in active use by a pod the PVC is removed immediately, therefore, the user may loose data. Similarly, when an admin deletes a PV (no finalizer present) that is in active use (i.e. bound to a PVC that is in active use) data may be lost again.

So when the StorageObjectInUseProtection feature is enabled finalizers are added to PVs and PVCs immediately when they are created (this is done by the StorageObjectInUseProtection admission plugin) so these PVs and PVCs are now protected. Now, when the PV or PVC is deleted it is not removed immediately, they transition into Terminating phase. PVs or PVCs in Terminating phase are removed after it's finalizer is removed. In K8s 1.10 there are PV and PVC Protection controllers that remove finalizers automatically when a PV or PVC in Terminating phase is not in active use.

In K8s 1.9 the finalizers work in the same way as in K8s 1.10, i.e. when a finalizer is present in an object and the object is deleted the object is not removed immediately, but the object transitions into Terminating phase. The object is removed after it's finalizer is removed.

Note: there is PVCProtection alpha feature in K8s 1.9. When this feature is enabled PVCProtection controller automatically removes finalizers from PVCs when they are both in Terminating phase and not in active use by a pod. However, there's no controller that would remove finalizers from PVs.

I personally would describe the downgrading issue in the below way:

After downgrading from Kubernetes 1.10 to 1.9, PV/PVCs that contain a finalizer cannot be removed until their finalizer is removed.


## Workaround

Currently PV/PVC finalizers have to be manually removed so PV/PVC can be removed after downgrading to Kubernetes 1.9.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Currently the pv-protection and pvc-protection finalizers must be removed manually before you downgrade so that PVs and PVCs can be removed after you downgrade to version 1.9. Here's what to do:"

yeah, I added "before you downgrade" because that's what it looks like from the following instructions, but it was not completely clear. Also adding another step to make when you do what even clearer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good to me


Currently PV/PVC finalizers have to be manually removed so PV/PVC can be removed after downgrading to Kubernetes 1.9.

Before downgrading to Kubernetes 1.9, disable `StorageObjectInUseProtection` plugin and restart admission controller.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"1. Before you downgrade, disable the StorageObjectInUseProtection plugin and restart the admission controller."

Can we also please provide the shell command here? (and if you indent four spaces it will keep the step numbering, with my great thanks)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hyperkube apiserver --disable-admission-plugins=StorageObjectInUseProtection ... (rest of the options omitted)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or kube-apiserver --disableadmissionplugins=StorageObjectInUseProtection ...? (to match what's currently in the docs)

Also -- I see nothing in the admission controller docs about a restart. If it's necessary, what's the command? Do you have to restart the apiserver?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--disable-admission-plugins is the command line option, there is - in between words.

The restart process is to stop apiserver, and start it again but with a different command line option as above:

  • if apiserver is managed by systemd, stop it using systemd stop <apiserver service name>. Modify apiserver service unit service file, reload the service, and start it using systemd start command
  • if apiserver is started through command line, kill the process and run the command again with the above command line option

```bash
kubectl get pv pv1 -o yaml |grep finalizer
# (result should be empty)
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And for ll22 to end:

1. Patch the PV or PVC, as in the following command, where `pv1` is the name of the PV to patch:

     ```bash
    kubectl patch pv pv1 --type=json -p='[{"op": "remove", "path": "/metadata/finalizers"}]'
    ````

1. Verify the finalizers are removed:

    ```bash
    kubectl get pv pv1 -o yaml |grep finalizer
    ```

    The result should be empty.

1. You can now safely downgrade to version 1.9.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good. this process has to iterate over all pv/pvcs created in 1.10


## PV/PVC Protection in Kubernetes 1.10

When enabled, [PV/PVC Protection](docs/admin/admission-controllers.md#storage-object-in-use-protection-beta) prevents PV/PVC from being removed when the finalizers are removed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link should be /docs/admin/admission-controllers.md#storage-object-in-use-protection-beta, otherwise it will be broken. NB the first slash.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!


Currently PV/PVC finalizers have to be manually removed so PV/PVC can be removed after downgrading to Kubernetes 1.9.

Before downgrading to Kubernetes 1.9, disable `StorageObjectInUseProtection` plugin and restart admission controller.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to disable StorageObjectInUseProtection plugin and restart admission controller? I don't read the details of the doc PV/PVC Protection referenced above. If the approach is given in the doc, it'd be good to mention this. If not, it'd be great to give some details or reference some doc here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just notice that @Bradamant3 already mentioned this in #7731 (comment) :)

Copy link
Contributor

@tengqm tengqm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest we put this kind of docs into a separate subtopic when we are adding this page to the TOC.

Ref: #7735

@Bradamant3
Copy link
Contributor

@tengqm @rootfs @pospispa can y'all please take a look at my questions in #7742, which is closely related to this PR?

TL;DR: I'm asking about the relationship between the feature named pvc-protection in 1.9, which seems to be the alpha version of what is now named StorageObjectInUseProtection in 1.10. But see linked PR for more detailed questions.

{:toc}

---
title: Kubernetes Downgrade issue from 1.10 to 1.9 due to PV/PVC Protection

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The feature is called StorageObjectInUseProtection in K8s 1.10 and PVCProtection in K8s 1.9 that's why I would change the title in such a way that it refers to the feature names.
Do we need to include exact numbers from 1.10 to 1.9 in the title?
What about title: StorageObjectInUse/PVC Protection Downgrade Issue?


## PV/PVC Protection in Kubernetes 1.10

When enabled, [PV/PVC Protection](docs/admin/admission-controllers.md#storage-object-in-use-protection-beta) prevents PV/PVC from being removed when the finalizers are removed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the feature is called StorageObjectInUseProtection in K8s 1.10 I would:

s/[PV/PVC Protection]/[`StorageObjectInUseProtection`]


## PV/PVC Protection in Kubernetes 1.10

When enabled, [PV/PVC Protection](docs/admin/admission-controllers.md#storage-object-in-use-protection-beta) prevents PV/PVC from being removed when the finalizers are removed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, it is should be PersistentVolume and PersistentVolumeClaim objects are not removed if the related pv-protection and pvc-protection finalizers are still present

Correct, when the finalizer is present in an PV or PVC object, the object is not removed.


## Downgrading issue

After downgrading from Kubernetes 1.10 to 1.9, PV/PVCs that are created in Kubernetes 1.10 with PVC Protection cannot be removed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, let me firstly explain the purpose of the StorageObjectInUseProtection feature.

When the StorageObjectInUseProtection feature is disabled PVs and PVCs have empty finalizers, i.e. are not protected. When a user deletes a PVC (no finalizer present) that is in active use by a pod the PVC is removed immediately, therefore, the user may loose data. Similarly, when an admin deletes a PV (no finalizer present) that is in active use (i.e. bound to a PVC that is in active use) data may be lost again.

So when the StorageObjectInUseProtection feature is enabled finalizers are added to PVs and PVCs immediately when they are created (this is done by the StorageObjectInUseProtection admission plugin) so these PVs and PVCs are now protected. Now, when the PV or PVC is deleted it is not removed immediately, they transition into Terminating phase. PVs or PVCs in Terminating phase are removed after it's finalizer is removed. In K8s 1.10 there are PV and PVC Protection controllers that remove finalizers automatically when a PV or PVC in Terminating phase is not in active use.

In K8s 1.9 the finalizers work in the same way as in K8s 1.10, i.e. when a finalizer is present in an object and the object is deleted the object is not removed immediately, but the object transitions into Terminating phase. The object is removed after it's finalizer is removed.

Note: there is PVCProtection alpha feature in K8s 1.9. When this feature is enabled PVCProtection controller automatically removes finalizers from PVCs when they are both in Terminating phase and not in active use by a pod. However, there's no controller that would remove finalizers from PVs.

I personally would describe the downgrading issue in the below way:

After downgrading from Kubernetes 1.10 to 1.9, PV/PVCs that contain a finalizer cannot be removed until their finalizer is removed.

@Bradamant3
Copy link
Contributor

Adding tracking issue here for reference: kubernetes/kubernetes#60764

@rootfs
Copy link
Contributor Author

rootfs commented Mar 14, 2018

@Bradamant3 feedback addressed


## Downgrading issue

After downgrading from Kubernetes 1.10 to 1.9, PV/PVCs that are created in Kubernetes 1.10 with PVC Protection cannot be removed.
After downgrading from Kubernetes 1.10 to 1.9, PV/PVCs that contain finalizers cannot be removed until their finalizer are removed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: s/their finalizer are/their finalizers are/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Signed-off-by: Huamin Chen <hchen@redhat.com>
@pospispa
Copy link

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 16, 2018
…downgrade

Signed-off-by: Huamin Chen <hchen@redhat.com>
@k8s-ci-robot
Copy link
Contributor

New changes are detected. LGTM label has been removed.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 16, 2018
- Patch the PV or PVC, as in the following command, where `pv1` is the name of the PV to patch:

```bash
kubectl patch pv pv1 --type=json -p='[{"op": "remove", "path": "/metadata/finalizers"}]'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not instruct them to remove the entire finalizers block, just the specific pv or pvc finalizer

Signed-off-by: Huamin Chen <hchen@redhat.com>
@Bradamant3 Bradamant3 mentioned this pull request Mar 21, 2018
@Bradamant3
Copy link
Contributor

Bradamant3 commented Mar 21, 2018

travis build break should be fixed with #7807
still needs strictly wordsmithing copyedits for clarity, but merging for now bc content comes first. Will submit a followon PR if time permits. Adding both merge commands bc it's clear that last tech comments are addressed in last commit

/lgtm

/approve

@liggitt
Copy link
Member

liggitt commented Mar 21, 2018

Since the 1.9 fix was merged and released in 1.9.6, I don't think we need this doc any more :-/

The known issue and solution can be summarized as "if you need to downgrade from 1.10 to 1.9.x, downgrade to v1.9.6 to ensure PV and PVC objects can be deleted properly."

Copyedits only to bump build again
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Bradamant3, pospispa

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 21, 2018
@Bradamant3
Copy link
Contributor

good thing the bot removed my labels when I pushed changes, lol. Will go revert related TOC change.

@Bradamant3
Copy link
Contributor

Doc no longer needed, so closing unmerged. Will doc in release notes.

@Bradamant3 Bradamant3 closed this Mar 21, 2018
@Bradamant3 Bradamant3 mentioned this pull request Mar 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants