-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial draft of upgrade guide for kubeadm clusters. #3999
Conversation
a6ab204
to
403f61b
Compare
Thanks! Needs a link from the kubeadm guide to the upgrade guide. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
other 1.6.x releases. Upgrades are not supported before the 1.6.0 release (when | ||
kubeadm became Beta). | ||
|
||
It is a work-in-progress (not intended for merging yet) and targeted at other |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put the do-not-merge label on so you can remove this, plus the WIP in title also denotes.
a. On Debian, this would be: | ||
|
||
sudo apt-get update | ||
sudo apt-get install kubelet kubeadm kubelet kubernetes-cni |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be needed if it's an upgrade.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, sudo apt-get update && sudo apt-get upgrade
is sufficent
b. On CentOS/Fedora, this would be: | ||
|
||
sudo yum update | ||
sudo yum install kubelet kubeadm kubelet kubernetes-cni |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as the last one.
|
||
kubeadm init --skip-preflight-checks --kubernetes-version v1.7.0 | ||
|
||
For pre-release testing, this would be: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nix from doc, I don't want this publicly visible honestly.
|
||
4. Upgrade CNI provider. | ||
|
||
Your CNI provider might have its own upgrade instructions to follow now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Links to the major providers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an addons page, we might be able to link to that one...
|
||
Your CNI provider might have its own upgrade instructions to follow now. | ||
|
||
TODO: Rollback instructions in case anything goes wrong (using the backed up |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cp -r backup /etc/kubernetes
yum downgrade x,y,z
--- | ||
assignees: | ||
- pipejakob | ||
title: Upgrading kubeadm clusters to 1.7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from v1.6 to v1.7?
|
||
This guide is for upgrading kubeadm clusters from version 1.6.x to 1.7.x, or | ||
other 1.6.x releases. Upgrades are not supported before the 1.6.0 release (when | ||
kubeadm became Beta). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@luxas going to need to add the warning re: kubernetes/kubernetes#47081 (comment)
24a1ad5
to
d60ea72
Compare
I've addressed most of the feedback given, so please give this another look. I still intend to add explicit rollback instructions, and I'll also host developer builds of debs for the most recent 1.7.0 beta release so others can actually test the instructions and file bugs for any problems found. |
I've built debs for
There are subfolders for the different architectures, so to grab all of the debs for, say, amd64, you can run:
When following the instructions, instead of running If anyone wants to test the instructions using rpms, please let me know and I'll try to build those as well. |
I've gone ahead and built rpms for testing as well:
|
d60ea72
to
e597b03
Compare
a. On Debian, this can be accomplished with: | ||
|
||
sudo apt-get update | ||
sudo apt-get upgrade |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend just the update steps, it will keep around the legacy files in case there is one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused about the suggestion here. Are you suggesting to delete "sudo apt-get upgrade" and only leave the "sudo apt-get update" step? On Debian, that would only fetch the listing of new packages without upgrading any packages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My debian is rusty, but on fedora/*EL you don't need the extra step so I'll comment below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, got it. Will remove.
|
||
sudo yum update | ||
sudo yum upgrade | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're missing a
sudo systemctl restart kubelet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
e597b03
to
28deb88
Compare
CC @lukemarsden |
3dd47e9
to
a07dfce
Compare
We discussed this PR in this week's SIG Cluster Lifecycle meeting, since it is a hot topic for 1.7 burndown, and we were hoping it could get merged once a few people had tested the steps. If I can get ACKs from others that they've independently verified the process, will I need anything else other than a normal LGTM? I'm not sure how the "Needs Docs Review" label works. Also, could someone please remove the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we say something about how you may be able to restore things?
|
||
sudo kubeadm init --skip-preflight-checks --kubernetes-version v1.7.0 | ||
|
||
5. Upgrade CNI provider. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The admin has to delete the kube-proxy pods as well (so they get recreated)...
We aren't using rolling updates for that DS yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call-out. I'll add it to the doc and retest the steps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. However, the 1.7.0-beta.2 version doesn't seem to have the kube-proxy
regression fix, so the pod ends up crash looping. I tested with kubeadm built from the HEAD of release-1.7
and the steps worked fine.
The next 1.7 beta release is due out today, so I'll be able to build new debs/rpms for it and post them for testing. I'll also start building them for the HEAD of release-1.7
so people can still test these instructions ASAP.
That was my original intention, but I was taking @lukemarsden's advice and trying to start minimal with the documentation so that we could keep adding to it once it was checked in. Now that it's been in review for so long, and we're so close to the release, I'll hedge a bit and open a second PR based on top of this commit to add the rollback instructions. |
a07dfce
to
8967643
Compare
Repeating this comment so it's not hidden in the collapsed, outdated diff: I've added the step @luxas suggested for manually deleting The next 1.7 beta release is due out today, so I'll be able to build new debs/rpms for it and post them for testing. I'll also start building them for the HEAD of release-1.7 so people can still test these instructions ASAP. |
b. On CentOS/Fedora, you would instead run: | ||
|
||
sudo yum update | ||
sudo yum upgrade |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need the extra upgrade step here, update works just fine and leaves breadcrumb config files in case you clobber something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
8967643
to
c766a90
Compare
For anyone wanting to test the instructions using Debian on amd64, I've created new .debs here:
|
Ping me when there's a tech lgtm and this is ready to be merged. I'll make sure it gets into the 1.7 release. |
c766a90
to
c1a4f51
Compare
I've added steps to upgrade the OS packages on each node (and restart kubelet), along with headings to make it clear what steps to perform where. Any other comments, @luxas or @timothysc? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tech LGTM
We can reiterate if we find something more when the rc is cut
|
||
### On the master | ||
|
||
1. Back up `/etc/kubernetes`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why should we have this if we don't tell how to restore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Ultimately, I'm hoping we can have backup, upgrade, and rollback instructions before the 1.7 release. I've already moved out the rollback instructions to a separate PR, so it probably makes sense to move the backup instructions there as well instead of leaving them dangling.
In-place upgrades are supported between 1.6 and 1.7 releases. Rollback instructions to come in a separate commit. Fixes kubernetes/kubeadm#278
c1a4f51
to
5946f21
Compare
* Minor fixes in the Deployment doc Signed-off-by: Michail Kargakis <mkargaki@redhat.com> * add NodeRestriction to admission-controllers (#3842) * Admins Can Configure Zones in Storage Class The PR #38505 (kubernetes/kubernetes#38505) added zones optional parameter to Storage Class for AWS and GCE provisioners. That's why documentation needs to be updated accordingly. * document custom resource definitions * add host paths to psp (#3971) * add host paths to psp * add italics * Update ConfigMap doc to explain TTL-based cache updates (#3989) * Update ConfigMap doc to explain TTL-based cache updates * swap word order Change "When a ConfigMap being already consumed..." to "When a ConfigMap already being consumed..." * Update NetworkPolicy docs for v1 * StorageOS Volume plugin * Update GPU docs * docs: HPA autoscaling/v2alpha1 status conditions This commit documents the new status conditions feature for HPA autoscaling/v2alpha1. It demonstrates how to get the status conditions using `kubectl describe`, and how to interpret them. * Update description about NodeRestriction kubelet node can alse create mirror pods for their own static pods. * adding storage as a supported resource to node allocatable Signed-off-by: Vishnu kannan <vishnuk@google.com> * Add documentation for podpreset opt-out annotation This adds the annotation for having the podpreset admission controller to skip (opt-out) manipulating the pod spec. Also, the annotation format for what presets have acted on a pod has been modified to add a prefix of "podpreset-". The new naming makes it such that there is no chance of collision with the newly introduced opt-out annotation (or future ones yet to be added). Opt-out annotation PR: kubernetes/kubernetes#44965 * Update PDB documentation to explain new field (#3885) * update-docs-pdb * Addressed erictune@'s comments * Fix title and add a TOC to the logging concept page * Patch #4118 for typos * Describe setting coredns server in nameserver resolv chain * Address comments in PR #3997. Comment is in https://github.com/kubernetes/kubernetes.github.io/pull/3997/files/f6eb59c67e28efc298c87b1ef49a96bc6adacd1e#diff-7a14981f3dd8eb203f897ce6c11d9828 * Update task for DaemonSet history and rollback (#4098) * Update task for DaemonSet history and rollback Also remove mentions of templateGeneration field because it's deprecated * Address comments * removed lt and gt as operators (#4152) * removed lt and gt as operators * replace lt and gt for node-affinfity * updated based on bsalamat review * Initial draft of upgrade guide for kubeadm clusters. In-place upgrades are supported between 1.6 and 1.7 releases. Rollback instructions to come in a separate commit. Fixes kubernetes/kubeadm#278 * Add local volume documentation (#4050) * Add local volume documentation * Add PV local volume example * Patch PR #3999 * Add documentation for Stackdriver event exporter * Add documentation about controller metrics * Federation: Add task for setting up placement policies (#4075) * Add task for setting up placement policies * Update version of management sidecar in policy engine deployment * Address @nikhiljindal's comments - Lower case filenames - Comments in policy - Typo fixes - Removed type LoadBalancer from OPA Service * Add example that sets cluster selector Per-@nikhiljindal's suggestion * Fix wording and templating per @chenopis * PodDisruptionBudget documentation Improvements (#4140) * Changes from #3885 Title: Update PDB documentation to explain new field Author: foxish * Added Placeholder Disruptions Concept Guide New file: docs/concepts/workloads/pods/disruptions.md Intented contents: concept for Pod Disruption Budget, cross reference to Eviction and Preemption docs. Linked from: concepts > workloads > pods * Added placeholder Configuring PDB Task New file: docs/tasks/run-application/configure-pdb.md Intented contents: task for writing a Pod Disruption Budget. Linked from: tasks > configuring-applications > configure pdb. * Add refs to the "drain a node" task. * Refactor PDB docs. Move the "Requesting an eviction" section from: docs/tasks/administer-cluster/configure-pod-disruption-budget.md -- which is going away -- to: docs/tasks/administer-cluster/safely-drain-node.md The move is verbatim, except for an introductory sentence. Also added assignees. * Refactor of PDB docs Moved the section: Specifying a PodDisruptionBudget from: docs/tasks/administer-cluster/configure-pod-disruption-budget.md to: docs/tasks/run-application/configure-pdb.md because that former file is going away. Move is verbatim. * Explain how Eviction tools should handle failures * Refactor PDB docs Move text from: docs/tasks/administer-cluster/configure-pod-disruption-budget.md to: docs/concepts/workloads/pods/disruptions.md Delete the now empty: docs/tasks/administer-cluster/configure-pod-disruption-budget.md Added a redirects_from section to the new doc, containing the path of the now-deleted doc, plus all the redirects from the deleted doc. * Expand PDB Concept guide Building on a little content from the old task, greatly expanded the Disruptions concept guide, including an abstract example. * Update creating a pdb Task. * Address review comments. * Fixed for all cody-clark's review comments * Address review comments from mml * Address review comments from maisem * Fix missing backtick * Api and Kubectl reference docs updates for 1.7 (#4193) * Fix includes groups * Generated kubectl docs for 1.7 * Generated references docs for 1.7 api * Document node authorization mode * API Aggregator (#4173) * API Aggregator * Additional bullet points * incorporated feedback for apiserver-aggregation.md * split setup-api-aggregator.md into two docs and address feedback * fix link * addressed docs feedback * incorporate feedback * integrate feedback * Add documentation for DNS stub domains (#4063) * Add documentation for DNS stub domains * add additional prereq * fix image path * review feedback * minor grammar and style nits * documentation for using hostAliases to manage hosts file (#4080) * documentation for using hostAliases to manage hosts file * add to table of contents * review comments * update the right command to see hosts file * reformat doc based on suggestion and change some wording * Fix typo for #4080 * Patch PR #4063 * Fix wording in placement policy task introduction * Add update to statefulset concepts and basic tutorial (#4174) * Add update to statefulset concpets and basic tutorial * Address tech comments. * Update ESIPP docs for new added API fields * Custom resource docs * update audit document with advanced audit features added in 1.7 * kubeadm v1.7 documentation updates (#4018) * v1.7 updates for kubeadm * Address review comments * Address Luke's comments * Encrypting secrets at rest and cluster security guide * Edits for Custom DNS Documentation (#4207) * reorganize custom dns doc * format fixes * Update version numbers to 1.7 * Patch PR #4140 (#4215) * Patch PR #4140 * fix link and typos * Update PR template * Update TLS bootstrapping with 1.7 features This includes documenting the new CSR approver built into the controller manager and the kubelet alpha features for certificate rotation. Since the CSR approver changed over the 1.7 release cycle we need to call out the migration steps for those using the alpha feature. This document as a whole could probably use some updates, but the main focus of this PR is just to get these features minimally documented before the release. * Federated ClusterSelector formatting updates from review * complete PR #4181 (#4223) * complete PR #4181 * fix security link * Extensible admission controller (#4092) * extensible-admission-controllers * Update extensible-admission-controllers.md * more on initializers * fixes * Expand external admission webhooks documentation * wrap at 80 chars * more * add reference * Use correct apigroup for network policy * Docs changes to PR #4092 (#4224) * Docs changes to PR #4092 * address feedback * add doc for --as-group in cli Add doc for this pr: kubernetes/kubernetes#43696
* Minor fixes in the Deployment doc Signed-off-by: Michail Kargakis <mkargaki@redhat.com> * add NodeRestriction to admission-controllers (#3842) * Admins Can Configure Zones in Storage Class The PR #38505 (kubernetes/kubernetes#38505) added zones optional parameter to Storage Class for AWS and GCE provisioners. That's why documentation needs to be updated accordingly. * document custom resource definitions * add host paths to psp (#3971) * add host paths to psp * add italics * Update ConfigMap doc to explain TTL-based cache updates (#3989) * Update ConfigMap doc to explain TTL-based cache updates * swap word order Change "When a ConfigMap being already consumed..." to "When a ConfigMap already being consumed..." * Update NetworkPolicy docs for v1 * StorageOS Volume plugin * Update GPU docs * docs: HPA autoscaling/v2alpha1 status conditions This commit documents the new status conditions feature for HPA autoscaling/v2alpha1. It demonstrates how to get the status conditions using `kubectl describe`, and how to interpret them. * Update description about NodeRestriction kubelet node can alse create mirror pods for their own static pods. * adding storage as a supported resource to node allocatable Signed-off-by: Vishnu kannan <vishnuk@google.com> * Add documentation for podpreset opt-out annotation This adds the annotation for having the podpreset admission controller to skip (opt-out) manipulating the pod spec. Also, the annotation format for what presets have acted on a pod has been modified to add a prefix of "podpreset-". The new naming makes it such that there is no chance of collision with the newly introduced opt-out annotation (or future ones yet to be added). Opt-out annotation PR: kubernetes/kubernetes#44965 * Update PDB documentation to explain new field (#3885) * update-docs-pdb * Addressed erictune@'s comments * Fix title and add a TOC to the logging concept page * Patch #4118 for typos * Describe setting coredns server in nameserver resolv chain * Address comments in PR #3997. Comment is in https://github.com/kubernetes/kubernetes.github.io/pull/3997/files/f6eb59c67e28efc298c87b1ef49a96bc6adacd1e#diff-7a14981f3dd8eb203f897ce6c11d9828 * Update task for DaemonSet history and rollback (#4098) * Update task for DaemonSet history and rollback Also remove mentions of templateGeneration field because it's deprecated * Address comments * removed lt and gt as operators (#4152) * removed lt and gt as operators * replace lt and gt for node-affinfity * updated based on bsalamat review * Initial draft of upgrade guide for kubeadm clusters. In-place upgrades are supported between 1.6 and 1.7 releases. Rollback instructions to come in a separate commit. Fixes kubernetes/kubeadm#278 * Add local volume documentation (#4050) * Add local volume documentation * Add PV local volume example * Patch PR #3999 * Add documentation for Stackdriver event exporter * Add documentation about controller metrics * Federation: Add task for setting up placement policies (#4075) * Add task for setting up placement policies * Update version of management sidecar in policy engine deployment * Address @nikhiljindal's comments - Lower case filenames - Comments in policy - Typo fixes - Removed type LoadBalancer from OPA Service * Add example that sets cluster selector Per-@nikhiljindal's suggestion * Fix wording and templating per @chenopis * PodDisruptionBudget documentation Improvements (#4140) * Changes from #3885 Title: Update PDB documentation to explain new field Author: foxish * Added Placeholder Disruptions Concept Guide New file: docs/concepts/workloads/pods/disruptions.md Intented contents: concept for Pod Disruption Budget, cross reference to Eviction and Preemption docs. Linked from: concepts > workloads > pods * Added placeholder Configuring PDB Task New file: docs/tasks/run-application/configure-pdb.md Intented contents: task for writing a Pod Disruption Budget. Linked from: tasks > configuring-applications > configure pdb. * Add refs to the "drain a node" task. * Refactor PDB docs. Move the "Requesting an eviction" section from: docs/tasks/administer-cluster/configure-pod-disruption-budget.md -- which is going away -- to: docs/tasks/administer-cluster/safely-drain-node.md The move is verbatim, except for an introductory sentence. Also added assignees. * Refactor of PDB docs Moved the section: Specifying a PodDisruptionBudget from: docs/tasks/administer-cluster/configure-pod-disruption-budget.md to: docs/tasks/run-application/configure-pdb.md because that former file is going away. Move is verbatim. * Explain how Eviction tools should handle failures * Refactor PDB docs Move text from: docs/tasks/administer-cluster/configure-pod-disruption-budget.md to: docs/concepts/workloads/pods/disruptions.md Delete the now empty: docs/tasks/administer-cluster/configure-pod-disruption-budget.md Added a redirects_from section to the new doc, containing the path of the now-deleted doc, plus all the redirects from the deleted doc. * Expand PDB Concept guide Building on a little content from the old task, greatly expanded the Disruptions concept guide, including an abstract example. * Update creating a pdb Task. * Address review comments. * Fixed for all cody-clark's review comments * Address review comments from mml * Address review comments from maisem * Fix missing backtick * Api and Kubectl reference docs updates for 1.7 (#4193) * Fix includes groups * Generated kubectl docs for 1.7 * Generated references docs for 1.7 api * Document node authorization mode * API Aggregator (#4173) * API Aggregator * Additional bullet points * incorporated feedback for apiserver-aggregation.md * split setup-api-aggregator.md into two docs and address feedback * fix link * addressed docs feedback * incorporate feedback * integrate feedback * Add documentation for DNS stub domains (#4063) * Add documentation for DNS stub domains * add additional prereq * fix image path * review feedback * minor grammar and style nits * documentation for using hostAliases to manage hosts file (#4080) * documentation for using hostAliases to manage hosts file * add to table of contents * review comments * update the right command to see hosts file * reformat doc based on suggestion and change some wording * Fix typo for #4080 * Patch PR #4063 * Fix wording in placement policy task introduction * Add update to statefulset concepts and basic tutorial (#4174) * Add update to statefulset concpets and basic tutorial * Address tech comments. * Update ESIPP docs for new added API fields * Custom resource docs * update audit document with advanced audit features added in 1.7 * kubeadm v1.7 documentation updates (#4018) * v1.7 updates for kubeadm * Address review comments * Address Luke's comments * Encrypting secrets at rest and cluster security guide * Edits for Custom DNS Documentation (#4207) * reorganize custom dns doc * format fixes * Update version numbers to 1.7 * Patch PR #4140 (#4215) * Patch PR #4140 * fix link and typos * Update PR template * Update TLS bootstrapping with 1.7 features This includes documenting the new CSR approver built into the controller manager and the kubelet alpha features for certificate rotation. Since the CSR approver changed over the 1.7 release cycle we need to call out the migration steps for those using the alpha feature. This document as a whole could probably use some updates, but the main focus of this PR is just to get these features minimally documented before the release. * Federated ClusterSelector formatting updates from review * complete PR #4181 (#4223) * complete PR #4181 * fix security link * Extensible admission controller (#4092) * extensible-admission-controllers * Update extensible-admission-controllers.md * more on initializers * fixes * Expand external admission webhooks documentation * wrap at 80 chars * more * add reference * Use correct apigroup for network policy * Docs changes to PR #4092 (#4224) * Docs changes to PR #4092 * address feedback * add doc for --as-group in cli Add doc for this pr: kubernetes/kubernetes#43696
This is a work-in-progress guide for upgrading kubeadm clusters from 1.6.x to 1.7.x.
Based on SIG feedback, I'm aiming for an initial MVP to get committed so that others can help iterate on it.
Fixes kubernetes/kubeadm#278
@kubernetes/sig-cluster-lifecycle-pr-reviews
This change is