-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix safe upgrade #2256
Fix safe upgrade #2256
Conversation
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA. It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Wrote an email with issues regarding CLA verification, but perhaps you can check request while it is being fixed. |
How's the initial installation is working without a token? |
@ant31 so before this commit the process is a follows:
With this commit, we can actually remove I will update that piece of code, test it on cluster again and come back. |
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA. It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
8bfd885
to
ea5d8de
Compare
@ant31 I updated the code and here is a full testing process. Deployment to clean VMs:
See cluster update in progress one by one:
I checked the code and uncordone code is executed only for worker nodes, so I fixed it as well as it is part of safe upgrade process. To double check that it wasn't some unlucky error with cordoning, I run just pre-upgrade and post-upgrade tags (which do cordon and uncordon actions) on master nodes:
Master nodes were still reporting Please let me know if I can improve it more or it can be merged. |
@mlushpenko thanks for the detailed explanation and all the tests you made! |
ci check this |
run_once: true | ||
register: temp_token | ||
delegate_to: "{{ groups['kube-master'][0] }}" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
30:1 error trailing spaces (trailing-spaces)
https://gitlab.com/kargo-ci/kubernetes-incubator__kubespray/-/jobs/51312023
ea5d8de
to
56b311c
Compare
@ant31 sorry, wasn't using proper editor :) please check now |
Hi @ant31, any update on this? I know you may be busy but also could happen that you just missed my previous notification among all others. |
Hi, @mlushpenko can you please update example inventory kubeadm group_vars |
@chapsuk done, anything else? |
sorry @mlushpenko it need a rebase to solve the conflict and I ll righ after. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
thanks @ant31, looking forward :) |
@mlushpenko merge is blocked because inventory/group_vars/all.yml is in conflict |
@ant31 yes, I get it, but I can't do it from my side and just need to wait until you do rebase. No problem. |
@mlushpenko the rebase have to be done on your branch.
|
Even though there it kubeadm_token_ttl=0 which means that kubeadm token never expires, it is not present in `kubeadm token list` after cluster is provisioned (at least after it is running for some time) and there is issue regarding this kubernetes/kubeadm#335, so we need to create a new temporary token during the cluster upgrade.
Tokens are generated automatically during init process and on-demand for nodes joining process
08fedce
to
a37c642
Compare
@ant31 thanks, my first PR as you may have guessed.. |
thanks :) |
Problem
A kubespray cluster is running for some time and you want to safely update it to the newer version using
upgrade_cluster.yml
It will fail during
[kubernetes/kubeadm : Join to cluster if needed]
with error:Expected result
kubeadm join
will succeed askubeadm_token_ttl
is set to 0 which means that token should never expire, but it is not present inkubeadm token list
after cluster is provisioned (at least after it is running for some time)Related issues
kubernetes/kubeadm#335
Solution
Create a new temporary token before the
kubeadm join
commandRefactoring issues
Not sure what to do with
kubeadm_token
andkubeadm_token_ttl
that are defined inroles\kubespray-defaults\defaults\main.yml
. The code I added doesn't really breake anything as much as I tested, but looks likekubeadm_token_ttl
is not respected, so perhaps it can be removed.kubeadm_token
is also used for master config, so can stay untouched but it's a bit weird that that token is not used then duringkubeadm join
because I override it with newly generated one. Please suggest if you have ideas how to optimize it.