Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve etcd upgrade/downgrade policy and tests #9306

Closed
gyuho opened this issue Feb 8, 2018 · 5 comments
Closed

Improve etcd upgrade/downgrade policy and tests #9306

gyuho opened this issue Feb 8, 2018 · 5 comments

Comments

@gyuho
Copy link
Contributor

gyuho commented Feb 8, 2018

We don't have enough coverage on upgrades (none for downgrades). Only test case is upgrade from latest release to master branch https://github.com/coreos/etcd/blob/master/e2e/etcd_release_upgrade_test.go where we stop/restart with new versions of etcd (master branch) in CI.

  • Clearly document compatibilities between different versions
  • Early terminate (or warning) on unsafe upgrades/downgrades
  • Add more test cases (or document)
    • What if newer versions of etcd join older versioned cluster, and vice versa?
    • What if newer versions of etcd reboots from snapshot fetched from older-versioned etcd cluster?
    • Downgrading cluster may corrupt data #6457

ref. #7308

/cc @jpbetz @SaranBalaji90

@gyuho gyuho changed the title Improve etcd upgrade policy/tests Improve etcd upgrade/downgrade policy/tests Feb 8, 2018
@gyuho gyuho changed the title Improve etcd upgrade/downgrade policy/tests Improve etcd upgrade/downgrade policy and tests Feb 8, 2018
@SaranBalaji90
Copy link

SaranBalaji90 commented Feb 9, 2018

I was able to upgrade my cluster by performing rolling update. Initially had a 3 node cluster with 3.0.17 version and upgraded it to 3.1.11 by removing old node one by one and adding new node simultaneously. It seems to be working but I'm yet to run e2e test on this new cluster.

Do we have list of things that etcd performs when you drop-in new binary and start etcd again with this new binary?
Also I'm curious while upgrading from 3.0 to 3.2, why do we recommend upgrading to 3.1 first and then to 3.2? Is it because etcd changes some underlying schema that restricts this upgrade or is it just that we haven't tested this yet?

@wenjiaswe
Copy link
Contributor

wenjiaswe commented Oct 16, 2018

I am working on the "etcd downgrad design" documentation. The basic idea is to add a "etcdctl downgrade --target-version" command to initiate downgrade process, which enable the cluster to allow member replacement with target version. More details can be found in the design doc. @gyuho, @xiang90, @jpbetz and @jingyih have reviewed the design doc. I also posted a topic in etcd-dev.

  • add “downgrade” API, to temporarily whitelist lower versions in cluster version check.
  • add basic tests.
  • add other “downgrade” APIs (such as status, cancel).
  • implement unknown WAL log entry handling code.
  • more tests.
  • add downgrade guides (including developer responsibilities).
  • commit final design doc to /docs.

@knisbet Kevin, would you please kindly take a look at the design? We know that you have a lot of valuable experience in this area. Your input will be appreciated.

@wenjiaswe
Copy link
Contributor

cc @YoyinZyc

@tangcong
Copy link
Contributor

#11689 upgrading cluster may cause data corruption.

@stale
Copy link

stale bot commented Jun 9, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jun 9, 2020
@stale stale bot closed this as completed Jul 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants