Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support seamless rollback 1 minor revision #7308

Closed
mml opened this issue Feb 11, 2017 · 16 comments
Closed

support seamless rollback 1 minor revision #7308

mml opened this issue Feb 11, 2017 · 16 comments
Assignees
Milestone

Comments

@mml
Copy link

mml commented Feb 11, 2017

Motivation

This is a feature request to make rolling out upgrades safer, specifically for kubernetes cluster admins, but this could easily apply to others. It's often the case that a user may upgrade k8s from N-1 to N. Let's say that we'd like N to also include a change of the bundled etcd from M-1 to M.

k8s vN may include bugs that aren't acceptable. The easiest resolution is usually to roll back to N-1, and to avoid having to test N-1 against two versions of etcd (M-1 and M), we'd prefer that etcd be rolled back to M-1. Another reason we may wish to go back to M-1 is that the source of the bug is etcd itself.

Request

Ideally, the storage format changes between adjacent revisions are such that whatever M is writing to disk is by-design readable by M-1. It may contain new features, but they will be harmlessly ignored in the event of a rollback. One example of an encoding that works like this is proto2, which ignores unknown fields. (This isn't to advocate proto2, just to give an example.)

That said, even if etcd stabilizes on an encoding with these properties, it might want to change that encoding scheme in the future. In that case it makes sense to treat both upgrade and downgrade equally, authoring and testing tools that go in both directions, and testing round-tripping in both directions.

@xiang90 @hongchaodeng @heyitsanthony @wojtek-t

@mml
Copy link
Author

mml commented Feb 11, 2017

@xiang90 asked some questions in another thread.

  1. what kind of rollback (online or offline) do you expect?

If i understand the question correctly, online is best. It's ideal if the changes made to the data on disk don't render that data unreadable by the previous version.

  1. what do you want to preserve after a rollback? discard or preserve data used by any new feature (so that upgrade again the data of the new feature will be available again )?

What's an example of a feature where this question would come up? I think the answer will be to think about the safety and semantics users will expect around rollback of the specific feature, but maybe we can come up with a principle.

  1. what happens if only part of the cluster is rolled back?

What happens today if only a part of the cluster is upgraded? One idea is that nothing happens unless a quorum can be found all sharing the same version.

@xiang90
Copy link
Contributor

xiang90 commented Feb 12, 2017

@mml

Ideally, the storage format changes between adjacent revisions are such that whatever M is writing to disk is by-design readable by M-1.

We already ensure this today. The on-disk entry format is protobuf. Reading an entry with previous pb will result in some unknown values, but will not panic anything.

but they will be harmlessly ignored in the event of a rollback.

What does this mean exactly? Say in version M+1, we introduce a new filed called "Dog". Once we rollback to version M, the "Dog" field will magically disappear. This is probably OK for non-clustered systems. But for etcd, this means you can get "Dog" field from a node running version M+1, but not a node rollbacked to M. What is more magic is that, if you upgrade the node again "Dog" will appear again. Client will see inconsistent state based on which node they talk to.

That said, even if etcd stabilizes on an encoding with these properties, it might want to change that encoding scheme in the future.

I am not super worry about this. Rewriting the WAL or even DB is not hard.

@xiang90
Copy link
Contributor

xiang90 commented Feb 12, 2017

What happens today if only a part of the cluster is upgraded?

this is very different. the cluster cannot use ANY new feature until ALL members are upgraded.

Here is an example

[3.0, 3.0, 3.0] -> only write 3.0 entry and only 3.0 feature
[3.1, 3.1, 3.0] -> still one node on 3.1. so only write 3.0 entry and only 3.0 feature
[3.1, 3.1, 3.1] -> 3.1 is enabled!

If you only downgrade one member, it breaks the rule and client will see inconsistent state "unavoidably".

T1 [3.1, 3.1, 3.1] -> 3.1 is enabled!
T2 [3.1, 3.1, 3.0] -> 3.1 is disabled since we detect a 3.0 back online.

However, T2 becomes a magic timestamp... Clients will get confused... Why a filed is cleared?!

I would love to hear how similar systems work for downgrade path. Any open source clustered system or google internal examples would be super helpful.

@xiang90 xiang90 self-assigned this Feb 12, 2017
@mml
Copy link
Author

mml commented Feb 14, 2017

What are some concrete examples of user-visible fields that etcd has introduced or plans to introduce?

@xiang90
Copy link
Contributor

xiang90 commented Feb 14, 2017

What are some concrete examples of user-visible fields that etcd has introduced or plans to introduce?

For example, we introduced PrevKV field to return the previous value of a key when modifying a key, which can save one round trip time. We introduce some fields in the range request for querying revision ranges. There are more fields we might introduce in the future. Also there might be new APIs, like native locking API.

@heyitsanthony heyitsanthony added this to the v3.2.0 milestone Feb 14, 2017
@xiang90
Copy link
Contributor

xiang90 commented Feb 27, 2017

@mml Any update on this issue?

@xiang90
Copy link
Contributor

xiang90 commented Mar 22, 2017

@mml

I do not think this can happen in 3.2 timeframe. But I REALLY want to sort this out to make Kubernetes users happy. Please ping us when you guys have time.

I am looping some other people in since I feel they are interested in this as well.

/cc @justinsb @wojtek-t @timothysc

@xiang90
Copy link
Contributor

xiang90 commented Oct 4, 2017

move this to 3.4

@gyuho
Copy link
Contributor

gyuho commented Sep 25, 2018

Updates: @wenjiaswe is working on this. We are reviewing the design doc.

@philips
Copy link
Contributor

philips commented Oct 23, 2018

Where is the doc?

@wenjiaswe
Copy link
Contributor

@philips
Here is the etcd downgrade design doc.
Here is the tracking issue on etcd: Improve etcd upgrade/downgrade policy and tests.
And here is the link to the etcd-dev topic: etcd downgrade design document ready for review.

Any suggestion or comments are welcome and appreciated!

@gyuho gyuho added this to the etcd-v3.5 milestone Aug 5, 2019
@wenjiaswe
Copy link
Contributor

wenjiaswe commented Oct 10, 2019

assign @YoyinZyc

@stale
Copy link

stale bot commented Apr 6, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Apr 6, 2020
@stale stale bot closed this as completed Apr 28, 2020
@zaisongz
Copy link

zaisongz commented Sep 2, 2020

@wenjiaswe i could not open the downgrade documents, did you have a new link?

@wenjiaswe
Copy link
Contributor

@zaisongz I just checked the design doc linked here: https://docs.google.com/document/d/1mSihXRJz8ROhXf4r5WrBGc8aka-b8HKjo2VDllOc6ac/edit#heading=h.e4jdx621yd8s. It is shared with Anyone on the internet with this link. Could you check again and let me know if you can't open it?

@zaisongz
Copy link

zaisongz commented Sep 2, 2020

@wenjiaswe just realized it is casued by the my proxy setting, everthing is good now. Thanks a lot for your quick response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

8 participants