Revert "Switch to etcd snap to 3.2" #118

Cynerva · 2018-04-06T19:11:27Z

Reverts #115

Occasionally seeing etcd broken after a charm upgrade to edge:

etcd/0*                   active    idle   1        18.233.158.232  2379/tcp  Errored with 0 known peers
etcd/1                    active    idle   2        54.242.230.117  2379/tcp  Errored with 0 known peers
etcd/2                    active    idle   3        52.54.208.170   2379/tcp  Errored with 0 known peers

This is happening because etcd is automatically and incorrectly upgraded to v3.2.

etcd is in a crash loop, logging this:

panic: recovering backend from snapshot error: database snapshot file path error: snap: snapshot file doesn't exist

This doc indicates that you should upgrade to 3.0 first, and not upgrade to 3.2 until you have v3 data: https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md

NOTE: When migrating from v2 with no v3 data, etcd server v3.2+ panics when etcd restores from existing snapshots but no v3 ETCD_DATA_DIR/member/snap/db file. This happens when the server had migrated from v2 with no previous v3 data. This also prevents accidental v3 data loss (e.g. db file might have been moved). etcd requires that post v3 migration can only happen with v3 data. Do not upgrade to newer v3 versions until v3.0 server contains v3 data.

This issue indicates the same: etcd-io/etcd#9480

We need to revert back to 2.3 as the default so that charm upgrades don't result in a broken cluster. We will have to come up with a new way to make 3.2 the default on new deployments without breaking upgrades.

This reverts commit 84d9083.

ktsakalozos · 2018-04-10T08:04:27Z

I am afraid it is not only the upgrade path to 3.2 that is unstable.

We have incidents where non-upgraded 3.2 releases are misbehaving: https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/541 (also builds in our CI show some flaky behaviour)

Should we also revert canonical/etcd-snaps@9e52ed5 @wwwtyro ?

wwwtyro · 2018-04-10T13:36:04Z

I'm not sure I see why it'd be causing a problem with new installations, but yeah, we should remove the upgrade from everything after 3.1, so we should revert these:

I'll make the reversion PRs.

wwwtyro · 2018-04-10T13:42:33Z

@Cynerva pointed out we should also revert the 3.1 change, so I added that one as well.

Revert "Switch to etcd snap to 3.2 (#115)"

670887b

This reverts commit 84d9083.

kwmonroe approved these changes Apr 6, 2018

View reviewed changes

kwmonroe merged commit a54b6a2 into master Apr 6, 2018

kwmonroe deleted the revert-115-feature/etcd32 branch April 6, 2018 19:26

This was referenced Apr 10, 2018

Revert "support upgrades from 2.3 to 3.x" canonical/etcd-snaps#10

Merged

Revert "support upgrades from v2.3 to 3.x and fix build on master branch" canonical/etcd-snaps#11

Merged

Revert "support upgrade from 2.3 to 3.x" canonical/etcd-snaps#12

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "Switch to etcd snap to 3.2" #118

Revert "Switch to etcd snap to 3.2" #118

Cynerva commented Apr 6, 2018

ktsakalozos commented Apr 10, 2018

wwwtyro commented Apr 10, 2018

wwwtyro commented Apr 10, 2018

Revert "Switch to etcd snap to 3.2" #118

Revert "Switch to etcd snap to 3.2" #118

Conversation

Cynerva commented Apr 6, 2018

ktsakalozos commented Apr 10, 2018

wwwtyro commented Apr 10, 2018

wwwtyro commented Apr 10, 2018