-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add validation for supported index version on node join, restore, upgrade & open index #21830
Add validation for supported index version on node join, restore, upgrade & open index #21830
Conversation
Today we can easily join a cluster that holds an index we don't support since we currently allow rolling upgrades from 5.x to 6.x. This commit adds additional safety that fails cluster state validation if there is an open index with an incompatible index version created in the cluster. Realtes to elastic#21670
@bleskes can you look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code LGTM and does what was, I think, intended.
However I have some concerns about the approach:
- Why allow closed indices? Someone can just open them and then we're in trouble no? I think it's better to prevent joining and give people the option to open the indices and snapshot them/reindex them/delete them.
- I think we also need to deal with dangling indices
- Because of dangling indices, we need to do this validation on the cluster state update thread. we can maybe "preflight" it before the node join but we can still end up in trouble (and the preflight can happen on the master, there is no need to go back the the node for this).
An alternative to all of this (but more complex) is to have allocation deciders prevent these indices from being allocated to 6.x nodes and auto close them as soon as a 6.x master is elected. Although the question than becomes - what can people do once they are closed? we don't allow downgrading a data folder anyway (as we may have re-written the state).
I completely agree, I will remove that. I think this is not a complete solution but rather a building block towards a more complete solution. The dangling indices case should be done as well and I think we should just not import dangling indices if we have a node in the cluster that doesn't support it but that is a different change / PR - ok with that?
I think we can't do that since we then need to be able to understand mappings etc for these indices as well even though we don't support the index. |
Sure thing. Do note that we have to do this join validation on the cluster state thread for it to be effective.
I agree it's a rabbit hole. I tend to say "no mixed version clusters unless all metadata/data is supported by both versions". |
@bleskes pushed a new commit |
@bleskes are you ok with pushing this? |
@s1monw sorry - did some other stuff. Did you see my comment about doing this on the cluster state thread? To be clear - on the master |
yeah but I thought we do this in another PR? |
…tate thread to ensure we don't have a race
@bleskes I pushed new commits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx @s1monw . LGTM
@elasticmachine test this please |
@bleskes I had to add some more stuff to ensure we are sane in a bunch of other places can you take another look |
@elasticmachine test this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some minor suggestion. The one main one is how we treat Coordinating Nodes. Last, I think the PR title is out date now? it's more than just join?
@@ -160,12 +161,14 @@ public ClusterState execute(ClusterState currentState) { | |||
MetaData.Builder mdBuilder = MetaData.builder(currentState.metaData()); | |||
ClusterBlocks.Builder blocksBuilder = ClusterBlocks.builder() | |||
.blocks(currentState.blocks()); | |||
Version minIndexCompatibilityVersion = currentState.getNodes().getSmallestNonClientNodeVersion() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm... I'm wondering if we should exclude the non-data non-master nodes here. They do all the coordination and might rely on some metadata they can't digest? We're still have to keep the transport client around, but I don't think we should exclude ClientNode (now called coordinating nodes) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This holds for other places in the code as well...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think coordinating nodes do rely on any metadata they can't digest. They don't parse mappings afaik (I am convinced they should not) which is the main thing here so I thin we are good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the actual question is why do we use the smallest, we have to use the largest node in the cluster since that is the one relevant for the index version to be opened.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yikes. Of course.
@@ -92,21 +92,22 @@ boolean isUpgraded(IndexMetaData indexMetaData) { | |||
} | |||
|
|||
/** | |||
* Elasticsearch 5.0 no longer supports indices with pre Lucene v5.0 (Elasticsearch v2.0.0.beta1) segments. All indices | |||
* that were created before Elasticsearch v2.0.0.beta1 should be reindexed in Elasticsearch 2.x | |||
* Elasticsearch 6.0 no longer supports indices with pre Lucene v5.0 (Elasticsearch v5.0.0.beta1) segments. All indices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we lead with ES and make lucene second? it's a bit weird now as ES v5 uses lucene 6.0. How about "ES 6.0 no longer supports indices created by an ES version earlier than 5.0" and leave it at that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seriously?
private void checkSupportedVersion(IndexMetaData indexMetaData, Version minimumIndexCompatibilityVersion) { | ||
if (indexMetaData.getState() == IndexMetaData.State.OPEN && isSupportedVersion(indexMetaData, | ||
minimumIndexCompatibilityVersion) == false) { | ||
throw new IllegalStateException("The index [" + indexMetaData.getIndex() + "] was created before v5.0.0.beta1." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any way we could bake the minimumIndexCompatibilityVersion into the message ? I'm afraid we'll forget in the future and the error message will be wrong.
@@ -98,7 +98,7 @@ public void testRestoreOldSnapshots() throws Exception { | |||
for (Version v : VersionUtils.allReleasedVersions()) { | |||
if (VersionUtils.isSnapshot(v)) continue; // snapshots are unreleased, so there is no backcompat yet | |||
if (v.isRelease() == false) continue; // no guarantees for prereleases | |||
if (v.onOrBefore(Version.V_2_0_0_beta1)) continue; // we can only test back one major lucene version | |||
if (v.onOrBefore(Version.V_5_0_0_beta1)) continue; // we can only test back one major lucene version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering - we didn't you go with Version.CURRENT.minimumIndexCompatibilityVersion()
here?
@@ -96,7 +97,7 @@ public void testFailUpgrade() { | |||
.put(IndexMetaData.SETTING_VERSION_CREATED, Version.fromString("2.1.0")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Should we bump these to realistic current versions? i.e., 5.0.0 for index created and Version.V_6_0_0. minimumIndexCompatibilityVersion
@bleskes I pushed new commits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me. Thanks @s1monw
…rade & open index (elastic#21830) Today we can easily join a cluster that holds an index we don't support since we currently allow rolling upgrades from 5.x to 6.x. Along the same lines we don't check if we can support an index based on the nodes in the cluster when we open, restore or metadata-upgrade and index. This commit adds additional safety that fails cluster state validation, open, restore and /or upgrade if there is an open index with an incompatible index version created in the cluster. Realtes to elastic#21670
…rade & open index (#21830) (#21914) Today we can easily join a cluster that holds an index we don't support since we currently allow rolling upgrades from 5.x to 6.x. Along the same lines we don't check if we can support an index based on the nodes in the cluster when we open, restore or metadata-upgrade and index. This commit adds additional safety that fails cluster state validation, open, restore and /or upgrade if there is an open index with an incompatible index version created in the cluster. Realtes to #21670
…23311) When a node wants to join a cluster, it sends a join request to the master. The master then sends a join validation request to the node. This checks that the node can deserialize the current cluster state that exists on the master and that it can thus handle all the indices that are currently in the cluster (see #21830). The current code can trip an assertion as it does not take the cluster state as is but sets itself as the local node on the cluster state. This can result in an inconsistent DiscoveryNodes object as the local node is not yet part of the cluster state and a node with same id but different address can still exist in the cluster state. Also another node with the same address but different id can exist in the cluster state if multiple nodes are run on the same machine and ports have been swapped after node crashes/restarts.
…23311) When a node wants to join a cluster, it sends a join request to the master. The master then sends a join validation request to the node. This checks that the node can deserialize the current cluster state that exists on the master and that it can thus handle all the indices that are currently in the cluster (see #21830). The current code can trip an assertion as it does not take the cluster state as is but sets itself as the local node on the cluster state. This can result in an inconsistent DiscoveryNodes object as the local node is not yet part of the cluster state and a node with same id but different address can still exist in the cluster state. Also another node with the same address but different id can exist in the cluster state if multiple nodes are run on the same machine and ports have been swapped after node crashes/restarts.
…23311) When a node wants to join a cluster, it sends a join request to the master. The master then sends a join validation request to the node. This checks that the node can deserialize the current cluster state that exists on the master and that it can thus handle all the indices that are currently in the cluster (see #21830). The current code can trip an assertion as it does not take the cluster state as is but sets itself as the local node on the cluster state. This can result in an inconsistent DiscoveryNodes object as the local node is not yet part of the cluster state and a node with same id but different address can still exist in the cluster state. Also another node with the same address but different id can exist in the cluster state if multiple nodes are run on the same machine and ports have been swapped after node crashes/restarts.
…23311) When a node wants to join a cluster, it sends a join request to the master. The master then sends a join validation request to the node. This checks that the node can deserialize the current cluster state that exists on the master and that it can thus handle all the indices that are currently in the cluster (see #21830). The current code can trip an assertion as it does not take the cluster state as is but sets itself as the local node on the cluster state. This can result in an inconsistent DiscoveryNodes object as the local node is not yet part of the cluster state and a node with same id but different address can still exist in the cluster state. Also another node with the same address but different id can exist in the cluster state if multiple nodes are run on the same machine and ports have been swapped after node crashes/restarts.
Today we fail the node at startup if it contains an index that is too old to be compatible with the current version, unless that index is closed. If the index is closed then the node will start up and this puts us into a bad state: the index cannot be opened and must be reindexed using an earlier version, but we offer no way to get that index into a node running an earlier version so that it can be reindexed. Downgrading the node in-place is decidedly unsupported and cannot be expected to work since the node already started up and upgraded the rest of its metadata. Since elastic#41731 we actively reject downgrades to versions ≥ v7.2.0 too. This commit prevents the node from starting in the presence of any too-old indices (closed or not). In particular, it does not write any upgraded metadata in this situation, increasing the chances an in-place downgrade might be successful. We still actively reject the downgrade using elastic#41731, because we wrote the node metadata file before checking the index metadata, but at least there is a way to override this check. Relates elastic#21830, elastic#44230
Today we fail the node at startup if it contains an index that is too old to be compatible with the current version, unless that index is closed. If the index is closed then the node will start up and this puts us into a bad state: the index cannot be opened and must be reindexed using an earlier version, but we offer no way to get that index into a node running an earlier version so that it can be reindexed. Downgrading the node in-place is decidedly unsupported and cannot be expected to work since the node already started up and upgraded the rest of its metadata. Since #41731 we actively reject downgrades to versions ≥ v7.2.0 too. This commit prevents the node from starting in the presence of any too-old indices (closed or not). In particular, it does not write any upgraded metadata in this situation, increasing the chances an in-place downgrade might be successful. We still actively reject the downgrade using #41731, because we wrote the node metadata file before checking the index metadata, but at least there is a way to override this check. Relates #21830, #44230
Today we can easily join a cluster that holds an index we don't support since
we currently allow rolling upgrades from 5.x to 6.x. Along the same lines we don't check if we can support an index based on the nodes in the cluster when we
open
,restore
ormetadata-upgrade
and index. This commit addsadditional safety that fails cluster state validation, open, restore and /or upgrade if there is an open index with an incompatible index version created in the cluster.
Realtes to #21670