Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move IndexLifecycleMetadata installation to put-lifecycle-action #31346

Merged
merged 2 commits into from
Jun 15, 2018

Conversation

talevy
Copy link
Contributor

@talevy talevy commented Jun 14, 2018

There is a problematic scenario with x-pack-cluster master-nodes
attempting to install custom metadata into the cluster-state and
broadcasting that to non-x-pack-enabled nodes. Since those nodes
are not aware of this custom metadata, their cluster-state recovery
will be broken. This change ensures that newly-elected x-pack master
nodes bootstrap IndexLifecycleMetadata upon the first request to
leverage its features. This means that PutLifecycleAction is
now responsible for installing the metadata. Since this X-Pack API
can only be called once all nodes in the cluster have x-pack enabled,
it is safe to assume that the cluster will appropriately handle the
cluster-state recovery with the new set of index-lifecycle metadata.

in response to rolling upgrade test failure here: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+index-lifecycle+feature-branch/69/

There is a problematic scenario with x-pack-cluster master-nodes
attempting to install custom metadata into the cluster-state and
broadcasting that to non-x-pack-enabled nodes. Since those nodes
are not aware of this custom metadata, their cluster-state recovery
will be broken. This change ensures that newly-elected x-pack master
nodes bootstrap IndexLifecycleMetadata upon the first request to
leverage its features. This means that PutLifecycleAction is
now responsible for installing the metadata. Since this X-Pack API
can only be called once all nodes in the cluster have x-pack enabled,
it is safe to assume that the cluster will appropriately handle the
cluster-state recovery with the new set of index-lifecycle metadata.
@talevy talevy added review :Data Management/ILM+SLM Index and Snapshot lifecycle management labels Jun 14, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

private LongSupplier nowSupplier;
private SchedulerEngine.Job scheduledJob;
private IndexLifecycleRunner lifecycleRunner;

public IndexLifecycleService(Settings settings, Client client, ClusterService clusterService, Clock clock,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ThreadPool is no longer used since the service does not issue a cluster-state-update to install metadata anymore

@@ -59,6 +59,8 @@ public PolicyStepsRegistry() {
@SuppressWarnings({ "unchecked", "rawtypes" })
public void update(ClusterState currentState, Client client, LongSupplier nowSupplier) {
IndexLifecycleMetadata meta = currentState.metaData().custom(IndexLifecycleMetadata.TYPE);
assert meta != null : "IndexLifecycleMetadata cannot be null when updating the policy steps registry";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen this pattern of assertion elsewhere in the codebase. I am not sure how I feel about it, but I chose to try it out here to be self-documenting. Currently this method is only ever called once, and it is behind a null check

@@ -32,11 +32,12 @@
import java.util.SortedMap;
import java.util.TreeMap;

public class TransportDeleteLifcycleAction extends TransportMasterNodeAction<Request, Response> {
public class TransportDeleteLifecycleAction extends TransportMasterNodeAction<Request, Response> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missed an existing typo

@@ -74,7 +75,8 @@ public ClusterState execute(ClusterState currentState) {
}
ClusterState.Builder newState = ClusterState.builder(currentState);
IndexLifecycleMetadata currentMetadata = currentState.metaData().custom(IndexLifecycleMetadata.TYPE);
if (currentMetadata.getPolicyMetadatas().containsKey(request.getPolicyName()) == false) {
if (currentMetadata == null
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this is an anonymous class, we do not have good unit test coverage, so I've added a yaml test to walk this branch

@@ -64,6 +68,9 @@ protected Response newResponse(boolean acknowledged) {
public ClusterState execute(ClusterState currentState) throws Exception {
ClusterState.Builder newState = ClusterState.builder(currentState);
IndexLifecycleMetadata currentMetadata = currentState.metaData().custom(IndexLifecycleMetadata.TYPE);
if (currentMetadata == null) { // first time using index-lifecycle feature, bootstrap metadata
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

existing yaml tests should walk this branch

@@ -6,6 +6,16 @@ setup:

---
"Test Basic Policy CRUD":
- do:
catch: missing
xpack.index_lifecycle.get_lifecycle:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added to walk the branch in TransportGetLifecycleAction that checks for null metadata

@talevy
Copy link
Contributor Author

talevy commented Jun 14, 2018

woohoo elasticsearch-ci went green!

Copy link
Contributor

@colings86 colings86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think TransportRetryAction and TransportSetPolicyForIndexAction will also need to handle the case where the metadata is null?

@talevy
Copy link
Contributor Author

talevy commented Jun 15, 2018

@colings86 good catch. I've updated set-policy-action, but I do not see it for RetryAction or MoveToStepAction because they deal with the steps registry directly. the rest of the actions like explain seem to deal with settings directly, as well.

Copy link
Contributor

@colings86 colings86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@talevy talevy merged commit 80ab773 into elastic:index-lifecycle Jun 15, 2018
@talevy talevy deleted the ilm-put-installmetadata branch June 15, 2018 15:23
jasontedor pushed a commit that referenced this pull request Aug 17, 2018
)

There is a problematic scenario with x-pack-cluster master-nodes
attempting to install custom metadata into the cluster-state and
broadcasting that to non-x-pack-enabled nodes. Since those nodes
are not aware of this custom metadata, their cluster-state recovery
will be broken. This change ensures that newly-elected x-pack master
nodes bootstrap IndexLifecycleMetadata upon the first request to
leverage its features. This means that PutLifecycleAction is
now responsible for installing the metadata. Since this X-Pack API
can only be called once all nodes in the cluster have x-pack enabled,
it is safe to assume that the cluster will appropriately handle the
cluster-state recovery with the new set of index-lifecycle metadata.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/ILM+SLM Index and Snapshot lifecycle management
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants