Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds Index lifecycle feature #35193

Merged
merged 484 commits into from
Nov 2, 2018
Merged

Adds Index lifecycle feature #35193

merged 484 commits into from
Nov 2, 2018

Conversation

colings86
Copy link
Contributor

This feature allows users to create policies to define actions that should be taken on an index at different points in its lifecycle including rollover, forcemerge, shrink, shard allocation filtering, changeing the number of replicas and deleting the index.

colings86 and others added 30 commits June 14, 2018 11:33
)

There is a problematic scenario with x-pack-cluster master-nodes
attempting to install custom metadata into the cluster-state and
broadcasting that to non-x-pack-enabled nodes. Since those nodes
are not aware of this custom metadata, their cluster-state recovery
will be broken. This change ensures that newly-elected x-pack master
nodes bootstrap IndexLifecycleMetadata upon the first request to
leverage its features. This means that PutLifecycleAction is
now responsible for installing the metadata. Since this X-Pack API
can only be called once all nodes in the cluster have x-pack enabled,
it is safe to assume that the cluster will appropriately handle the
cluster-state recovery with the new set of index-lifecycle metadata.
This PR introduces a concept of a maintenance mode for the
lifecycle service. During maintenance mode, no policies are
executed.

To be placed into maintenance mode, users must first issue a
request to be placed in maintenance mode. Once the service
is assured that no policies are in actions that are not to be
interrupted (like ShrinkAction), the service will place itself
in maintenance mode.

APIs to-be introduced:

- POST _xpack/index_lifecycle/maintenance/_request
   - issues a request to be placed into maintenenance mode.
     This is not immediate, since we must first verify that
     it is safe to go from REQUESTED -> IN maintenance mode.
- POST _xpack/index_lifecycle/maintenance/_stop
   - issues a request to be taken out (this is immediate)
- GET _xpack/index_lifecycle/maintenance
   - get back the current mode our lifecycle management is in
ILM was rendering exceptions using the exception helper that
basically ignores simply rendering non-elasticsearch exceptions when
no details are desired. This commit updates the method used to still be
a rather simple rendering of the exception. The rendering lacks
all the causes, but does a sufficient job in rendering the top-level
message for one of our most expected exceptions... IllegalArgumentException
* Adds an API to remove ILM from an index completely

The removal will only happen if the index is not in the shrink action

* Fixes compile issues
* Adds ability to update a policy as long as no indexes are in the shrink
action

* Address review comments
Indices that are rolled over preserve their origin index.lifecycle.date
as the time the index was created. Although this is also fine, it would
make more sense to start the timer for moving to the warm phase from the
time it was rolled over so that we capture the time elapsed from the
latest data, as opposed to when the index was created.

This commit adds an extra step within the RolloverAction that
extracts the index.creation_date of the newly created index from the RolloverStep
and sets that as the index.lifecycle.date of the rolled-over index that is
waiting for its next phase.
- POST _xpack/index_lifecycle/_stop
   - issues a request to be placed into STOPPED mode (maintenance mode).
     This is not immediate, since we must first verify that
     it is safe to go from STOPPING -> STOPPED.
- POST _xpack/index_lifecycle/_start
   - issues a request to be placed back into RUNNING mode (immediately)
- GET _xpack/index_lifecycle/_status
   - get back the current mode our lifecycle management is in

- update task was hardened to support uninstalled metadata
- if no metadata is installed, the start/stop actions will install metadata
  and proceed to try and change it (default start mode is RUNNING)
- rename MAINTENANCE -> STOPPED, MAINTENANCE_REQUESTED -> STOPPING, NORMAL -> RUNNING

follow-up to #31164.
IndexShard should not return null stats - empty stats or AlreadyCloseException if it's closed is better
TransportAction currently contains 2 doExecute methods, one which takes
a the task, and one that does not. The latter is what some subclasses
implement, while the first one just calls the latter, dropping the given
task. This commit combines these methods, in favor of just always
assuming a task is present.
talevy and others added 16 commits October 29, 2018 14:03
Previously, if ClusterStateActionSteps or ClusterStateWaitSteps threw an
exception executing, the exception would only be caught and logged by
the generic ClusterStateUpdateTask machinery and the index would become
stuck on that step.

Now, exceptions thrown in these steps will be caught and the index will
be moved to the Error step.
This PR renames the CRUD APIS for ILM

GET _ilm/<policy>, _ilm -> _ilm/policy/<policy>, _ilm/policy
PUT _ilm/<policy> -> _ilm/policy/<policy>
DELETE _ilm/<policy> -> _ilm/policy/<policy>

closes #34929.
… alias (#35065)

The ILM Rollover Step can execute on the incorrect index if the rollover alias
exists on another valid index, but not the one the step is executing against. This
is a problem and is now guarded against
…ef (#35070)

* [DOCS] Fixed edit links for ILM APIs and added the APIs to the REST API section.

* [DOCS] Fixed link to ILM APIs.
the settings variable was previously created by
the AbstractComponent class inherited by IndexLifecycleService.
this is no more.
This commit does a few things

- moves ILM-specifc rest yaml tests into plugin/ilm/qa, and creates special
  :plugin:ilm:qa:rest module to test them
- removes the with-security tests of the yaml tests since they are covered in
  the rest tests now
- moves ChangePolicyforIndexIT into the qa/multi-node project since that test is
  not currently running in main ilm since integTest is disabled
ILM's Shrink Action was using a nodes "_name" attribute to
allocate to prepare for the shrink step. Since the name is
configurable by a user and may use the same name for
multiple nodes on one machine, _id is safer since it is guaranteed
to be unique.

closes #35043.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@colings86 colings86 merged commit fc6e1f7 into master Nov 2, 2018
@colings86
Copy link
Contributor Author

@codebrain The feature adds some new APIs to manage and retrieve information about ILM. I think the best place to see what those APIs are and how to support them would be from the REST API spec here: https://github.com/elastic/elasticsearch/tree/master/x-pack/plugin/src/test/resources/rest-api-spec/api (all the ilm apis we added have files starting with ilm.) and the API reference documentation here: https://www.elastic.co/guide/en/elasticsearch/reference/6.6/index-lifecycle-management-api.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.