Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide guidance of "Custom" allocation behavior in ILM #96111

Merged
merged 19 commits into from
May 1, 2021

Conversation

cjcenizal
Copy link
Contributor

@cjcenizal cjcenizal commented Apr 2, 2021

Release note

The Index Lifecycle Management UI makes allocation behavior clearer when you select the "Custom" allocation option, but don't select a node attribute or if no node attributes are available.

Summary

Fixes #91254

This PR intends to clarify allocation behavior to the user when the user selects the "Custom" allocation option, but doesn't select a node attribute or if no node attributes are available.

Refactoring

I significantly refactored the way these conditions are communicated by the code.

  • I found DefaultAllocationNotice and DefaultAllocationWarning hard to reason about, because the names are so similar and I kept forgetting what the difference was. So I renamed DefaultAllocationNotice -> WillUseFallbackTierNotice/WillUseFallbackTierUsingNodeAttributesNotice and DefaultAllocationWarning -> NoTiersAvailableNotice/NoTiersAvailableUsingNodeAttributesNotice to differentiate them and communicate their purpose.
  • I removed some dead code pertaining to the frozen tier since that tier no longer supports allocation, and updated the pertinent types.
  • I extracted out and grouped related condition-based variables like noTiersAvailable, willUseFallbackTier, isUsingDeprecatedDataRoleConfig, and hasNodeAttributes, so you can discern which conditions drive the behavior of DataTierAllocationField at a glance. This also enabled a nice mirror structure of the two logic branches for handling the node_roles and node_attrs selections.

Scenarios

There are four general scenarios requiring different types of guidance.

Condition Behavior Component
1. Data nodes are in use Indices will allocate to data nodes DefaultToDataNodesNotice (the option label also provides guidance)
2. Data tiers are available for the given phase's tier Indices will allocate to data tiers DefaultToDataTiersNotice
3. Data tiers are available but not for the given phase's tier Indices will allocate to a fallback data tier WillUseFallbackTierUsingNodeAttributesNotice
4. No data tiers are available for the given phases's tier or as a fallback Allocation won't occur NoTiersAvailableUsingNodeAttributesNotice

Cloud-specific scenario

I also updated the code to hide the "Allocate to available data nodes" option when the user is on Cloud and node roles are in use, because that means they switched their data nodes over to use node roles.

Steps to test each scenario

Sometimes you'll need to edit data_tier_allocation_field.tsx to indirectly simulate the conditions of the scenario, and sometimes you'll need to edit elasticsearch.yml to directly configure the scenario. If you need to edit elasticsearch.yml:

  • Configure .es/8.0.0/config/elasticsearch.yml with the indicated settings.
  • Every time you configure this file, you need to restart ES.
  • Remember to start ES by running .es/8.0.0/bin/elasticsearch so you don't reinstall the snapshot and lose your config changes.

1. Data nodes are in use, so indices will allocate to data nodes

If there are node attributes available to choose from, the user will receive this guidance from the default option.

/* Line 56 of data_tier_allocation_field.tsx */
// const { nodesByRoles, nodesByAttributes, isUsingDeprecatedDataRoleConfig } = data!;
const nodesByRoles = {};
const nodesByAttributes = { 'key:value' : 'node-id' };
const isUsingDeprecatedDataRoleConfig = true;

image

If there aren't any node attributes, then we'll show the user a callout using the DefaultToDataNodesNotice component.

/* Line 56 of data_tier_allocation_field.tsx */
// const { nodesByRoles, nodesByAttributes, isUsingDeprecatedDataRoleConfig } = data!;
const nodesByRoles = {};
const nodesByAttributes = {};
const isUsingDeprecatedDataRoleConfig = true;

image

2. Data tiers are available for the given phase's tier, so indices will allocate to data tiers

The callout dynamically changes to tell you which tier will be used, depending on the phase it's in, using the DefaultToDataTiersNotice component.

# elasticsearch.yml
node.roles: [master, remote_cluster_client, data_content, data_warm, data_cold]

Warm

image

Cold

image

3. Data tiers are available but not for the given phase's tier, so indices will allocate to a fallback data tier

The callout dynamically changes to tell the user which tier will be used, depending on the phase it's in. It also dynamically changes to tell the user which fallback tier will be used. This uses the WillUseFallbackTierUsingNodeAttributesNotice component.

# elasticsearch.yml
node.roles: [master, remote_cluster_client, data_content, data_hot]

Warm

image

Cold

image

4. No data tiers are available for the given phases's tier or as a fallback, so allocation won't occur

This uses the NoTiersAvailableUsingNodeAttributesNotice component.

# elasticsearch.yml
node.roles: [master, remote_cluster_client, data_content]

image

Cloud-specific scenario

When the user is on Cloud and data nodes are in use, the user can select "Allocate to available data nodes".

/* Line 56 of data_tier_allocation_field.tsx */
// const { nodesByRoles, nodesByAttributes, isUsingDeprecatedDataRoleConfig } = data!;
const nodesByRoles = {};
const nodesByAttributes = { 'key:value' : 'node-id' };
const isUsingDeprecatedDataRoleConfig = true;

const hasNodeAttributes = Boolean(Object.keys(nodesByAttributes ?? {}).length);
const isCloudEnabled = true;//cloud?.isCloudEnabled ?? false;

image

When the user is on Cloud and node roles are available (inferring data nodes are no longer in use), the user can't select "Allocate to available data nodes".

# elasticsearch.yml
node.attr.custom: my-attribute
node.roles: [master, remote_cluster_client, data_content, data_hot]
/* Line 56 of data_tier_allocation_field.tsx */
const { nodesByRoles, nodesByAttributes, isUsingDeprecatedDataRoleConfig } = data!;

const hasNodeAttributes = Boolean(Object.keys(nodesByAttributes ?? {}).length);
const isCloudEnabled = true;//cloud?.isCloudEnabled ?? false;

image

Checklist

@cjcenizal cjcenizal force-pushed the bug/duplicate-allocation-ux branch from b621b41 to 7e66be9 Compare April 2, 2021 16:50
* Provide allocation behavior guidance when the user has selected Custom, but no attributes are available and:
  ...data nodes are in use, so indices will allocate to data nodes.
  ...data tiers are available, so indices will allocate to data tiers.
  ...data tiers are available but not for the given phase's tier, so indices will be allocated to a fallback data tier.
  ...no data tiers are available, so allocation won't occur (TODO: Check if this is even possible).
* Change the label for the Custom > 'Do not modify allocation configuration' option to 'Allocate to available data nodes' to clarify behavior.
* Hide the Custom > 'Allocate to available data nodes' option when on Cloud and data nodes are in use.
@cjcenizal cjcenizal force-pushed the bug/duplicate-allocation-ux branch from 7e66be9 to 0ebf824 Compare April 5, 2021 19:28
* using node roles in our Node config yet. See {@link ListNodesRouteResponse} for information about how this is
* detected.
*/
const disableDataTierOption = Boolean(isCloudEnabled && isUsingDeprecatedDataRoleConfig);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was originally being defined in data_tier_allocation_field.tsx and passed in as a prop. I refactored the code to define this value locally, derived from the underlying conditions, to make it easier to reason about when reading this file. There's no change to the original behavior.

//
// On prem, users should have the freedom to choose this option, even if they're using node roles.
// So we always give them this option.
if (!isCloudEnabled || isUsingDeprecatedDataRoleConfig) {
Copy link
Contributor Author

@cjcenizal cjcenizal Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a behavioral change. It's the condition that will hide the "Allocate to available data nodes" custom allocation option on Cloud when node roles are available.

@cjcenizal cjcenizal changed the title Provide guidance of allocation behavior in ILM Provide guidance of "Custom" allocation behavior in ILM Apr 21, 2021
@jrodewig jrodewig requested a review from debadair April 21, 2021 14:32
Copy link
Contributor

@debadair debadair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some suggestions--let me know if you have any questions.

title: {
warm: i18n.translate(
'xpack.indexLifecycleMgmt.warmPhase.dataTier.defaultAllocationNotAvailableTitle',
{ defaultMessage: 'Indices will be allocated to the warm tier' }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{ defaultMessage: 'Indices will be allocated to the warm tier' }
{ defaultMessage: 'Move data to warm nodes.' }

Copy link
Contributor Author

@cjcenizal cjcenizal Apr 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updated callout content will be:

Move data to warm nodes
Define custom node attributes in elasticsearch.yml to allocate indices based on these attributes instead.

In this particular situation, the user has selected the "Custom" allocation option and probably intends to use custom node attributes to control allocation behavior. However, because they haven't defined any custom node attributes, allocation will behave unexpectedly. Its precise behavior varies depending on how the cluster is configured and which phase of the policy the user is configuring.

"Move data to warm nodes" sounds imperative to me, as if the user intentionally chooses this option in order to allocate to warm nodes. But the scenario is more like "You probably didn't intend this, so we just wanted to be clear about what's going to happen." So I wonder if this tone is appropriate for the uncertainty of the scenario -- I think we really want users to doubt their choice here.

Is this making any sense @debadair? Does it change the way you're seeing this copy?

),
cold: i18n.translate(
'xpack.indexLifecycleMgmt.coldPhase.dataTier.defaultAllocationNotAvailableTitle',
{ defaultMessage: 'Indices will be allocated to the cold tier' }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{ defaultMessage: 'Indices will be allocated to the cold tier' }
{ defaultMessage: 'Move data to cold nodes.' }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the same concerns here as above.

},
body: i18n.translate('xpack.indexLifecycleMgmt.editPolicy.defaultToDataTiersDescription', {
defaultMessage:
'Define custom node attributes in elasticsearch.yml to allocate indices based on these attributes instead.',

This comment was marked as resolved.


const i18nTexts = {
title: i18n.translate('xpack.indexLifecycleMgmt.editPolicy.defaultToDataNodesLabel', {
defaultMessage: 'Indices will be allocated to available data nodes',

This comment was marked as resolved.

}),
body: i18n.translate('xpack.indexLifecycleMgmt.editPolicy.defaultToDataNodesDescription', {
defaultMessage:
'Define custom node attributes in elasticsearch.yml to allocate indices based on these attributes instead.',

This comment was marked as resolved.

}),
body: i18n.translate('xpack.indexLifecycleMgmt.frozenPhase.dataTier.noTiersAvailableBody', {
defaultMessage:
'Assign at least one node to the frozen, cold, warm, or hot tier to use role-based allocation. The policy will fail to complete allocation if there are no available nodes.',

This comment was marked as resolved.

{ defaultMessage: 'Do not modify allocation configuration' }
allocateToDataNodesOption: i18n.translate(
'xpack.indexLifecycleMgmt.editPolicy.nodeAllocation.allocateToDataNodesOption',
{ defaultMessage: 'Allocate to available data nodes' }

This comment was marked as resolved.

body: (nodeRole: DataTierRole) =>
i18n.translate('xpack.indexLifecycleMgmt.warmPhase.dataTier.willUseFallbackTierDescription', {
defaultMessage:
'This policy will move data in the warm phase to {tier} tier nodes instead.',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think move is a little misleading, as the net effect is likely to be that it isn't moved to a different tier, it stays where it is.

Suggested change
'This policy will move data in the warm phase to {tier} tier nodes instead.',
If no warm nodes are available, data is store in the {tier} tier.',

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point!

Copy link
Contributor Author

@cjcenizal cjcenizal Apr 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change the callout will state:

No nodes assigned to the warm tier
If no warm nodes are available, data is stored in the hot tier.

I wonder if this sounds a little unclear. The title is definitive about no warm nodes being available, and then the body seems to walk it back by cautioning "if" they're unavailable. But there's no "if" involved -- we know they're unavailable, so maybe we should more directly explain the consequence?

Very bluntly put:

No nodes assigned to the warm tier
Because of this, data is stored in the hot tier.

body: (nodeRole: DataTierRole) =>
i18n.translate('xpack.indexLifecycleMgmt.coldPhase.dataTier.willUseFallbackTierDescription', {
defaultMessage:
'This policy will move data in the cold phase to {tier} tier nodes instead.',

This comment was marked as resolved.

'xpack.indexLifecycleMgmt.editPolicy.customizeWithNodeAttributeDescription',
{
defaultMessage:
'Define custom node attributes in elasticsearch.yml to allocate indices based on these attributes.',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear why this is relevant in the context of the fallback tiers?

Copy link
Contributor Author

@cjcenizal cjcenizal Apr 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ties back to #96111 (comment). In the situation where this message is shown to the user, they've selected "Custom" allocation but there are no custom node attributes available. If they've chosen "Custom" I expect their intention would be to use custom node attributes to control allocation, otherwise they should use the "Recommended (data tiers)" option. I've updated this to subtly direct them to use built-in node roles per your earlier comment.

@cjcenizal cjcenizal added enhancement New value added to drive a business result Feature:ILM Team:Kibana Management Dev Tools, Index Management, Upgrade Assistant, ILM, Ingest Node Pipelines, and more v7.14.0 v8.0.0 labels Apr 25, 2021
@cjcenizal cjcenizal marked this pull request as ready for review April 25, 2021 01:45
@cjcenizal cjcenizal requested a review from a team as a code owner April 25, 2021 01:45
@elasticmachine
Copy link
Contributor

Pinging @elastic/es-ui (Team:Elasticsearch UI)

@cjcenizal
Copy link
Contributor Author

cjcenizal commented Apr 25, 2021

@yuliacech @jloleysens As you review this, could you please also take a look at how I restructured the CITs? I split up the node allocation tests into separate files to make each file easier to read and to also make the categories of functionality under test easier to discover. I also created a helper file per test file so we can narrowly scope each helper's interface to the tests' needs, which I think will improve maintainability because it's easier to track which actions are used where.

I'm mostly concerned with whether or not this direction is compatible with the direction you want to take with refactoring these tests. Is it close enough to what you have in mind for me to move forward with this? If so then I'll add additional tests to cover the changes in this PR. I'll also do an additional cleanup pass on the test files I touched to remove any redundant or unnecessary configuration (e.g. beforeEach calls which overlap and override one another).

@cjcenizal cjcenizal added release_note:enhancement and removed enhancement New value added to drive a business result labels Apr 25, 2021
@cjcenizal
Copy link
Contributor Author

@elasticmachine merge upstream

Copy link
Contributor

@yuliacech yuliacech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @cjcenizal, this work looks great, thanks a lot for doing this! I'm happy with it being merged, just a couple of non blocking comments

  • When there is no data tiers to fallback to or when there is no data tier for a concrete phase, would that be useful to link to this guide? To help the user configure their node roles correctly. This might not be applicable on Cloud though I guess.
  • About the test helpers: Thanks a lot for clarifying your reasoning about using 'helpers' file for individual test files. I think that works but I still would like to separate helpers that create edit policy actions like 'savePolicy' etc into a logical group. As you suggested, maybe we could move them in a separate folder inside edit_policy folder (I don't think they are used by other tests then edit policy)

@cjcenizal
Copy link
Contributor Author

cjcenizal commented Apr 30, 2021

Thanks @yuliacech! I created #98960 to track your first point about the docs. I am open to any changes you'd like to make to the test structure. I'd like to limit this PRs scope to node allocation so let's handle that in subsequent PRs.

Copy link
Contributor

@debadair debadair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple minor suggestions. Also, there might be an instance where "Allocate to available data nodes" is still showing up in the drop down instead of "Any data node" when there are custom attributes configured? (It's showing up in the screen cap, but I'm not seeing it in the diff.)

{i18n.translate(
'xpack.indexLifecycleMgmt.dataTier.noTiersAvailableUsingNodeAttributesDescription',
{
defaultMessage: 'There are no available nodes for data allocation.',

This comment was marked as resolved.

{i18n.translate(
'xpack.indexLifecycleMgmt.dataTier.willUseFallbackTierUsingNodeAttributesDescription',
{
defaultMessage: 'Data will be allocated to the {tier} tier.',

This comment was marked as resolved.

@cjcenizal
Copy link
Contributor Author

cjcenizal commented May 1, 2021

@debadair, thanks! Copy updated, conditions checked, screenshots uploaded.

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
indexLifecycleManagement 206 211 +5

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
indexLifecycleManagement 247.1KB 248.8KB +1.8KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
indexLifecycleManagement 50.1KB 50.3KB +189.0B

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@cjcenizal cjcenizal merged commit 8ae49a4 into elastic:master May 1, 2021
@cjcenizal cjcenizal deleted the bug/duplicate-allocation-ux branch May 1, 2021 04:37
cjcenizal added a commit to cjcenizal/kibana that referenced this pull request May 1, 2021
* Provide allocation behavior guidance when the user has selected Custom, but no attributes are available and:
  ...data nodes are in use, so indices will allocate to data nodes.
  ...data tiers are available, so indices will allocate to data tiers.
  ...data tiers are available but not for the given phase's tier, so indices will be allocated to a fallback data tier.
  ...no data tiers are available, so allocation won't occur.
* Provide link to the migration docs.
* Adjust copy to consistently notify the user that node attributes are missing and guide them to use role-based allocation.
* Localize disableDataTierOption flag instead of passing it as a prop, to make it easier to reason about.
* Clarify the scenario in which the user has selected node_attrs allocation but no data tiers are available.
* Remove inapplicable frozen tier from NoTiersAvailableNotice.
* Refactor and update tests.
  * Split up node allocation tests to improve discoverability. Refactor test helpers to define suite-specific helper interfaces, to improve maintainability.
  * Refactor tests to make the [using node attributes, using node roles] branch and the various conditions within easier to contrast and compare.
  * Add coverage for NoTiersAvailableUsingNodeAttributesNotice scenario.
  * Add some helpers for improved maintainability and readability.
  * Create helpers/types file to store extracted Phase type.
cjcenizal added a commit that referenced this pull request May 1, 2021
)

* Provide allocation behavior guidance when the user has selected Custom, but no attributes are available and:
  ...data nodes are in use, so indices will allocate to data nodes.
  ...data tiers are available, so indices will allocate to data tiers.
  ...data tiers are available but not for the given phase's tier, so indices will be allocated to a fallback data tier.
  ...no data tiers are available, so allocation won't occur.
* Provide link to the migration docs.
* Adjust copy to consistently notify the user that node attributes are missing and guide them to use role-based allocation.
* Localize disableDataTierOption flag instead of passing it as a prop, to make it easier to reason about.
* Clarify the scenario in which the user has selected node_attrs allocation but no data tiers are available.
* Remove inapplicable frozen tier from NoTiersAvailableNotice.
* Refactor and update tests.
  * Split up node allocation tests to improve discoverability. Refactor test helpers to define suite-specific helper interfaces, to improve maintainability.
  * Refactor tests to make the [using node attributes, using node roles] branch and the various conditions within easier to contrast and compare.
  * Add coverage for NoTiersAvailableUsingNodeAttributesNotice scenario.
  * Add some helpers for improved maintainability and readability.
  * Create helpers/types file to store extracted Phase type.
spalger added a commit that referenced this pull request May 1, 2021
spalger added a commit that referenced this pull request May 1, 2021
@spalger
Copy link
Contributor

spalger commented May 1, 2021

Hey CJ, I had to revert this PR and the 7.x backport because for some reason as soon as it was merged master started to fail with:

22:26:23   info [  kibana  ] Checking Windows for paths > 200 characters
22:26:26     │ERROR failure 3 sec
22:26:26     │ERROR Error: Windows has a path limit of 260 characters so we limit the length of paths in Kibana to 200 characters  and the following files exceed this limit:
22:26:26     │       - x-pack/plugins/index_lifecycle_management/target/types/public/application/sections/edit_policy/components/phases/shared_fields/data_tier_allocation_field/components/default_to_data_nodes_notice.d.ts.map
22:26:26     │       - x-pack/plugins/index_lifecycle_management/target/types/public/application/sections/edit_policy/components/phases/shared_fields/data_tier_allocation_field/components/default_to_data_tiers_notice.d.ts.map
22:26:26     │       - x-pack/plugins/index_lifecycle_management/target/types/public/application/sections/edit_policy/components/phases/shared_fields/data_tier_allocation_field/components/no_custom_attributes_messages.d.ts.map
22:26:26     │       - x-pack/plugins/index_lifecycle_management/target/types/public/application/sections/edit_policy/components/phases/shared_fields/data_tier_allocation_field/components/no_tiers_available_using_node_attributes_notice.d.ts.map
22:26:26     │       - x-pack/plugins/index_lifecycle_management/target/types/public/application/sections/edit_policy/components/phases/shared_fields/data_tier_allocation_field/components/node_role_to_fallback_tier_map.d.ts.map
22:26:26     │       - x-pack/plugins/index_lifecycle_management/target/types/public/application/sections/edit_policy/components/phases/shared_fields/data_tier_allocation_field/components/will_use_fallback_tier_notice.d.ts.map
22:26:26     │       - x-pack/plugins/index_lifecycle_management/target/types/public/application/sections/edit_policy/components/phases/shared_fields/data_tier_allocation_field/components/will_use_fallback_tier_using_node_attributes_notice.d.ts.map

Seems like this should be failing in the PR too, filed #99005 to remind me to look into it on Monday. Would you mind resubmitting this PR and considering some shorter file names before merging? Hopefully we can track down what the problem is quickly, but your new PR might not be validating the new path lengths so please use caution. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:ILM release_note:enhancement reverted Team:Kibana Management Dev Tools, Index Management, Upgrade Assistant, ILM, Ingest Node Pipelines, and more v7.14.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Disambiguate ILM's "Do not modify" custom allocation option from turning allocation off
8 participants