Set acking timeout to 0 on dynamic mapping update #31140

ywelsch · 2018-06-06T14:25:57Z

As acking can fail for any reason (unrelated node being too slow, node disconnecting), it should not be required for acking to succeed in order for index requests with dynamic mapping updates to successfully complete.

Relates to #30672 and Closes #30844

elasticmachine · 2018-06-06T14:25:59Z

Pinging @elastic/es-distributed

bleskes

I left a question. I also want to discuss the following - since we're ignoring the ack, should we also just not set it to the mapping timeout but rather always to 0?

bleskes · 2018-06-06T14:43:13Z

server/src/test/java/org/elasticsearch/indices/state/RareClusterStateIT.java

@@ -315,6 +309,7 @@ public void testDelayedMappingPropagationOnReplica() throws Exception {
            Settings.builder()
                .put(DiscoverySettings.COMMIT_TIMEOUT_SETTING.getKey(), "30s") // explicitly set so it won't default to publish timeout
                .put(DiscoverySettings.PUBLISH_TIMEOUT_SETTING.getKey(), "0s") // don't wait post commit as we are blocking things by design
+                .put(MappingUpdatedAction.INDICES_MAPPING_DYNAMIC_TIMEOUT_SETTING.getKey(), "2s")


can you elaborate on the rational of how you got to 2s?

it's high enough so that the cluster state update task will actually get to run. It's low enough so that the test still finishes in a reasonable amount of time.

ywelsch · 2018-06-06T14:59:04Z

If I read the code correctly, setting the timeout to 0 would result in the update mapping call returning prematurely, even before the publishing has started, which is certainly not what we want?

ywelsch · 2018-06-14T17:03:23Z

After merging #31303, I've updated this PR to set the acking timeout to 0.

ywelsch · 2018-06-15T09:50:00Z

This triggered a failure on the doc tests, which uncovered a larger issue. The reason is that with this change, we now always trigger a RetryOnPrimaryException in case of a dynamic mapping update. In return this can lead to the local checkpoint getting stuck on a replica. I've added a test that illustrates this in e8e6374

ywelsch · 2019-01-11T11:53:01Z

@bleskes I've updated this PR, which is now simpler due to the refactoring of TransportShardBulkAction.

bleskes

LGTM.

bleskes · 2019-01-11T14:36:57Z

server/src/test/java/org/elasticsearch/indices/state/RareClusterStateIT.java

@@ -397,6 +399,21 @@ public void onFailure(Exception e) {

        assertBusy(() -> assertTrue(client().prepareGet("index", "type", "1").get().isExists()));

+        ActionFuture<IndexResponse> dynamicMappingsFut = client().prepareIndex("index", "type", "2").setSource("field2", 42).execute();


I presume this just biffs up the test, or am I missing something specific?

I also wonder if we sometime should not have an extra field

the first document that is put on the test does not use dynamic mappings (there is an explicit putMapping call for that one). This one does an actual dynamic mapping update (the thing we changed in this PR).
Not having an extra field is just what the first document tests :)

I've added a comment to the test here to clarify

Ok, thanks. I thought that the first call was indeed dynamic (and also adds a type, which is why another call to only add a field is required). Thanks for clarifying.

ywelsch · 2019-01-11T19:32:18Z

The test failure here uncovered a problem with CCR I think. The failure is:

WARNING: Uncaught exception in thread: Thread[elasticsearch[followerd4][write][T#2],5,TGRP-IndexFollowingIT]
  2> java.lang.AssertionError: Only already-processed error should happen; op=[Index{id='1', type='doc', seqNo=0, primaryTerm=1, version=1, autoGeneratedIdTimestamp=-1}] error=[null]

I suspect that this is fails because the follower does not have the right mapping update for the batch of operations that it tries to index. CCR gets a batch of operations from the leader cluster and assumes that the mapping version that it uses from the ClusterService is at least as up-to-date as the operations that it gets from the index. This is not true, because we only expose the cluster state on ClusterService after it's been applied to the shard (and the index operation have possibly already occurred).

/cc: @martijnvg @jasontedor

dnhatn · 2019-01-14T02:45:43Z

The test failure here uncovered a problem with CCR I think.

@ywelsch Thanks for the ping. @jasontedor @martijnvg I will look into this issue.

Today we keep the mapping on the follower in sync with the leader's using the mapping version from changes requests. There are two rare cases where the mapping on the follower is not synced properly: 1. The returned mapping version (from ClusterService) is outdated than the actual mapping. This happens because we expose the latest cluster state in ClusterService after applying it to IndexService. 2. It's possible for the FollowTask to receive an outdated mapping than the min_required_mapping. In that case, it should fetch the mapping again; otherwise, the follower won't have the right mapping. Relates to #31140

…pings

As acking can fail for any reason (unrelated node being too slow, node disconnecting), it should not be required for acking to succeed in order for index requests with dynamic mapping updates to successfully complete. Relates to #30672 and Closes #30844

Follow-up to #31140

* elastic/master: Optimize warning header de-duplication (elastic#37725) Bubble exceptions up in ClusterApplierService (elastic#37729) SQL: Improve handling of invalid args for PERCENTILE/PERCENTILE_RANK (elastic#37803) Remove unused ThreadBarrier class (elastic#37666) Add built-in user and role for code plugin (elastic#37030) Consolidate testclusters tests into a single project (elastic#37362) Fix docs for MappingUpdatedAction SQL: Introduce SQL DATE data type (elastic#37693) disabling bwc test while backporting elastic#37639 Mute ClusterDisruptionIT testAckedIndexing Set acking timeout to 0 on dynamic mapping update (elastic#31140) Remove index audit output type (elastic#37707) Mute FollowerFailOverIT testReadRequestsReturnsLatestMappingVersion [ML] Increase close job timeout and lower the max number (elastic#37770) Remove Custom Listeners from SnapshotsService (elastic#37629) Use m_m_nodes from Zen1 master for Zen2 bootstrap (elastic#37701) Fix index filtering in follow info api. (elastic#37752) Use project dependency instead of substitutions for distributions (elastic#37730) Update authenticate to allow unknown fields (elastic#37713) Deprecate HLRC EmptyResponse used by security (elastic#37540)

This test is obsolete since #31140 where an index request with dynamic mapping update no longer requires acking. Closes #37816

Dynamic mapping update of an index request no longer requires acks from all nodes. We need busily to check the mapping update. Relates #31140. Closes #37817

This test is obsolete since #31140 where an index request with dynamic mapping update no longer requires acking. Closes #37816

We need to await busily for the mapping in this test because we no longer require acking on the dynamic mapping of an index request. Relates elastic#31140 Closes elastic#37928

If a replica does not have a right mapping yet, we will retry the index request on that replica; then the actual tasks is higher than the expected tasks. Since #31140 this happens more frequently for we no longer require acking on the dynamic mapping of index requests. Relates #31140 Closes #37893

If a replica does not have a right mapping yet, we will retry the index request on that replica; then the actual tasks is higher than the expected tasks. Since elastic#31140 this happens more frequently for we no longer require acking on the dynamic mapping of index requests. Relates elastic#31140 Closes elastic#37893

…8045) Since #31140 we no longer require acking on the dynamic mapping of index requests. Thus, a returned mapping from a get mapping request does not necessarily contain the dynamic updates from the index request. This commit replaces the dynamic mapping update with a manual put mapping. Relates #31140 Closes #37928

…astic#38045) Since elastic#31140 we no longer require acking on the dynamic mapping of index requests. Thus, a returned mapping from a get mapping request does not necessarily contain the dynamic updates from the index request. This commit replaces the dynamic mapping update with a manual put mapping. Relates elastic#31140 Closes elastic#37928

Since #31140 we no longer require acking on the dynamic mapping of index requests. Thus, a returned mapping from a get mapping request does not necessarily contain the dynamic updates from the index request. This commit replaces the dynamic mapping update with a manual put mapping. Relates #31140 Closes #37928

If a replica does not have a right mapping yet, we will retry the index request on that replica; then the actual tasks is higher than the expected tasks. Since #31140 this happens more frequently for we no longer require acking on the dynamic mapping of index requests. Relates #31140 Closes #37893

ywelsch added >enhancement :Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. v7.0.0 v6.4.0 labels Jun 6, 2018

ywelsch requested a review from bleskes June 6, 2018 14:25

bleskes reviewed Jun 6, 2018

View reviewed changes

ywelsch changed the title ~~Don't check ack status on dynamic mapping update~~ Set acking timeout to 0 on dynamic mapping update Jun 14, 2018

lcawl added v6.4.1 and removed v6.4.0 labels Aug 23, 2018

Set acking timeout to 0 on dynamic mapping update

779cb2c

ywelsch force-pushed the no-ack-dynamic-mappings branch from 6e09683 to 779cb2c Compare January 11, 2019 11:51

ywelsch added v6.7.0 and removed v6.4.1 labels Jan 11, 2019

ywelsch requested a review from bleskes January 11, 2019 11:52

bleskes approved these changes Jan 11, 2019

View reviewed changes

ywelsch added 2 commits January 11, 2019 16:07

checkstyle

d9819fb

add comment

7060f6f

dnhatn mentioned this pull request Jan 19, 2019

Ensure changes requests return the latest mapping version #37633

Merged

ywelsch added 2 commits January 23, 2019 22:11

Merge remote-tracking branch 'elastic/master' into no-ack-dynamic-map…

5e10ef8

…pings

Merge remote-tracking branch 'elastic/master' into no-ack-dynamic-map…

f9ce423

…pings

ywelsch merged commit 64adb5a into elastic:master Jan 24, 2019

ywelsch added a commit that referenced this pull request Jan 24, 2019

Fix docs for MappingUpdatedAction

2bf269e

Follow-up to #31140

ywelsch added a commit that referenced this pull request Jan 24, 2019

Fix docs for MappingUpdatedAction

9409706

Follow-up to #31140

dnhatn added a commit that referenced this pull request Jan 25, 2019

Remove testMappingsPropagatedToMasterNodeImmediately

3ccd488

This test is obsolete since #31140 where an index request with dynamic mapping update no longer requires acking. Closes #37816

dnhatn mentioned this pull request Jan 25, 2019

[CI] DynamicMappingIT testMappingsPropagatedToMasterNodeImmediately fails #37816

Closed

dnhatn added a commit that referenced this pull request Jan 25, 2019

AssertBusy to check mapping in LegacyDynamicMappingIT

f925690

Dynamic mapping update of an index request no longer requires acks from all nodes. We need busily to check the mapping update. Relates #31140. Closes #37817

dnhatn added a commit that referenced this pull request Jan 25, 2019

Remove testMappingsPropagatedToMasterNodeImmediately

036a67c

This test is obsolete since #31140 where an index request with dynamic mapping update no longer requires acking. Closes #37816

dnhatn mentioned this pull request Jan 25, 2019

[CI] LegacyDynamicMappingIT testMappingsPropagatedToMasterNodeImmediatelyMultiType fails in 6.x #37817

Closed

This was referenced Jan 30, 2019

Disable dynamic mapping in testSimpleGetFieldMappingsWithDefaults #38045

Merged

Disable dynamic mapping update in testTransportBulkTasks #38073

Merged

dnhatn mentioned this pull request Feb 1, 2019

Disable dynamic mapping update in testTransportBulkTasks #38145

Merged

dnhatn mentioned this pull request Feb 1, 2019

Disable dynamic mapping in testSimpleGetFieldMappingsWithDefaults #38146

Merged

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set acking timeout to 0 on dynamic mapping update #31140

Set acking timeout to 0 on dynamic mapping update #31140

ywelsch commented Jun 6, 2018

elasticmachine commented Jun 6, 2018

bleskes left a comment

bleskes Jun 6, 2018

ywelsch Jun 6, 2018

ywelsch commented Jun 6, 2018

ywelsch commented Jun 14, 2018

ywelsch commented Jun 15, 2018

ywelsch commented Jan 11, 2019

bleskes left a comment

bleskes Jan 11, 2019

bleskes Jan 11, 2019

ywelsch Jan 11, 2019

ywelsch Jan 11, 2019

bleskes Jan 14, 2019

ywelsch commented Jan 11, 2019

dnhatn commented Jan 14, 2019

		@@ -397,6 +399,21 @@ public void onFailure(Exception e) {

		assertBusy(() -> assertTrue(client().prepareGet("index", "type", "1").get().isExists()));

		ActionFuture<IndexResponse> dynamicMappingsFut = client().prepareIndex("index", "type", "2").setSource("field2", 42).execute();

Set acking timeout to 0 on dynamic mapping update #31140

Set acking timeout to 0 on dynamic mapping update #31140

Conversation

ywelsch commented Jun 6, 2018

elasticmachine commented Jun 6, 2018

bleskes left a comment

Choose a reason for hiding this comment

bleskes Jun 6, 2018

Choose a reason for hiding this comment

ywelsch Jun 6, 2018

Choose a reason for hiding this comment

ywelsch commented Jun 6, 2018

ywelsch commented Jun 14, 2018

ywelsch commented Jun 15, 2018

ywelsch commented Jan 11, 2019

bleskes left a comment

Choose a reason for hiding this comment

bleskes Jan 11, 2019

Choose a reason for hiding this comment

bleskes Jan 11, 2019

Choose a reason for hiding this comment

ywelsch Jan 11, 2019

Choose a reason for hiding this comment

ywelsch Jan 11, 2019

Choose a reason for hiding this comment

bleskes Jan 14, 2019

Choose a reason for hiding this comment

ywelsch commented Jan 11, 2019

dnhatn commented Jan 14, 2019