Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove optimisations to reuse objects when applying a new ClusterState #27317

Merged

Conversation

DaveCTurner
Copy link
Contributor

In order to avoid churn when applying a new ClusterState, there are some checks that compare parts of the old and new states and, if equal, the new object is discarded and the old one reused. Since ClusterState updates are now largely diff-based, this code is unnecessary: applying a diff also reuses any old objects if unchanged. Moreover, the code compares the parts of the ClusterState using their version() values which is not guaranteed to be correct, because of a lack of consensus.

This change removes this optimisation, and tests that objects are still reused as expected via the diff mechanism.

@DaveCTurner DaveCTurner added :Cluster :Distributed Coordination/Discovery-Plugins Anything related to our integration plugins with EC2, GCP and Azure labels Nov 8, 2017
Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I left an ask for more testing, not directly related to the code you changed but to make the test you add more complete.

@@ -154,4 +154,58 @@ public void testSnapshotDeletionsInProgressSerialization() throws Exception {
assertThat(stateAfterDiffs.custom(SnapshotDeletionsInProgress.TYPE), notNullValue());
}

public void testObjectReuseWhenApplyingClusterStateDiff() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while we're at it , can you check more than just the routing table and the index meta data? we have blocks, nodes and customs (out of the top of my head).

@DaveCTurner
Copy link
Contributor Author

I added assertions for Blocks, DiscoveryNodes, and IndexTemplateMetaDatas. I tried Customs too but found no simple way to include them in the test without implementing local versions of the relevant abstract classes, and this ended up being more code than the test itself. I feel kinda comfortable with this, since Customs use the same basic machinery as the rest of them, so any issues are going to be hidden in a particular implementation, which is definitely out of scope for this test.

@DaveCTurner
Copy link
Contributor Author

(FAO @bleskes)

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@DaveCTurner DaveCTurner merged commit 4abb5fa into elastic:master Nov 9, 2017
DaveCTurner added a commit that referenced this pull request Nov 9, 2017
…ate` (#27317)

In order to avoid churn when applying a new `ClusterState`, there are some checks that compare parts of the old and new states and, if equal, the new object is discarded and the old one reused. Since `ClusterState` updates are now largely diff-based, this code is unnecessary: applying a diff also reuses any old objects if unchanged. Moreover, the code compares the parts of the `ClusterState` using their `version()` values which is not guaranteed to be correct, because of a lack of consensus.

This change removes this optimisation, and tests that objects are still reused as expected via the diff mechanism.
@DaveCTurner
Copy link
Contributor Author

Thanks. Merged to master (4abb5fa) and backported to 6.x (11dc3a3).

@DaveCTurner DaveCTurner deleted the 2017-11-08-no-clusterstate-object-reuse branch November 9, 2017 08:14
jasontedor added a commit that referenced this pull request Nov 9, 2017
* master: (22 commits)
  Update Tika version to 1.15
  Aggregations: bucket_sort pipeline aggregation (#27152)
  Introduce templating support to timezone/locale in DateProcessor (#27089)
  Increase logging on qa:mixed-cluster tests
  Update to AWS SDK 1.11.223 (#27278)
  Improve error message for parse failures of completion fields (#27297)
  Ensure external refreshes will also refresh internal searcher to minimize segment creation (#27253)
  Remove optimisations to reuse objects when applying a new `ClusterState` (#27317)
  Decouple `ChannelFactory` from Tcp classes (#27286)
  Fix find remote when building BWC
  Remove colons from task and configuration names
  Add unreleased 5.6.5 version number
  testCreateSplitIndexToN: do not set `routing_partition_size` to >= `number_of_routing_shards`
  Snapshot/Restore: better handle incorrect chunk_size settings in FS repo (#26844)
  Add limits for ngram and shingle settings (#27211) (#27318)
  Correct comment in index shard test
  Roll translog generation on primary promotion
  ObjectParser: Replace IllegalStateException with ParsingException (#27302)
  scripted_metric _agg parameter disappears if params are provided (#27159)
  Update discovery-ec2.asciidoc
  ...
jasontedor added a commit that referenced this pull request Nov 9, 2017
* 6.x:
  Update Tika version to 1.15
  Introduce templating support to timezone/locale in DateProcessor (#27089)
  Increase logging on qa:mixed-cluster tests
  Update to AWS SDK 1.11.223 (#27278)
  Improve error message for parse failures of completion fields (#27297)
  Remove optimisations to reuse objects when applying a new `ClusterState` (#27317)
  Decouple `ChannelFactory` from Tcp classes (#27286)
  Use PlainListenableActionFuture for CloseFuture (#26242)
  Fix find remote when building BWC
  Remove colons from task and configuration names
  Fix snapshot getting stuck in INIT state (#27214)
  Snapshot/Restore: better handle incorrect chunk_size settings in FS repo (#26844)
  Add unreleased 5.6.5 version number
  testCreateSplitIndexToN: do not set `routing_partition_size` to >= `number_of_routing_shards`
  Correct comment in index shard test
  Roll translog generation on primary promotion
  ObjectParser: Replace IllegalStateException with ParsingException (#27302)
  scripted_metric _agg parameter disappears if params are provided (#27159)
  Update discovery-ec2.asciidoc
@DaveCTurner DaveCTurner restored the 2017-11-08-no-clusterstate-object-reuse branch November 22, 2017 15:43
@clintongormley clintongormley added :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. and removed :Cluster labels Feb 13, 2018
@jpountz jpountz removed the :Distributed Coordination/Discovery-Plugins Anything related to our integration plugins with EC2, GCP and Azure label Jan 29, 2019
@DaveCTurner DaveCTurner deleted the 2017-11-08-no-clusterstate-object-reuse branch July 23, 2022 10:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. resiliency v6.1.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants