Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster access to INITIALIZING/RELOCATING shards #47817

Merged
merged 9 commits into from
Oct 31, 2019
Merged

Faster access to INITIALIZING/RELOCATING shards #47817

merged 9 commits into from
Oct 31, 2019

Conversation

kkewwei
Copy link
Contributor

@kkewwei kkewwei commented Oct 9, 2019

Today a couple of allocation deciders iterate through all the shards on a node
to find the INITIALIZING or RELOCATING ones, and this can slow down cluster
state updates in clusters with very high-density nodes holding many thousands
of shards even if those shards belong to closed or frozen indices. This commit
pre-computes the sets of INITIALIZING and RELOCATING shards to speed up
this search.

Closes #46941
Relates #48579

Co-authored-by: "hongju.xhj" hongju.xhj@alibaba-inc.com

@polyfractal polyfractal added the :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) label Oct 15, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Allocation)

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kkewwei, I left a few minor requests inline. I also think we should add a method bool invariant() and call assert invariant() at the end of the constructor and at the start and end of the add(), remove() and update() methods. The invariant() method should assert that various invariants hold, e.g. that initializingShards contains all the initialising shards in shards in the correct order, and similarly for relocatingShards. See org.elasticsearch.index.seqno.ReplicationTracker for a good example of this pattern.

We should also have some unit tests that exercise these methods to verify that the invariant holds.

@kkewwei
Copy link
Contributor Author

kkewwei commented Oct 16, 2019

Your suggestion is very helpful, I will modify it according to your suggestion. I have a little doubt, if there is need to call assert invariant() at the end of the constructor, because initializingShards and relocatingShards are created here.

@DaveCTurner
Copy link
Contributor

Yes, it's good practice to verify invariants at the end of the constructor. It may be obvious today that all assertions pass on the newly-constructed object, but the point of the exercise is to catch mistakes that may be made in future too.

@DaveCTurner
Copy link
Contributor

Hi @kkewwei please don't force-push to PR branches, it loses history and makes it harder to keep track of older reviews. I see you've made a couple of changes. Let us know when it's ready for another review.

@kkewwei
Copy link
Contributor Author

kkewwei commented Oct 25, 2019

Ok, I will not change previous submissions anymore. It's ready to review. should I open another PR fot this, or continue to review here?

@DaveCTurner
Copy link
Contributor

Here is good, thanks. You can push more changes onto a PR branch, just don't force-push since that loses older changes. I might not get to this for the next few days, but it's on my list.

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kkewwei, I left a handful more comments.

Another contributor opened a similar PR, but I'd prefer to keep going with this one. They did however make a related change to ThrottlingAllocationDecider that I would like to do here too:

https://github.com/elastic/elasticsearch/pull/48579/files#diff-74018dd650ea4d9dc396af7800ef377eR126

This doesn't need a specific test, since it's already covered adequately by ThrottlingAllocationTests.


// shards must contains all the shards in initializingShards and relocatingShards
assert initializingShards.stream().allMatch(shardRouting -> shards.containsValue(shardRouting));
assert relocatingShards.stream().allMatch(shardRouting -> shards.containsValue(shardRouting));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a stronger property, namely that the order of the shards in relocatingShards is consistent with that in shards:

Suggested change
assert relocatingShards.stream().allMatch(shardRouting -> shards.containsValue(shardRouting));
assert new ArrayList<>(relocatingShards)
.equals(shards.values().stream().filter(ShardRouting::relocating).collect(Collectors.toList()));

(this means the implementation needs some adjustment to get the tests to pass again)


private boolean invariant() {
// initializingShards only contains the initializing shards
assert initializingShards.stream().allMatch(shardRouting -> shardRouting.initializing());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be unnecessary with the suggested changes to check equality below.

assert initializingShards.stream().allMatch(shardRouting -> shardRouting.initializing());

// relocatingShards only contains the relocating shards
assert relocatingShards.stream().allMatch(shardRouting -> shardRouting.relocating());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be unnecessary with the suggested changes to check equality below.

assertThat(routingNode.shardsWithState(ShardRoutingState.INITIALIZING).size(), equalTo(1));
}

public void testshardsWithState() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public void testshardsWithState() {
public void testShardsWithStateInIndex() {

@kkewwei
Copy link
Contributor Author

kkewwei commented Oct 29, 2019

TBR

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The invariant is still not checking that the cached collections have matching iteration orders.

@kkewwei
Copy link
Contributor Author

kkewwei commented Oct 29, 2019

@DaveCTurner , I realize what you means, looking forward to your review

@DaveCTurner
Copy link
Contributor

@elasticmachine test this please

@DaveCTurner DaveCTurner dismissed their stale review October 29, 2019 17:07

All comments addressed

@DaveCTurner
Copy link
Contributor

DaveCTurner commented Oct 29, 2019

The failure (at least, the one of elasticsearch-ci/1) looks relevant. I think we might have to go back to a LinkedHashMap<ShardId, ShardRouting> to preserve the iteration order without hitting that exception.

@kkewwei
Copy link
Contributor Author

kkewwei commented Oct 30, 2019

I set the type of initializingShards and relocatingShards to be LinkedHashMap, but it doesn't work. It seems that the iteration order of shards can't be changed when we test testCloseWhileRelocatingShards. In RoutingNode.update(), if we don't change the iteration order of shards, the iteration order of relocatingShards will not be consistent with that in shards.

in RoutingNode.update(), i directly overwrite instead of deleting and writing, It work well. if we should not keep the stronger property? I am very happy to follow your suggestion.

@DaveCTurner
Copy link
Contributor

I see, yes, this is trickier than I thought. I will have another look at how important the iteration order is.

In the meantime, @elasticmachine test this please

@DaveCTurner
Copy link
Contributor

@elasticmachine update branch

@DaveCTurner
Copy link
Contributor

@elasticmachine test this please

@kkewwei
Copy link
Contributor Author

kkewwei commented Oct 30, 2019

@DaveCTurner
Copy link
Contributor

Failure looks unrelated to this, let's try again before digging deeper.

@elasticmachine please run elasticsearch-ci/packaging-sample-matrix

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, it does look like the order of these filtered lists isn't so important so let's go with the weaker property. I did a (hopefully final) pass and left a few suggestions, otherwise I think this is good to go.


if (shard.initializing()) {
initializingShards.add(shard);
} else if(shard.relocating()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whitespace nit:

Suggested change
} else if(shard.relocating()) {
} else if (shard.relocating()) {

@@ -127,6 +178,14 @@ void remove(ShardRouting shard) {
* @return number of shards
*/
public int numberOfShardsWithState(ShardRoutingState... states) {
if (states.length == 1) {
if(states[0] == ShardRoutingState.INITIALIZING) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whitespace nit:

Suggested change
if(states[0] == ShardRoutingState.INITIALIZING) {
if (states[0] == ShardRoutingState.INITIALIZING) {

@@ -144,6 +203,14 @@ public int numberOfShardsWithState(ShardRoutingState... states) {
* @return List of shards
*/
public List<ShardRouting> shardsWithState(ShardRoutingState... states) {
if (states.length == 1) {
if(states[0] == ShardRoutingState.INITIALIZING) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whitespace nit:

Suggested change
if(states[0] == ShardRoutingState.INITIALIZING) {
if (states[0] == ShardRoutingState.INITIALIZING) {

@@ -164,6 +231,26 @@ public int numberOfShardsWithState(ShardRoutingState... states) {
public List<ShardRouting> shardsWithState(String index, ShardRoutingState... states) {
List<ShardRouting> shards = new ArrayList<>();

if (states.length == 1) {
if(states[0] == ShardRoutingState.INITIALIZING) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whitespace nit:

Suggested change
if(states[0] == ShardRoutingState.INITIALIZING) {
if (states[0] == ShardRoutingState.INITIALIZING) {

}

void remove(ShardRouting shard) {
ShardRouting previousValue = shards.remove(shard.shardId());
assert previousValue == shard : "expected shard " + previousValue + " but was " + shard;
if (shard.initializing()) {
boolean exist = initializingShards.remove(shard);
assert exist : "expected shard " + shard + "exists in initializingShards, but not ";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wording/whitespace suggestion:

Suggested change
assert exist : "expected shard " + shard + "exists in initializingShards, but not ";
assert exist : "expected shard " + shard + " to exist in initializingShards";

assert exist : "expected shard " + shard + "exists in initializingShards, but not ";
} else if (shard.relocating()) {
boolean exist = relocatingShards.remove(shard);
assert exist : "expected shard " + shard + "exists in relocatingShards, but not ";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wording/whitespace suggestion:

Suggested change
assert exist : "expected shard " + shard + "exists in relocatingShards, but not ";
assert exist : "expected shard " + shard + " to exist in relocatingShards";

@DaveCTurner DaveCTurner changed the title reduce the time cost to update cluster state Faster access to INITIALIZING/RELOCATING shards Oct 30, 2019
@kkewwei
Copy link
Contributor Author

kkewwei commented Oct 30, 2019

@DaveCTurner, very thank you for your grammar check.

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks @kkewwei

@DaveCTurner
Copy link
Contributor

@elasticmachine test this please

@DaveCTurner DaveCTurner merged commit 30b0a4e into elastic:master Oct 31, 2019
DaveCTurner pushed a commit that referenced this pull request Oct 31, 2019
Today a couple of allocation deciders iterate through all the shards on a node
to find the `INITIALIZING` or `RELOCATING` ones, and this can slow down cluster
state updates in clusters with very high-density nodes holding many thousands
of shards even if those shards belong to closed or frozen indices. This commit
pre-computes the sets of `INITIALIZING` and `RELOCATING` shards to speed up
this search.

Closes #46941
Relates #48579

Co-authored-by: "hongju.xhj" <hongju.xhj@alibaba-inc.com>
@kkewwei kkewwei deleted the fix_46941 branch October 31, 2019 11:53
@DaveCTurner DaveCTurner added v7.6.0 and removed v7.5.0 labels Oct 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >enhancement v7.6.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

A method to reduce the time cost to update cluster state
5 participants