[ML] prefer least allocated model when a new node is added to the cluster #77756

benwtrent · 2021-09-15T11:27:47Z

When a new node is added to the cluster, we should first attempt to allocate models that have fewer
allocated nodes.

This way we prioritize getting allocations up and running vs. using a random selection.

…ster

elasticmachine · 2021-09-15T11:27:50Z

Pinging @elastic/ml-core (Team:ML)

benwtrent · 2021-09-15T11:31:58Z

...va/org/elasticsearch/xpack/ml/inference/allocation/TrainedModelAllocationClusterService.java

+            .entrySet()
+            .stream()
+            .filter(entry -> entry.getValue().getAllocationState().equals(AllocationState.STOPPING) == false)
+            .sorted(Comparator.comparing(e -> e.getValue().getNodeRoutingTable().size()))


This is obviously not optimal, but I am not 100% sure how else to do it.

One issue here (besides performance), is that we always use the counts based on the original cluster state. Meaning, as models are allocated, our iteration order isn't updated in the same pass.

This is probably acceptable.

Another option is that we maybe iterate on nodes in the outer loop? This way all allocations attempt the same node and then we move on? The only downside here is that the order of inner iteration may change all the time.

Another option is attempting this same thing with a priority queue that updates on each iteration, popping off all the least allocations and rebuilding each time, but this seems pretty expensive to me as the whole heap would be rebuit on every pass on a new node...

A low hanging fruit here is to apply the filter of StartTrainedModelDeploymentAction.TaskParams.mayAllocateToNode at the point where we collect currentNotShuttingDownNodes. So that would become currentEligibleNodes for instance and it'd be non-shutting down, ML nodes of an appropriate version. This would significantly reduce one of the parameters here for most clusters.

Not sure I can think of something else from a first look. But if we make this optimization, then we can run a few tests to see how long it takes to run this with realistic parameter values. It might not be a problem at all.

benwtrent · 2021-09-15T13:34:47Z

...va/org/elasticsearch/xpack/ml/inference/allocation/TrainedModelAllocationClusterService.java

@@ -289,16 +289,8 @@ static ClusterState updateModelRoutingTable(ClusterState currentState, UpdateTra
        logger.trace(
            () -> new ParameterizedMessage("[{}] [{}] current metadata before update {}", modelId, nodeId, Strings.toString(metadata))
        );
-        Set<String> shuttingDownNodes = nodesShuttingDown(currentState);


This is old, unused code, decided to clean it up. Unrelated to the change in this PR.

davidkyle · 2021-09-15T16:09:39Z

...va/org/elasticsearch/xpack/ml/inference/allocation/TrainedModelAllocationClusterService.java

+        StartTrainedModelDeploymentAction.TaskParams params,
+        DiscoveryNode node
+    ) {
+        NodeLoad load = builder.isChanged()


This behaviour might be surprising or cause a subtle bug as if you pass the builder (this is an overloaded method) you expect it to be used. I can't think of reason not to call the builder.build() did you have something in mind?

I can't think of reason not to call the builder.build() did you have something in mind?

It just seems unnecessary if nothing changed. The object could be pretty big and constructing a brand new object (deeply copying all the hashtables), could be expensive in the case when a node is simply leaving the cluster

It works fine in this case because we know it is ok to fall back on the TrainedModelAllocationMetadata in ClusterState state but given a few refactors and a little bit of time that assumption may no longer hold and lead unexpected behaviour.

Perhaps it is not as neat but you could test builder.isChanged() outside this function and then call the correct overload or at least add a comment

if you pass the builder (this is an overloaded method) you expect it to be used

I suppose it is a sneaky switch. I will update it to be more clear so other callers of the method aren't surprised.

davidkyle · 2021-09-15T16:14:53Z

...va/org/elasticsearch/xpack/ml/inference/allocation/TrainedModelAllocationClusterService.java

+                        && modelAllocationEntry.getValue().isRoutedToNode(node.getId()) == false) {
+                        Optional<String> failure = nodeHasCapacity(
+                            currentState,
+                            builder,


good catch using the updated allocations here

…tion-prefers-least-allocated-on-new-node

davidkyle

LGTM

benwtrent · 2021-09-16T12:09:35Z

run elasticsearch-ci/packaging-tests-windows-sample

dimitris-athanasiou

LGTM2

* master: (185 commits) Implement get and containsKey in terms of the wrapped innerMap (elastic#77965) Adjust Lucene version and enable BWC tests (elastic#77933) Disable BWC to upgrade to Lucene-8.10-snapshot Reenable MlDistributedFailureIT [DOCS] Fix typo for `script.painless.regex.enabled` setting value (elastic#77853) Upgrade to Lucene-8.10.0-snapshot-bf2fcb53079 (elastic#77801) [DOCS] Fix ESS install lead-in (elastic#77887) Resolve thirdparty gradle plugin artifacts from mavencentral (elastic#77865) Reduce the number of times that `LifecycleExecutionState` is parsed when running a policy. (elastic#77863) Utility methods to add and remove backing indices from data streams (elastic#77778) Use Objects.equals() instead of == to compare strings (elastic#77840) [ML] prefer least allocated model when a new node is added to the cluster (elastic#77756) Deprecate ignore_throttled parameter (elastic#77479) Improve LifecycleExecutionState parsing. (elastic#77855) [DOCS] Removes deprecated word from HLRC title. (elastic#77851) Remove legacy geo code from AggregationResultUtils (elastic#77702) Adjust SearchableSnapshotsBlobStoreCacheIntegTests.testBlobStoreCache (elastic#77758) Laxify SecureSM to allow creation of the JDK's innocuous threads (elastic#77789) [Test] Reduce concurrency when testing creation of security index (elastic#75293) Refactor metric PipelineAggregation integration test (elastic#77548) ... # Conflicts: # server/src/main/java/org/elasticsearch/cluster/metadata/DataStream.java

[ML] prefer least allocated model when a new node is added to the clu…

472704b

…ster

benwtrent added >non-issue :ml Machine learning v8.0.0 labels Sep 15, 2021

elasticmachine added the Team:ML Meta label for the ML team label Sep 15, 2021

benwtrent commented Sep 15, 2021

View reviewed changes

removing unused code

6406ebb

benwtrent commented Sep 15, 2021

View reviewed changes

davidkyle reviewed Sep 15, 2021

View reviewed changes

benwtrent added 2 commits September 16, 2021 07:38

Merge remote-tracking branch 'upstream/master' into feature/ml-alloca…

9a307e6

…tion-prefers-least-allocated-on-new-node

addressing PR comments

3bd697d

benwtrent requested review from davidkyle and dimitris-athanasiou September 16, 2021 11:55

davidkyle approved these changes Sep 16, 2021

View reviewed changes

dimitris-athanasiou approved these changes Sep 16, 2021

View reviewed changes

benwtrent added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Sep 16, 2021

benwtrent merged commit fec5e8a into elastic:master Sep 16, 2021

benwtrent deleted the feature/ml-allocation-prefers-least-allocated-on-new-node branch September 16, 2021 12:48

jakelandis added v8.0.0-beta1 and removed v8.0.0 labels Oct 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] prefer least allocated model when a new node is added to the cluster #77756

[ML] prefer least allocated model when a new node is added to the cluster #77756

benwtrent commented Sep 15, 2021

elasticmachine commented Sep 15, 2021

benwtrent Sep 15, 2021

dimitris-athanasiou Sep 15, 2021

benwtrent Sep 15, 2021

davidkyle Sep 15, 2021

benwtrent Sep 15, 2021

davidkyle Sep 16, 2021

benwtrent Sep 16, 2021

davidkyle Sep 15, 2021

davidkyle left a comment

benwtrent commented Sep 16, 2021

dimitris-athanasiou left a comment

[ML] prefer least allocated model when a new node is added to the cluster #77756

[ML] prefer least allocated model when a new node is added to the cluster #77756

Conversation

benwtrent commented Sep 15, 2021

elasticmachine commented Sep 15, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidkyle left a comment

Choose a reason for hiding this comment

benwtrent commented Sep 16, 2021

dimitris-athanasiou left a comment

Choose a reason for hiding this comment