Limit the number of concurrent requests per node #31206

s1monw · 2018-06-08T15:03:04Z

With max_concurrent_shard_requests we used to throttle / limit
the number of concurrent shard requests a high level search request
can execute per node. This had several problems since it limited the
number on a global level based on the number of nodes. This change
now throttles the number of concurrent requests per node while still
allowing concurrency across multiple nodes.

Closes #31192

With `max_concurrent_shard_requests` we used to throttle / limit the number of concurrent shard requests a high level search request can execute per node. This had several problems since it limited the number on a global level based on the number of nodes. This change now throttles the number of concurrent requests per node while still allowing concurrency across multiple nodes. Closes elastic#31192

elasticmachine · 2018-06-08T15:03:28Z

Pinging @elastic/es-search-aggs

…oose it in a race

jpountz

It looks like you had a search/replace issue but I like it in general. Given that it is an expert option I don't think we need sophisticated backward compatibility, let's just add a note about it in the breaking changes for 7.0?

jpountz · 2018-06-08T15:33:14Z

server/src/main/java/org/elasticsearch/action/search/InitialSearchPhase.java

+
+        synchronized void add(Runnable runnable) {
+            if (queue == null) { // create this lazily
+                queue = new LinkedList<>();


not that it matters much but ArrayDeque should be better?

jpountz · 2018-06-08T15:33:38Z

server/src/main/java/org/elasticsearch/action/search/InitialSearchPhase.java

+        }
+
+        synchronized void add(Runnable runnable) {
+            if (queue == null) { // create this lazily


why does it need to be lazy?

jpountz · 2018-06-08T15:35:43Z

server/src/main/java/org/elasticsearch/action/search/TransportSearchAction.java

@@ -258,7 +254,7 @@ protected void doExecute(Task task, SearchRequest searchRequest, ActionListener<
                }
                String[] aliases = aliasFilter.getAliases();
                String[] finalIndices = aliases.length == 0 ? new String[] {shardId.getIndexName()} : aliases;
-                // here we have to map the filters to the UUID since from now on we use the uuid for the lookup
+                // here we have to pendingExecutionsPerNode the filters to the UUID since from now on we use the uuid for the lookup


search/replace issue?

jpountz · 2018-06-08T15:37:11Z

server/src/main/java/org/elasticsearch/action/search/InitialSearchPhase.java

+                try {
+                    executePhaseOnShard(shardIt, shard, new SearchActionListener<FirstResult>(new SearchShardTarget(shard
+                        .currentNodeId(),
+                        shardIt.shardId(), shardIt.getClusterAlias(), shardIt.getOriginalIndices()), shardIndex) {


indentation of the two above lines doesn't help readability

jpountz · 2018-06-08T15:37:42Z

server/src/main/java/org/elasticsearch/action/search/SearchResponse.java

@@ -199,7 +199,7 @@ public void scrollId(String scrollId) {
     * If profiling was enabled, this returns an object containing the profile results from
     * each shard.  If profiling was not enabled, this will return null
     *
-     * @return The profile results or an empty map
+     * @return The profile results or an empty pendingExecutionsPerNode


search/replace issue?

jpountz · 2018-06-08T15:37:48Z

server/src/main/java/org/elasticsearch/action/search/SearchResponseSections.java

@@ -90,7 +90,7 @@ public final int getNumReducePhases() {

    /**
     * Returns the profile results for this search response (including all shards).
-     * An empty map is returned if profiling was not enabled
+     * An empty pendingExecutionsPerNode is returned if profiling was not enabled


search/replace issue?

jpountz · 2018-06-08T15:37:53Z

server/src/main/java/org/elasticsearch/action/search/SearchTransportService.java

@@ -203,8 +203,8 @@ public RemoteClusterService getRemoteClusterService() {
    }

    /**
-     * Return a map of nodeId to pending number of search requests.
-     * This is a snapshot of the current pending search and not a live map.
+     * Return a pendingExecutionsPerNode of nodeId to pending number of search requests.


search/replace issue?

jpountz · 2018-06-08T15:45:46Z

server/src/main/java/org/elasticsearch/action/search/InitialSearchPhase.java

-                fork(() -> onShardFailure(shardIndex, shard, shard.currentNodeId(), shardIt, e));
+                }
+            };
+            if (pendingExecutions == null || pendingExecutions.tryAcquire()) {


maybe remove this tryAcquire and go through the else block directly? I think it's simpler if the management of the semaphore permits are contained in PendingExecutions?

s1monw · 2018-06-08T16:37:44Z

@jpountz I addressed your comments

jpountz

LGTM

* master: Remove RestGetAllAliasesAction (#31308) Temporary fix for broken build Reenable Checkstyle's unused import rule (#31270) Remove remaining unused imports before merging #31270 Fix non-REST doc snippet [DOC] Extend SQL docs Immediately flush channel after writing to buffer (#31301) [DOCS] Shortens ML API intros Use quotes in the call invocation (#31249) move security ingest processors to a sub ingest directory (#31306) Add 5.6.11 version constant. Fix version detection. SQL: Whitelist SQL utility class for better scripting (#30681) [Docs] All Rollup docs experimental, agg limitations, clarify DeleteJob (#31299) CCS: don't proxy requests for already connected node (#31273) Mute ScriptedMetricAggregatorTests testSelfReferencingAggStateAfterMap [test] opensuse packaging turn up debug logging Add unreleased version 6.3.1 Removes experimental tag from scripted_metric aggregation (#31298) [Rollup] Metric config parser must use builder so validation runs (#31159) [ML] Check licence when datafeeds use cross cluster search (#31247) Add notion of internal index settings (#31286) Test: Remove broken yml test feature (#31255) REST hl client: cluster health to default to cluster level (#31268) [ML] Update test thresholds to account for changes to memory control (#31289) Log warnings when cluster state publication failed to some nodes (#31233) Fix AntFixture waiting condition (#31272) Ignore numeric shard count if waiting for ALL (#31265) [ML] Implement new rules design (#31110) index_prefixes back-compat should test 6.3 (#30951) Core: Remove plain execute method on TransportAction (#30998) Update checkstyle to 8.10.1 (#31269) Set analyzer version in PreBuiltAnalyzerProviderFactory (#31202) Modify pipelining handlers to require full requests (#31280) Revert upgrade to Netty 4.1.25.Final (#31282) Use armored input stream for reading public key (#31229) Fix Netty 4 Server Transport tests. Again. REST hl client: adjust wait_for_active_shards param in cluster health (#31266) REST high-level Client: remove deprecated API methods (#31200) [DOCS] Mark SQL feature as experimental [DOCS] Updates machine learning custom URL screenshots (#31222) Fix naming conventions check for XPackTestCase Fix security Netty 4 transport tests Fix race in clear scroll (#31259) [DOCS] Clarify audit index settings when remote indexing (#30923) Delete typos in SAML docs (#31199) REST high-level client: add Cluster Health API (#29331) [ML][TEST] Mute tests using rules (#31204) Support RequestedAuthnContext (#31238) SyncedFlushResponse to implement ToXContentObject (#31155) Add Get Aliases API to the high-level REST client (#28799) Remove some line length supressions (#31209) Validate xContentType in PutWatchRequest. (#31088) [INGEST] Interrupt the current thread if evaluation grok expressions take too long (#31024) Suppress extras FS on caching directory tests Revert "[DOCS] Added 6.3 info & updated the upgrade table. (#30940)" Revert "Fix snippets in upgrade docs" Fix snippets in upgrade docs [DOCS] Added 6.3 info & updated the upgrade table. (#30940) LLClient: Support host selection (#30523) Upgrade to Netty 4.1.25.Final (#31232) Enable custom credentials for core REST tests (#31235) Move ESIndexLevelReplicationTestCase to test framework (#31243) Encapsulate Translog in Engine (#31220) HLRest: Add get index templates API (#31161) Remove all unused imports and fix CRLF (#31207) [Tests] Fix self-referencing tests [TEST] Fix testRecoveryAfterPrimaryPromotion [Docs] Remove mention pattern files in Grok processor (#31170) Use stronger write-once semantics for Azure repository (#30437) Don't swallow exceptions on replication (#31179) Limit the number of concurrent requests per node (#31206) Call ensureNoSelfReferences() on _agg state variable after scripted metric agg script executions (#31044) Move java version checker back to its own jar (#30708) [test] add fix for rare virtualbox error (#31212)

Some of the docs were outdated as they did not mention that the limit is not per node. Also, The default value changed. Relates to elastic#31206

Some of the docs were outdated as they did not mention that the limit is not per node. Also, The default value changed. Relates to #31206

Some of the docs were outdated as they did not mention that the limit is not per node. Also, The default value changed. Relates to elastic#31206

Some of the docs were outdated as they did not mention that the limit is not per node. Also, The default value changed. Relates to #31206

s1monw requested review from jpountz, ywelsch and jasontedor June 8, 2018 15:03

s1monw added >enhancement :Search/Search Search-related issues that do not fall into other categories v7.0.0 labels Jun 8, 2018

make sure we try to execute again after queueing to ensure we don't l…

48f79f0

…oose it in a race

jpountz requested changes Jun 8, 2018

View reviewed changes

s1monw added 2 commits June 8, 2018 17:58

fix replace issue

86bda1c

address comments

b88f065

Merge branch 'master' into fix_max_con_search_request

ee95038

jpountz approved these changes Jun 11, 2018

View reviewed changes

s1monw merged commit f825a53 into elastic:master Jun 11, 2018

s1monw deleted the fix_max_con_search_request branch June 11, 2018 13:53

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

javanna mentioned this pull request May 20, 2019

Update max_concurrent_shard_request parameter docs #42227

Merged

javanna added a commit that referenced this pull request May 23, 2019

Update max_concurrent_shard_request parameter docs (#42227)

13dc1cf

Some of the docs were outdated as they did not mention that the limit is not per node. Also, The default value changed. Relates to #31206

javanna added a commit that referenced this pull request Jun 12, 2019

Update max_concurrent_shard_request parameter docs (#42227)

e538592

Some of the docs were outdated as they did not mention that the limit is not per node. Also, The default value changed. Relates to #31206

javanna added a commit that referenced this pull request Jun 12, 2019

Update max_concurrent_shard_request parameter docs (#42227)

01a7c85

Some of the docs were outdated as they did not mention that the limit is not per node. Also, The default value changed. Relates to #31206

javanna added a commit that referenced this pull request Jun 12, 2019

Update max_concurrent_shard_request parameter docs (#42227)

a357b93

Some of the docs were outdated as they did not mention that the limit is not per node. Also, The default value changed. Relates to #31206

javanna added a commit that referenced this pull request Jun 12, 2019

Update max_concurrent_shard_request parameter docs (#42227)

f10199f

Some of the docs were outdated as they did not mention that the limit is not per node. Also, The default value changed. Relates to #31206

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit the number of concurrent requests per node #31206

Limit the number of concurrent requests per node #31206

s1monw commented Jun 8, 2018

elasticmachine commented Jun 8, 2018

jpountz left a comment

jpountz Jun 8, 2018

jpountz Jun 8, 2018

jpountz Jun 8, 2018

jpountz Jun 8, 2018

jpountz Jun 8, 2018

jpountz Jun 8, 2018

jpountz Jun 8, 2018

jpountz Jun 8, 2018

s1monw commented Jun 8, 2018

jpountz left a comment

Limit the number of concurrent requests per node #31206

Limit the number of concurrent requests per node #31206

Conversation

s1monw commented Jun 8, 2018

elasticmachine commented Jun 8, 2018

jpountz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

s1monw commented Jun 8, 2018

jpountz left a comment

Choose a reason for hiding this comment