-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not create engine under IndexShard#mutex #45263
Conversation
Pinging @elastic/es-distributed |
I also prefer the second approach, since snapshot of metadata will soon need to ask for things like max seqno and global checkpoint, which could then not be initialized yet in the case we have an engine. I am thinking about this work (that is still to be done): https://github.com/elastic/elasticsearch/pull/42518/files#diff-49e1a1b834b522f4ae6997c5defe9eb0R1243. |
@dnhatn why can't we create an engine under a different lock just before we acquire the mutex and do some state checks. I don't think we should make such a drastic change to the concurrency model just because a ctor can be heavy. You can protect from creating two engines with a separate lock and then once it's created do the state check and change the reference? Something like this: diff --git a/server/src/main/java/org/elasticsearch/index/shard/IndexShard.java b/server/src/main/java/org/elasticsearch/index/shard/IndexShard.java
index a98f501946b..4897792223b 100644
--- a/server/src/main/java/org/elasticsearch/index/shard/IndexShard.java
+++ b/server/src/main/java/org/elasticsearch/index/shard/IndexShard.java
@@ -1593,16 +1593,26 @@ public class IndexShard extends AbstractIndexShardComponent implements IndicesCl
assert recoveryState.getRecoverySource().expectEmptyRetentionLeases() == false || getRetentionLeases().leases().isEmpty()
: "expected empty set of retention leases with recovery source [" + recoveryState.getRecoverySource()
+ "] but got " + getRetentionLeases();
- synchronized (mutex) {
- verifyNotClosed();
- assert currentEngineReference.get() == null : "engine is running";
- // we must create a new engine under mutex (see IndexShard#snapshotStoreMetadata).
+ synchronized (this) {
final Engine newEngine = engineFactory.newReadWriteEngine(config);
- onNewEngine(newEngine);
- currentEngineReference.set(newEngine);
- // We set active because we are now writing operations to the engine; this way,
- // if we go idle after some time and become inactive, we still give sync'd flush a chance to run.
- active.set(true);
+ boolean success = false;
+ try {
+ synchronized (mutex) {
+ verifyNotClosed();
+ assert currentEngineReference.get() == null : "engine is running";
+ // we must create a new engine under mutex (see IndexShard#snapshotStoreMetadata).
+ onNewEngine(newEngine);
+ currentEngineReference.set(newEngine);
+ // We set active because we are now writing operations to the engine; this way,
+ // if we go idle after some time and become inactive, we still give sync'd flush a chance to run.
+ active.set(true);
+ success = true;
+ }
+ } finally {
+ if (success == false) {
+ newEngine.close();
+ }
+ }
} |
I started with your suggestion, but then I dropped it. I was worried about deadlock for we need to make sure the lock ordering. I've applied your suggestion in bd56da8. Can you please take another look. Thank you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a couple of comments to consider. I think this can work out, but need to ponder on it a little
// we need to refresh again to expose all operations that were index until now. Otherwise | ||
// we may not expose operations that were indexed with a refresh listener that was immediately | ||
// responded to in addRefreshListener. | ||
getEngine().refresh("post_recovery"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if moving this outside the mutex require extra care in addRefreshListener too, since otherwise we risk someone querying in between marking state POST_RECOVERY and the refresh completing, expecting to see data because of a previous call to addRefreshListener
? I am not really absolutely sure this is necessary and the timing obviously have to be horrible for anything bad to happen, but wanted your thoughts on this anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well spotted. Yes, we need to maintain this happens-before relation. We can reuse the engineMutex but I prefer a separate mutex for this. See f802515.
} | ||
} finally { | ||
if (success == false) { | ||
readOnlyEngine.close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This slightly changes behaviour to close the current engine if closing the old engine fails. While this might not really make a difference, I think I would prefer to not do that., ie. set success=true after verifyNotClosed
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should close it if the old engine does't close it will just cause dangling reference to an engine and a locked shared if we don't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed 67badc6 to close the new engine if we failed to install.
} | ||
} finally { | ||
if (success == false) { | ||
readOnlyEngine.close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should close it if the old engine does't close it will just cause dangling reference to an engine and a locked shared if we don't?
@henningandersen @s1monw I've addressed your comments. Can you take another look? Thank you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Thanks @dnhatn
@henningandersen @s1monw Thank you for reviewing. |
Today we create new engines under IndexShard#mutex. This is not ideal because it can block the cluster state updates which also execute under the same mutex. We can avoid this problem by creating new engines under a separate mutex. Closes #43699
* Put error message from inside the process into the exception that is thrown when the process doesn't start correctly. (#45846) * update bwcVersions * [DOCS] Reformat match query (#45152) * Fix update-by-query script examples (#43907) Two examples had swapped the order of lang and code when creating a script. Relates #43884 * Adjusting ML usage object serialization bwc version (#45874) * Fsync translog without writeLock before rolling (#45765) Today, when rolling a new translog generation, we block all write threads until a new generation is created. This choice is perfectly fine except in a highly concurrent environment with the translog async setting. We can reduce the blocking time by pre-sync the current generation without writeLock before rolling. The new step would fsync most of the data of the current generation without blocking write threads. Close #45371 * Add node.processors setting in favor of processors (#45855) This commit namespaces the existing processors setting under the "node" namespace. In doing so, we deprecate the existing processors setting in favor of node.processors. * Remove binary file accidentally committed 🤦♀️ * Fix TransportSnapshotsStatusAction ThreadPool Use (#45824) In case of an in-progress snapshot this endpoint was broken because it tried to execute repository operations in the callback on a transport thread which is not allowed (only generic or snapshot pool are allowed here). * Enable testing against JDK 14 (#45178) This commit enables testing against JDK 14. * [DOCS] Add anchor to version types list. (#45886) * Adding a warning to from-size.asciidoc Customers occasionally discover a known behavior in Elasticsearch's pagination that does not appear to be documented. This warning is intended to educate customers of this behavior while still highlighting alternative solutions. * Remove redundant Java check from Sys V init (#45793) In the Sys V init scripts, we check for Java. This is not needed, since the same check happens in elasticsearch-env when starting up. Having this duplicate check has bitten us in the past, where we made a change to the logic in elasticsearch-env, but missed updating it here. Since there is no need for this duplicate check, we remove it from the Sys V init scripts. * Update joda to 2.10.3 (#45495) * Allow partial request body reads in AWS S3 retries tests (#45847) This commit changes the tests added in #45383 so that the fixture that emulates the S3 service now sometimes consumes all the request body before sending an error, sometimes consumes only a part of the request body and sometimes consumes nothing. The idea here is to beef up a bit the tests that writes blob because the client's retry logic relies on marking and resetting the blob's input stream. This pull request also changes the testWriteBlobWithRetries() so that it (rarely) tests with a large blob (up to 1mb), which is more than the client's default read limit on input streams (131Kb). Finally, it optimizes the ZeroInputStream so that it is a bit more effective (now works using an internal buffer and System.arraycopy() primitives). * Move testRetentionLeasesClearedOnRestore (#45896) * [DOCS] Reformat put mapping API docs (#45709) * Fix RemoteClusterConnection close race (#45898) Closing a `RemoteClusterConnection` concurrently with trying to connect could result in double invoking the listener. This fixes RemoteClusterConnectionTest#testCloseWhileConcurrentlyConnecting Closes #45845 * [ML][Transforms] fix doSaveState check (#45882) * [ML][Transforms] fix doSaveState check * removing unnecessary log statement * [ML] Improve progress reportings for DF analytics (#45856) Previously, the stats API reports a progress percentage for DF analytics tasks that are running and are in the `reindexing` or `analyzing` state. This means that when the task is `stopped` there is no progress reported. Thus, one cannot distinguish between a task that never run to one that completed. In addition, there are blind spots in the progress reporting. In particular, we do not account for when data is loaded into the process. We also do not account for when results are written. This commit addresses the above issues. It changes progress to being a list of objects, each one describing the phase and its progress as a percentage. We currently have 4 phases: reindexing, loading_data, analyzing, writing_results. When the task stops, progress is persisted as a document in the state index. The stats API now reports progress from in-memory if the task is running, or returns the persisted document (if there is one). * Expose the ability to cancel async requests in REST high-level client (#45688) This commits makes all the async methods in the high level client return the `Cancellable` object that the low level client now exposes. Relates to #45379 Closes #44802 * Fix IngestService to respect original document content type (#45799) This PR modifies the logic in IngestService to preserve the original content type on the IndexRequest, such that when a document with a content type like SMILE is submitted to a pipeline, the resulting document that is persisted will remain in the original content type (SMILE in this case). * Change `{var}` convention to `<var>` (#45904) * Fix bugs in Painless SCatch node (#45880) This fixes two bugs: - A recently introduced bug where an NPE will be thrown if a catch block is empty. - A long-time bug where an NPE will be thrown if multiple catch blocks in a row are empty for the same try block. * Update translog checkpoint after marking ops as persisted (#45634) If two translog syncs happen concurrently, then one can return before its operations are marked as persisted. In general, this should not be an issue; however, peer recoveries currently rely on this assumption. Closes #29161 * [DOCS] Reformat get index API docs (#45758) * [DOCS] Reformat delete index API docs (#45755) * Handle multiple loopback addresses (#45901) AbstractSimpleTransportTestCase.testTransportProfilesWithPortAndHost expects a host to only have a single IPv4 loopback address, which isn't necessarily the case. Allow for >= 1 address. * [DOCS] Relocate Ingest API docs to REST API section (#45812) * [ML][Transforms] adjusting when and what to audit (#45876) * [ML][Transforms] adjusting when and what to audit * Update DataFrameTransformTask.java * removing unnecessary audit message * Remove processors setting (#45905) The processors setting was deprecated in version 7.4.0 of Elasticsearch for removal in Elasticsearch 8.0.0. This commit removes the processors setting. * Remove translating processors in Docker entrypoint (#45923) Now that processors is no longer a valid Elasticsearch setting, this commit removes translation for it in the Docker entrypoint. * Deprecate the pidfile setting (#45938) This commit deprecates the pidfile setting in favor of node.pidfile. * Adjust node.pidfile version in cluster formation Now that the deprecation of pidfile has been backported to 7.4.0, this commit adjusts the version-conditional logic in cluster formation tasks for setting pidfile versus node.pidfile. * Remove non task aware execute methods from TransportAction (#45821) The TransportAction class has several ways to execute the action, some of which will create a task. This commit removes those non task aware variants in favor of handling task creation inside NodeClient for local actions. * Remove the pidfile setting (#45940) The pidfile setting was deprecated in version 7.4.0 of Elasticsearch for removal in Elasticsearch 8.0.0. This commit removes the pidfile setting. * Allow Transport Actions to indicate authN realm (#45767) This commit allows the Transport Actions for the SSO realms to indicate the realm that should be used to authenticate the constructed AuthenticationToken. This is useful in the case that many authentication realms of the same type have been configured and where the caller of the API(Kibana or a custom web app) already know which realm should be used so there is no need to iterate all the realms of the same type. The realm parameter is added in the relevant REST APIs as optional so as not to introduce any breaking change. * re-enable BWC tests after merging #45767 (#45948) * Fix plaintext on TLS port logging (#45852) Today if non-TLS record is received on TLS port generic exception will be logged with the stack-trace. SSLExceptionHelper.isNotSslRecordException method does not work because it's assuming that NonSslRecordException would be top-level. This commit addresses the issue and the log would be more concise. * Add Test Logging for #45953 (#45957) Adding some logging to track down #45953 and making the failing assertion log more detail * [DOCS] Reformat create index API docs (#45749) * Fix SnapshotStatusApisIT (#45929) The snapshot status when blocking can still be INIT in rare cases when the new cluster state that has the snapshot in `STARTED` hasn't yet become visible. Fixes #45917 * Fix Broken HTTP Request Breaking Channel Closing (#45958) This is essentially the same issue fixed in #43362 but for http request version instead of the request method. We have to deal with the case of not being able to parse the request version, otherwise channel closing fails. Fixes #43850 * Refactor RepositoryCredentialsTests (#45919) This commit refactors the S3 credentials tests in RepositoryCredentialsTests so that it now uses a single node (ESSingleNodeTestCase) to test how secure/insecure credentials are overriding each other. Using a single node makes it much easier to understand what each test is actually testing and IMO better reflect how things are initialized. It also allows to fold into this class the test testInsecureRepositoryCredentials which was wrongly located in S3BlobStoreRepositoryTests. By moving this test away, the S3BlobStoreRepositoryTests class does not need the allow_insecure_settings option anymore and thus can be executed as part of the usual gradle test task. * [DOCS] Reformat get settings API docs (#45924) * Better logging for TLS message on non-secure transport channel (#45835) This commit enhances logging for 2 cases: 1. If non-TLS enabled node receives transport message from TLS enabled node on transport port. 2. If non-TLS enabled node receives HTTPs request on transport port. * Relax translog assertion in testRestoreLocalHistoryFromTranslog (#45943) Since #45473, we trim translog below the local checkpoint of the safe commit immediately if soft-deletes enabled. In testRestoreLocalHistoryFromTranslog, we should have a safe commit after recoverFromTranslog is called; then we will trim translog files which contain only operations that are at most the global checkpoint. With this change, we relax the assertion to ensure that we don't put operations to translog while recovering history from the local translog. * Consider artifact repositories backed by S3 secure (#45950) Since credentials are required to access such a repository, and these repositories are accessed over an encrypted protocol (https), this commit adds support to consider S3-backed artifact repositories as secure. Additionally, we add tests for this functionality. * Build: Support `console-result` language (#45937) This adds support for verifying that snippets with the `console-result` language are valid json. It also switches the response snippets on the `docs/get` page from `js` to `console-result` which will allow clients to provide "alternatives" for them like they can now do with `// CONSOLE` snippets. * [DOCS] Reformat indices exists API docs (#45918) * [DOCS] Reformat get field mapping API docs (#45700) * Add Cumulative Cardinality agg (and Data Science plugin) (#43661) This adds a pipeline aggregation that calculates the cumulative cardinality of a field. It does this by iteratively merging in the HLL sketch from consecutive buckets and emitting the cardinality up to that point. This is useful for things like finding the total "new" users that have visited a website (as opposed to "repeat" visitors). This is a Basic+ aggregation and adds a new Data Science plugin to house it and future advanced analytics/data science aggregations. * [DOCS] Correct `IIF` conditional section title (#45979) * Fix typo in plugin name, add to allowed settings * PKI realm authentication delegation (#45906) This commit introduces PKI realm delegation. This feature supports the PKI authentication feature in Kibana. In essence, this creates a new API endpoint which Kibana must call to authenticate clients that use certificates in their TLS connection to Kibana. The API call passes to Elasticsearch the client's certificate chain. The response contains an access token to be further used to authenticate as the client. The client's certificates are validated by the PKI realms that have been explicitly configured to permit certificates from the proxy (Kibana). The user calling the delegation API must have the delegate_pki privilege. Closes #34396 * [ML] fixing bug where analytics process starts with 0 rows (#45879) The native process requires that there be a non-zero number of rows to analyze. If the flag --rows 0 is passed to the executable, it throws and does not start. When building the configuration for the process we should not start the native process if there are no rows. Adding some logging to indicate what is occurring. * [ML] add supported types to no fields error message (#45926) * [ML] add supported types to no fields error message * adding supported types to logger debug * Range Field support for Histogram and Date Histogram aggregations(#45395) * Add support for a Range field ValuesSource, including decode logic for range doc values and exposing RangeType as a first class enum * Provide hooks in ValuesSourceConfig for aggregations to control ValuesSource class selection on missing & script values * Branch aggregator creation in Histogram and DateHistogram based on ValuesSource class, to enable specialization based on type. This is similar to how Terms aggregator works. * Prioritize field type when available for selecting the ValuesSource class type to use for an aggregation * [TEST] wait for search task to be cancelled in SearchRestCancellationIT (#45978) SearchRestCancellationIT aborts an http request, and then checks that the corresponding search task has been cancelled on the server-side. There are no guarantees that the task has already been marked cancelled after the `cancel` calls returns, and there is no easy wait for that. This commit introduces an assertBusy to try and wait for the search task to be marked cancelled. Closes #45911 * Remove node settings from blob store repositories (#45991) This commit starts from the simple premise that the use of node settings in blob store repositories is a mistake. Here we see that the node settings are used to get default settings for store and restore throttle rates. Yet, since there are not any node settings registered to this effect, there can never be a default setting to fall back to there, and so we always end up falling back to the default rate. Since this was the only use of node settings in blob store repository, we move them. From this, several places fall out where we were chaining settings through only to get them to the blob store repository, so we clean these up as well. That leaves us with the changeset in this commit. * [DOCS] Streamline GS search topic. (#45941) * Streamline GS search topic. * Added missing comma. * Update docs/reference/getting-started.asciidoc Co-Authored-By: István Zoltán Szabó <istvan.szabo@elastic.co> * Add test for CopyBytesSocketChannel (#45873) Currently we use a custom CopyBytesSocketChannel for interfacing with netty. We have integration tests that use this channel, however we never verify the read and write behavior in the face of potential partial writes. This commit adds a test for this behavior. * Do not create engine under IndexShard#mutex (#45263) Today we create new engines under IndexShard#mutex. This is not ideal because it can block the cluster state updates which also execute under the same mutex. We can avoid this problem by creating new engines under a separate mutex. Closes #43699 * Fix compilation in CumulativeCardinalityAggregatorTests (#46000) Some generics were specified at too fine-grained a level. * [DOCS] Streamlined GS aggs section. (#45951) * [DOCS] Streamlined GS aggs section. * Update docs/reference/getting-started.asciidoc Co-Authored-By: James Rodewig <james.rodewig@elastic.co> * Don't use assemble task on root project (#45999) The root project uses the base plugin to get a clean task, but does not actually need the assemble task. This commit changes the root project to use the lifecycle-base plugin, which while still creating the assemble task, won't add any dependencies to it. * [DOCS] Fix typo. (#46006) * [TEST] wait for http channels to be closed in ESIntegTestCase (#45977) We recently added a check to `ESIntegTestCase` in order to verify that no http channels are being tracked when we close clusters and the REST client. Close listeners though are invoked asynchronously, hence this check may fail if we assert before the close listener that removes the channel from the map is invoked. With this commit we add an `assertBusy` so we try and wait for the map to be empty. Closes #45914 Closes #45955 * Add `manage_own_api_key` cluster privilege (#45897) The existing privilege model for API keys with privileges like `manage_api_key`, `manage_security` etc. are too permissive and we would want finer-grained control over the cluster privileges for API keys. Previously APIs created would also need these privileges to get its own information. This commit adds support for `manage_own_api_key` cluster privilege which only allows api key cluster actions on API keys owned by the currently authenticated user. Also adds support for retrieval of the API key self-information when authenticating via API key without the need for the additional API key privileges. To support this privilege, we are introducing additional authentication context along with the request context such that it can be used to authorize cluster actions based on the current user authentication. The API key get and invalidate APIs introduce an `owner` flag that can be set to true if the API key request (Get or Invalidate) is for the API keys owned by the currently authenticated user only. In that case, `realm` and `username` cannot be set as they are assumed to be the currently authenticated ones. The changes cover HLRC changes, documentation for the API changes. Closes #40031 * Partly revert globalInfo.ready check (#45960) This check was introduced in #41392 but had the unwanted side-effect that the keystore settings in such blocks would note be added in the node's keystore. Given that we have a mid-term plan for FIPS testing that would made such checks unnecessary, and that the conditional in these two cases is not really that important, this change removes this conditional logic so that full-cluster-restart and rolling upgrade tests will run with PEM files for key/certificate material no matter if we're in a FIPS JVM or not. Resolves: #45475 * [ML] Add option to regression to randomize training set (#45969) Adds a parameter `training_percent` to regression. The default value is `100`. When the parameter is set to a value less than `100`, from the rows that can be used for training (ie. those that have a value for the dependent variable) we randomly choose whether to actually use for training. This enables splitting the data into a training set and the rest, usually called testing, validation or holdout set, which allows for validating the model on data that have not been used for training. Technically, the analytics process considers as training the data that have a value for the dependent variable. Thus, when we decide a training row is not going to be used for training, we simply clear the row's dependent variable. * Disallow partial results when shard unavailable (#45739) Searching with `allowPartialSearchResults=false` could still return partial search results during recovery. If a shard copy fails with a "shard not available" exception, the failure would be ignored and a partial result returned. The one case where this is known to happen is when a shard copy is recovering when searching, since `IllegalIndexShardStateException` is considered a "shard not available" exception. Relates to #42612 * [DOCS] Reformat open index API docs (#45921) * Fix RegressionTests#fromXContent (#46029) * The `trainingPercent` must be between `1` and `100`, not `0` and `100` which is causing test failures * [DOCS] Separate and reformat close index API docs (#45922) * Remove already exist assertion while renew ccr lease (#46009) If a CCR lease is disappeared while we are renewing it, then we will issue asyncAddRetentionLease to add that lease. And if asyncAddRetentionLease takes longer than retentionLeaseRenewInterval, then we can issue another asyncAddRetentionLease request. One of asyncAddRetentionLease requests will fail with RetentionLeaseAlreadyExistsException, hence trip the assertion. Closes #45192 * Watcher max_iterations with foreach action execution (#45715) Prior to this commit the foreach action execution had a hard coded limit to 100 iterations. This commit allows the max number of iterations to be a configuration ('max_iterations') on the foreach action. The default remains 100. * [DOCS] Reformat update index settings API docs (#45931) * Always add Java-9 style file permissions (#46050) Java 9 removed pathname canonicalization, which means that we need to add permissions for the path and also the real path when adding file permissions. Since master requires a minimum runtime of JDK 11, we no longer need conditional logic here to apply this pathname canonicalization with our bares hands. This commit removes that conditional pathname canonicalization. * [ML][HLRC] Add data frame analytics regression analysis (#46024) * [ML] Support boolean fields for DF analytics (#46037) This commit adds support for `boolean` fields in data frame analytics (and currently both outlier detection and regression). The analytics process expects `boolean` fields to be encoded as integers with 0 or 1 value. * Add a few notes on Cancellable to the LLRC and HLRC docs. (#45912) Add a section to both the low level and high level client documentation on asynchronous usage and `Cancellable` added for #44802 Co-Authored-By: Lee Hinman <dakrone@users.noreply.github.com> * [DOCS] [8.0] Add upgrade matrix to docs (#46027) * [DOCS] Add index alias exists API docs (#46042) * Few clean ups in ESBlobStoreRepositoryIntegTestCase (#46068) * Add XContentType as parameter to HLRC ART#createServerTestInstance (#46036) Add XContentType as parameter to the AbstractResponseTestCase#createServerTestInstance method. In the case a server side response class serializes xcontent as bytes then the test needs to know what xcontent type was randomily selected. This change is needed in #45970 * Fix rollover alias in SLM history index template (#46001) This commit adds the `rollover_alias` setting required for ILM to work correctly to the SLM history index template and adds assertions to the SLM integration tests to ensure that it works correctly. * Handle no-op document level failures (#46083) Today we assume that document failures can not occur for no-ops. This assumption is bogus, as they can fail for a variety of reasons such as the Lucene index having reached the document limit. Because of this assumption, we were asserting that such a document-level failure would never happen. When this bogus assertion is violated, we fail the node, a catastrophe. Instead, we need to treat this as a fatal engine exception. * Remove plugins dir reference from docs (#46047) While the plugin installation directory used to be settable, it has not been so for several major versions. This commit removes a lingering reference to the plugins directory in upgrade docs. closes #45889 * Fix rest-api-spec dep for external plugins (#45949) This commit fixes the maven coordinates for the rest-api-spec jar. It was accidentally by #45107. closes #45891 * Use float instead of double for query vectors. (#46004) Currently, when using script_score functions like cosineSimilarity, the query vector is treated as an array of doubles. Since the stored document vectors use floats, it seems like the least surprising behavior for the query vectors to also be float arrays. In addition to improving consistency, this change may help with some optimizations we have been considering around vector dot product. * Add Circle Processor (#43851) add circle-processor that translates circles to polygons * [ML] Throw an error when a datafeed needs CCS but it is not enabled for the node (#46044) Though we allow CCS within datafeeds, users could prevent nodes from accessing remote clusters. This can cause mysterious errors and difficult to troubleshoot. This commit adds a check to verify that `cluster.remote.connect` is enabled on the current node when a datafeed is configured with a remote index pattern. * Muting org.elasticsearch.client.MachineLearningIT.testEstimateMemoryUsage (#46099) * [DOCS] Adds search-related query parameters to the common parameters. (#46057) @szabosteve Merging so I can make some additions. Will incorporate the comments from @jrodewig. * Move netty numDirectArenas to jvm.options (#46104) We currently configure io.netty.allocator.numDirectArenas to be 0 in the jvm erconomics class. This is a config that we always want to set, so it makes sense to move it to jvm.options. * Handle delete document level failures (#46100) Today we assume that document failures can not occur for deletes. This assumption is bogus, as they can fail for a variety of reasons such as the Lucene index having reached the document limit. Because of this assumption, we were asserting that such a document-level failure would never happen. When this bogus assertion is violated, we fail the node, a catastrophe. Instead, we need to treat this as a fatal engine exception. * [DOCS] Reformats delete by query API (#46051) * Reformats delete by query API * Update docs/reference/docs/delete-by-query.asciidoc Co-Authored-By: James Rodewig <james.rodewig@elastic.co> * Updated common parms includes. * Flush engine after big merge (#46066) Today we might carry on a big merge uncommitted and therefore occupy a significant amount of diskspace for quite a long time if for instance indexing load goes down and we are not quickly reaching the translog size threshold. This change will cause a flush if we hit a significant merge (512MB by default) which frees diskspace sooner. * Docs _cat/health verification fix (#46064) The _cat/health call in getting-started assumes that the master task max wait time is always 0 (-), however, the test could sometimes run into a short wait time (like some ms). Fixed to allow this. * Do not throw an exception if the process finished quickly but without any error. (#46073) * [DOCS] Reformats URI search request (#45844) * [DOCS] Reformats URI search request. Co-Authored-By: James Rodewig <james.rodewig@elastic.co> Co-Authored-By: debadair <debadair@elastic.co> * DOC: Update SQL docs for DbVis and Workbench/J (#45981) Refresh the setup for the new versions of DbVisualizer and SQL Workbench/J which have Elasticsearch JDBC support out of the box. * Upgrade to Azure SDK 8.4.0 (#46094) * Upgrading to 8.4.0 here which brings bulk deletes to be used in a follow up PR * Use better matchers in AbstractSimpleTransportTestCase (#45899) Convert most of the assertions to use Hamcrest matchers, as they give much more context if an assertion fails. * Refactor auditor-related classes (#45893) * Unmute the test now that the fix for the underlying cause is merged in. (#46117) * Replace MockAmazonS3 usage in S3BlobStoreRepositoryTests by a HTTP server (#46081) This commit removes the usage of MockAmazonS3 in S3BlobStoreRepositoryTests and replaces it by a HttpServer that emulates the S3 service. This allows the repository tests to use the real Amazon's S3 client under the hood in tests and will allow to test the behavior of the snapshot/restore feature for S3 repositories by simulating random server-side internal errors. The HTTP server used to emulate the S3 service is intentionally simple and minimal to keep things understandable and maintainable. Testing full client options on the server side (like authentication, chunked encoding etc) remains the responsibility of the AmazonS3Fixture. * Avoid overshooting watermarks during relocation (#46079) Today the `DiskThresholdDecider` attempts to account for already-relocating shards when deciding how to allocate or relocate a shard. Its goal is to stop relocating shards onto a node before that node exceeds the low watermark, and to stop relocating shards away from a node as soon as the node drops below the high watermark. The decider handles multiple data paths by only accounting for relocating shards that affect the appropriate data path. However, this mechanism does not correctly account for _new_ relocating shards, which are unwittingly ignored. This means that we may evict far too many shards from a node above the high watermark, and may relocate far too many shards onto a node causing it to blow right past the low watermark and potentially other watermarks too. There are in fact two distinct issues that this PR fixes. New incoming shards have an unknown data path until the `ClusterInfoService` refreshes its statistics. New outgoing shards have a known data path, but we fail to account for the change of the corresponding `ShardRouting` from `STARTED` to `RELOCATING`, meaning that we fail to find the correct data path and treat the path as unknown here too. This PR also reworks the `MockDiskUsagesIT` test to avoid using fake data paths for all shards. With the changes here, the data paths are handled in tests as they are in production, except that their sizes are fake. Fixes #45177 * AwaitsFix for #46124 * Revert "Use better matchers in AbstractSimpleTransportTestCase (#45899)" This reverts commit 38cf581d360bdf50b1b1f1b21607887d8c91cf36. * Revert "AwaitsFix for #46124" This reverts commit 71ead7552df1fbdfab2c0e72015496f53b29ab20. * [DOCS] [PUT DFA] Documents inline the child params of source and dest (#45649) * [DOCS] [PUT DFA] Documents inline the child params of source and dest. * [DOCS] Fixes indentation issues and amends dfa definitions. * Only verify global checkpoint if translog sync occurred (#45980) We only sync translog if the given offset hasn't synced yet. We can't verify the global checkpoint from the latest translog checkpoint unless a sync has occurred. Closes #46065 Relates #45634 * Start testing against AdoptOpenJDK (#45666) This commit adds AdoptOpenJDK to the testing matrix. * [DOCS] Reformats analyze API (#45986) * [DOCS] Add get index alias API docs (#46046) * Validate SLM policy ids strictly (#45998) This uses strict validation for SLM policy ids, similar to what we use for index names. Resolves #45997 * More Efficient Ordering of Shard Upload Execution (#42791) * Change the upload order of of snapshots to work file by file in parallel on the snapshot pool instead of merely shard-by-shard * Inspired by #39657 * [DOCS] Correct custom analyzer callouts (#46030) * Rename `data-science` plugin to `analytics` (#46092) This renames the "data-science" plugin to "analytics". Also removes the enabled flag * [DOCS] Separate add index alias API docs (#46086) * [DOCS] Reformat update index aliases API docs (#46093) * [ML] Regression dependent variable must be numeric (#46072) * [ML] Regression dependent variable must be numeric This adds a validation that the dependent variable of a regression analysis must be numeric. * Address review comments and fix some problems In addition to addressing the review comments, this commit fixes a few issues I found during testing. In particular: - if there were mappings for required fields but they were not included we were not reporting the error - if explicitly included fields had unsupported types we were not reporting the error Unfortunately, I couldn't get those fixed without refactoring the code in `ExtractedFieldsDetector`. * Ensure top docs optimization is fully disabled for queries with unbounded max scores. (#46105) When a query contains a mandatory clause that doesn't track the max score per block, we disable the max score optimization. Previously, we were doing this by wrapping the collector with a FilterCollector that always returned ScoreMode.COMPLETE. However we weren't adjusting totalHitsThreshold, so the collector could still call Scorer#setMinCompetitiveScore. It is against the method contract to call setMinCompetitiveScore when the score mode is COMPLETE, and some scorers like ReqOptSumScorer throw an error in this case. This commit tries to disable the optimization by always setting totalHitsThreshold to max int, as opposed to wrapping the collector. * [DOCS] Add "index template exists" API docs (#46095) * [DOCS] Add "delete index template" API docs (#46101) * Remove classic similarity (#46078) This commit removes the `classic` similarity from code and docs in master (8.0). The `classic` similarity cannot be used on indices created after 7.0. Closes #46058 * Add package docs for bundled jdk location (#46153) This commit expands the documented directory layout of the rpm and deb packages to include the bundled jdk. closes #45150 * bump version (#46158) * Set netty system properties in BuildPlugin (#45881) Currently in production instances of Elasticsearch we set a couple of system properties by default. We currently do not apply all of these system properties in tests. This commit applies these properties in the tests. * Remove insecure settings (#46147) This commit removes the oxymoron of insecure secure settings from the code base. In particular, we remove the ability to set the access_key and secret_key for S3 repositories inside the repository definition (in the cluster state). Instead, these settings now must be in the keystore. Thus, it also removes some leniency where these settings could be placed in the elasticsearch.yml, would not be rejected there, but would not be consumed for any purpose. * Inject random errors in S3BlobStoreRepositoryTests (#46125) This commit modifies the HTTP server used in S3BlobStoreRepositoryTests so that it randomly returns server errors for any type of request executed by the SDK client. It is now possible to verify that the repository tests are s uccessfully completed even if one or more errors were returned by the S3 service in response of a blob upload, a blob deletion or a object listing request etc. Because injecting errors forces the SDK client to retry requests, the test limits the maximum errors to send in response for each request at 3 retries. * Forbid settings without a namespace (#45947) This commit forbids settings that are not in any namespace, all setting names must now contain a dot. * Enhanced logging when transport is misconfigured to talk to HTTP port (#45964) If a node is misconfigured to talk to remote node HTTP port (instead of transport port) eventually it will receive an HTTP response from the remote node on transport port (this happens when a node sends accidentally line terminating byte in a transport request). If this happens today it results in a non-friendly log message and a long stack trace. This commit adds a check if a malformed response is HTTP response. In this case, a concise log message would appear. * Fix wrong URL encoding in watcher HTTP client (#45894) The test assumption was calling the wrong method resulting in a URL encoding before returning the data. Closes #44970 * Fix translog stats in testPrepareIndexForPeerRecovery (#46137) When recovering a shard locally, we use a translog snapshot from newSnapshotFromGen which consists of all readers from a certain generation. In the test, we use newSnapshotFromMinSeqNo for the expectation. The snapshot of this method includes only readers containing operations in the requesting range. Closes #46022 * Make Snapshot Logic Write Metadata after Segments (#45689) * Write metadata during snapshot finalization after segment files to prevent outdated metadata in case of dynamic mapping updates as explained in #41581 * Keep the old behavior of writing the metadata beforehand in the case of mixed version clusters for BwC reasons * Still overwrite the metadata in the end, so even a mixed version cluster is fixed by this change if a newer version master does the finalization * Fixes #41581 * [TEST] Mute PinnedQueryBuilderIT.testPinnedPromotions (#46175) Relates #46174 * Move plugin.mandatory to installing plugins docs This commit moves the plugin.mandatory settings from the plugin directory page in the docs to the installing plugins page in the docs. * Move plugin.mandatory to its own page This commit takes the reworking of plugin.mandatory docs even farther by taking this setting to its own page. * Add test tasks for unpooled and direct buffer pooling to netty (#46049) Some netty behavior is controlled by system properties. While we want to test with the defaults for Elasticsearch for most tests, within netty we want to ensure these netty settings exhibit correct behavior. This commit adds variants of test and integTest tasks for netty which set the unpooled and direct buffer pooled allocators. relates #45881 * Stabilize SLM REST Tests (#46195) Unfortunately, #42791 destabilized SLM tests because those tests use rate limiting the snapshot write rate to a very low value globally. Now that the various files in a snapshot get uploaded in parallel this can lead to a few threads in parallel way overshooting the low value throughput value used by the rate limiter and then making it wait for minutes which times out the tests that then try to abort the snapshot (see #21759 for details, aborting a snapshot only happens when writing bytes to the repository). For now the old behavior of the test from before my changes can be restored by moving to a single threaded snapshot pool but we should find a better way of testing the SLM behaviour here in a follow-up. * Clarify default behavior of auto_create_index (#46134) Be specific about the default behaviour of `action.auto_create_index` when a list is given. * Mute SnapshotLifeCycleIT (#46207) Relates #46205 * Remove Unused Method from BlobStoreRepository (#46204) This method isn't used anymore and I forgot to delete it. * Allow ingest processors access to node client. (#46077) This is the first PR that merges changes made to server module from the enrich branch (see #32789) into the master branch. The plan is to merge changes made to the server module separately from the pr that will merge enrich into master, so that these changes can be reviewed in isolation. * SQL: Fix issue with DataType for CASE with NULL (#46173) Previously, if the DataType of all the WHEN conditions of a CASE statement is NULL, then it was set to NULL even if the ELSE clause has a non-NULL data type, e.g.: ``` CASE WHEN a = 1 THEN NULL WHEN a = 5 THEN NULL ELSE 'foo' ``` Fixes: #46032 * Mute 2 tests in S3BlobStoreRepositoryTests (#46221) Muted testSnapshotAndRestore and testMultipleSnapshotAndRollback Relates #46218 and #46219 * Cleanup BlobStoreRepository Abort and Failure Handling (#46208) Aborts and failures were handled in a somewhat unfortunate way in #42791: Since the tasks for all files are generated before uploading they are all executed when a snapshot is aborted and lead to a massive number of failures added to the original aborted exception. In the case of failures the situation was not very reasonable as well. If one blob fails uploading the snapshot logic would upload all the remaining files as well and then fail (when previously it would just fail all following files). I fixed both of the above issues, by just short-circuiting all remaining tasks for a shard in case of an exception in any one upload. * Test fix for PinnedQueryBuilderIT (#46187) Fix test issue to stabilise scoring through use of DFS search mode. Randomised index-then-delete docs introduced by the test framework likely caused an imbalance in IDF scores across shards. Also made number of shards used in test a random number for added test coverage. Closes #46174 * Wait for all Rec. to Stop on Node Close (#46178) * Wait for all Rec. to Stop on Node Close * This issue is in the `RecoverySourceHandler#acquireStore`. If we submit the store release to the generic threadpool while it is getting shut down we never complete the futue we wait on (in the generic pool as well) and fail to ever release the store potentially. * Fixed by waiting for all recoveries to end on node close so that we aways have a healthy thread pool here * Closes #45956 * Disable request throttling in S3BlobStoreRepositoryTests (#46226) When some high values are randomly picked up - for example the number of indices to snapshot or the number of snapshots to create - the tests in S3BlobStoreRepositoryTests can generate a high number of requests to the internal S3 server. In order to test the retry logic of the S3 client, the internal server is designed to randomly generate random server errors. When many requests are made, it is possible that the S3 client reaches its maximum number of successive retries capacity. Then the S3 client will stop retrying requests until enough retry attempts succeed, but it means that any request could fail before reaching the max retries count and make the test fail too. Closes #46217 Closes #46218 Closes #46219 * Sync translog without lock when trim unreferenced readers (#46203) With this change, we can avoid blocking writing threads when trimming unreferenced readers; hence improving the translog writing performance in async durability mode. Close #46201 * Add debug assertions for userhome not existing (#46206) The elasticsearch user should not have a homedir, yet we have seen this particular test fail rather frequently with a failed check that the userhome does not exist. This commit adds some additional assertions on the presumptive userhome to narrow down where it might be created. relates #45903 * Remove duplicate line in SearchAfterBuilder (#45994) * reset queryGeometry in ShapeQueryTests (#45974) * [ML-DataFrame] Fix off-by-one error in checkpoint operations_behind (#46235) Fixes a problem where operations_behind would be one less than expected per shard in a new index matched by the data frame transform source pattern. For example, if a data frame transform had a source of foo* and a new index foo-new was created with 2 shards and 7 documents indexed in it then operations_behind would be 5 prior to this change. The problem was that an empty index has a global checkpoint number of -1 and the sequence number of the first document that is indexed into an index is 0, not 1. This doesn't matter for indices included in both the last and next checkpoints, as the off-by-one errors cancelled, but for a new index it affected the observed result. * Fixed synchronizing REST API inflight breaker names with internal variable (#40878) The internal configuration settings were like that: network.breaker.inflight_requests But the exposed REST API had the value names with underscore like that: network.breaker.in_flight_requests This was now corrected to without underscores like that: network.breaker.inflight_requests * [DOCS] Add delete index alias API docs (#46080) * [ML][Transforms] fixing stop on changes check bug (#46162) * [ML][Transforms] fixing stop on changes check bug * Adding new method finishAndCheckState to cover race conditions in early terminations * changing stopping conditions in `onStart` * allow indexer to finish when exiting early * Fix testSyncFailsIfOperationIsInFlight (#46269) testSyncFailsIfOperationIsInFlight could fail due to the index request spawing a GCP sync (new since 7.4). Test now waits for it to finish before testing that flushed sync fails. * [ML] Unmute testStopOutlierDetectionWithEnoughDocumentsToScroll (#46271) The test seems to have been failing due to a race condition between stopping the task and refreshing the destination index. In particular, we were going forward with refreshing the destination index even though the task stopped in the meantime. This was fixed in request. Closes #43960 * [ML][Transforms] protecting doSaveState with optimistic concurrency (#46156) * [ML][Transforms] protecting doSaveState with optimistic concurrency * task code cleanup * Suppress warning from background sync on relocated primary (#46247) If a primary as being relocated, then the global checkpoint and retention lease background sync can emit unnecessary warning logs. This side effect was introduced in #42241. Relates #40800 Relates #42241 * Add CumulativeCard pipeline agg to pipeline index (#46279) The Cumulative Cardinality docs weren't linked from the pipeline index page * Add more assertions and cleanup to setup passwords tests (#46289) This commit is a followup to #46206 to continue debugging failures in an elasticsearch homedir being created. A couple more assertions are added as well as a final cleanup at the end of the previous test to the one that fails. * Multi-get requests should wait for search active (#46283) When a shard has fallen search idle, and a non-realtime multi-get request is executed, today such requests do not wait for the shard to become search active and therefore such requests do not wait for a refresh to see the latest changes to the index. This also prevents such requests from triggering the shard as non-search idle, influencing the behavior of scheduled refreshes. This commit addresses this by attaching a listener to the shard search active state for multi-get requests. In this way, when the next scheduled refresh is executed, the multi-get request will then proceed. * [ML][Transforms] fixing listener being called twice (#46284) * Mute testRecoveryFromFailureOnTrimming Tracked at #46267 * Move MockRespository into test framework (#46298) This moves the `MockRespository` class into `test/framework/src/main` so it can be used across all modules and plugins in tests. * First round of optimizations for vector functions. (#46294) This PR merges the `vectors-optimize-brute-force` feature branch, which makes the following changes to how vector functions are computed: * Precompute the L2 norm of each vector at indexing time. (#45390) * Switch to ByteBuffer for vector encoding. (#45936) * Decode vectors and while computing the vector function. (#46103) * Use an array instead of a List for the query vector. (#46155) * Precompute the normalized query vector when using cosine similarity. (#46190) Co-authored-by: Mayya Sharipova <mayya.sharipova@elastic.co> * Initialize document subset bit set cache used for DLS (#46211) This commit initializes DocumentSubsetBitsetCache even if DLS is disabled. Previously it would throw null pointer when querying usage stats if we explicitly disabled DLS as there would be no instance of DocumentSubsetBitsetCache to query. It is okay to initialize DocumentSubsetBitsetCache which will be empty as the license enforcement would prevent usage of DLS feature and it will not fail when accessing usage stats. Closes #45147 * [ML-DataFrame] unmute tests for debuging purposes (#46121) unmute testGetCheckpointStats closes #45238 * SQL: Fix issue with IIF function when condition folds (#46290) Previously, when the condition (1st argument) of the IIF function could be evaluated (folded) to false, the `IfConditional` was eliminated which caused `IndexOutOfBoundsException` to be thrown when `info()` and `resolveType()` methods where called. Fixes: #46268 * [DOCS] Reformats multi search API (#46256) * [DOCS] Reformats multi search API. Co-Authored-By: James Rodewig <james.rodewig@elastic.co> * Remove stack trace logging in Security(Transport|Http)ExceptionHandler (#45966) As per #45852 comment we no longer need to log stack-traces in SecurityTransportExceptionHandler and SecurityHttpExceptionHandler even if trace logging is enabled. * [DOCS] Reformats request body search API (#46254) * [DOCS] Reformats request body search API. Co-Authored-By: James Rodewig <james.rodewig@elastic.co> * Reenable+Fix testMasterShutdownDuringFailedSnapshot (#46303) Reenable this test since it was fixed by #45689 in production code (specifically, the fact that we write the `snap-` blobs without overwrite checks now). Only required adding the assumed blocking on index file writes to test code to properly work again. * Closes #25281 * DOCS Link to kib reference from es reference on PKI authn (#46260) * Quote the task name in reproduction line printer (#46266) Some tasks have `#` for instance that doesn't play well with some shells ( e.x. zsh ) * [DOCS] Reformats search shards API (#46240) * [DOCS] Reformats search shards API Co-Authored-By: James Rodewig <james.rodewig@elastic.co> * Fix SearchService.createContext exception handling (#46258) An exception from the DefaultSearchContext constructor could leak a searcher, causing future issues like shard lock obtained exceptions. The underlying cause of the exception in the constructor has been fixed, but as a safety precaution we also fix the exception handling in createContext. Closes #45378 * Bwc testclusters all (#46265) Convert all bwc projects to testclusters * Adjacency_matrix aggregation optimisation. (#46257) Avoid pre-allocating ((N * N) - N) / 2 “BitsIntersector” objects given N filters. Most adjacency matrices will be sparse and we typically don’t need to allocate all of these objects - can save a lot of allocations when the number of filters is high. Closes #46212 * [DOCS] Reformats search template and multi search template APIs (#46236) * [DOCS] Reformats search template and multi search template APIs. Co-Authored-By: James Rodewig <james.rodewig@elastic.co> * Improve documentation for X-Opaque-ID (#46167) this field can be present in search slow logs and deprecation logs. The docs describes how to enable this functionality and what expect in logs. closes #44851 * [DOCS] Add "get index template" API docs (#46296) * Do not send recovery requests with CancellableThreads (#46287) Previously, we send recovery requests using CancellableThreads because we send requests and wait for responses in a blocking manner. With async recovery, we no longer need to do so. Moreover, if we fail to submit a request, then we can release the Store using an interruptible thread which can risk invalidating the node lock. This PR is the first step to avoid forking when releasing the Store. Relates #45409 Relates #46178 * Build: Enable testing without magic comments (#46180) Previously we only turned on tests if we saw either `// CONSOLE` or `// TEST`. These magic comments are difficult for the docs build to deal with so it has moved away from using them where possible. We should catch up. This adds another trigger to enable testing: marking a snippet with the `console` language. It looks like this: ``` [source,console] ---- GET / ---- ``` This saves a line which is nice, I guess. But it is more important to me that this is consistent with the way the docs build works now. Similarly this enables response testing when you mark a snippet with the language `console-result`. That looks like: ``` [source,console-result] ---- { "result": "0.1" } ---- ``` `// TESTRESPONSE` is still available for situations like `// TEST`: when the response isn't *in* the console-result language (like `_cat`) or when you want to perform substitutions on the generated test. Should unblock #46159. * Docs for translog, history retention and flushing (#46245) This commit updates the docs about translog retention and flushing to reflect recent changes in how peer recoveries work. It also adds some docs to describe how history is retained for replay using soft deletes and shard history retention leases. Relates #45473 * [DOCS] Reformat "put index template" API docs (#46297) * Add test that get triggers shard search active (#46317) This commit is a follow-up to a change that fixed that multi-get was not triggering a shard to become search active. In that change, we added a test that multi-get properly triggers a shard to become search active. This commit is a follow-up to that change which adds a test for the get case. While get is already handled correctly in production code, there was not a test for it. This commit adds one. Additionally, we factor all the search idle tests from IndexShardIT into a separate test class, as an effort to keep related tests together instead of a single large test class containing a jumble of tests, and also to keep test classes smaller for better parallelization. * Document support of OIDC Implicit flow in Kibana. (#45693) * [DOCS] Replace "// CONSOLE" comments with [source,console] (#46159) * [DOCS] Identify reloadable EC2 Discovery Plugin settings (#46102) * [ML] testFullClusterRestart waiting for stable cluster (#46280) * [ML] waiting for ml indices before waiting task assignment testFullClusterRestart * waiting for a stable cluster after fullrestart * removing unused imports * [ML][Transforms] fixing rolling upgrade continuous transform test (#45823) * [ML][Transforms] fixing rolling upgrade continuous transform test * adjusting wait assert logic * adjusting wait conditions * muting test (#46343) * Decouple shard allocation awareness from search and get requests (#45735) With this commit, Elasticsearch will no longer prefer using shards in the same location (with the same awareness attribute values) to process `_search` and `_get` requests. Instead, adaptive replica selection (the default since 7.0) should route requests more efficiently using the service time of prior inter-node communications. Clusters with big latencies between nodes should switch to cross cluster replication to isolate nodes within the same zone. Note that this change only targets 8.0 since it is considered as breaking. However a follow up pr should add an option to activate this behavior in 7.x in order to allow users to opt-in early. Closes #43453 * Revert "Sync translog without lock when trim unreferenced readers (#46203)" Unfortunately, with this change, we won't clean up all unreferenced generations when reopening. We assume that there's at most one unreferenced generation when reopening translog. The previous implementation guarantees this assumption by syncing translog every time after we remove a translog reader. This change, however, only syncs translog once after we have removed all unreferenced readers (can be more than one) and breaks the assumption. Closes #46267 This reverts commit fd8183ee51d7cf08d9def58a2ae027714beb60de. * [DOCS] Identify reloadable S3 repository plugin settings (#46349) * Unmute testRecoveryFromFailureOnTrimming Tracked at #46267 * [DOCS] Identify reloadable GCS repository plugin settings (#46352) * [DOCS] Synchs Watcher API titles with better HLRC titles (#46328) * Add repository integration tests for Azure (#46263) Similarly to what had been done for S3 (#46081) and GCS (#46255) this commit adds repository integration tests for Azure, based on an internal HTTP server instead of mocks. * Replace mocked client in GCSBlobStoreRepositoryTests by HTTP server (#46255) This commit removes the usage of MockGoogleCloudStoragePlugin in GoogleCloudStorageBlobStoreRepositoryTests and replaces it by a HttpServer that emulates the Storage service. This allows the repository tests to use the real Google's client under the hood in tests and will allow us to test the behavior of the snapshot/restore feature for GCS repositories by simulating random server-side internal errors. The HTTP server used to emulate the Storage service is intentionally simple and minimal to keep things understandable and maintainable. Testing full client options on the server side (like authentication, chunked encoding etc) remains the responsibility of the GoogleCloudStorageFixture. * Mute failing SamlAuthenticationIT tests (#46369) see #44410 * Enable Debug Logging for Master and Coordination Packages (#46363) In order to track down #46091: * Enables debug logging in REST tests for `master` and `coordination` packages since we suspect that issues are caused by failed and then retried publications * Quiet down shard lock failures (#46368) These were actually never intended to be logged at the warning level but made visible by a refactoring in #19991, which introduced a new exception type but forgot to adapt some of the consumers of the exception. * [ML][Transforms] allow executor to call start on started task (#46347) * [DOCS] Reformat index segments API docs (#46345) * [DOCS] Re-add versioning to put template docs (#46384) Adds documentation for index template versioning accidentally removed with #46297. * [ML][Transforms] update supported aggs docs (#46388) * Support geotile_grid aggregation in composite agg sources (#45810) Adds support for `geotile_grid` as a source in composite aggs. Part of this change includes adding a new docFormat of `GEOTILE` that formats a hashed `long` value into a geotile formatting string `zoom/x/y`. * Refactor AllocatedPersistentTask#init(), move rollup logic out of ctor (#46288) This makes the AllocatedPersistentTask#init() method protected so that implementing classes can perform their initialization logic there, instead of the constructor. Rollup's task is adjusted to use this init method. It also slightly refactors the methods to se a static logger in the AllocatedTask instead of passing it in via an argument. This is simpler, logged messages come from the task instead of the service, and is easier for tests * [DOCS] Update snippets in security APIs (#46191) * [DOCS] Identify reloadable Azure repository plugin settings (#46358) * [DOCS] Reformats Watcher APIs using template (#46152) * Add docs on upgrading the keystore (#46331) This commit adds a note to the docs regarding upgrading the keystore. * [ML] Fixing instance serialization version for bwc (#46403) * [DOCS] Reformat index stats API docs (#46322) * Adjusting bwc serialization after backport (#46400) * Clarify error message on keystore write permissions (#46321) When the Elasticsearch process does not have write permissions to upgrade the Elasticsearch keystore, we bail with an error message that indicates there is a filesystem permissions problem. This commit clarifies that error message by pointing out the directory where write permissions are required, or that the user can also run the elasticsearch-keystore upgrade command manually before starting the Elasticsearch process. In this case, the upgrade would not be needed at runtime, so the permissions would not be needed then. * Revert "Refactor AllocatedPersistentTask#init(), move rollup logic out of ctor (#46288)" This reverts commit d999942c6dfd931266d01db24d3fb26b29cf8f64. * reuse mock client to avoid probles with thread context closed errors (#46398) * [DOCS] Replace "// TESTRESPONSE" magic comments with "[source,console-result] (#46295) * [ML-DataFrame] improve error message for timeout case in stop (#46131) improve error message if stopping of transform times out. related #45610 * Fix usage of randomIntBetween() in testWriteBlobWithRetries (#46380) This commit fixes the usage of randomIntBetween() in the test testWriteBlobWithRetries, when the test generates a random array of a single byte. * cleanup static member * Resolve the incorrect scroll_current when delete or close index (#45226) Resolve the incorrect current scroll for deleted or closed index * [ML] Extract DataFrameAnalyticsTask into its own class (#46402) This refactors `DataFrameAnalyticsTask` into its own class. The task has quite a lot of functionality now and I believe it would make code more readable to have it live as its own class rather than an inner class of the start action class. * Mute CcrRollingUpgradeIT.testUniDirectionalIndexFollowing and testUniDirectionalIndexFollowing (#46429) Relates #46416 * Mute SSLClientAuthTests.testThatHttpFailsWithoutSslClientAuth() Tracked in #46230 * Add yet more logging around index creation (#46431) Further investigation into #46091, expanding on #46363, to add even more detailed logging around the retry behaviour during index creation. * [Transform] simplify class structure of indexer (#46306) simplify transform task and indexer - remove redundant transform id - moving client data frame indexer (and builder) into a separate file * [ML] Tolerate total_search_time_ms not mapped in get datafeed stats (#46432) ML users who upgrade from versions prior to 7.4 to 7.4 or later will have ML results indices that do not have mappings for the total_search_time_ms field. Therefore, when searching these indices we must tolerate this field not having a mapping. Fixes #46437 * [DOCS] Adds progress parameter description to the GET stats data frame analytics API doc. (#46434) * [DOCS] Resort common-parms (#46419) * [DOCS] Change // CONSOLE comments to [source,console] (#46441) * [DOCS] Add index alias definition to glossary (#46339) * [Docs] Fix typo in field-names-field.asciidoc (#46430) * [DOCS] [5 of 5] Change // TESTRESPONSE comments to [source,console-results] (#46449) * [DOCS] Correct definition for `allow_no_indices` parameter (#46450) * Increase REST-Test Client Timeout to 60s (#46455) We are seeing requests take more than the default 30s which leads to requests being retried and returning unexpected failures like e.g. "index already exists" because the initial requests that timed out, worked out functionally anyway. => double the timeout to reduce the likelihood of the failures described in #46091 => As suggested in the issue, we should in a follow-up turn off retrying all-together probably * [DOCS] Remove cat request from Index Segments API requests (#46463) * SQL: fix scripting for grouped by datetime functions (#46421) * Fix issue with painless scripting not being correctly generated when datetime functions are used for GROUPing of an INTERVAL operation. * Use `null` schema response for `SYS TABLES` command. (#46386) * Ignore replication for noop updates (#46458) Previously, we ignore replication for noop updates because they do not have sequence numbers. Since #44603, we started assigning sequence numbers to noop updates leading them to be replicated to replicas. This bug occurs only on 8.0 for it requires #41065 and #44603. Closes #46366 * Strengthen testUpdate in rolling upgrade We hit a bug where we can't partially update documents created in a mixed cluster between 5.x and 6.x. Although this bug does not affect 7.0 or later, we should have a good test that catches this issue. Relates #46198
We should not open new engines if a shard is closed. We break this assumption in #45263 where we stop verifying the shard state before creating an engine but only before swapping the engine reference. We can fail to snapshot the store metadata or checkIndex a closed shard if there's some IndexWriter holding the index lock. Closes #47060
We should not open new engines if a shard is closed. We break this assumption in #45263 where we stop verifying the shard state before creating an engine but only before swapping the engine reference. We can fail to snapshot the store metadata or checkIndex a closed shard if there's some IndexWriter holding the index lock. Closes #47060
We should not open new engines if a shard is closed. We break this assumption in #45263 where we stop verifying the shard state before creating an engine but only before swapping the engine reference. We can fail to snapshot the store metadata or checkIndex a closed shard if there's some IndexWriter holding the index lock. Closes #47060
Today we create a new engine under
IndexShard#mutex
. This is not ideal because it can block the cluster state updates which require the same mutex. The constructor of InternalEngine can take some time as it implicitly builds the global ordinals ifeager_global_ordinals
setting is true. To solve this problem, I explored two options:Move the expensive stuff from the Engine's constructor to the
start
method, and callstart
outside IndexShard#mutex. See https://github.com/elastic/elasticsearch/compare/master...dnhatn:start-engine?expand=1Create engines under a new mutex in IndexShard
I prefer the second approach for it's more contained.
Closes #43699