Provide explanation of dangling indices, fixes #26008 #26965

geekpete · 2017-10-11T11:42:30Z

Adding explanation of dangling indices back to the docs to fix #26008

Related to #25726, this resolves #25556 for the 5.x series by parsing "*" as a `MatchAllDocsQuery` instead of expanding it to a (potentially expensive) query on the `_field_names`.

…al part (#25835) This changes makes it so you can index a value like "1.0" or "1.1" into whole number field types like byte and integer. Without this change then the above values would have resulted in an error, even with coerce set to true. Closes #25819

…5882) Since #25208 the Java Low-Level Rest Client has shaded dependencies. This commit updates the documentation to reflect that.

Stored fields were still being accessed for nested inner hits even if the _source was not requested. This was done to figure out the id of the root document. However this is already known higher up the stack. So instead this change adds the id to the nested search context, so that it is no longer required to be fetched via the stored fields. In case the _source is large and no source is requested then hot threads like these ones would still appear: ``` 100.3% (501.3ms out of 500ms) cpu usage by thread 'elasticsearch[AfXKKfq][search][T#6]' 2/10 snapshots sharing following 22 elements org.apache.lucene.store.DataInput.skipBytes(DataInput.java:352) org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.skipField(CompressingStoredFieldsReader.java:246) org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:601) org.apache.lucene.index.CodecReader.document(CodecReader.java:88) org.apache.lucene.index.FilterLeafReader.document(FilterLeafReader.java:411) org.elasticsearch.search.fetch.FetchPhase.loadStoredFields(FetchPhase.java:347) org.elasticsearch.search.fetch.FetchPhase.createNestedSearchHit(FetchPhase.java:219) org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:150) org.elasticsearch.search.fetch.subphase.InnerHitsFetchSubPhase.hitsExecute(InnerHitsFetchSubPhase.java:73) org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:166) org.elasticsearch.search.fetch.subphase.InnerHitsFetchSubPhase.hitsExecute(InnerHitsFetchSubPhase.java:73) org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:166) org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:422) ``` and: ``` 8/10 snapshots sharing following 27 elements org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:135) org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:138) org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader$BlockState$1.fillBuffer(CompressingStoredFieldsReader.java:531) org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader$BlockState$1.readBytes(CompressingStoredFieldsReader.java:550) org.apache.lucene.store.DataInput.readBytes(DataInput.java:87) org.apache.lucene.store.DataInput.skipBytes(DataInput.java:350) org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.skipField(CompressingStoredFieldsReader.java:246) org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:601) org.apache.lucene.index.CodecReader.document(CodecReader.java:88) org.apache.lucene.index.FilterLeafReader.document(FilterLeafReader.java:411) org.elasticsearch.search.fetch.FetchPhase.loadStoredFields(FetchPhase.java:347) org.elasticsearch.search.fetch.FetchPhase.createNestedSearchHit(FetchPhase.java:219) org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:150) org.elasticsearch.search.fetch.subphase.InnerHitsFetchSubPhase.hitsExecute(InnerHitsFetchSubPhase.java:73) org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:166) org.elasticsearch.search.fetch.subphase.InnerHitsFetchSubPhase.hitsExecute(InnerHitsFetchSubPhase.java:73) org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:166) org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:422) ```

…er change (#25877) This predicate is used to deal with the intricacies of detecting when a master is reelected/nodes rejoins an existing master. The current implementation is based on nodeIds, which is fine if the master really change. If the nodeId is equal the code falls back to detecting an increment in the cluster state version which happens when a node is re-elected or when the node rejoins. Sadly this doesn't cover the case where the same node is elected after a full restart of all master nodes. In that case we recover the cluster state from disk but the version is reset back to 0. To fix this, the check should be done based on ephemeral IDs which are reset on restart. Fixes #25471

The configuration removed from the runtime configuration did not properly remove the deps jar from gradle versions > 3.3. The rest client now removes both the 3.3 and 3.3+ configurations so this works on both versions of gradle. Closes #25884 Relates #25208

Queries are supposed to be cacheable per segment, yet matches of this query also depend on how many documents exist on previous segments.

At the shard level we use an operation permit to coordinate between regular shard operations and special operations that need exclusive access. In ES versions < 6, the operation requiring exclusive access was invoked during primary relocation, but ES versions >= 6 this exclusive access is also used when a replica learns about a new primary or when a replica is promoted to primary. These special operations requiring exclusive access delay regular operations from running, by adding them to a queue, and after finishing the exclusive access, release these operations which then need to be put back on the original thread-pool they were running on. In the presence of thread pool rejections, the current implementation had two issues: - it would not properly release the operation permit when hitting a rejection (i.e. when calling ThreadedActionListener.onResponse from IndexShardOperationPermits.acquire). - it would not invoke the onFailure method of the action listener when the shard was closed, and just log a warning instead (see ThreadedActionListener.onFailure), which would ultimately lead to the replication task never being cleaned up (see #25863). This commit fixes both issues by introducing a custom threaded action listener that is permit-aware and properly deals with rejections. Closes #25863

With Gradle 4.1 and newer JDK versions, we can finally invoke Gradle directly using a JDK9 JAVA_HOME without requiring a JDK8 to "bootstrap" the build. As the thirdPartyAudit task runs within the JVM that Gradle runs in, it needs to be adapted now to be JDK9 aware. This commit also changes the `JavaCompile` tasks to only fork if necessary (i.e. when Gradle's JVM and JAVA_HOME's JVM differ).

The low level rest client does not need the shadow plugin applied, it only needs the plugin jar in the classpath, in order to create a ShadowJar task. Relates #25208

* A cycle was detected in eclipse, and was fixed in the same fashion as core and core-tests. * The rest client deps jar was not properly exported in the generated eclipse classpath file for rest client. Relates #25208

Closes #25935

The example output for node info and cluster stats was outdated w.r.t. to the information that is shown for plugins. With this commit we updated the example output and update the explanation of the respective fields.

It looks a bit ambiguous here. ElasticSearch no more using 'hybrid mmapfs / niofs' which chooses filesystem based on the file. It is any one of the mmapfs, niofs or simplefs depending on the operating system. As quoted here https://www.elastic.co/guide/en/elasticsearch/reference/5.5/index-modules-store.html Thanks, Pulkit Agrawal

This commit adds a bootstrap check for the maximum file size, and ensures the limit is set correctly when Elasticsearch is installed as a service on systemd-based systems. Relates #25974

This commit removes some useless empty lines checks from the evil JNA tests. These empty lines checks are useless because if the lines are actually empty, the for loop will never be entered and we will hit the fail condition at the bottom as intended anyway.

The systemd service file that ships with Elasticsearch installs on systemd-based systems contains a suggestion for setting LimitMEMLOCK if the user wants to enable bootstrap.memory_lock. However, this setting this in the installed service file goes against best practices for working with systemd, and goes against our existing documentation for how to set this. Therefore, we should not have this suggestion in the service file otherwise users might be led to think they should edit it there. Relates #25979

We have a bootstrap check for the maximum size of the virtual memory address space for the Elasticsearch process. We can set this in the service file for Elasticsearch when installed as a service on systemd-based systems for a better user experience than them fumbling through thinking they should set this via /etc/security/limits.d (as a lot of pages on the Internet would tell them) not realizing that systemd completely ignores these for services and then trying to figure out how to add a unit file for the Elasticsearch service. Relates #25975

We set some limits in the service file for Elasticsearch when installed as a service on systemd-based systems. This commit adds a packaging test that these limits are indeed set correctly. Relates #25976

This commit removes an outdated reference to http_address in the nodes info docs. This information is available in the http object for each node in the nodes info API response. Relates #25980

We publish javadocs to artifacts.elastic.co (and snapshots.elastic.co) for a while. This commit adds the link to them to the transport client, low level REST client, sniffer and high level REST client pages. Closes #23761

…ith ignore_malformed=true when inserting "NaN", "Infinity" or "-Infinity" values (#25967)

In the packaging tests we have an assertion that the maximum number of processes was set correctly by systemd when Elasticsearch is started as a service on systemd-based systems. However, when this assertion was backported, the backport did not account for the fact that the limit is different between 5.x and 6.x (and current master). Namely, in 6.x we made the decision to increase the limit to 4096 because we removed the artificial limit of 32 on the number of processors and so increased the limit in the boostrap check for maximum number of processes as well. This commit fixes this assertion to be consistent with the bootstrap check and the setting in the service in the 5.x branch.

This commit adds a small note to the discovery docs to include a note that we recommend that the unicast hosts list be maintained as the list of master-eligible nodes in the cluster. Relates #25991

…26856)

…26721)

…nt" API Closes #26895

The document type is `question`, not `my_parent`

…oring (#26936) This commit adds a warning to deter contributers from creating PRs generated by tools to do large refactors just for the sake of refactoring.

List settings are not correctly filtered out in 5.6 and 6.0. This has been detected by a failing packaging test after #26878 has been merged.

… timestamps specified as a json number Closes #26890

This commit adds instructions for installing Elasticsearch via Homebrew to the Getting Started guide. Relates #26847

This commit fixes an issue with the handling of paths containing parentheses on Windows. When such a path is used as a component of Elasticsearch home, then a later echo statement that is guarded by an if will fail because the parentheses in the path will be confused with the parentheses defining the if block. This commit fixes the issue by protecting this echo statement by wrapping the possibly offending path in quotes. Relates #26916

This shows an example of how to install a plugin on Windows, which is not as obvious at I would have expected.

This commit clarifies how to apply an override to the systemd unit file for Elasticsearch. Relates #26950

While opening a connection to a node, a channel can subsequently close. If this happens, a future callback whose purpose is to close all other channels and disconnect from the node will fire. However, this future will not be ready to close all the channels because the connection will not be exposed to the future callback yet. Since this callback is run once, we will never try to disconnect from this node again and we will be left with a closed channel. This commit adds a check that all channels are open before exposing the channel and throws a general connection exception. In this case, the usual connection retry logic will take over. Relates #26932

When a node which contains the primary shard is unavailable, the primary stats (and the total stats) of an `IndexStats` will be empty for a short moment (while the primary shard is being relocated). However, we assume that these stats are always non-empty when handling `_cat/indices` in RestIndicesAction. This commit checks the content of these stats before accessing. Closes #26942

The simple transport test cases now need to implement AbstractSimpleTransportTestCase#closeConnectionChannel. This commit adds this for the simple local transport tests, and skips a test that does not apply to local transport.

* Add `bytes` to cat.shards API spec * add bytes param to rest-api-spec of cat.segments

This change adds a dedicated thread group, configures threads with a corresponding thread name and starts all threads as daemon threads.

rjernst

I think there are a couple things to fix.

rjernst · 2017-10-12T21:19:58Z

docs/reference/modules/gateway.asciidoc

+
+When a node joins the cluster, any shards/indices stored in its local `data/` 
+directory which do not already exist in the cluster will be imported into the 
+cluster by default. This functionality is intended as a best effort to help 


I think by default should be removed, because I don't think this is configurable (as it was, I believe, in 1.x).

Thanks! Will fix.

rjernst · 2017-10-12T21:20:23Z

docs/reference/modules/gateway.asciidoc

@@ -48,3 +48,12 @@ as long as the following conditions are met:
    Recover as long as this many data nodes have joined the cluster.

 NOTE: These settings only take effect on a full cluster restart.
+
+==== Dangling indices


The nesting level seems off here. This is 4 levels, but the one before it is 2 levels.

This commit clarifies the interaction between settings specified in a create index request, and those that would come from any templates that apply to the create index request. Relates #26994

This commit reformats a paragraph in the template docs to fit in 80 columns as for the rest of the doc, and as-is a standard that we loosely adhere to.

geekpete · 2017-10-13T05:47:51Z

resubmitting as 26999

dakrone and others added 30 commits July 25, 2017 09:50

Add 5.5.1 version constant and BWC indices

b1f7d3c

Fix elvis operator documentation

086b942

Parse "*" in query_string_query as MatchAllDocsQuery

39f6bc5

Related to #25726, this resolves #25556 for the 5.x series by parsing "*" as a `MatchAllDocsQuery` instead of expanding it to a (potentially expensive) query on the `_field_names`.

[Docs] Add profile section to the Search API documentation (#25880)

ce6c353

[Docs] Update Java Low-Level documentation to reflect shaded deps (#2…

48d027b

…5882) Since #25208 the Java Low-Level Rest Client has shaded dependencies. This commit updates the documentation to reflect that.

[DOCS] update low level client artifact name

b9eb9d2

Caching a MinDocQuery can lead to wrong results. (#25909)

4517515

Queries are supposed to be cacheable per segment, yet matches of this query also depend on how many documents exist on previous segments.

Remove the shadow plugin apply in the rest client (#25921)

8a4b33e

The low level rest client does not need the shadow plugin applied, it only needs the plugin jar in the classpath, in order to create a ShadowJar task. Relates #25208

Fix eclipse issues related to rest client shading (#25874)

d7d8300

* A cycle was detected in eclipse, and was fixed in the same fashion as core and core-tests. * The rest client deps jar was not properly exported in the generated eclipse classpath file for rest client. Relates #25208

docs: Remove incorrect warning

dd8deb7

Closes #25935

Update plugin-related output in reference docs (#25897)

c465891

The example output for node info and cluster stats was outdated w.r.t. to the information that is shown for plugins. With this commit we updated the example output and update the explanation of the respective fields.

Add max file size bootstrap check

8c21797

This commit adds a bootstrap check for the maximum file size, and ensures the limit is set correctly when Elasticsearch is installed as a service on systemd-based systems. Relates #25974

Add test for limits on systemd

5721eb9

We set some limits in the service file for Elasticsearch when installed as a service on systemd-based systems. This commit adds a packaging test that these limits are indeed set correctly. Relates #25976

Remove mention of http_address in nodes info docs

58a6127

This commit removes an outdated reference to http_address in the nodes info docs. This information is available in the http object for each node in the nodes info API response. Relates #25980

[DOCS] add links to javadocs to clients docs (#25745)

fcb948f

We publish javadocs to artifacts.elastic.co (and snapshots.elastic.co) for a while. This commit adds the link to them to the transport client, low level REST client, sniffer and high level REST client pages. Closes #23761

Fix term(s) query for range field (#25918)

080123a

Fixed bug that mapper_parsing_exception is thrown for numeric field w…

643700a

…ith ignore_malformed=true when inserting "NaN", "Infinity" or "-Infinity" values (#25967)

[Docs] Add migration notes for the high-level rest client (#25911)

978ff1f

Add recommendation on unicast hosts to docs

c00a52c

This commit adds a small note to the discovery docs to include a note that we recommend that the unicast hosts list be maintained as the list of master-eligible nodes in the cluster. Relates #25991

thomas11 and others added 20 commits October 6, 2017 16:59

Fix IndexOutOfBoundsException in histograms for NaN doubles (#26787) (#…

1a2f265

…26856)

[Docs] Add not about maximum token length for whitespace tokenizer (#…

f9e5529

…26721)

[API] Added the terminate_after parameter to the REST spec for "Cou…

377a43c

…nt" API Closes #26895

fixing typo in datehistogram-aggregation.asciidoc (#26924)

4d6a5b1

Update 5.6.3 release notes

dc507fd

Fix join field docs (#26929)

cb4abe9

The document type is `question`, not `my_parent`

Docs: Add note to contributing docs warning against tool based refact…

62b228c

…oring (#26936) This commit adds a warning to deter contributers from creating PRs generated by tools to do large refactors just for the sake of refactoring.

Fix filtering for ListSetting (#26914)

897cac6

List settings are not correctly filtered out in 5.6 and 6.0. This has been detected by a failing packaging test after #26878 has been merged.

ingest: Fix bug that prevent date_index_name processor from accepting…

e2d8959

… timestamps specified as a json number Closes #26890

fix compile error

e688b4b

Add Homebrew instructions to getting started

08a0d4d

This commit adds instructions for installing Elasticsearch via Homebrew to the Getting Started guide. Relates #26847

[DOCS] Plugin Installation for Windows (#21671)

9ea3421

This shows an example of how to install a plugin on Windows, which is not as obvious at I would have expected.

[DOCS] Bumped version for 5.6.3

ceecb63

Clarify systemd overrides

9e36c16

This commit clarifies how to apply an override to the systemd unit file for Elasticsearch. Relates #26950

[DOCS] Fixed indentation of the definition list.

7de1dc7

Fix compilation for simple local transport tests

7e2873f

The simple transport test cases now need to implement AbstractSimpleTransportTestCase#closeConnectionChannel. This commit adds this for the simple local transport tests, and skips a test that does not apply to local transport.

Bumping ES Version to 5.6.4

10bf1ae

geekpete added the >docs General docs changes label Oct 11, 2017

GlenRSmith and others added 3 commits October 11, 2017 11:23

Cat shards bytes (#26952)

80b1e2c

* Add `bytes` to cat.shards API spec * add bytes param to rest-api-spec of cat.segments

Use a dedicated ThreadGroup in rest sniffer (#26897)

bb92bb2

This change adds a dedicated thread group, configures threads with a corresponding thread name and starts all threads as daemon threads.

Fix a typo in the similarity docs (#26970)

b278349

rjernst reviewed Oct 12, 2017

View reviewed changes

ppf2 and others added 3 commits October 12, 2017 17:51

Clarify settings and template on create index

0d2aaca

This commit clarifies the interaction between settings specified in a create index request, and those that would come from any templates that apply to the create index request. Relates #26994

Reformat paragraph in template docs to 80 columns

fe2d4a0

This commit reformats a paragraph in the template docs to fit in 80 columns as for the rest of the doc, and as-is a standard that we loosely adhere to.

Provide explanation of dangling indices, fixes #26008

f52e9fa

geekpete closed this Oct 13, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide explanation of dangling indices, fixes #26008 #26965

Provide explanation of dangling indices, fixes #26008 #26965

geekpete commented Oct 11, 2017

rjernst left a comment

rjernst Oct 12, 2017

geekpete Oct 13, 2017

rjernst Oct 12, 2017

geekpete commented Oct 13, 2017 •

edited

Loading

Provide explanation of dangling indices, fixes #26008 #26965

Provide explanation of dangling indices, fixes #26008 #26965

Conversation

geekpete commented Oct 11, 2017

rjernst left a comment

Choose a reason for hiding this comment

rjernst Oct 12, 2017

Choose a reason for hiding this comment

geekpete Oct 13, 2017

Choose a reason for hiding this comment

rjernst Oct 12, 2017

Choose a reason for hiding this comment

geekpete commented Oct 13, 2017 • edited Loading

geekpete commented Oct 13, 2017 •

edited

Loading