Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide explanation of dangling indices, fixes #26008 #26965

Closed
wants to merge 3,230 commits into from
Closed

Provide explanation of dangling indices, fixes #26008 #26965

wants to merge 3,230 commits into from

Conversation

geekpete
Copy link
Member

Adding explanation of dangling indices back to the docs to fix #26008

dakrone and others added 30 commits July 25, 2017 09:50
Related to #25726, this resolves #25556 for the 5.x series by parsing "*" as a
`MatchAllDocsQuery` instead of expanding it to a (potentially expensive) query
on the `_field_names`.
…al part (#25835)

This changes makes it so you can index a value like "1.0" or "1.1" into whole
number field types like byte and integer. Without this change then the above
values would have resulted in an error, even with coerce set to true.

Closes #25819
…5882)

Since #25208 the Java Low-Level Rest Client has shaded dependencies.
This commit updates the documentation to reflect that.
Stored fields were still being accessed for nested inner hits even if the _source was not requested.
This was done to figure out the id of the root document. However this is already known higher up the stack.
So instead this change adds the id to the nested search context, so that it is no longer required to be fetched via the stored fields.

In case the _source is large and no source is requested then hot threads like these ones would still appear:

```
100.3% (501.3ms out of 500ms) cpu usage by thread 'elasticsearch[AfXKKfq][search][T#6]'
     2/10 snapshots sharing following 22 elements
       org.apache.lucene.store.DataInput.skipBytes(DataInput.java:352)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.skipField(CompressingStoredFieldsReader.java:246)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:601)
       org.apache.lucene.index.CodecReader.document(CodecReader.java:88)
       org.apache.lucene.index.FilterLeafReader.document(FilterLeafReader.java:411)
       org.elasticsearch.search.fetch.FetchPhase.loadStoredFields(FetchPhase.java:347)
       org.elasticsearch.search.fetch.FetchPhase.createNestedSearchHit(FetchPhase.java:219)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:150)
       org.elasticsearch.search.fetch.subphase.InnerHitsFetchSubPhase.hitsExecute(InnerHitsFetchSubPhase.java:73)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:166)
       org.elasticsearch.search.fetch.subphase.InnerHitsFetchSubPhase.hitsExecute(InnerHitsFetchSubPhase.java:73)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:166)
       org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:422)
```

and:

```
8/10 snapshots sharing following 27 elements
       org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:135)
       org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:138)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader$BlockState$1.fillBuffer(CompressingStoredFieldsReader.java:531)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader$BlockState$1.readBytes(CompressingStoredFieldsReader.java:550)
       org.apache.lucene.store.DataInput.readBytes(DataInput.java:87)
       org.apache.lucene.store.DataInput.skipBytes(DataInput.java:350)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.skipField(CompressingStoredFieldsReader.java:246)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:601)
       org.apache.lucene.index.CodecReader.document(CodecReader.java:88)
       org.apache.lucene.index.FilterLeafReader.document(FilterLeafReader.java:411)
       org.elasticsearch.search.fetch.FetchPhase.loadStoredFields(FetchPhase.java:347)
       org.elasticsearch.search.fetch.FetchPhase.createNestedSearchHit(FetchPhase.java:219)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:150)
       org.elasticsearch.search.fetch.subphase.InnerHitsFetchSubPhase.hitsExecute(InnerHitsFetchSubPhase.java:73)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:166)
       org.elasticsearch.search.fetch.subphase.InnerHitsFetchSubPhase.hitsExecute(InnerHitsFetchSubPhase.java:73)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:166)
       org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:422)
```
…er change (#25877)

This predicate is used to deal with the intricacies of detecting when a master is reelected/nodes rejoins an existing master. The current implementation is based on nodeIds, which is fine if the master really change. If the nodeId is equal the code falls back to detecting an increment in the cluster state version which happens when a node is re-elected or when the node rejoins. Sadly this doesn't cover the case where the same node is elected after a full restart of all master nodes. In that case we recover the cluster state from disk but the version is reset back to 0. To fix this, the check should be done based on ephemeral IDs which are reset on restart.

Fixes #25471
The configuration removed from the runtime configuration did not
properly remove the deps jar from gradle versions > 3.3. The rest client
now removes both the 3.3 and 3.3+ configurations so this works on both
versions of gradle.

Closes #25884
Relates #25208
Queries are supposed to be cacheable per segment, yet matches of this query
also depend on how many documents exist on previous segments.
At the shard level we use an operation permit to coordinate between regular shard operations and special operations that need exclusive access. In ES versions < 6, the operation requiring exclusive access was invoked during primary relocation, but ES versions >= 6 this exclusive access is also used when a replica learns about a new primary or when a replica is promoted to primary.

These special operations requiring exclusive access delay regular operations from running, by adding them to a queue, and after finishing the exclusive access, release these operations which then need to be put back on the original thread-pool they were running on. In the presence of thread pool rejections, the current implementation had two issues:

- it would not properly release the operation permit when hitting a rejection (i.e. when calling ThreadedActionListener.onResponse from IndexShardOperationPermits.acquire).
- it would not invoke the onFailure method of the action listener when the shard was closed, and just log a warning instead (see ThreadedActionListener.onFailure), which would ultimately lead to the replication task never being cleaned up (see #25863).

This commit fixes both issues by introducing a custom threaded action listener that is permit-aware and properly deals with rejections.

Closes #25863
With Gradle 4.1 and newer JDK versions, we can finally invoke Gradle directly using a JDK9 JAVA_HOME without requiring a JDK8 to "bootstrap" the build. As the thirdPartyAudit task runs within the JVM that Gradle runs in, it needs to be adapted now to be JDK9 aware.

This commit also changes the `JavaCompile` tasks to only fork if necessary (i.e. when Gradle's JVM and JAVA_HOME's JVM differ).
The low level rest client does not need the shadow plugin applied, it
only needs the plugin jar in the classpath, in order to create a
ShadowJar task.

Relates #25208
* A cycle was detected in eclipse, and was fixed in the same fashion as
  core and core-tests.
* The rest client deps jar was not properly exported in the generated
  eclipse classpath file for rest client.

Relates #25208
The example output for node info and cluster stats was outdated w.r.t.
to the information that is shown for plugins. With this commit we
updated the example output and update the explanation of the respective
fields.
It looks a bit ambiguous here.

ElasticSearch no more using 'hybrid mmapfs / niofs' which chooses filesystem based on the file. It is any one of the mmapfs, niofs or simplefs depending on the operating system.
As quoted here https://www.elastic.co/guide/en/elasticsearch/reference/5.5/index-modules-store.html

Thanks,
Pulkit Agrawal
This commit adds a bootstrap check for the maximum file size, and
ensures the limit is set correctly when Elasticsearch is installed as a
service on systemd-based systems.

Relates #25974
This commit removes some useless empty lines checks from the evil JNA
tests. These empty lines checks are useless because if the lines are
actually empty, the for loop will never be entered and we will hit the
fail condition at the bottom as intended anyway.
The systemd service file that ships with Elasticsearch installs on
systemd-based systems contains a suggestion for setting LimitMEMLOCK if
the user wants to enable bootstrap.memory_lock. However, this setting
this in the installed service file goes against best practices for
working with systemd, and goes against our existing documentation for
how to set this. Therefore, we should not have this suggestion in the
service file otherwise users might be led to think they should edit it
there.

Relates #25979
We have a bootstrap check for the maximum size of the virtual memory
address space for the Elasticsearch process. We can set this in the
service file for Elasticsearch when installed as a service on
systemd-based systems for a better user experience than them fumbling
through thinking they should set this via /etc/security/limits.d (as a
lot of pages on the Internet would tell them) not realizing that systemd
completely ignores these for services and then trying to figure out how
to add a unit file for the Elasticsearch service.

Relates #25975
We set some limits in the service file for Elasticsearch when installed
as a service on systemd-based systems. This commit adds a packaging test
that these limits are indeed set correctly.

Relates #25976
This commit removes an outdated reference to http_address in the nodes
info docs. This information is available in the http object for each
node in the nodes info API response.

Relates #25980
We publish javadocs to artifacts.elastic.co (and snapshots.elastic.co) for a while. This commit adds the link to them to the transport client, low level REST client, sniffer and high level REST client pages.

Closes #23761
…ith ignore_malformed=true when inserting "NaN", "Infinity" or "-Infinity" values (#25967)
In the packaging tests we have an assertion that the maximum number of
processes was set correctly by systemd when Elasticsearch is started as
a service on systemd-based systems. However, when this assertion was
backported, the backport did not account for the fact that the limit is
different between 5.x and 6.x (and current master). Namely, in 6.x we
made the decision to increase the limit to 4096 because we removed the
artificial limit of 32 on the number of processors and so increased the
limit in the boostrap check for maximum number of processes as
well. This commit fixes this assertion to be consistent with the
bootstrap check and the setting in the service in the 5.x branch.
This commit adds a small note to the discovery docs to include a note
that we recommend that the unicast hosts list be maintained as the list
of master-eligible nodes in the cluster.

Relates #25991
thomas11 and others added 20 commits October 6, 2017 16:59
The document type is `question`, not `my_parent`
…oring (#26936)

This commit adds a warning to deter contributers from creating PRs
generated by tools to do large refactors just for the sake of
refactoring.
List settings are not correctly filtered out in 5.6 and 6.0.
This has been detected by a failing packaging test after
#26878 has been merged.
This commit adds instructions for installing Elasticsearch via Homebrew
to the Getting Started guide.

Relates #26847
This commit fixes an issue with the handling of paths containing
parentheses on Windows. When such a path is used as a component of
Elasticsearch home, then a later echo statement that is guarded by an if
will fail because the parentheses in the path will be confused with the
parentheses defining the if block. This commit fixes the issue by
protecting this echo statement by wrapping the possibly offending path
in quotes.

Relates #26916
This shows an example of how to install a plugin on Windows, which is not as obvious at I would have expected.
This commit clarifies how to apply an override to the systemd unit file
for Elasticsearch.

Relates #26950
While opening a connection to a node, a channel can subsequently
close. If this happens, a future callback whose purpose is to close all
other channels and disconnect from the node will fire. However, this
future will not be ready to close all the channels because the
connection will not be exposed to the future callback yet. Since this
callback is run once, we will never try to disconnect from this node
again and we will be left with a closed channel. This commit adds a
check that all channels are open before exposing the channel and throws
a general connection exception. In this case, the usual connection retry
logic will take over.

Relates #26932
When a node which contains the primary shard is unavailable, the primary
stats (and the total stats) of an `IndexStats` will be empty for a short
moment (while the primary shard is being relocated). However, we assume
that these stats are always non-empty when handling `_cat/indices` in
RestIndicesAction. This commit checks the content of these stats before
accessing.

Closes #26942
The simple transport test cases now need to implement
AbstractSimpleTransportTestCase#closeConnectionChannel. This commit adds
this for the simple local transport tests, and skips a test that does
not apply to local transport.
@geekpete geekpete added the >docs General docs changes label Oct 11, 2017
GlenRSmith and others added 3 commits October 11, 2017 11:23
* Add `bytes` to cat.shards API spec

* add bytes param to rest-api-spec of cat.segments
This change adds a dedicated thread group, configures threads with a corresponding thread name and starts all threads as daemon threads.
Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are a couple things to fix.


When a node joins the cluster, any shards/indices stored in its local `data/`
directory which do not already exist in the cluster will be imported into the
cluster by default. This functionality is intended as a best effort to help
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think by default should be removed, because I don't think this is configurable (as it was, I believe, in 1.x).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Will fix.

@@ -48,3 +48,12 @@ as long as the following conditions are met:
Recover as long as this many data nodes have joined the cluster.

NOTE: These settings only take effect on a full cluster restart.

==== Dangling indices
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nesting level seems off here. This is 4 levels, but the one before it is 2 levels.

ppf2 and others added 3 commits October 12, 2017 17:51
This commit clarifies the interaction between settings specified in a
create index request, and those that would come from any templates that
apply to the create index request.

Relates #26994
This commit reformats a paragraph in the template docs to fit in 80
columns as for the rest of the doc, and as-is a standard that we loosely
adhere to.
@geekpete
Copy link
Member Author

geekpete commented Oct 13, 2017

resubmitting as 26999

@geekpete geekpete closed this Oct 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>docs General docs changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DOCS] Provide an explanation of what dangling indices are and how they can occur