Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve docs for search preferences #32098

Merged
merged 3 commits into from
Jul 18, 2018
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
102 changes: 68 additions & 34 deletions docs/reference/search/request/preference.asciidoc
Original file line number Diff line number Diff line change
@@ -1,56 +1,81 @@
[[search-request-preference]]
=== Preference

Controls a `preference` of which shard copies on which to execute the
search. By default, the operation is randomized among the available shard
copies, unless allocation awareness is used.
Controls a `preference` of which shard copies on which to execute the search.
By default, Elasticsearch selects from the available shard copies in an
unspecified order, taking the <<allocation-awareness,allocation awareness>> and
<<search-adaptive-replica,adaptive replica selection>> configuration into
account. However, it may sometimes be desirable to try and route certain
searches to certain sets of shard copies, for instance to make better use of
per-copy caches.

Preferences do not _guarantee_ that any particular shard copies are used in a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to be more careful here. Some preferences are guaranteed (like _only_local). These are useful for debugging and not production use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really it's just _only_local (undocumented prior to this change, and I wrote this paragraph before adding that 😁) I reworded this in cae2225 and moved it below the description of the options where it flowed better.

search, and on a changing index this may mean that repeated searches may yield
different results if they are executed on different shard copies which are in
different refresh states.

The `preference` is a query string parameter which can be set to:

[horizontal]
`_primary`::
The operation will go and be executed only on the primary
shards. deprecated[6.1.0, will be removed in 7.0, use `_only_nodes` or `_prefer_nodes`]
`_primary`::
The operation will be executed only on primary shards.
deprecated[6.1.0, will be removed in 7.0, use `_only_nodes` or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we were also going to explain why we removed it - in this case because it puts and unreasonable load on the primary with no gains as the ES replication is sync

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nevermind, I see it now at the end. maybe add here something like "see end of page for more information"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, done. I also recommended _only_nodes, _prefer_nodes or a custom string value there instead.

`_prefer_nodes`]

`_primary_first`::
The operation will go and be executed on the primary
shard, and if not available (failover), will execute on other shards.
deprecated[6.1.0, will be removed in 7.0, use `_only_nodes` or `_prefer_nodes`]
`_primary_first`::
The operation will be executed on primary shards if possible, but will
fall back to other shards if not. deprecated[6.1.0, will be removed in
7.0, use `_only_nodes` or `_prefer_nodes`]

`_replica`::
The operation will go and be executed only on a replica shard.
deprecated[6.1.0, will be removed in 7.0, use `_only_nodes` or `_prefer_nodes`]
The operation will be executed only on replica shards. If there are
multiple replicas then the order of preference between them is
unspecified. deprecated[6.1.0, will be removed in 7.0, use
`_only_nodes` or `_prefer_nodes`]

`_replica_first`::
The operation will go and be executed only on a replica shard, and if
not available (failover), will execute on other shards.
deprecated[6.1.0, will be removed in 7.0, use `_only_nodes` or `_prefer_nodes`]
The operation will be executed on replica shards if possible, but will
fall back to other shards if not. If there are multiple replicas then
the order of preference between them is unspecified. deprecated[6.1.0,
will be removed in 7.0, use `_only_nodes` or `_prefer_nodes`]

`_only_local`::
The operation will be executed only on shards allocated to the local
node.

`_local`::
The operation will prefer to be executed on a local
allocated shard if possible.
`_local`::
The operation will be executed on shards allocated to the local node if
possible, and will fall back to other shards if not.

`_prefer_nodes:abc,xyz`::
Prefers execution on the nodes with the provided
node ids (`abc` or `xyz` in this case) if applicable.
The operation will be executed on nodes with one of the provided node
ids (`abc` or `xyz` in this case) if possible. If suitable shard copies
exist on more than one of the selected nodes then the order of
preference between these copies is unspecified.

`_shards:2,3`::
Restricts the operation to the specified shards. (`2`
and `3` in this case). This preference can be combined with other
preferences but it has to appear first: `_shards:2,3|_local`
`_shards:2,3`::
Restricts the operation to the specified shards. (`2` and `3` in this
case). This preference can be combined with other preferences but it
has to appear first: `_shards:2,3|_local`

`_only_nodes`::
Restricts the operation to nodes specified in <<cluster,node specification>>
`_only_nodes:abc*,x*yz,...`::
Restricts the operation to nodes specified according to the
<<cluster,node specification>>. If suitable shard copies exist on more
than one of the selected nodes then the order of preference between
these copies is unspecified.

Custom (string) value::
A custom value will be used to guarantee that
the same shards will be used for the same custom value. This can help
with "jumping values" when hitting different shards in different refresh
states. A sample value can be something like the web session id, or the
user name.
Custom (string) value::
Any value that does not start with `_`. If two searches both give the same
custom string value for their preference and the underlying cluster state
does not change then the same ordering of shards will be used for the
searches. This does not guarantee that the exact same shards will be used
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to say something like "That said, we expect this selection to be stable for a long period of time. This allows users of a customized value to optimize cache usage"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I did that in cae2225.

each time: the cluster state, and therefore the selected shards, may change
for a number of reasons including shard relocations and shard failures, and
nodes may sometimes reject searches causing fallbacks to alternative nodes.
A good candidate for a custom preference value is something like the web
session id or the user name.

For instance, use the user's session ID to ensure consistent ordering of results
for the user:
For instance, use the user's session ID `xyzabc123` as follows:

[source,js]
------------------------------------------------
Expand All @@ -65,3 +90,12 @@ GET /_search?preference=xyzabc123
------------------------------------------------
// CONSOLE

WARNING: The `_primary`, `_primary_first`, `_replica` and `_replica_first` are
not recommended, and will be removed in a future version. They do not help to
avoid inconsistent results that arise from the use of shards that have
different refresh states, and Elasticsearch uses synchronous replication so the
primary does not in general hold fresher data than its replicas. The
`_primary_first` and `_replica_first` preferences silently fall back to
non-preferred copies if it is not possible to search the preferred copies. The
`_primary` and `_replica` preferences will silently change their preferred
shards if a replica is promoted to primary, which can happen at any time.