Skip to content

Commit

Permalink
Merge branch 'main' into security-tokens-no-index-refresh
Browse files Browse the repository at this point in the history
  • Loading branch information
albertzaharovits committed Jul 24, 2023
2 parents c7f774f + 3e07b7c commit 947363f
Show file tree
Hide file tree
Showing 220 changed files with 1,970 additions and 865 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -116,8 +116,9 @@ private void checkModuleVersion(ModuleReference mref) {

private void checkModuleNamePrefix(ModuleReference mref) {
getLogger().info("{} checking module name prefix for {}", this, mref.descriptor().name());
if (mref.descriptor().name().startsWith("org.elasticsearch.") == false) {
throw new GradleException("Expected name starting with \"org.elasticsearch.\", in " + mref.descriptor());
if (mref.descriptor().name().startsWith("org.elasticsearch.") == false
&& mref.descriptor().name().startsWith("co.elastic.") == false) {
throw new GradleException("Expected name starting with \"org.elasticsearch.\" or \"co.elastic\" in " + mref.descriptor());
}
}

Expand Down
5 changes: 5 additions & 0 deletions docs/changelog/96515.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 96515
summary: Support boxplot aggregation in transform
area: Transform
type: enhancement
issues: []
5 changes: 5 additions & 0 deletions docs/changelog/97683.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 97683
summary: Refactor nested field handling in `FieldFetcher`
area: Search
type: enhancement
issues: []
6 changes: 6 additions & 0 deletions docs/changelog/97840.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
pr: 97840
summary: Improve exception handling in Coordinator#publish
area: Cluster Coordination
type: bug
issues:
- 97798
32 changes: 17 additions & 15 deletions docs/reference/aggregations/metrics/geoline-aggregation.asciidoc
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
[role="xpack"]
[[search-aggregations-metrics-geo-line]]
=== Geo-Line Aggregation
=== Geo-line aggregation
++++
<titleabbrev>Geo-Line</titleabbrev>
<titleabbrev>Geo-line</titleabbrev>
++++

The `geo_line` aggregation aggregates all `geo_point` values within a bucket into a `LineString` ordered
Expand Down Expand Up @@ -77,13 +77,12 @@ Which returns:
The resulting https://tools.ietf.org/html/rfc7946#section-3.2[GeoJSON Feature] contains both a `LineString` geometry
for the path generated by the aggregation, as well as a map of `properties`.
The property `complete` informs of whether all documents matched were used to generate the geometry.
The `size` option described below can be used to limit the number of documents included in the aggregation,
The <<search-aggregations-metrics-geo-line-size,`size` option>> can be used to limit the number of documents included in the aggregation,
leading to results with `complete: false`.
Exactly which documents are dropped from results depends on whether the aggregation is based
on `time_series` or not, and this is discussed in
<<search-aggregations-metrics-geo-line-grouping-time-series-advantages,more detail below>>.
Exactly which documents are dropped from results <<search-aggregations-metrics-geo-line-grouping-time-series-advantages,depends on whether the aggregation is based
on `time_series` or not>>.

The above result could be displayed in a map user interface:
This result could be displayed in a map user interface:

image:images/spatial/geo_line.png[Kibana map with museum tour of Amsterdam]

Expand Down Expand Up @@ -132,18 +131,19 @@ feature properties.
The line is sorted in ascending order by the sort key when set to "ASC", and in descending
with "DESC".

[[search-aggregations-metrics-geo-line-size]]
`size`::
(Optional, integer, default: `10000`) The maximum length of the line represented in the aggregation.
Valid sizes are between one and 10000.
Within <<search-aggregations-metrics-geo-line-grouping-time-series,`time_series`>>
the aggregation uses line simplification to constrain the size, otherwise it uses truncation.
See <<search-aggregations-metrics-geo-line-grouping-time-series-advantages,below>>
Refer to <<search-aggregations-metrics-geo-line-grouping-time-series-advantages>>
for a discussion on the subtleties involved.

[[search-aggregations-metrics-geo-line-grouping]]
==== Grouping

The simple example above will produce a single track for all the data selected by the query. However, it is far more
This simple example produces a single track for all the data selected by the query. However, it is far more
common to need to group the data into multiple tracks. For example, grouping flight transponder measurements by
flight call-sign before sorting each flight by timestamp and producing a separate track for each.

Expand Down Expand Up @@ -210,7 +210,7 @@ POST /tour/_bulk?refresh
[[search-aggregations-metrics-geo-line-grouping-terms]]
==== Grouping with terms

Using the above data, for a non-time-series use case, the grouping can be done using a
Using this data, for a non-time-series use case, the grouping can be done using a
<<search-aggregations-bucket-terms-aggregation,terms aggregation>> based on city name.
This would work whether or not we had defined the `tour` index as a time series index.

Expand Down Expand Up @@ -294,17 +294,19 @@ Which returns:
----
// TESTRESPONSE

The above results contain an array of buckets, where each bucket is a JSON object with the `key` showing the name
These results contain an array of buckets, where each bucket is a JSON object with the `key` showing the name
of the `city` field, and an inner aggregation result called `museum_tour` containing a
https://tools.ietf.org/html/rfc7946#section-3.2[GeoJSON Feature] describing the
actual route between the various attractions in that city.
Each result also includes a `properties` object with a `complete` value which will be `false` if the geometry
was truncated to the limits specified in the `size` parameter.
Note that when we use `time_series` in the example below, we will get the same results structured a little differently.
Note that when we use `time_series` in the next example, we will get the same results structured a little differently.

[[search-aggregations-metrics-geo-line-grouping-time-series]]
==== Grouping with time-series

preview::[]

Using the same data as before, we can also perform the grouping with a
<<search-aggregations-bucket-time-series-aggregation,`time_series` aggregation>>.
This will group by TSID, which is defined as the combinations of all fields with `time_series_dimension: true`,
Expand Down Expand Up @@ -337,7 +339,7 @@ NOTE: The `geo_line` aggregation no longer requires the `sort` field when nested
This is because the sort field is set to `@timestamp`, which all time-series indexes are pre-sorted by.
If you do set this parameter, and set it to something other than `@timestamp` you will get an error.

The above query will result in:
This query will result in:

[source,js]
----
Expand Down Expand Up @@ -400,7 +402,7 @@ The above query will result in:
----
// TESTRESPONSE

The above results are essentially the same as with the previous `terms` aggregation example, but structured differently.
These results are essentially the same as with the previous `terms` aggregation example, but structured differently.
Here we see the buckets returned as a map, where the key is an internal description of the TSID.
This TSID is unique for each unique combination of fields with `time_series_dimension: true`.
Each bucket contains a `key` field which is also a map of all dimension values for the TSID, in this case only the city
Expand All @@ -414,7 +416,7 @@ was simplified to the limits specified in the `size` parameter.
[[search-aggregations-metrics-geo-line-grouping-time-series-advantages]]
==== Why group with time-series?

When reviewing the above examples, you might think that there is little difference between using
When reviewing these examples, you might think that there is little difference between using
<<search-aggregations-bucket-terms-aggregation,`terms`>> or
<<search-aggregations-bucket-time-series-aggregation,`time_series`>>
to group the geo-lines. However, there are some important differences in behaviour between the two cases.
Expand Down
29 changes: 15 additions & 14 deletions docs/reference/how-to/size-your-shards.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -140,20 +140,21 @@ Every new backing index is an opportunity to further tune your strategy.

[discrete]
[[shard-size-recommendation]]
==== Aim for shard sizes between 10GB and 50GB

Larger shards take longer to recover after a failure. When a node fails, {es}
rebalances the node's shards across the data tier's remaining nodes. This
recovery process typically involves copying the shard contents across the
network, so a 100GB shard will take twice as long to recover than a 50GB shard.
In contrast, small shards carry proportionally more overhead and are less
efficient to search. Searching fifty 1GB shards will take substantially more
resources than searching a single 50GB shard containing the same data.

There are no hard limits on shard size, but experience shows that shards
between 10GB and 50GB typically work well for logs and time series data. You
may be able to use larger shards depending on your network and use case.
Smaller shards may be appropriate for
==== Aim for shards of up to 200M documents, or with sizes between 10GB and 50GB

There is some overhead associated with each shard, both in terms of cluster
management and search performance. Searching a thousand 50MB shards will be
substantially more expensive than searching a single 50GB shard containing the
same data. However, very large shards can also cause slower searches and will
take longer to recover after a failure.

There is no hard limit on the physical size of a shard, and each shard can in
theory contain up to just over two billion documents. However, experience shows
that shards between 10GB and 50GB typically work well for many use cases, as
long as the per-shard document count is kept below 200 million.

You may be able to use larger shards depending on your network and use case,
and smaller shards may be appropriate for
{enterprise-search-ref}/index.html[Enterprise Search] and similar use cases.

If you use {ilm-init}, set the <<ilm-rollover,rollover action>>'s
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/migration/apis/feature-migration.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ Example response:
"migration_status" : "NO_MIGRATION_NEEDED"
}
--------------------------------------------------
// TESTRESPONSE[s/"minimum_index_version" : "8100099"/"minimum_index_version" : $body.$_path/]
// TESTRESPONSE[skip:"AwaitsFix https://github.com/elastic/elasticsearch/issues/97780]

When you submit a POST request to the `_migration/system_features` endpoint to
start the migration process, the response indicates what features will be
Expand Down
1 change: 1 addition & 0 deletions docs/reference/rest-api/common-parms.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -767,6 +767,7 @@ currently supported:
+
--
* <<search-aggregations-metrics-avg-aggregation,Average>>
* <<search-aggregations-metrics-boxplot-aggregation,Boxplot>>
* <<search-aggregations-pipeline-bucket-script-aggregation,Bucket script>>
* <<search-aggregations-pipeline-bucket-selector-aggregation,Bucket selector>>
* <<search-aggregations-metrics-cardinality-aggregation,Cardinality>>
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/scripting/security.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ configured to run both types of scripts. To limit what type of scripts are run,
set `script.allowed_types` to `inline` or `stored`. To prevent any scripts from
running, set `script.allowed_types` to `none`.

IMPORTANT: If you use {kib}, set `script.allowed_types` to `both` or `inline`.
IMPORTANT: If you use {kib}, set `script.allowed_types` to both or just `inline`.
Some {kib} features rely on inline scripts and do not function as expected
if {es} does not allow inline scripts.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ public XContentBuilder toXContent(XContentBuilder builder, Params params) throws
builder.field("cluster_uuid", clusterUuid);
builder.startObject("version")
.field("number", build.qualifiedVersion())
.field("build_flavor", "default")
.field("build_flavor", build.flavor())
.field("build_type", build.type().displayName())
.field("build_hash", build.hash())
.field("build_date", build.date())
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ protected Collection<Class<? extends Plugin>> nodePlugins() {
}

/** Check that the reset method cleans up a feature */
@AwaitsFix(bugUrl = "https://github.com/elastic/elasticsearch/issues/97780")
public void testResetSystemIndices() throws Exception {
String systemIndex1 = ".test-system-idx-1";
String systemIndex2 = ".second-test-system-idx-1";
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -541,15 +541,25 @@ static ReducedQueryPhase reducedQueryPhase(
);
}
int total = queryResults.size();
queryResults = queryResults.stream().filter(res -> res.queryResult().isNull() == false).toList();
String errorMsg = "must have at least one non-empty search result, got 0 out of " + total;
assert queryResults.isEmpty() == false : errorMsg;
if (queryResults.isEmpty()) {
throw new IllegalStateException(errorMsg);
final Collection<SearchPhaseResult> nonNullResults = new ArrayList<>();
boolean hasSuggest = false;
boolean hasProfileResults = false;
for (SearchPhaseResult queryResult : queryResults) {
var res = queryResult.queryResult();
if (res.isNull()) {
continue;
}
hasSuggest |= res.suggest() != null;
hasProfileResults |= res.hasProfileResults();
nonNullResults.add(queryResult);
}
queryResults = nonNullResults;
validateMergeSortValueFormats(queryResults);
final boolean hasSuggest = queryResults.stream().anyMatch(res -> res.queryResult().suggest() != null);
final boolean hasProfileResults = queryResults.stream().anyMatch(res -> res.queryResult().hasProfileResults());
if (queryResults.isEmpty()) {
var ex = new IllegalStateException("must have at least one non-empty search result, got 0 out of " + total);
assert false : ex;
throw ex;
}

// count the total (we use the query result provider here, since we might not get any hits (we scrolled past them))
final Map<String, List<Suggestion<?>>> groupedSuggestions = hasSuggest ? new HashMap<>() : Collections.emptyMap();
Expand Down Expand Up @@ -578,9 +588,7 @@ static ReducedQueryPhase reducedQueryPhase(
}
}
}
if (bufferedTopDocs.isEmpty() == false) {
assert result.hasConsumedTopDocs() : "firstResult has no aggs but we got non null buffered aggs?";
}
assert bufferedTopDocs.isEmpty() || result.hasConsumedTopDocs() : "firstResult has no aggs but we got non null buffered aggs?";
if (hasProfileResults) {
String key = result.getSearchShardTarget().toString();
profileShardResults.put(key, result.consumeProfileResult());
Expand Down
Loading

0 comments on commit 947363f

Please sign in to comment.