Optimize sort on long field #48804

mayya-sharipova · 2019-11-01T12:29:07Z

Rewrite sort on long field (number or date) to Lucene DistanceFeatureQuery.
This allows to skip non-competitive hits leading to speedups on sort.

Optimize sort on numeric long and date fields, when the system property `es.search.long_sort_optimized` is true.

…ation

Skip sort optimization if the index has 50% or more data with the same value. When index has a lot of docs with the same value, sort optimization doesn't make sense, as DistanceFeatureQuery will produce same scores for these docs, and Lucene will use the second sort to tie-break. This could be slower than usual sorting.

…4021) This change pre-sort the index reader leaves (segment) prior to search when the primary sort is a numeric field eligible to the distance feature optimization. It also adds a tie breaker on `_doc` to the rewritten sort in order to bypass the fact that leaves will be collected in a random order. I ran this patch on the http_logs benchmark and the results are very promising: ``` | 50th percentile latency | desc_sort_timestamp | 220.706 | 136544 | 136324 | ms | | 90th percentile latency | desc_sort_timestamp | 244.847 | 162084 | 161839 | ms | | 99th percentile latency | desc_sort_timestamp | 316.627 | 172005 | 171688 | ms | | 100th percentile latency | desc_sort_timestamp | 335.306 | 173325 | 172989 | ms | | 50th percentile service time | desc_sort_timestamp | 218.369 | 1968.11 | 1749.74 | ms | | 90th percentile service time | desc_sort_timestamp | 244.182 | 2447.2 | 2203.02 | ms | | 99th percentile service time | desc_sort_timestamp | 313.176 | 2950.85 | 2637.67 | ms | | 100th percentile service time | desc_sort_timestamp | 332.924 | 2959.38 | 2626.45 | ms | | error rate | desc_sort_timestamp | 0 | 0 | 0 | % | | Min Throughput | asc_sort_timestamp | 0.801824 | 0.800855 | -0.00097 | ops/s | | Median Throughput | asc_sort_timestamp | 0.802595 | 0.801104 | -0.00149 | ops/s | | Max Throughput | asc_sort_timestamp | 0.803282 | 0.801351 | -0.00193 | ops/s | | 50th percentile latency | asc_sort_timestamp | 220.761 | 824.098 | 603.336 | ms | | 90th percentile latency | asc_sort_timestamp | 251.741 | 853.984 | 602.243 | ms | | 99th percentile latency | asc_sort_timestamp | 368.761 | 893.943 | 525.182 | ms | | 100th percentile latency | asc_sort_timestamp | 431.042 | 908.85 | 477.808 | ms | | 50th percentile service time | asc_sort_timestamp | 218.547 | 820.757 | 602.211 | ms | | 90th percentile service time | asc_sort_timestamp | 249.578 | 849.886 | 600.308 | ms | | 99th percentile service time | asc_sort_timestamp | 366.317 | 888.894 | 522.577 | ms | | 100th percentile service time | asc_sort_timestamp | 430.952 | 908.401 | 477.45 | ms | | error rate | asc_sort_timestamp | 0 | 0 | 0 | % | ``` So roughly 10x faster for the descending sort and 2-3x faster in the ascending case. Note that I indexed the http_logs with a single client in order to simulate real time-based indices where document are indexed in their timestamp order. Relates #37043

…ation

As we don't use cancellableCollector anymore, it should be removed from the expected docs response.

When we optimize sort, we sort segments by their min/max value. As a collector expects to have segments in order, we can not use a single collector for sorted segments. Thus for such a case, we use collectorManager, where for every segment a dedicated collector will be created.

…ation

mayya-sharipova · 2019-11-01T12:32:58Z

Last results comparing performance of sort http_logs on master VS this branch:

Master without optimization

before force_merge

| 50th percentile service time | desc_sort_timestamp0 |    3076.96 |     ms |
| 90th percentile service time | desc_sort_timestamp0 |    3118.53 |     ms |
| 99th percentile service time | desc_sort_timestamp0 |    3211.05 |     ms |
|100th percentile service time | desc_sort_timestamp0 |    3219.41 |     ms |
| 50th percentile service time |  asc_sort_timestamp0 |    1776.03 |     ms |
| 90th percentile service time |  asc_sort_timestamp0 |    1822.78 |     ms |
| 99th percentile service time |  asc_sort_timestamp0 |    1855.65 |     ms |
|100th percentile service time |  asc_sort_timestamp0 |    1955.24 |     ms |

after force_merge (without number of segments)

| 50th percentile service time | desc_sort_timestamp1 |    3073.02 |     ms |
| 90th percentile service time | desc_sort_timestamp1 |     3119.1 |     ms |
| 99th percentile service time | desc_sort_timestamp1 |    3140.78 |     ms |
|100th percentile service time | desc_sort_timestamp1 |    3143.92 |     ms |
| 50th percentile service time |  asc_sort_timestamp1 |     1769.2 |     ms |
| 90th percentile service time |  asc_sort_timestamp1 |    1810.92 |     ms |
| 99th percentile service time |  asc_sort_timestamp1 |    1842.64 |     ms |
|100th percentile service time |  asc_sort_timestamp1 |     1847.4 |     ms |

after force_merge to 1 segment

| 50th percentile service time | desc_sort_timestamp2 |    3005.17 |     ms |
| 90th percentile service time | desc_sort_timestamp2 |    3133.18 |     ms |
| 99th percentile service time | desc_sort_timestamp2 |    3264.84 |     ms |
|100th percentile service time | desc_sort_timestamp2 |    3613.81 |     ms |
| 50th percentile service time |  asc_sort_timestamp2 |    1777.44 |     ms |
| 90th percentile service time |  asc_sort_timestamp2 |    1837.54 |     ms |
| 99th percentile service time |  asc_sort_timestamp2 |    1868.63 |     ms |
|100th percentile service time |  asc_sort_timestamp2 |    1888.11 |     ms |

Long sort optimization

before force_merge

| 50th percentile service time | desc_sort_timestamp0 |    1288.75 |     ms |
| 90th percentile service time | desc_sort_timestamp0 |    1330.79 |     ms |
| 99th percentile service time | desc_sort_timestamp0 |    1370.68 |     ms |
|100th percentile service time | desc_sort_timestamp0 |    1386.77 |     ms |
| 50th percentile service time |  asc_sort_timestamp0 |    174.195 |     ms |
| 90th percentile service time |  asc_sort_timestamp0 |    192.786 |     ms |
| 99th percentile service time |  asc_sort_timestamp0 |    220.151 |     ms |
|100th percentile service time |  asc_sort_timestamp0 |    222.951 |     ms |

after force_merge (without number of segments)

| 50th percentile service time | desc_sort_timestamp1 |    1287.39 |     ms |
| 90th percentile service time | desc_sort_timestamp1 |    1328.13 |     ms |
| 99th percentile service time | desc_sort_timestamp1 |    1359.59 |     ms |
|100th percentile service time | desc_sort_timestamp1 |    1371.03 |     ms |
| 50th percentile service time |  asc_sort_timestamp1 |    174.839 |     ms |
| 90th percentile service time |  asc_sort_timestamp1 |    190.754 |     ms |
| 99th percentile service time |  asc_sort_timestamp1 |    207.287 |     ms |
|100th percentile service time |  asc_sort_timestamp1 |     220.85 |     ms |

after force_merge to 1 segment

| 50th percentile service time | desc_sort_timestamp2 |    1485.37 |     ms |
| 90th percentile service time | desc_sort_timestamp2 |     1524.5 |     ms |
| 99th percentile service time | desc_sort_timestamp2 |     1558.5 |     ms |
|100th percentile service time | desc_sort_timestamp2 |    1568.62 |     ms |
| 50th percentile service time |  asc_sort_timestamp2 |    634.327 |     ms |
| 90th percentile service time |  asc_sort_timestamp2 |    675.618 |     ms |
| 99th percentile service time |  asc_sort_timestamp2 |    1499.46 |     ms |
|100th percentile service time |  asc_sort_timestamp2 |    2276.86 |     ms |

Seems that with merging the performance of ascending sort degrades, but still 90th percentile is 3 times better comparing with master. Performance of descending sort is 2 times faster than master

elasticmachine · 2019-11-01T12:46:59Z

Pinging @elastic/es-search (:Search/Search)

jpountz

This is wonderful.

I'd need to do another round of review to double check whether there are corner cases that we are not covering but this looks great to me. This code would likely be a bit hard to maintain but I believe that the benefits are worth it.

jpountz · 2019-11-09T14:33:23Z

server/src/main/java/org/elasticsearch/search/query/QueryPhase.java

            } else {
-               queryCollector = QueryCollectorContext.createQueryCollector(collectors);
+                shouldRescore = searchWithCollector(searchContext, searcher, query, collectors, hasFilterCollector, timeoutSet);


nit: indentation

Thanks. Looks like the code on master was not indented properly.

ok, sorry then I got it backwards

jpountz · 2019-11-09T14:39:53Z

server/src/main/java/org/elasticsearch/search/query/QueryPhase.java

+
+    private static Query tryRewriteLongSort(SearchContext searchContext, IndexReader reader,
+                                            Query query, boolean hasFilterCollector) throws IOException {
+        if (searchContext.searchAfter() != null) return null;


maybe leave a TODO about this one, it'd be nice to be able to handle it in a followup

jpountz · 2019-11-09T14:42:43Z

server/src/main/java/org/elasticsearch/search/query/QueryPhase.java

+                if (SortField.FIELD_DOC.equals(sField) == false) return null;
+            } else {
+                if (searchContext.mapperService().fullName(sFieldName) == null) return null; // could be _script field that uses _score
+            }


could we use sortField.needsScores to cover scripted fields?

Looks like using sortField.needsScores doesn't work for script fields, as needsScore checks for type == Type.SCORE, while for script fields type=Type.CUSTOM.

For example, in this case we want to avoid running optimization:

"sort": [ {"my_long": "desc"}, { "_script": { "script": { "source": "_score" } } } ]

The type for second sortfield will be CUSTOM. And to avoid running optimization in this case, we need to check that all sort fields are mapped.

In the last commit, I have added TODO item to think how we can cover _script sort that doesn't use _score.

jimczi

I left one comment that needs to be addressed but the change looks good to me.

jimczi · 2019-11-12T11:35:30Z

server/src/main/java/org/elasticsearch/search/query/QueryPhase.java

+                    // Add a tiebreak on _doc in order to be able to search
+                    // the leaves in any order. This is needed since we reorder
+                    // the leaves based on the minimum/maxim value in each segment.
+                    newSortFields[newSortFields.length-1] = SortField.FIELD_DOC;


This is not needed anymore since we use the shared collector manager ?

Thanks Jim, indeed shared collector manager with tie on scores compares by docID . I will correct this.

@jimczi addressed in the last commit

jimczi · 2019-11-12T11:37:49Z

server/src/main/java/org/elasticsearch/search/query/QueryPhase.java

+                if (SortField.FIELD_DOC.equals(sField) == false) return null;
+            } else {
+                if (searchContext.mapperService().fullName(sFieldName) == null) return null; // could be _script field that uses _score
+            }


mayya-sharipova · 2019-11-12T22:52:11Z

@jimczi @jpountz I have rewritten indexFieldHasDuplicateData function in [5870fd7] to correct bugs and also optimize it (to have less number of iterations). Can you please review it to make sure it has a correct logic. Thanks a lot.

mayya-sharipova · 2019-11-13T16:08:02Z

With the latest commit 5870fd7 I have just ran benchmarks on geonames dataset that has around 11M docs with force_merge (but not a single segment).

Master without optimization

| 50th percentile service time | desc_sort_population |     122.952 |     ms |
| 90th percentile service time | desc_sort_population |     128.613 |     ms |
| 99th percentile service time | desc_sort_population |     136.611 |     ms |
|100th percentile service time | desc_sort_population |     141.578 |     ms |
| 50th percentile service time |  asc_sort_population |     122.794 |     ms |
| 90th percentile service time |  asc_sort_population |     126.137 |     ms |
| 99th percentile service time |  asc_sort_population |     130.012 |     ms |
|100th percentile service time |  asc_sort_population |     133.064 |     ms |

| 50th percentile service time |  desc_sort_geonameid |     116.927 |     ms |
| 90th percentile service time |  desc_sort_geonameid |     122.493 |     ms |
| 99th percentile service time |  desc_sort_geonameid |     125.534 |     ms |
|100th percentile service time |  desc_sort_geonameid |     129.366 |     ms |
| 50th percentile service time |   asc_sort_geonameid |     114.988 |     ms |
| 90th percentile service time |   asc_sort_geonameid |     118.912 |     ms |
| 99th percentile service time |   asc_sort_geonameid |      121.49 |     ms |
|100th percentile service time |   asc_sort_geonameid |     122.745 |     ms |

Optimization branch

| 50th percentile service time | desc_sort_population |     126.824 |     ms |
| 90th percentile service time | desc_sort_population |     131.297 |     ms |
| 99th percentile service time | desc_sort_population |     136.004 |     ms |
|100th percentile service time | desc_sort_population |      145.49 |     ms |
| 50th percentile service time |  asc_sort_population |     126.788 |     ms |
| 90th percentile service time |  asc_sort_population |     132.104 |     ms |
| 99th percentile service time |  asc_sort_population |     139.983 |     ms |
|100th percentile service time |  asc_sort_population |     143.888 |     ms |

| 50th percentile service time |  desc_sort_geonameid |     11.3477 |     ms |
| 90th percentile service time |  desc_sort_geonameid |     12.8101 |     ms |
| 99th percentile service time |  desc_sort_geonameid |     15.0781 |     ms |
|100th percentile service time |  desc_sort_geonameid |     25.8151 |     ms |
| 50th percentile service time |   asc_sort_geonameid |     8.86233 |     ms |
| 90th percentile service time |   asc_sort_geonameid |     11.1957 |     ms |
| 99th percentile service time |   asc_sort_geonameid |     15.7971 |     ms |
|100th percentile service time |   asc_sort_geonameid |     21.6551 |     ms |

For the field population there are no improvements, as optimization was NOT run on this field, as this field has a lot of duplicate data.
For the field geonameid there are around 10X improvements both on asc and desc sorts.

mayya-sharipova · 2019-11-15T13:52:24Z

caught with @jimczi offline; reverting 5870fd7 as it has flaws in logic. Instead in d147e0f just corrected the bug in the calculation of the avg value to avoid overflow.

jimczi

I left some minor nits but the change looks good to me. Thanks @mayya-sharipova

jimczi · 2019-11-15T11:01:55Z

server/src/main/java/org/elasticsearch/search/internal/ContextIndexSearcher.java

+                try {
+                    intersectScorerAndBitSet(scorer, liveDocsBitSet, leafCollector,
+                        checkCancelled == null ? () -> {
+                        } : checkCancelled);


nit: correct indentation

jimczi · 2019-11-15T11:06:52Z

server/src/main/java/org/elasticsearch/search/query/QueryPhase.java

@@ -75,6 +94,8 @@
 */
 public class QueryPhase implements SearchPhase {
    private static final Logger LOGGER = LogManager.getLogger(QueryPhase.class);
+    public static final boolean SYS_PROP_LONG_SORT_OPTIMIZED =
+        Booleans.parseBoolean(System.getProperty("es.search.long_sort_optimized", "true"));


nit: maybe change the name to es.search.rewrite_sort ? Can you also add a comment/todo that we should remove this property in 8 ?

@jimczi Do we need to have any user-faced documentation for this? or at least documentation for the support team (e.g. in case sort is misbehaving, disable the optimization)?

jimczi · 2019-11-15T15:21:47Z

server/src/main/java/org/elasticsearch/search/query/QueryPhase.java

+    // we use collectorManager during sort optimization
+    // for the sort optimization, we have already checked that there are no other collectors, no filters,
+    // no search after, no scroll, no collapse, no track scores
+    // this means we can use TopFieldCollector directly


This could be replaced with a multiline comment ?

mayya-sharipova · 2019-11-15T21:31:34Z

@jimczi Does this PR target 7.6? Looks like ES 7.6 still was not upgraded to the Lucene version where shared collector manager was introduced.

jpountz

I tried to do a more in-depth review. I found one thing that I'd like us to optimize a bit, but otherwise it looks good to me.

jpountz · 2019-11-21T09:08:48Z

server/src/main/java/org/elasticsearch/search/query/QueryPhase.java

+        collectors.addFirst(topDocsFactory);
+
+        final Collector queryCollector;
+        if ( searchContext.getProfilers() != null) {


Suggested change

if ( searchContext.getProfilers() != null) {

if (searchContext.getProfilers() != null) {

jpountz · 2019-11-21T09:17:28Z

server/src/main/java/org/elasticsearch/search/query/QueryPhase.java

+        long minValue = LongPoint.decodeDimension(pointValues.getMinPackedValue(), 0);
+        long maxValue = LongPoint.decodeDimension(pointValues.getMaxPackedValue(), 0);
+        while (minValue < maxValue) {
+            long avgValue = Math.floorDiv(minValue, 2) + Math.floorDiv(maxValue, 2); // to avoid overflow first divide each value by 2


I think avg is confusing here since it is the mid point, not an average

jpountz · 2019-11-21T09:32:22Z

server/src/main/java/org/elasticsearch/search/query/QueryPhase.java

+        while (minValue < maxValue) {
+            long avgValue = Math.floorDiv(minValue, 2) + Math.floorDiv(maxValue, 2); // to avoid overflow first divide each value by 2
+            long countLeft = estimatePointCount(pointValues, minValue, avgValue);
+            long countRight = estimatePointCount(pointValues, avgValue + 1, maxValue);


I wonder that this method might be quite expensive by calling estimatePointCount in a loop, can we break the for loop as soon as countLeft + countRight <= globalDocCount/2 (which might require to rename this method to something like boolean hasValueWithCountGreaterThan(PointValues values, long threshold) or something along those lines).

Thanks. I was thinking the same, and implemented it in 5870fd7 , but @jimczi thinks that this is incorrect logic. Now thinking again, I think that may be it is correct, and I should have the commit 5870fd7 back.

After reading 5870fd7 again I think I misread your commit, sorry @mayya-sharipova ;). I thought of it as if min and max was never updated. The logic looks good to me now, sorry for the confusion.

jpountz · 2019-11-21T09:33:24Z

server/src/main/java/org/elasticsearch/search/query/QueryPhase.java

+        return (globalMedianCount >= globalDocCount/2);
+    }
+
+    static long estimateMedianValue(PointValues pointValues) throws IOException {


for the record I think this method is doing the right thing, but we are not computing the median here since calls to estimatePointCount below use the updated values of minValue/maxValue instead of the global min/max.

I am not sure I completely understood your comment. Median item is such a value that half of values are smaller than it and half is greater than it. With every loop iteration we choosing a part with more counts, and by this converging to a median value. No? You see it differently?

jpountz · 2019-11-22T22:41:27Z

server/src/main/java/org/elasticsearch/search/query/QueryPhase.java

+            PointValues pointValues = lrc.reader().getPointValues(field);
+            if (pointValues == null) continue;
+            int docCount = pointValues.getDocCount();
+            if (docCount <= 512) { // skipping small segments as estimateMedianCount doesn't work well on them


Suggested change

if (docCount <= 512) { // skipping small segments as estimateMedianCount doesn't work well on them

if (docCount <= 512) { // skipping small segments as estimateDocCount doesn't work well on them

jpountz · 2019-11-22T22:44:09Z

server/src/main/java/org/elasticsearch/search/query/QueryPhase.java

+                docsNoDupl += docCount;
+            }
+        }
+        return (docsDupl > docsNoDupl);


The per-segment logic looks good to me but I'm less sure about how information is merged across multiple segments. Maybe this is something we can look into improving later.

mayya-sharipova · 2019-11-25T13:52:50Z

@elasticmachine run elasticsearch-ci/bwc

mayya-sharipova · 2019-11-25T13:53:03Z

@elasticmachine run elasticsearch-ci/default-distro

Use shared TopFieldCollector manager for sort optimization. This collector manager is able to exchange minimum competitive score between collectors

…ation

mayya-sharipova · 2019-11-25T20:10:38Z

@elasticmachine run elasticsearch-ci/packaging-sample-matrix

This reverts commit 79d9b36.

This is a follow up of elastic#48804 where we rewrite numeric sort to use the DistanceFeatureQuery. This change adds another optimization if the query is a `match_all` that instead of using a distance feature query will simply extract the documents directly from the indexed point and early terminate as soon as enough docs have been collected. This optimization has a constant cost so it can be considerably faster than the other optimization since it only needs to visit the BKD-tree of a field and can early terminate as soon as it collected the number of requested hits. Note that this optimization can only work when the query is a match_all and the numeric sort order is not reversed. The pr is in WIP state, it needs more tests and some cleanup but I wanted to open it early in order to discuss whether we should pursue this path or not.

Don't run long sort optimization when index is already sorted on the same field as the sort query parameter. Relates to elastic#37043, follow up for elastic#48804

This rewrites long sort as a `DistanceFeatureQuery`, which can efficiently skip non-competitive blocks and segments of documents. Depending on the dataset, the speedups can be 2 - 10 times. The optimization can be disabled with setting the system property `es.search.rewrite_sort` to `false`. Optimization is skipped when an index has 50% or more data with the same value. Optimization is done through: 1. Rewriting sort as `DistanceFeatureQuery` which can efficiently skip non-competitive blocks and segments of documents. 2. Sorting segments according to the primary numeric sort field(#44021) This allows to skip non-competitive segments. 3. Using collector manager. When we optimize sort, we sort segments by their min/max value. As a collector expects to have segments in order, we can not use a single collector for sorted segments. We use collectorManager, where for every segment a dedicated collector will be created. 4. Using Lucene's shared TopFieldCollector manager This collector manager is able to exchange minimum competitive score between collectors, which allows us to efficiently skip the whole segments that don't contain competitive scores. 5. When index is force merged to a single segment, #48533 interleaving old and new segments allows for this optimization as well, as blocks with non-competitive docs can be skipped. Backport for #48804 Co-authored-by: Jim Ferenczi <jim.ferenczi@elastic.co>

Don't run long sort optimization when index is already sorted on the same field as the sort query parameter. Relates to #37043, follow up for #48804

Don't run long sort optimization when index is already sorted on the same field as the sort query parameter. Relates to elastic#37043, follow up for elastic#48804

mayya-sharipova and others added 12 commits June 10, 2019 12:04

Optimize sort on numeric long and date fields (#39770)

567a739

Optimize sort on numeric long and date fields, when the system property `es.search.long_sort_optimized` is true.

Merge branch 'master' into long_sort_optimization

410e28f

Merge remote-tracking branch 'upstream/master' into long_sort_optimiz…

a17757f

…ation

Merge remote-tracking branch 'upstream/master' into long_sort_optimiz…

1a9deae

…ation

Merge branch 'master' into long_sort_optimization

b5cad4d

Merge branch 'master' into long_sort_optimization

04e5e41

Merge remote-tracking branch 'upstream/master' into long_sort_optimiz…

e167fb9

…ation

Remove nested collector in docs response

c4c3b66

As we don't use cancellableCollector anymore, it should be removed from the expected docs response.

Merge remote-tracking branch 'upstream/master' into long_sort_optimiz…

a261e4f

…ation

mayya-sharipova added the :Search/Search Search-related issues that do not fall into other categories label Nov 1, 2019

mayya-sharipova added the >enhancement label Nov 1, 2019

jpountz reviewed Nov 9, 2019

View reviewed changes

jimczi requested changes Nov 12, 2019

View reviewed changes

jimczi added release highlight v7.6.0 v8.0.0 labels Nov 12, 2019

jimczi approved these changes Nov 15, 2019

View reviewed changes

jimczi reviewed Nov 15, 2019

View reviewed changes

jpountz requested changes Nov 21, 2019

View reviewed changes

jpountz approved these changes Nov 22, 2019

View reviewed changes

mayya-sharipova added 6 commits November 25, 2019 14:56

Use shared TopFieldCollector manager

37d44ad

Use shared TopFieldCollector manager for sort optimization. This collector manager is able to exchange minimum competitive score between collectors

Merge remote-tracking branch 'upstream/master' into long_sort_optimiz…

5ce9e33

…ation

Correct calculation of avg value to avoid overflow

ddf165c

Merge remote-tracking branch 'upstream/master' into long_sort_optimiz…

64c430c

…ation

Optimize calculating if index has duplicate data

6a0a284

Merge remote-tracking branch 'upstream/master' into long_sort_optimiz…

bf8e17a

…ation

mayya-sharipova force-pushed the long_sort_optimization branch from 5cef7ec to bf8e17a Compare November 25, 2019 20:06

mayya-sharipova merged commit 79d9b36 into master Nov 26, 2019

mayya-sharipova added a commit that referenced this pull request Nov 26, 2019

Revert "Optimize sort on long field (#48804)"

e9ba252

This reverts commit 79d9b36.

jimczi mentioned this pull request Nov 29, 2019

Optimize numeric sort on match_all queries #49717

Closed

This was referenced Nov 29, 2019

Disable sort optimization when index is sorted #49727

Merged

Optimize sort on numeric long and date fields. #49732

Merged

mayya-sharipova added a commit that referenced this pull request Dec 2, 2019

Disable sort optimization when index is sorted (#49727)

8f42719

Don't run long sort optimization when index is already sorted on the same field as the sort query parameter. Relates to #37043, follow up for #48804

mayya-sharipova added a commit that referenced this pull request Dec 2, 2019

Disable sort optimization when index is sorted (#49727)

3bbaa01

Don't run long sort optimization when index is already sorted on the same field as the sort query parameter. Relates to #37043, follow up for #48804

This was referenced Feb 3, 2020

[meta] 7.6 release elastic/elasticsearch-net#4340

Closed

[meta] 7.6 release elastic/elasticsearch-net#4341

Closed

mfussenegger mentioned this pull request Mar 24, 2020

ES Backports crate/crate#9796

Closed

37 tasks

mayya-sharipova deleted the long_sort_optimization branch May 14, 2020 10:31

jtibshirani mentioned this pull request May 18, 2020

Search can fail when size is 0 and long sort optimization enabled. #56923

Closed

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

	if ( searchContext.getProfilers() != null) {
	if (searchContext.getProfilers() != null) {

	if (docCount <= 512) { // skipping small segments as estimateMedianCount doesn't work well on them
	if (docCount <= 512) { // skipping small segments as estimateDocCount doesn't work well on them

Optimize sort on long field #48804

Optimize sort on long field #48804

Conversation

mayya-sharipova commented Nov 1, 2019

mayya-sharipova commented Nov 1, 2019 • edited Loading

Master without optimization

Long sort optimization

elasticmachine commented Nov 1, 2019

jpountz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mayya-sharipova Nov 12, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jimczi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mayya-sharipova commented Nov 12, 2019

mayya-sharipova commented Nov 13, 2019

Master without optimization

Optimization branch

mayya-sharipova commented Nov 15, 2019

jimczi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mayya-sharipova commented Nov 15, 2019

jpountz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jimczi Nov 21, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mayya-sharipova commented Nov 25, 2019

mayya-sharipova commented Nov 25, 2019

mayya-sharipova commented Nov 25, 2019

mayya-sharipova commented Nov 1, 2019 •

edited

Loading

mayya-sharipova Nov 12, 2019 •

edited

Loading

jimczi Nov 21, 2019 •

edited

Loading