Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve string terms aggregation performance using Collector#setWeight #11643

Merged
merged 30 commits into from
Mar 12, 2024

Conversation

sandeshkr419
Copy link
Contributor

@sandeshkr419 sandeshkr419 commented Dec 19, 2023

Description

Utilize Collector#setWeight to smartly short-circuit certain aggregation paths. Basically cases when weight#count does not returns -1:

  • when weight#count > 0 & weight#count == maxdocs in segments -> can leverage reading from termsEnum

Cases accounted for (for which the optimization will not work):

  1. Field data not indexed.
  2. Doc count explicitly provided in documents.

Related Issues

Resolves #10954

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

❌ Gradle check result for 006f404: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions github-actions bot added enhancement Enhancement or improvement to existing feature or request Search:Performance labels Dec 19, 2023
Copy link
Contributor

github-actions bot commented Jan 9, 2024

❌ Gradle check result for 6667d18: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Jan 9, 2024

Compatibility status:

Checks if related components are compatible with change cdc4204

Incompatible components

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/flow-framework.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/sql.git]

Copy link
Contributor

❌ Gradle check result for ce1082c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for a5b3baa: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for e4e0b3c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 851b759: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for e005a9c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@sandeshkr419
Copy link
Contributor Author

sandeshkr419 commented Jan 29, 2024

To test the performance improvements, I had edited one of the sub-aggregation search body to a simple term aggregation, like this.

Basically in OSB, my search body/workload looks like this:

{
      "name": "country_agg_uncached",
      "operation-type": "search",
      "body": {
        "size": 0,
        "aggs": {
          "country_population": {
            "terms": {
              "field": "country_code.raw"
            }
          }
        }
      }
    }

The changes are run on 2.11 cluster:

Without Changes:

|                                                  Segment count |                      |         318 |        |
|                                                 Min Throughput | country_agg_uncached |        2.99 |  ops/s |
|                                                Mean Throughput | country_agg_uncached |        2.99 |  ops/s |
|                                              Median Throughput | country_agg_uncached |        2.99 |  ops/s |
|                                                 Max Throughput | country_agg_uncached |        2.99 |  ops/s |
|                                        50th percentile latency | country_agg_uncached |     126.903 |     ms |
|                                        90th percentile latency | country_agg_uncached |     161.641 |     ms |
|                                        99th percentile latency | country_agg_uncached |     301.439 |     ms |
|                                       100th percentile latency | country_agg_uncached |     317.183 |     ms |
|                                   50th percentile service time | country_agg_uncached |     124.271 |     ms |
|                                   90th percentile service time | country_agg_uncached |     159.256 |     ms |
|                                   99th percentile service time | country_agg_uncached |     298.885 |     ms |
|                                  100th percentile service time | country_agg_uncached |     314.573 |     ms |
|                                                     error rate | country_agg_uncached |           0 |      % |

|                                                  Segment count |                      |         318 |        |
|                                                 Min Throughput | country_agg_uncached |        3.01 |  ops/s |
|                                                Mean Throughput | country_agg_uncached |        3.01 |  ops/s |
|                                              Median Throughput | country_agg_uncached |        3.01 |  ops/s |
|                                                 Max Throughput | country_agg_uncached |        3.01 |  ops/s |
|                                        50th percentile latency | country_agg_uncached |     124.281 |     ms |
|                                        90th percentile latency | country_agg_uncached |     133.398 |     ms |
|                                        99th percentile latency | country_agg_uncached |     146.104 |     ms |
|                                       100th percentile latency | country_agg_uncached |     147.676 |     ms |
|                                   50th percentile service time | country_agg_uncached |     121.773 |     ms |
|                                   90th percentile service time | country_agg_uncached |     131.374 |     ms |
|                                   99th percentile service time | country_agg_uncached |     145.346 |     ms |
|                                  100th percentile service time | country_agg_uncached |     146.158 |     ms |
|                                                     error rate | country_agg_uncached |           0 |      % |

|                                                  Segment count |                      |         318 |        |
|                                                 Min Throughput | country_agg_uncached |           3 |  ops/s |
|                                                Mean Throughput | country_agg_uncached |        3.01 |  ops/s |
|                                              Median Throughput | country_agg_uncached |        3.01 |  ops/s |
|                                                 Max Throughput | country_agg_uncached |        3.01 |  ops/s |
|                                        50th percentile latency | country_agg_uncached |     121.595 |     ms |
|                                        90th percentile latency | country_agg_uncached |      129.72 |     ms |
|                                        99th percentile latency | country_agg_uncached |     139.045 |     ms |
|                                       100th percentile latency | country_agg_uncached |     143.113 |     ms |
|                                   50th percentile service time | country_agg_uncached |     119.435 |     ms |
|                                   90th percentile service time | country_agg_uncached |     127.876 |     ms |
|                                   99th percentile service time | country_agg_uncached |     136.874 |     ms |
|                                  100th percentile service time | country_agg_uncached |     140.488 |     ms |
|                                                     error rate | country_agg_uncached |           0 |      % |

With Current Changes:

|                                                  Segment count |                      |         318 |        |
|                                                 Min Throughput | country_agg_uncached |           3 |  ops/s |
|                                                Mean Throughput | country_agg_uncached |           3 |  ops/s |
|                                              Median Throughput | country_agg_uncached |           3 |  ops/s |
|                                                 Max Throughput | country_agg_uncached |           3 |  ops/s |
|                                        50th percentile latency | country_agg_uncached |     22.5772 |     ms |
|                                        90th percentile latency | country_agg_uncached |     26.6315 |     ms |
|                                        99th percentile latency | country_agg_uncached |     37.5379 |     ms |
|                                       100th percentile latency | country_agg_uncached |     41.2208 |     ms |
|                                   50th percentile service time | country_agg_uncached |     19.9387 |     ms |
|                                   90th percentile service time | country_agg_uncached |     23.2274 |     ms |
|                                   99th percentile service time | country_agg_uncached |     34.8001 |     ms |
|                                  100th percentile service time | country_agg_uncached |     35.6524 |     ms |
|                                                     error rate | country_agg_uncached |           0 |      % |

|                                                  Segment count |                      |         318 |        |
|                                                 Min Throughput | country_agg_uncached |        3.01 |  ops/s |
|                                                Mean Throughput | country_agg_uncached |        3.01 |  ops/s |
|                                              Median Throughput | country_agg_uncached |        3.01 |  ops/s |
|                                                 Max Throughput | country_agg_uncached |        3.01 |  ops/s |
|                                        50th percentile latency | country_agg_uncached |     21.9949 |     ms |
|                                        90th percentile latency | country_agg_uncached |      26.996 |     ms |
|                                        99th percentile latency | country_agg_uncached |     32.5468 |     ms |
|                                       100th percentile latency | country_agg_uncached |     42.8395 |     ms |
|                                   50th percentile service time | country_agg_uncached |     19.5599 |     ms |
|                                   90th percentile service time | country_agg_uncached |     24.0329 |     ms |
|                                   99th percentile service time | country_agg_uncached |     29.9984 |     ms |
|                                  100th percentile service time | country_agg_uncached |     39.9631 |     ms |
|                                                     error rate | country_agg_uncached |           0 |      % |

|                                                  Segment count |                      |         318 |        |
|                                                 Min Throughput | country_agg_uncached |        3.01 |  ops/s |
|                                                Mean Throughput | country_agg_uncached |        3.01 |  ops/s |
|                                              Median Throughput | country_agg_uncached |        3.01 |  ops/s |
|                                                 Max Throughput | country_agg_uncached |        3.01 |  ops/s |
|                                        50th percentile latency | country_agg_uncached |     19.9977 |     ms |
|                                        90th percentile latency | country_agg_uncached |     25.1257 |     ms |
|                                        99th percentile latency | country_agg_uncached |     35.0793 |     ms |
|                                       100th percentile latency | country_agg_uncached |     44.3048 |     ms |
|                                   50th percentile service time | country_agg_uncached |     17.6903 |     ms |
|                                   90th percentile service time | country_agg_uncached |     23.3045 |     ms |
|                                   99th percentile service time | country_agg_uncached |     28.8288 |     ms |
|                                  100th percentile service time | country_agg_uncached |     43.4339 |     ms |
|                                                     error rate | country_agg_uncached |           0 |      % |

|                                                  Segment count |                      |         318 |        |
|                                                 Min Throughput | country_agg_uncached |        3.01 |  ops/s |
|                                                Mean Throughput | country_agg_uncached |        3.01 |  ops/s |
|                                              Median Throughput | country_agg_uncached |        3.01 |  ops/s |
|                                                 Max Throughput | country_agg_uncached |        3.01 |  ops/s |
|                                        50th percentile latency | country_agg_uncached |     21.8416 |     ms |
|                                        90th percentile latency | country_agg_uncached |     27.9763 |     ms |
|                                        99th percentile latency | country_agg_uncached |     34.1669 |     ms |
|                                       100th percentile latency | country_agg_uncached |     48.2031 |     ms |
|                                   50th percentile service time | country_agg_uncached |     19.2949 |     ms |
|                                   90th percentile service time | country_agg_uncached |     25.0944 |     ms |
|                                   99th percentile service time | country_agg_uncached |     30.8222 |     ms |
|                                  100th percentile service time | country_agg_uncached |     42.6661 |     ms |
|                                                     error rate | country_agg_uncached |           0 |      % |

Clearly 4x (p100) - 6x (p90) improvements can be seen.

@msfroh I'm working next to see if I can trim in more corners in implementation, refactor further and relevant cases, but please feel free to take initial look and provide comments.

Also, with OSB, I will open up a separate issue with OSB workload to incorporate vanilla term aggregations in their workloads since currently we do not have any term aggregations workload like the one I tested.

@sandeshkr419 sandeshkr419 changed the title [Draft] Use Collector.setWeight to improve aggregation performance [Draft] Use Collector.setWeight to improve terms aggregation performance Jan 31, 2024
Copy link
Contributor

❌ Gradle check result for 6d05716: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@harshavamsi harshavamsi added v2.13.0 Issues and PRs related to version 2.13.0 v2.12.0 Issues and PRs related to version 2.12.0 and removed v2.13.0 Issues and PRs related to version 2.13.0 labels Feb 1, 2024
@sandeshkr419 sandeshkr419 changed the title [Draft] Use Collector.setWeight to improve terms aggregation performance [Draft] Use Collector.setWeight to improve string terms aggregation performance Feb 3, 2024
@sandeshkr419 sandeshkr419 changed the title [Draft] Use Collector.setWeight to improve string terms aggregation performance Improve string terms aggregation performance using Collector#setWeight Feb 3, 2024
Copy link
Contributor

github-actions bot commented Feb 3, 2024

❌ Gradle check result for 2a9cce7: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Feb 3, 2024

❌ Gradle check result for 82d1532: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Feb 3, 2024

❌ Gradle check result for 58716d2: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Feb 3, 2024

❌ Gradle check result for 8922b88: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Feb 3, 2024

❌ Gradle check result for b555118: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@sandeshkr419 sandeshkr419 marked this pull request as ready for review February 5, 2024 02:08
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>
@sandeshkr419
Copy link
Contributor Author

sandeshkr419 commented Mar 12, 2024

@msfroh Thanks for the comments, made the rewording related changes as you pointed out. Let me know if it looks more accurate now.
Also, rebased and merged now.
Waiting for the CI to succeed.

Copy link
Contributor

✅ Gradle check result for cdc4204: SUCCESS

@sandeshkr419 sandeshkr419 added v2.13.0 Issues and PRs related to version 2.13.0 and removed v2.12.0 Issues and PRs related to version 2.12.0 labels Mar 12, 2024
@sandeshkr419 sandeshkr419 self-assigned this Mar 12, 2024
@msfroh msfroh merged commit 7dac98c into opensearch-project:main Mar 12, 2024
34 of 36 checks passed
@msfroh msfroh added the backport 2.x Backport to 2.x branch label Mar 12, 2024
@sandeshkr419 sandeshkr419 deleted the agg-per branch March 12, 2024 23:46
opensearch-trigger-bot bot pushed a commit that referenced this pull request Mar 13, 2024
…onally match-all for a segment (#11643)

---------

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>
(cherry picked from commit 7dac98c)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
sandeshkr419 added a commit to sandeshkr419/OpenSearch that referenced this pull request Mar 13, 2024
…onally match-all for a segment (opensearch-project#11643)



---------

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>
msfroh pushed a commit that referenced this pull request Mar 13, 2024
…query is functionally match-all for a segment (#12629)

Quickly compute terms aggregations when the top-level query is functionally match-all for a segment (#11643)


---------

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>
rayshrey pushed a commit to rayshrey/OpenSearch that referenced this pull request Mar 18, 2024
…onally match-all for a segment (opensearch-project#11643)



---------

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…onally match-all for a segment (opensearch-project#11643)

---------

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch enhancement Enhancement or improvement to existing feature or request Search:Performance v2.13.0 Issues and PRs related to version 2.13.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use Collector.setWeight to improve aggregation performance (for special cases)
6 participants