Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build global ordinals terms bucket from matching ordinals #30166

Merged
merged 6 commits into from
Apr 27, 2018

Commits on Apr 26, 2018

  1. Build terms bucket from matching ordinals

    The global ordinals terms aggregator has an option to remap global ordinals to
    dense ordinal that match the request. This mode is automatically picked when the terms
    aggregator is a child of another bucket aggregator or when it needs to defer buckets to an
    aggregation that is used in the ordering of the terms.
    Though when building the final buckets, this aggregator loops over all possible global ordinals
    rather than using the hash map that was built to remap the ordinals.
    For fields with high cardinality this is highly inefficient and can lead to slow responses even
    when the number of terms that match the query is low.
    This change fixes this performance issue by using the hash table of matching ordinals to perform
    the pruning of the final buckets for the terms and significant_terms aggregation.
    I ran a simple benchmark with 1M documents containing 0 to 10 keywords randomly selected among 1M unique terms.
    This field is used to perform a multi-level terms aggregation using rally to collect the response times.
    The aggregation below is an example of a two-level terms aggregation that was used to perform the benchmark:
    
    ```
    "aggregations":{
       "1":{
          "terms":{
             "field":"keyword"
          },
          "aggregations":{
             "2":{
                "terms":{
                   "field":"keyword"
                }
             }
          }
       }
    }
    ```
    
    | Levels of aggregation | 50th percentile ms (master) | 50th percentile ms (patch) |
    | --- | --- | --- |
    | 2 | 640.41ms | 577.499ms |
    | 3 | 2239.66ms | 600.154ms |
    | 4 | 14141.2ms | 703.512ms |
    
    Closes elastic#30117
    jimczi committed Apr 26, 2018
    Configuration menu
    Copy the full SHA
    b46802c View commit details
    Browse the repository at this point in the history
  2. unused import

    jimczi committed Apr 26, 2018
    Configuration menu
    Copy the full SHA
    333e338 View commit details
    Browse the repository at this point in the history
  3. typos

    jimczi committed Apr 26, 2018
    Configuration menu
    Copy the full SHA
    5b6ed93 View commit details
    Browse the repository at this point in the history
  4. address review

    jimczi committed Apr 26, 2018
    Configuration menu
    Copy the full SHA
    69c8160 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    fa905ab View commit details
    Browse the repository at this point in the history

Commits on Apr 27, 2018

  1. Configuration menu
    Copy the full SHA
    78034d1 View commit details
    Browse the repository at this point in the history