Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combine all postings enum impls of the default codec into a single class #14033

Merged
merged 23 commits into from
Dec 4, 2024

Conversation

jpountz
Copy link
Contributor

@jpountz jpountz commented Dec 3, 2024

Recent speedups by making call sites bimorphic made me want to play with combining all postings enums and impacts enums of the default codec into a single class, in order to reduce polymorphism. Unfortunately, it does not yield a speedup since the major polymorphic call sites we have that hurt performance (DefaultBulkScorer, ConjunctionDISI) are still 3-polymorphic or more.

Yet, reduced polymorphism at little performance impact is a good trade-off as it would help make call sites bimorphic for users who don't have as much query diversity as nightly benchmarks, or in the future when we remove other causes of polymorphism.

@jpountz jpountz added this to the 10.1.0 milestone Dec 3, 2024
@jpountz
Copy link
Contributor Author

jpountz commented Dec 3, 2024

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                   TermTitleSort      160.25      (2.1%)      153.39      (2.2%)   -4.3% (  -8% -    0%) 0.000
                    SloppyPhrase        1.74      (5.3%)        1.68      (7.9%)   -3.8% ( -16% -   10%) 0.076
                         Prefix3      130.86      (5.8%)      126.45      (4.9%)   -3.4% ( -13% -    7%) 0.048
            FilteredAndStopWords       50.00      (1.4%)       48.40      (1.1%)   -3.2% (  -5% -    0%) 0.000
                     OrStopWords       35.34      (5.7%)       34.22      (4.8%)   -3.2% ( -12% -    7%) 0.054
                       OrHighMed      199.08      (4.9%)      193.04      (4.0%)   -3.0% ( -11% -    6%) 0.033
                 CountAndHighMed      167.66      (1.4%)      163.02      (2.0%)   -2.8% (  -6% -    0%) 0.000
             FilteredAndHighHigh       64.83      (1.6%)       63.05      (1.0%)   -2.8% (  -5% -    0%) 0.000
                        Or3Terms      178.04      (3.4%)      173.58      (2.6%)   -2.5% (  -8% -    3%) 0.009
             FilteredOrStopWords       44.97      (2.0%)       43.87      (1.9%)   -2.5% (  -6% -    1%) 0.000
     FilteredAnd2Terms2StopWords      203.45      (1.2%)      198.46      (1.0%)   -2.5% (  -4% -    0%) 0.000
                      OrHighHigh       54.08      (6.1%)       52.78      (5.0%)   -2.4% ( -12% -    9%) 0.172
              Or2Terms2StopWords      168.21      (3.4%)      164.33      (2.7%)   -2.3% (  -8% -    3%) 0.018
               TermDayOfYearSort      640.07      (3.2%)      626.12      (2.6%)   -2.2% (  -7% -    3%) 0.019
                    AndStopWords       33.19      (2.4%)       32.48      (2.0%)   -2.1% (  -6% -    2%) 0.003
                CountAndHighHigh       57.25      (1.1%)       56.07      (1.5%)   -2.1% (  -4% -    0%) 0.000
                  CountOrHighMed      142.11      (1.3%)      139.21      (1.8%)   -2.0% (  -5% -    1%) 0.000
              FilteredOrHighHigh       66.25      (1.7%)       64.92      (1.7%)   -2.0% (  -5% -    1%) 0.000
             And2Terms2StopWords      170.06      (1.6%)      166.65      (1.8%)   -2.0% (  -5% -    1%) 0.000
      FilteredOr2Terms2StopWords      151.21      (1.2%)      148.39      (1.0%)   -1.9% (  -4% -    0%) 0.000
                      TermDTSort      289.16      (8.1%)      284.04      (5.2%)   -1.8% ( -14% -   12%) 0.414
                AndMedOrHighHigh       60.26      (1.6%)       59.20      (2.4%)   -1.8% (  -5% -    2%) 0.007
                      OrHighRare      275.69      (7.7%)      271.12      (6.3%)   -1.7% ( -14% -   13%) 0.456
               FilteredOrHighMed      156.87      (1.2%)      154.29      (1.1%)   -1.6% (  -3% -    0%) 0.000
                FilteredOr3Terms      169.13      (0.9%)      166.54      (1.3%)   -1.5% (  -3% -    0%) 0.000
               FilteredAnd3Terms      198.47      (2.2%)      195.45      (1.9%)   -1.5% (  -5% -    2%) 0.018
                      DismaxTerm      604.57      (3.2%)      596.22      (3.0%)   -1.4% (  -7% -    4%) 0.159
                            Term      470.19      (4.7%)      463.83      (4.4%)   -1.4% ( -10% -    8%) 0.351
                       And3Terms      183.07      (1.6%)      180.77      (1.7%)   -1.3% (  -4% -    2%) 0.016
                          OrMany       20.25      (2.4%)       20.00      (1.9%)   -1.2% (  -5% -    3%) 0.071
                  FilteredPhrase       30.89      (1.3%)       30.58      (1.3%)   -1.0% (  -3% -    1%) 0.017
                          Phrase       14.95      (3.3%)       14.81      (3.9%)   -1.0% (  -7% -    6%) 0.396
              FilteredAndHighMed      132.58      (1.7%)      131.37      (1.7%)   -0.9% (  -4% -    2%) 0.095
                   TermMonthSort     3425.81      (2.8%)     3396.71      (2.5%)   -0.8% (  -5% -    4%) 0.313
                      AndHighMed      133.27      (1.6%)      132.15      (2.0%)   -0.8% (  -4% -    2%) 0.146
                        Wildcard       75.43      (3.9%)       74.82      (2.8%)   -0.8% (  -7% -    6%) 0.447
                  FilteredOrMany       16.94      (3.4%)       16.82      (5.0%)   -0.7% (  -8% -    7%) 0.602
                 CountOrHighHigh       75.42      (1.5%)       75.06      (1.4%)   -0.5% (  -3% -    2%) 0.294
                DismaxOrHighHigh       69.00      (3.7%)       68.68      (4.0%)   -0.5% (  -7% -    7%) 0.707
                     AndHighHigh       45.67      (1.9%)       45.47      (2.3%)   -0.4% (  -4% -    3%) 0.502
              CombinedAndHighMed       55.13      (2.4%)       54.92      (3.6%)   -0.4% (  -6% -    5%) 0.706
                    FilteredTerm      155.18      (2.5%)      154.79      (2.2%)   -0.2% (  -4% -    4%) 0.740
               CombinedOrHighMed       71.00      (2.7%)       70.89      (4.5%)   -0.2% (  -7% -    7%) 0.896
                        PKLookup      283.20      (2.1%)      283.11      (2.6%)   -0.0% (  -4% -    4%) 0.968
                 DismaxOrHighMed       85.11      (3.4%)       85.18      (3.0%)    0.1% (  -6% -    6%) 0.933
              CombinedOrHighHigh       18.64      (2.7%)       18.65      (5.0%)    0.1% (  -7% -    8%) 0.942
                        SpanNear        2.19     (11.7%)        2.20     (12.1%)    0.2% ( -21% -   27%) 0.959
                          Fuzzy1       81.65      (1.5%)       81.87      (1.8%)    0.3% (  -2% -    3%) 0.606
                          Fuzzy2       76.93      (1.6%)       77.16      (1.7%)    0.3% (  -2% -    3%) 0.564
                       CountTerm     8598.50      (5.3%)     8630.21      (5.0%)    0.4% (  -9% -   11%) 0.821
             CombinedAndHighHigh       15.07      (2.6%)       15.13      (3.9%)    0.4% (  -5% -    7%) 0.723
                    TermGroup100       23.57      (4.4%)       23.66      (4.2%)    0.4% (  -7% -    9%) 0.786
                         Respell       54.70      (2.5%)       55.00      (1.8%)    0.6% (  -3% -    4%) 0.420
                    CombinedTerm       29.19      (2.8%)       29.38      (4.0%)    0.7% (  -5% -    7%) 0.531
                    TermBGroup1M       23.94      (4.7%)       24.17      (4.2%)    1.0% (  -7% -   10%) 0.496
                    TermGroup10K       19.40      (4.0%)       19.59      (4.0%)    1.0% (  -6% -    9%) 0.428
                     TermGroup1M       19.07      (4.2%)       19.27      (3.7%)    1.1% (  -6% -    9%) 0.401
                  TermBGroup1M1P       36.55      (4.1%)       37.04      (4.1%)    1.3% (  -6% -    9%) 0.306
                          IntNRQ      112.93     (15.2%)      114.45     (17.0%)    1.3% ( -26% -   39%) 0.792
                IntervalsOrdered        2.35      (2.7%)        2.39      (3.0%)    1.5% (  -4% -    7%) 0.100
                     CountPhrase        4.20      (3.3%)        4.32      (1.4%)    2.9% (  -1% -    7%) 0.000
                 AndHighOrMedMed       43.41      (1.7%)       44.72      (3.1%)    3.0% (  -1% -    7%) 0.000

Copy link
Member

@rmuir rmuir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice cleanup

@jpountz jpountz merged commit 6c48b40 into apache:main Dec 4, 2024
3 checks passed
@jpountz jpountz deleted the merge_postings_enum_impls branch December 4, 2024 14:19
jpountz added a commit that referenced this pull request Dec 4, 2024
…ass (#14033)

Recent speedups by making call sites bimorphic made me want to play with combining all postings enums and impacts enums of the default codec into a single class, in order to reduce polymorphism. Unfortunately, it does not yield a speedup since the major polymorphic call sites we have that hurt performance (DefaultBulkScorer, ConjunctionDISI) are still 3-polymorphic or more.

Yet, reduced polymorphism at little performance impact is a good trade-off as it would help make call sites bimorphic for users who don't have as much query diversity as nightly benchmarks, or in the future when we remove other causes of polymorphism.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants