Favor SmallVec over Box in pluralrules #285

gregtatum · 2020-10-01T20:06:24Z

This will help avoid heap allocations as most lists are only a few items
long. The only boxed slice that was retained was the SampleRanges, as
this can be of an indeterminate length.

Resolves #193
Depends on #140
Depends on #107

gregtatum · 2020-10-01T20:08:15Z

components/pluralrules/Cargo.toml

@@ -17,6 +17,7 @@ include = [
 [dependencies]
 icu-locale = { path = "../locale" }
 icu-data-provider = { path = "../data-provider" }
+smallvec = "1.4"


This is the same version as num-util is using.

coveralls · 2020-10-01T20:12:10Z

Pull Request Test Coverage Report for Build 073a029aff20c87865f0223c207d4a4e19125dac-PR-285

7 of 10 (70.0%) changed or added relevant lines in 2 files are covered.
1 unchanged line in 1 file lost coverage.
Overall coverage increased (+11.2%) to 77.341%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
components/pluralrules/src/rules/ast.rs	0	3	0.0%

Files with Coverage Reduction	New Missed Lines	%
components/locale/src/serde/langid.rs	1	88.24%

Totals
Change from base Build c08d490c92c0187ee5504b8505d5fac7387d8675:	11.2%
Covered Lines:	2990
Relevant Lines:	3866

💛 - Coveralls

gregtatum · 2020-10-01T22:23:13Z

It seems that the linter is not liking what this does to the RulesSelector, as it's now 2480 bytes for the Conditions(PluralRuleList) entry. One mitigation would be to heap allocate the PluralRuleList in a Box<PluralRuleList>. It's a heap allocation, but it might have better memory locality, although... that's just a guess.

sffc · 2020-10-01T22:40:39Z

Thanks for the contribution!

Can you provide figures on how "cargo bench" differs before and after this PR?

Boxing the PluralRuleList seems like a good idea. It's one heap allocation, but that's better than the dozens that were there before.

zbraniecki · 2020-10-01T23:00:47Z

It does seem to have a nice impact on the microbenchmark:

plurals/parser/parse    time:   [1.1205 us 1.1219 us 1.1233 us]                                  
                        change: [-27.377% -26.828% -26.328%] (p = 0.00 < 0.05)
                        Performance has improved.

zbraniecki · 2020-10-01T23:06:44Z

In terms of long term maintenance, there's an effort now to introduce stack based vector - rust-lang/rfcs#2990 which, if successful, is a stepping stone to a hybrid vector like smallvec.

So I'm cautiously optimistic that we can use smallvec in performance sensitive cases where we commonly have small vectors but want to be ready for them to be larger (like this) and eventually we'll be able to remove the dependency on smallvec.

zbraniecki · 2020-10-02T17:17:25Z

I rebased @gregtatum patch on top of master and checked it against the total benchmarks:

operands/total          time:   [965.73 ns 971.94 ns 981.35 ns]                            
                        change: [-5.7538% -3.2698% -1.1855%] (p = 0.00 < 0.05)
                        Performance has improved.

parser/total            time:   [1.1470 us 1.1600 us 1.1776 us]                          
                        change: [-37.270% -36.208% -35.143%] (p = 0.00 < 0.05)
                        Performance has improved.

pluralrules/total       time:   [85.550 us 85.883 us 86.292 us]                              
                        change: [+4.7499% +6.1966% +7.6202%] (p = 0.00 < 0.05)
                        Performance has regressed.

I was able to reproduce that change (improvement on parsing by over 30% and regression on full selection by ~5-10%) in 5 test runs.
I do trust the results to be reliable.

I dove deeper:

zbraniecki@zibi-xps13:~/projects/icu4x/components/pluralrules$ cargo criterion pluralrules --features bench-pluralrules
   Compiling icu-num-util v0.0.1 (/home/zbraniecki/projects/icu4x/components/num-util)
   Compiling icu-pluralrules v0.0.1 (/home/zbraniecki/projects/icu4x/components/pluralrules)
    Finished bench [optimized] target(s) in 7.11s

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Gnuplot not found, using plotters backend
pluralrules/construct/fs                                                                            
                        time:   [143.34 us 143.71 us 144.19 us]
                        change: [+5.2929% +6.2346% +7.1295%] (p = 0.00 < 0.05)
                        Performance has regressed.

pluralrules/select/fs   time:   [1.7583 us 1.7621 us 1.7660 us]                                   
                        change: [-0.1753% +0.9524% +2.0192%] (p = 0.09 > 0.05)
                        No change in performance detected.

Gnuplot not found, using plotters backend
pluralrules/total       time:   [85.585 us 86.826 us 91.162 us]                              
                        change: [+11.223% +10.677% +14.487%] (p = 0.00 < 0.05)
                        Performance has regressed.

I'm not sure what's going on here really. My first hypothesis was that the regression is in selection because accessing AST nodes is slower, but the deep benchmarks show clearly that the regression is on parsing, not selection.

zbraniecki · 2020-10-02T17:17:35Z

can someone reproduce?

zbraniecki · 2020-10-02T17:23:33Z

One hypothesis that could explain that is that my parsing/total benchmark only tests pl rules and is not representative of what pluralrules/construct and pluralrules/total are testing since they all construct for 10 langs: https://github.com/unicode-org/icu4x/blob/master/components/pluralrules/benches/fixtures/plurals.json#L10

We may want to get rid of the pl rules in benchmark fixtures, and just parse rules for those 10 locales as a more representative case.

zbraniecki · 2020-10-02T17:24:33Z

If that hypothesis holds, then this PR helps Polish rules (by 30%) but slows down rules for 10 locales (by 5-10%). Both are syntentic microbenchmarks, but that would make me be against landing this change.

zbraniecki · 2020-10-02T18:36:29Z

Ok, I'm fairly certain that this is the case. The path forward is for #227 to be resolved and then all plural rules tests will use the generated data for the 10+/- locales we select for all benchmarks.

For this PR, I suggest @gregtatum that you test against the plural rules subset or, if you want to help clean up benchmarks, take on the PR to move parser/total to parse conditions for 10 locales that are used everywhere else. This will give us more consistent results.

zbraniecki · 2020-10-03T17:26:50Z

My PR in #295 is coalescing benchmarks to be more reliable by testing against 10 locales, instead of just one. Shane in #296 introduces "testdata" which will give us a single source of those locales and data to be used in all components tests/examples/benchmarks.

This should help us reason about this PR, so I'd suggest now to wait for those two to land.

gregtatum · 2020-10-05T13:27:23Z

It seems to reproduce on my machine (macOS).

plurals/parser/parse    time:   [1.5964 us 1.6061 us 1.6164 us]
                        change: [-59.684% -59.233% -58.768%] (p = 0.00 < 0.05)
                        Performance has improved.

plurals/convert+select/fs
                        time:   [1.4706 us 1.4817 us 1.4945 us]
                        change: [+7.0000% +8.2399% +9.8204%] (p = 0.00 < 0.05)
                        Performance has regressed.

plurals/select/fs       time:   [1.4470 us 1.4546 us 1.4636 us]
                        change: [+6.6509% +8.0535% +9.3887%] (p = 0.00 < 0.05)
                        Performance has regressed.

plurals/construct/fs/10 time:   [379.03 us 381.91 us 385.46 us]
                        change: [+1.8176% +2.8942% +4.1765%] (p = 0.00 < 0.05)
                        Performance has regressed.

gregtatum · 2020-10-05T13:51:54Z

Is PR #295 the one you were suggesting I look at, @zbraniecki ? I'd be happy to follow-up with some more work here, but my cadence will be a bit slower, as I'm doing this on the side. Or were you suggesting a separate issue #?

zbraniecki · 2020-10-05T17:09:24Z

Is PR #295 the one you were suggesting I look at, @zbraniecki ? I'd be happy to follow-up with some more work here, but my cadence will be a bit slower, as I'm doing this on the side. Or were you suggesting a separate issue #?

No worries. Yeah, #295 is what I had in mind. Let's wait for it to land and then get back to this PR and decide on it!
In the meantime, if you'll have spare cycles and want to grab any of those https://github.com/unicode-org/icu4x/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22 , let us know!

zbraniecki · 2020-10-05T22:09:55Z

Can you rebase this PR on top of master?

jira-pull-request-webhook · 2020-10-07T15:42:15Z

Notice: the branch changed across the force-push!

components/pluralrules/Cargo.toml is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

gregtatum · 2020-10-07T15:48:06Z

Can you rebase this PR on top of master?

Sure thing.

sffc · 2020-11-19T18:11:03Z

Depends on #140
Depends on #107

zbraniecki · 2021-01-07T18:23:55Z

This may also be a solution once it gets merged? rust-lang/rfcs#2990

This will help avoid heap allocations as most lists are only a few items long. The only boxed slice that was retained was the SampleRanges, as this can be of an indeterminate length.

jira-pull-request-webhook · 2021-09-01T15:54:38Z

Notice: the branch changed across the force-push!

Cargo.lock is now changed in the branch
components/pluralrules/Cargo.toml is no longer changed in the branch
components/pluralrules/src/rules/ast.rs is no longer changed in the branch
components/pluralrules/src/rules/mod.rs is no longer changed in the branch
components/pluralrules/src/rules/parser.rs is no longer changed in the branch
components/plurals/Cargo.toml is now changed in the branch
components/plurals/src/rules/ast.rs is now changed in the branch
components/plurals/src/rules/mod.rs is now changed in the branch
components/plurals/src/rules/parser.rs is now changed in the branch

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

sffc · 2021-11-22T20:12:48Z

Now that #1240 has landed, should we close this PR and the corresponding ticket as obsolete?

sffc

(setting review bit)

CLAassistant · 2022-09-01T06:18:21Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

sffc · 2022-11-10T18:27:12Z

This is obsolete because now we use VarZeroVec for runtime.

gregtatum requested review from sffc and zbraniecki as code owners October 1, 2020 20:06

gregtatum commented Oct 1, 2020

View reviewed changes

gregtatum force-pushed the pluralrules-smallvec branch from 0052021 to 770c3d0 Compare October 7, 2020 15:42

sffc marked this pull request as draft October 9, 2020 17:26

sffc added the waiting-on-dependency label Nov 19, 2020

Base automatically changed from master to main March 4, 2021 18:40

sffc added blocked A dependency must be resolved before this is actionable and removed waiting-on-dependency labels Apr 3, 2021

sffc added waiting-on-author PRs waiting for action from the author for >7 days and removed blocked A dependency must be resolved before this is actionable labels Jul 29, 2021

Favor SmallVec over Box in pluralrules

841c72d

This will help avoid heap allocations as most lists are only a few items long. The only boxed slice that was retained was the SampleRanges, as this can be of an indeterminate length.

gregtatum dismissed a stale review via 841c72d September 1, 2021 15:54

gregtatum force-pushed the pluralrules-smallvec branch from 770c3d0 to 841c72d Compare September 1, 2021 15:54

sffc reviewed Nov 30, 2021

View reviewed changes

sffc closed this Nov 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Favor SmallVec over Box in pluralrules #285

Favor SmallVec over Box in pluralrules #285

gregtatum commented Oct 1, 2020 •

edited by sffc

Loading

gregtatum Oct 1, 2020

coveralls commented Oct 1, 2020 •

edited

Loading

gregtatum commented Oct 1, 2020

sffc commented Oct 1, 2020 •

edited

Loading

zbraniecki commented Oct 1, 2020

zbraniecki commented Oct 1, 2020

zbraniecki commented Oct 2, 2020

zbraniecki commented Oct 2, 2020

zbraniecki commented Oct 2, 2020

zbraniecki commented Oct 2, 2020

zbraniecki commented Oct 2, 2020

zbraniecki commented Oct 3, 2020

gregtatum commented Oct 5, 2020

gregtatum commented Oct 5, 2020

zbraniecki commented Oct 5, 2020

zbraniecki commented Oct 5, 2020

jira-pull-request-webhook bot commented Oct 7, 2020

gregtatum commented Oct 7, 2020

sffc commented Nov 19, 2020

zbraniecki commented Jan 7, 2021

jira-pull-request-webhook bot commented Sep 1, 2021

sffc commented Nov 22, 2021

sffc left a comment

CLAassistant commented Sep 1, 2022

sffc commented Nov 10, 2022

Favor SmallVec over Box in pluralrules #285

Favor SmallVec over Box in pluralrules #285

Conversation

gregtatum commented Oct 1, 2020 • edited by sffc Loading

gregtatum Oct 1, 2020

Choose a reason for hiding this comment

coveralls commented Oct 1, 2020 • edited Loading

Pull Request Test Coverage Report for Build 073a029aff20c87865f0223c207d4a4e19125dac-PR-285

💛 - Coveralls

gregtatum commented Oct 1, 2020

sffc commented Oct 1, 2020 • edited Loading

zbraniecki commented Oct 1, 2020

zbraniecki commented Oct 1, 2020

zbraniecki commented Oct 2, 2020

zbraniecki commented Oct 2, 2020

zbraniecki commented Oct 2, 2020

zbraniecki commented Oct 2, 2020

zbraniecki commented Oct 2, 2020

zbraniecki commented Oct 3, 2020

gregtatum commented Oct 5, 2020

gregtatum commented Oct 5, 2020

zbraniecki commented Oct 5, 2020

zbraniecki commented Oct 5, 2020

jira-pull-request-webhook bot commented Oct 7, 2020

gregtatum commented Oct 7, 2020

sffc commented Nov 19, 2020

zbraniecki commented Jan 7, 2021

jira-pull-request-webhook bot commented Sep 1, 2021

sffc commented Nov 22, 2021

sffc left a comment

Choose a reason for hiding this comment

CLAassistant commented Sep 1, 2022

sffc commented Nov 10, 2022

gregtatum commented Oct 1, 2020 •

edited by sffc

Loading

coveralls commented Oct 1, 2020 •

edited

Loading

sffc commented Oct 1, 2020 •

edited

Loading