Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements From<FixedDecimal> for PluralOperands #278

Merged
merged 13 commits into from
Oct 1, 2020

Conversation

filmil
Copy link
Contributor

@filmil filmil commented Sep 28, 2020

This will allow us to convert FixedDecimal representation without loss
of precision into PluralOperands, for correct plural selection.

For example, a number 25 may sometimes be pluralized differently from
25.0.

Added the benchmarks for the conversion, though without a baseline it is
not just as useful yet. (Until we try to reimplement.)

See issue #190.

@coveralls
Copy link

coveralls commented Sep 28, 2020

Pull Request Test Coverage Report for Build c8e4619fedf99f4342f7b2c92ac2b050c4222b81-PR-278

  • 38 of 38 (100.0%) changed or added relevant lines in 1 file are covered.
  • 6 unchanged lines in 3 files lost coverage.
  • Overall coverage increased (+1.02%) to 78.555%

Files with Coverage Reduction New Missed Lines %
components/fs-data-provider/src/fs_data_provider.rs 1 54.05%
components/pluralrules/src/operands.rs 2 88.28%
components/data-provider/src/data_key.rs 3 84.81%
Totals Coverage Status
Change from base Build e6579f8c6342fe323641ffb665d3829c0f581d2a: 1.02%
Covered Lines: 2652
Relevant Lines: 3376

💛 - Coveralls

@filmil filmil force-pushed the fixed-decimal-to-plural-operands branch from 2046255 to 555a2cd Compare September 29, 2020 00:40
@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • components/pluralrules/tests/operands.rs is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

components/pluralrules/src/operands.rs Outdated Show resolved Hide resolved
components/pluralrules/src/operands.rs Outdated Show resolved Hide resolved
components/pluralrules/src/operands.rs Show resolved Hide resolved
@filmil filmil marked this pull request as ready for review September 29, 2020 01:12
@filmil filmil requested a review from sffc September 29, 2020 01:13
@filmil filmil self-assigned this Sep 29, 2020
@filmil
Copy link
Contributor Author

filmil commented Sep 29, 2020

@zbraniecki Should we also see a benchmark report on the pull request?

components/pluralrules/src/operands.rs Outdated Show resolved Hide resolved
components/pluralrules/tests/operands.rs Outdated Show resolved Hide resolved
components/pluralrules/tests/operands.rs Outdated Show resolved Hide resolved
@sffc
Copy link
Member

sffc commented Oct 1, 2020

Now that #270 is in, can you update your branch and benchmark FixedDecimal::from_str to power PluralOperands? I'd really like to get rid of the redundant string parsing code in PluralOperands.

This will allow us to convert FixedDecimal representation without loss
of precision into PluralOperands, for correct plural selection.

For example, a number `25` may sometimes be pluralized differently from
`25.0`.

Added the benchmarks for the conversion, though without a baseline it is
not just as useful yet.  (Until we try to reimplement.)

See issue unicode-org#190.
We needed to add a custom implementation for eq() to account
for loss of precision in PluralOperands.
It is possible to compute it based on the low end of the magnitude
range.  No performance change per benchmark.
Defines new tests for the data model for conversion, and places them
into the JSON files instead of inline with the tests.
This allows criterion to run the benchmark loop
with a specific time limit.
@filmil
Copy link
Contributor Author

filmil commented Oct 1, 2020

Now that #270 is in, can you update your branch and benchmark FixedDecimal::from_str to power PluralOperands? I'd really like to get rid of the redundant string parsing code in PluralOperands.

Yes, but having done that, the benchmark really isn't happy...

operands/create/string  time:   [2.0088 us 2.0092 us 2.0097 us]                                        
                        change: [+220.75% +221.44% +222.11%] (p = 0.00 < 0.05)
                        Performance has regressed.                                                       
Found 9 outliers among 100 measurements (9.00%)  
  2 (2.00%) low mild                                
  3 (3.00%) high mild                               
  4 (4.00%) high severe             

@filmil filmil force-pushed the fixed-decimal-to-plural-operands branch from 6f4f409 to d0ecb8f Compare October 1, 2020 07:42
@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • components/pluralrules/src/operands.rs is different
  • components/pluralrules/tests/fixtures/mod.rs is different
  • components/pluralrules/tests/fixtures/operands.json is different
  • components/pluralrules/tests/operands.rs is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

Helps smoke out specific performance regressions.
@filmil
Copy link
Contributor Author

filmil commented Oct 1, 2020

Yes, but having done that, the benchmark really isn't happy...

I poked at the benchmarks a bit, and added a few tests that tease out individual components of the parsing process. And the results are showing a drastic (~+100%) time increase.

From what I can see, the main reason for the slowdown is using impl <&FixedDecimal for PluralOperands to convert between the two, adding 15 to 30 ns of overhead for type conversion on top of what we already spent parsing FixedDecimal from string, roughly doubling the time needed to parse. The two parsing approaches are a bit different because FixedDecimal could in principle represent numbers that are out of range of PluralOperands, so it is not surprising that adapting one to the other is costly.

We could play some code golf and factor out a parser that builds a shared parsing state that is useful for both FixedDecimal and PluralOperands. Then we will likely get the best of both worlds. That said, perhaps even better would be not to micro-optimize without a concrete bottleneck.

See benchmark results below to corroborate.

Parsing FixedDecimal from string "0012.3400" takes ~34ns.
(The change shown below is likely measurement noise likely due to one of hundreds of processes that run on my machine and/or hyperthreading).

     Running /usr/local/google/home/fmil/code/icu4x/target/release/deps/fixed_decimal-1bde2d62558b6f97   
from_string/0012.3400   time:   [34.022 ns 34.460 ns 35.094 ns]                                          
                        change: [-9.4233% -6.2430% -2.9040%] (p = 0.00 < 0.05)                           
                        Performance has improved.                                                        
Found 13 outliers among 100 measurements (13.00%)                                                        
  3 (3.00%) high mild                                                                                    
  10 (10.00%) high severe     

New parsing of PluralOperands from 0012.3400, if implemented via FixedDecimal, takes 65ns. It takes ~half that in the base case (own parsing).

perands/create/string/samples/0012.3400                                                                                                                                                                            
                        time:   [65.286 ns 65.319 ns 65.354 ns]                                                                                                                                                    
                        change: [+91.459% +91.925% +92.398%] (p = 0.00 < 0.05)                                                                                                                                     
                        Performance has regressed.                                                       
Found 3 outliers among 100 measurements (3.00%)                                                          
  1 (1.00%) low mild                                                                                                                                                                                               
  2 (2.00%) high mild               

Converting FixedDecimal to PluralOperands takes anywhere between 15 and 30 ns, depending on the number of digits.

operands/create/from_fixed_decimal/samples/FixedDecimal { digits: [1], magnitude: 0, upper_magnitude...                                                                             
                        time:   [14.426 ns 14.434 ns 14.444 ns]
                        change: [+1.7178% +1.7976% +1.8789%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe
operands/create/from_fixed_decimal/samples/FixedDecimal { digits: [1, 2, 3, 4, 5], magnitude: 1, upp...                                                                             
                        time:   [30.027 ns 30.048 ns 30.069 ns]
                        change: [+3.8544% +4.0475% +4.2414%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
  1 (1.00%) low severe
  4 (4.00%) low mild
  2 (2.00%) high mild
  1 (1.00%) high severe
operands/create/from_fixed_decimal/samples/FixedDecimal { digits: [2, 5], magnitude: 1, upper_magnit...                                                                             
                        time:   [23.208 ns 23.227 ns 23.248 ns]
                        change: [-0.4926% -0.0794% +0.3147%] (p = 0.71 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  4 (4.00%) low mild
  3 (3.00%) high mild

@filmil
Copy link
Contributor Author

filmil commented Oct 1, 2020

After discussing on this weeks icu4x meeting, we decided to remove the change that unifies the parsing from string for PluralOperands and FixedDecimal to avoid the performance drop. We will keep the benchmarks around, so when/if we re-attempt the parsing code we have code that will directly measure the transition.

@filmil filmil force-pushed the fixed-decimal-to-plural-operands branch from d0ecb8f to 98d72b4 Compare October 1, 2020 17:26
@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • components/pluralrules/benches/operands.rs is different
  • components/pluralrules/src/operands.rs is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some ideas on how to speed up the conversion function, but let's check this in first so we can get the benchmarks and tests. I'll test some of my ideas then.

@filmil filmil merged commit bd4d2a5 into unicode-org:master Oct 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants