Switch PluralRules to use FixedDecimal #190

zbraniecki · 2020-07-31T20:00:29Z

At landing time PluralRules create operands from primitive types. We'd like to switch that to support FixedDecimal.

The text was updated successfully, but these errors were encountered:

zbraniecki · 2020-08-12T19:12:04Z

@sffc how do you envision this working?

FixedDecimal accepts u*/i* and PluralRules accepts Into<PluralOperands>, should we have Into<PluralOperands> for FixedDecimal?

sffc · 2020-08-12T20:21:09Z

I envision an impl From<&FixedDecimal> for PluralOperands, because there's no reason that the FixedDecimal needs to be consumed.

To implement the conversion, you'd use digit_at(x) for x from 0 to the most significant digit to construct i; digit_at(x) for x from the least significant digit to -1 to construct f; and so forth.

filmil · 2020-09-28T19:41:59Z

Is there a reason to remove support for From<primitive_type>?

zbraniecki · 2020-09-28T19:42:55Z

I don't think so.

zbraniecki · 2020-09-28T19:43:40Z

we can compare performance and if the primitive -> FixedDecimal -> PluralOperands is not a huge cost compared to primitive -> PluralOperands then we can remove. For now, I'd keep it.

sffc · 2020-09-28T19:45:56Z

It might be nice to remove FromStr for PluralOperands, since we are developing a more thoroughly tested code path for that feature in #270, but I think it's harmless to keep From<int-like>.

filmil · 2020-09-28T19:48:11Z

Thanks for explaining. Another question: is there a reason that FixedDecimal does not have a way to extract the integer part and the fractional part, but can rather extract only digit-wise?

sffc · 2020-09-28T19:50:37Z

This ticket is all about adding a way to convert the FixedDecimal to PluralOperands, which implies separating the integer and fraction part. FixedDecimal is a super simple representation that maps from magnitudes to digits. See #190 (comment):

To implement the conversion, you'd use digit_at(x) for x from 0 to the most significant digit to construct i; digit_at(x) for x from the least significant digit to -1 to construct f; and so forth.

filmil · 2020-09-28T20:09:59Z

This ticket is all about adding a way to convert the FixedDecimal to PluralOperands, which implies separating the integer and fraction part. FixedDecimal is a super simple representation that maps from magnitudes to digits. See #190 (comment):

Understood, but that is not quite the answer to my question. I was interested as to why FixedDecimal doesn't have functions, say, named integer_part and fractional_part. Is there something specific about the intended use of FixedDecimal that makes this undesirable?

This will allow us to convert FixedDecimal representation without loss of precision into PluralOperands, for correct plural selection. For example, a number `25` may sometimes be pluralized differently from `25.0`. Added the benchmarks for the conversion, though without a baseline it is not just as useful yet. (Until we try to reimplement.) See issue unicode-org#190.

sffc · 2020-09-29T00:04:35Z

I was interested as to why FixedDecimal doesn't have functions, say, named integer_part and fractional_part. Is there something specific about the intended use of FixedDecimal that makes this undesirable?

What type do you envision that those functions would return?

Another FixedDecimal? What scale would you set on the FixedDecimal? Why would you not just use the first FixedDecimal?
An integer? What if the number is too big to fit in an integer?

filmil · 2020-09-29T00:36:07Z

Another FixedDecimal? What scale would you set on the FixedDecimal? Why would you not just use the first FixedDecimal?

I would use a compatible scale. It is possible to set the internal representation of a FixedDecimal such that the fractional part is zero. Move other internal representation bits to match.

I wanted to use this modified FixedDecimal because I'd be able to use the internal structures of FixedDecimal to compute the integer and fractional parts, faster than relying on digit_at to iterate over specific digits.

An integer? What if the number is too big to fit in an integer?

We could signal an overflow. That said, for purposes of plural rules matching, most of the time only significant digits matter, and the exponents matter less so. It would be a different story for spellout, for example.

Either way, overflow is a realistic possibility for FixedDecimal even today, given that digits are stored in an 8-byte array.

filmil · 2020-09-29T00:37:48Z

I would use a compatible scale. It is possible to set the internal representation of a FixedDecimal such that the fractional part is zero. Move other internal representation bits to match.

That said, let me check in a benchmark first, then I can do an experiment to figure out if this even makes sense.

Top level info that I'd appreciate is whether FixedDecimal is required to have a specific API, or if you'd be supportive of adding more methods to it.

sffc · 2020-09-29T00:46:01Z

I wanted to use this modified FixedDecimal because I'd be able to use the internal structures of FixedDecimal to compute the integer and fractional parts, faster than relying on digit_at to iterate over specific digits.

digit_at is intended to be very fast. It is the operation that FixedDecimal is designed to be very good at doing. I think the function is like 2 or 3 lines of code. If we were to write some other helper function, it would use digit_at under the hood.

Top level info that I'd appreciate is whether FixedDecimal is required to have a specific API, or if you'd be supportive of adding more methods to it.

As usual, new APIs should solve problems. I don't see what problem integer_part and fractional_part are solving, other than being more convenient for the conversion to PluralOperands. I want to push back on efforts to make FixedDecimal more than what it is: a fast, low-level representation of decimal digits at a range of magnitudes (powers of 10).

filmil · 2020-09-29T00:51:57Z

than being more convenient for the conversion to PluralOperands. I want to push back on efforts to make FixedDecimal more than what it is: a fast, low-level representation of decimal digits at a range of magnitudes (powers of 10).

If you don't want to widen the API on FixedDecimal, that is OK and that answers my question.

sffc · 2020-09-29T01:02:30Z

There's really only one algorithm you need, which is:

int result = 0;
for (i in high..=low) {
    result = result * 10 + fixed_decimal.digit_at(i);
}
return result;

My mental model has been that this pattern is short and simple enough to put at the call site. It doesn't need its own method in FixedDecimal.

This will allow us to convert FixedDecimal representation without loss of precision into PluralOperands, for correct plural selection. For example, a number `25` may sometimes be pluralized differently from `25.0`. Added the benchmarks for the conversion, though without a baseline it is not just as useful yet. (Until we try to reimplement.) See issue unicode-org#190.

* Implements From<FixedDecimal> for PluralOperands This will allow us to convert FixedDecimal representation without loss of precision into PluralOperands, for correct plural selection. For example, a number `25` may sometimes be pluralized differently from `25.0`. Added the benchmarks for the conversion, though without a baseline it is not just as useful yet. (Until we try to reimplement.) See issue #190. * Rewrite eq() to branchless * Adds a test for eq() We needed to add a custom implementation for eq() to account for loss of precision in PluralOperands. * fixup:first * Clean up the code for From * Adds more illustrative naming in From * Makes clippy happy * Pulls num_fractional_digits out of the loop It is possible to compute it based on the low end of the magnitude range. No performance change per benchmark. * fixup: benchmark was wrong * fixup: moves new operand tests to json Defines new tests for the data model for conversion, and places them into the JSON files instead of inline with the tests. * fixup: moves the benchmark loop in This allows criterion to run the benchmark loop with a specific time limit. * fixup: formatting * fixup: adds individual sample measurements Helps smoke out specific performance regressions.

zbraniecki · 2020-10-05T23:43:10Z

Fixed by #278.

sffc · 2020-10-20T23:54:15Z

I'm having trouble linking #278 to this issue. Can someone else try?

zbraniecki · 2020-10-21T01:01:57Z

I also cannot

zbraniecki added T-core Type: Required functionality C-pluralrules Component: Plural rules labels Jul 31, 2020

sffc self-assigned this Aug 14, 2020

sffc mentioned this issue Sep 11, 2020

ICU4X 0.1 #204

Closed

sffc added this to the ICU4X 0.1 milestone Sep 11, 2020

sffc assigned filmil and unassigned sffc Sep 28, 2020

filmil mentioned this issue Sep 28, 2020

Implements From<FixedDecimal> for PluralOperands #278

Merged

zbraniecki closed this as completed Oct 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch PluralRules to use FixedDecimal #190

Switch PluralRules to use FixedDecimal #190

zbraniecki commented Jul 31, 2020

zbraniecki commented Aug 12, 2020

sffc commented Aug 12, 2020

filmil commented Sep 28, 2020 •

edited

Loading

zbraniecki commented Sep 28, 2020

zbraniecki commented Sep 28, 2020

sffc commented Sep 28, 2020

filmil commented Sep 28, 2020

sffc commented Sep 28, 2020

filmil commented Sep 28, 2020

sffc commented Sep 29, 2020

filmil commented Sep 29, 2020

filmil commented Sep 29, 2020

sffc commented Sep 29, 2020 •

edited

Loading

filmil commented Sep 29, 2020

sffc commented Sep 29, 2020

zbraniecki commented Oct 5, 2020

sffc commented Oct 20, 2020

zbraniecki commented Oct 21, 2020

Switch PluralRules to use FixedDecimal #190

Switch PluralRules to use FixedDecimal #190

Comments

zbraniecki commented Jul 31, 2020

zbraniecki commented Aug 12, 2020

sffc commented Aug 12, 2020

filmil commented Sep 28, 2020 • edited Loading

zbraniecki commented Sep 28, 2020

zbraniecki commented Sep 28, 2020

sffc commented Sep 28, 2020

filmil commented Sep 28, 2020

sffc commented Sep 28, 2020

filmil commented Sep 28, 2020

sffc commented Sep 29, 2020

filmil commented Sep 29, 2020

filmil commented Sep 29, 2020

sffc commented Sep 29, 2020 • edited Loading

filmil commented Sep 29, 2020

sffc commented Sep 29, 2020

zbraniecki commented Oct 5, 2020

sffc commented Oct 20, 2020

zbraniecki commented Oct 21, 2020

filmil commented Sep 28, 2020 •

edited

Loading

sffc commented Sep 29, 2020 •

edited

Loading