Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge PluralRules into NumberFormat (formatSelect) #397

Open
sffc opened this issue Jan 10, 2020 · 16 comments
Open

Merge PluralRules into NumberFormat (formatSelect) #397

sffc opened this issue Jan 10, 2020 · 16 comments
Labels
c: numbers Component: numbers, currency, units s: comment Status: more info is needed to move forward

Comments

@sffc
Copy link
Contributor

sffc commented Jan 10, 2020

Time and time again, programmers are confused about how to use Intl.PluralRules, especially in ways that relate to rendered digits, like how to take the plural form of 1 versus 1.00 versus 1K.

In the ICU implementation, to solve this problem, we allow users to pass a FormattedNumber, the output of NumberFormatter, into PluralRules.

Here's a draft of how this could look in ECMAScript:

const fmt = new Intl.NumberFormat("fr-FR", {
    notation: "compact"
});
const { string, pluralForm } = fmt.formatSelect(2.5e6);
console.log(string, pluralForm);
// "2.5 M" many

Intl.PluralRules would still be useful for the case where you don't care about the rendered output, but the new API on Intl.NumberFormat would help clarify how to get the effective plural form for a formatted number.

The new APIs:

  • formatSelect returns { string, pluralForm }
  • formatToPartsSelect (or formatSelectToParts) returns { parts, pluralForm }

Thoughts?

@zbraniecki @echeran @longlho

@zbraniecki
Copy link
Member

Could we instead offer an ability to pass a NumberFormat instance to PluralRules.select?

const nf = new Intl.NumberFormat("fr-FR", {
    notation: "compact"
});
const pr = new Intl.PluralRules("fr-FR");
pr.select(2.5e6, nf); // select using the number formatted with `nf` options?

@sffc
Copy link
Contributor Author

sffc commented Jan 10, 2020

The current pattern on how to do this is to pass NumberFormat options into the PluralRules constructor:

const nf = new Intl.NumberFormat("fr-FR", {
    notation: "compact"
});
const pr = new Intl.PluralRules("fr-FR", nf.resolvedOptions());

I was thinking that putting sugar methods on NumberFormat might make it easier to use. It would also make it more efficient because implementation-wise, you only format the number once, and then you compute both the string and the plural form at the same time.

@echeran
Copy link

echeran commented Jan 10, 2020

In ICU, we first get back a FormattedNumber from NumberFormat as an intermediate output, then we get its string representation and/or use it to select the plural rule. Do we not have that in ES (only have a string output), and thus want to consolidate the cognitive overhead of the APIs?

If so, I think the idea makes sense. We don't seem to really create custom plural rules -- we take whatever comes by default from CLDR, which means selecting a plural rule has the same input data as what it takes to format a number. And I assume that this proposal just solves the case where you want both; otherwise, you can reuse existing APIs.

On closer look at the current way to create plural rules, it does seem a little wonky when compared to the ICU way of doing things. But I think that matters right now to the extent that we have large use cases of plural rules selection only (w/o formatting) vs. formatting (w/o plural rules selection) or both.

@sffc sffc added c: numbers Component: numbers, currency, units s: discuss Status: TG2 must discuss to move forward labels Jan 10, 2020
@rxaviers
Copy link
Member

Clarifying the issue for potential readers... There are two problems:

  • On some languages, the plural form may vary depending on whether it's treated as an integer or a decimal. For example, for Macedonian (mk) "1" is treated as the one plural form, but "1.0" is treated as the other plural form. [1]
  • On all languages, the formatted/displayed number may be different from the actual number (because of rounding and notation) and therefore the plural form may change. For example, let's suppose the actual number is 1.0005 (other plural form in English) but the formatted number is "1" or "1.0" depending on the used fraction digits options (one plural form in English). An example about compact notation: let's suppose the actual number is 1000 and the displayed number is 1K, @sffc in here, I can't think of an example where the plural form for both would be different (I am thinking of "1000 likes" vs "1K likes"), do you have any handy example we can use to illustrate? Thanks

1:

new Intl.PluralRules("mk").select(1)
// > "one"
new Intl.PluralRules("mk", {minimumFractionDigits: 1}).select(1)
// > "other"

@sffc
Copy link
Contributor Author

sffc commented Jan 10, 2020

2.3e6 in fr-FR: "2 300 000 vues" (plural form "other")

But when compact notation is used: "2,3 millions de vues" (plural form "many")

@longlho
Copy link
Collaborator

longlho commented Jan 11, 2020

I'm trying to figure out the use case for this and so far off the top of my head it'd be useful for debugging. What are your anticipated use cases?

I think right now the confusion, at least for me, primarily comes from implicit fraction/significant digits resolution within NumberFormat, e.g ILD currency digits info that changes the default fraction digits.

The other thing to consider is plural within ICU MessageFormat as well, e.g

{count, plural, one{# book} other{# books}}

With this API seems like the signal is to do NumberFormat.formatSelect to be consistent w/ the rendered output in #. But then if we have

{count, plural, one{book} other{books}}

(no #, so no rendered number), then what should we do in that scenario?

@sffc
Copy link
Contributor Author

sffc commented Jan 13, 2020

formatSelect does not add any new functionality; it just makes the existing functionality more discoverable, understandable, and efficient. Use cases are not a consideration.

@sffc
Copy link
Contributor Author

sffc commented Mar 21, 2020

I plan to address this as part of my new proposal Intl.NumberFormat V3.

https://github.com/sffc/proposal-intl-numberformat-v3

@sffc sffc added s: in progress Status: the issue has an active proposal and removed s: discuss Status: TG2 must discuss to move forward labels Mar 21, 2020
@zbraniecki
Copy link
Member

@sffc I still don't see any references to libraries or software that would need this feature. It seems quite insufficiently justified so far. Can you provide sources of why and who would need that?

@sffc
Copy link
Contributor Author

sffc commented Mar 21, 2020

This isn't a feature; it's a refactoring of existing feature. You can refer back to the PluralRules proposal for the full list of use cases.

In message formatting, you generally want both the number and the plural form of the number. Right now you have to use two different Intl classes, which is unintuitive, clunky, and inefficient. (Do you need justification on those three adjectives?) This proposal means you can get both the formatted number and the plural form in one function call, which I claim is more ergonomic and efficient.

@zbraniecki
Copy link
Member

Do you need justification on those three adjectives?

I would like to see an example of a library of software where this problem is exemplified.

I am a co-author of a localization system that uses both Intl.PluralRules and Intl.NumberFormat and I have not observed that problem nor do I see how it would apply to my system.

Therefore I'm curious what other cases exist which exemplify the problem you're addressing. Saying "very often engineers encounter..." or "time and time again users are confused..." is only valuable if you can point at examples of where they're confused or where they encountered.

My issue is that I have not seen anything that would validate that claims.

@sffc
Copy link
Contributor Author

sffc commented Mar 23, 2020

Unintuitive: Previous discussions regarding confusion over Intl.PluralRules behavior: #373, #365, tc39/proposal-unified-intl-numberformat#86. I have also seen users simply unaware that fraction digit settings need to be passed to Intl.PluralRules in order to get correct behavior (which led to issues such as ICU-20617). For example, the following code is incorrect, even in English, but to most non-i18n experts, it looks perfectly plausible:

function howManyStars(locale, count, strings) {
  const nf = new Intl.NumberFormat(locale, {
    minimumFractionDigits: 1,
    maximumFractionDigits: 1,
  });
  const pr = new Intl.PluralRules(locale);
  return `${nf.format(count)} ${strings[pr.select(count)]}`;
}

howManyStars("en-US", 2, { one: "star", other: "stars" });
// Correct: "2.0 stars"

howManyStars("en-US", 1, { one: "star", other: "stars" })
// Incorrect: "1.0 star"

Also, the following doesn't work, either, since trailing zeros are stripped from .select():

const pr = new Intl.PluralRules("en-US");
pr.select("1.0");  // "one", but should be "other"

Clunky and Inefficient: The above function could be re-implemented in a safer, more efficient way by using formatSelect, as follows:

function howManyStars(locale, count, strings) {
  const nf = new Intl.NumberFormat(locale, {
    minimumFractionDigits: 1,
    maximumFractionDigits: 1,
  });
  const result = nf.formatSelect(count);
  return `${result.string} ${strings[result.pluralForm]}`;
}

I see the plural form as being fundamentally tied to the formatted string. In my opinion, as an i18n engineer who has worked with clients trying to implement plural selection correctly, the model of plural selection having its own class that neither accepts nor produces a formatted string is simply wrong, and it leads to bugs such as the ones listed above.


All that said, I appreciate the criticism from other i18n experts in this thread. It could be that my mental model of plural selection isn't correct. I am fine pulling formatSelect from my NumberFormat v3 proposal if we don't have consensus on it.

@longlho
Copy link
Collaborator

longlho commented Mar 24, 2020

I agree w/ @zbraniecki. I'm not sure if this is needed as a top level API, but rather just having PluralRules & NumberFormat sharing more underlying abstract operations.

@zbraniecki
Copy link
Member

@sffc what would you say for selectPluralCategory method on NumberFormat instead? This way the only surface increase is that NF may be used to get the plural category just like PluralRules can be.

@sffc
Copy link
Contributor Author

sffc commented Apr 2, 2020

Bikeshed:

  1. Intl.NumberFormat.prototype.formatSelect returning { string, pluralForm }
    • Pros: All features in one place; easy to use correctly; works nicely with formatRange
    • Cons: Doubles number of terminal methods, from 4 to 8 (including formatToParts and formatRange); return value should be a value type, but Records are still only Stage 1
  2. Intl.NumberFormat.prototype.selectPluralCategory returning a string pluralForm
    • Pros: Simple, straightforward addition
    • Cons: Two function calls, reducing potential performance benefit of a single call
  3. Intl.NumberFormat.prototype.getPluralRules returning an Intl.PluralRules
    • Pros: Clean separation of functionality; Intl.PluralRules remains a first-class construction
    • Cons: No performance benefit over the status quo
  4. Intl.PluralRules.from taking an Intl.NumberFormat as an argument
    • Pros/Cons: Same as above

@sffc
Copy link
Contributor Author

sffc commented May 22, 2020

We decided in the 2020-04-23 meeting to table this issue, because none of the proposed options solve the problem completely. We will still require documentation, even if we add new methods. I filed a ticket to follow up on the documentation:

tc39/ecma402-mdn#13

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: numbers Component: numbers, currency, units s: comment Status: more info is needed to move forward
Projects
Archived in project
Development

No branches or pull requests

5 participants