Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detecting directionality from script type (or from a language associated with a script) #205

Open
brettz9 opened this issue Dec 12, 2017 · 16 comments
Assignees
Labels
c: text Component: case mapping, collation, properties s: in progress Status: the issue has an active proposal
Milestone

Comments

@brettz9
Copy link

brettz9 commented Dec 12, 2017

In order to set default text directionality, I think it would be helpful to have a means of mapping scripts (as by ISO 15924 code), though even more usefully, as languages with a specific "Suppress Script" in the IANA registry (indicating a predominant association of that language with a given script) to the script's directionality (RTL, LTR, Inherited, T-to-B, or Varies). One could then have a standard means (using HTML dir with RTL languages or CSS writing modes for T-to-B ones) of programmatically setting directionality for locales or for providing excerpts of other language-encoded content (when directionality information was not also present).

@zbraniecki
Copy link
Member

That's issue #46. At Mozilla we landed mozIntl.getLocaleInfo API which has the following signature:

let data = mozIntl.getLocaleInfo('ar');

data === {
  'locale': 'ar',
  'direction': 'rtl'
};

and in the future we plan to add more bits to it.

@brettz9
Copy link
Author

brettz9 commented Dec 12, 2017

Great, thanks! If one just wanted to know the script's directionality, one could just add the 4-letter script code to any language I presume, e.g., en-Arab would give rtl for Arabic's direction (if one didn't happen to know what language that script was associated with), right? Not critical, but I'm just curious if it'd allow detection of any ISO 15924 scripts besides the two-letter language (i.e., not writing system) codes.

@brettz9
Copy link
Author

brettz9 commented Dec 13, 2017

Btw, I don't see mozIntl in Nightly (desktop)--is it behind a flag?

@littledan
Copy link
Member

I see a bunch of things we can do here:

  1. Expose a mapping from scripts to direction (as @brettz9 is asking for)
  2. Expose a direct mapping from locales to directions (as @zbraniecki is suggesting)
  3. Expose likely subtags for locales, to get the script out, so it's possible to go from locale to script using 1.
  4. Expose the directionality properties of individual code points used by the BiDi algorithm (not sure if anyone wants this).

To get the direction, with an API based on Intl.Locale, with the data coming from the script as in 3, can I suggest something like this?

new Intl.Locale('ar').withLikelySubtags().direction

@brettz9
Copy link
Author

brettz9 commented Dec 14, 2017

withLikelySubtags would presumably be an array then? The IANA registry's "Suppress Script" field (i.e., indicating that the script should generally not be added with the language code given that the language is generally written in that script as it is) only lists one script (which stands to reason in that context), but I suppose an array sorted by likelihood would be all the merrier if the data is available and that's what you mean.

1 with 3 sound excellent to me (and if 2 was exposed as well, wfm). As far as 4, I don't currently have a need for this myself, but FWIW, it was brought up at #90 (comment) .

@littledan
Copy link
Member

I was imagining that withLikelySubtags would give you a new Intl.Locale instance, based on CLDR's likely subtags data. Unfortunately, though, this just provides one script. Is there a good data source we should look into exposing to get this sort of array?

(To continue the story: That Locale would have a particular direction derived from its script, rather than script being undefined and direction throwing an error because of the missing script).

@brettz9
Copy link
Author

brettz9 commented Dec 14, 2017

I see.

No, I was just thinking that since you mentioned the API as withLikelySubtags in the plural that you knew of a data source with multiple subtags for each language. I don't know of any source for that. But I guess if it ever turned up and there was desire for it, withPossibleSubtags could presumably be added.

Re: locale having a direction derived form its script, do you mean direction derived from its language when no script is explicit? If so, SGTM...

@littledan
Copy link
Member

The subtags here are the region and script :/

@brettz9
Copy link
Author

brettz9 commented Dec 14, 2017

Ahh, gotcha... Ok, no worries--I think that should work... While it's helpful to try to accommodate possible use cases, I think a single script association should cover the most critical cases...

@jungshik
Copy link

jungshik commented May 7, 2018

I think a single script association should cover the most critical cases

To use that, the script for a given locale should be obtained in one way or another, shouldn't it? I'm afraid average developers are less familiar with script than locale/language. So, I think what @littledan suggested may work better.

@littledan
Copy link
Member

Note: we ended up naming the withLikelySubtags method as maximize in tc39/proposal-intl-locale#30

@sffc sffc added c: locale Component: locale identifiers s: in progress Status: the issue has an active proposal labels Mar 19, 2019
@sffc sffc added s: discuss Status: TG2 must discuss to move forward User Preferences Related to user preferences and removed s: in progress Status: the issue has an active proposal labels Jun 5, 2020
@sffc sffc added this to the ES 2021 milestone Jun 5, 2020
@sffc sffc added c: text Component: case mapping, collation, properties and removed c: locale Component: locale identifiers labels Jun 5, 2020
@FrankYFTang
Copy link
Contributor

I intend to address this issue by proposing https://github.com/FrankYFTang/proposal-intl-locale-info/

@sffc sffc modified the milestones: ES 2021, ES 2022 Mar 22, 2021
@ryzokuken
Copy link
Member

@sffc Now that the Locale info proposal is Stage 3, can we close this? Or perhaps we would want to close on Stage 4? At the very least we should drop the "user preferences" tag I feel.

@FrankYFTang
Copy link
Contributor

Agree w/ @ryzokuken

@sffc sffc removed the User Preferences Related to user preferences label Jun 4, 2021
@sffc
Copy link
Contributor

sffc commented Jun 4, 2021

I dropped the label, but I prefer to keep these issues open until the proposal reaches Stage 4 and is actually merged into the standard.

@sffc sffc modified the milestones: ES 2022, ES 2023 Jun 1, 2022
@sffc sffc added s: in progress Status: the issue has an active proposal and removed s: discuss Status: TG2 must discuss to move forward labels Sep 18, 2023
@sffc
Copy link
Contributor

sffc commented May 2, 2024

This is being addressed by the Intl Locale Info Proposal

https://github.com/tc39/proposal-intl-locale-info?tab=readme-ov-file#text-information

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: text Component: case mapping, collation, properties s: in progress Status: the issue has an active proposal
Projects
None yet
Development

No branches or pull requests

7 participants