Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment with ICU4X #1180

Closed
jasonwilliams opened this issue Mar 18, 2021 · 5 comments
Closed

Experiment with ICU4X #1180

jasonwilliams opened this issue Mar 18, 2021 · 5 comments
Assignees
Labels
enhancement New feature or request Hacktoberfest Hacktoberfest 2021 - https://hacktoberfest.digitalocean.com help wanted Extra attention is needed

Comments

@jasonwilliams
Copy link
Member

https://github.com/unicode-org/icu4x will be useful for implementing i18n-sensititve operations and future proposals like Temporal

@jasonwilliams jasonwilliams added Hacktoberfest Hacktoberfest 2021 - https://hacktoberfest.digitalocean.com help wanted Extra attention is needed labels Sep 30, 2021
@jedel1043
Copy link
Member

Ok, details I found while investigating ICU4X:

  • It requires a DataProvider in order to do actions, which is not trivial to obtain.
  • A DataProvider can be obtained using the icu4x-datagen crate, but it is not published on crates.io, so we would need to import the repo as a submodule if we want to automatize it.
  • We can use a StaticDataProvider to embed a DataProvider on the binary with include_bytes!.
  • We can also obtain the data from http://unicode.org/Public/cldr/ but we would need to code a parser into a BlobSchema for it to be easily embeddable as a StaticDataProvider.
  • We can use a build.rs script to avoid having to do these things by hand.
  • The collator we require is a WIP: Create a Collator component unicode-org/icu4x#971

@jasonwilliams
Copy link
Member Author

CC @sffc

bors bot pushed a commit that referenced this issue Oct 24, 2021
<!---
Thank you for contributing to Boa! Please fill out the template below, and remove or add any
information as you feel neccesary.
--->

This pull request is related to #1180.

It changes the following:

- Creates the `Intl` global
- Adds the `Intl.getCanonicalLocales` method

At the moment it does not actually use ICU4X behind the scenes; `Intl.getCanonicalLocales` simply acts as if all the locales passed are canonical locales. This will not be the case in the final PR.


Co-authored-by: RageKnify <RageKnify@gmail.com>
bors bot pushed a commit that referenced this issue Oct 24, 2021
<!---
Thank you for contributing to Boa! Please fill out the template below, and remove or add any
information as you feel neccesary.
--->

This pull request is related to #1180.

It changes the following:

- Creates the `Intl` global
- Adds the `Intl.getCanonicalLocales` method

At the moment it does not actually use ICU4X behind the scenes; `Intl.getCanonicalLocales` simply acts as if all the locales passed are canonical locales. This will not be the case in the final PR.


Co-authored-by: RageKnify <RageKnify@gmail.com>
@sffc
Copy link

sffc commented Feb 15, 2022

Hi there! I just saw this today.

Here are instructions on how to generate the ICU4X data file:

https://crates.io/crates/icu_datagen

Specific replies inline:

  • A DataProvider can be obtained using the icu4x-datagen crate, but it is not published on crates.io, so we would need to import the repo as a submodule if we want to automatize it.

It is on crates.io; see link above.

  • We can use a StaticDataProvider to embed a DataProvider on the binary with include_bytes!.

Correct. This is the easiest way to include data.

  • We can also obtain the data from http://unicode.org/Public/cldr/ but we would need to code a parser into a BlobSchema for it to be easily embeddable as a StaticDataProvider.

You should use icu4x-datagen to generate the data. You need the CLDR data available at build time.

  • We can use a build.rs script to avoid having to do these things by hand.

We have an issue to track this: unicode-org/icu4x#1188

@hsivonen has been working on the collator and can share more about the timeline for this feature.

@hsivonen
Copy link

There's now an ICU4X PR that shows the status of the collator.

@jedel1043
Copy link
Member

There's now an ICU4X PR that shows the status of the collator.

Nice!
I also saw that you're about to merge a PR with a datagen API for build.rs scripts (unicode-org/icu4x#1819). I'll try to experiment with your branches in the meantime, and hopefully we'll be able to integrate ICU4X in our codebase on your next release!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Hacktoberfest Hacktoberfest 2021 - https://hacktoberfest.digitalocean.com help wanted Extra attention is needed
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

4 participants