Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Baked data for icu_collator #3572

Merged
merged 1 commit into from
Jun 28, 2023
Merged

Conversation

robertbastian
Copy link
Member

@robertbastian robertbastian commented Jun 22, 2023

Manishearth
Manishearth previously approved these changes Jun 22, 2023
Comment on lines +76 to +81
DataPayload::from_static_ref(
icu_normalizer::provider::Baked::SINGLETON_NORMALIZER_NFD_V1,
),
DataPayload::from_static_ref(
icu_normalizer::provider::Baked::SINGLETON_NORMALIZER_NFDEX_V1,
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: For nested constructors I think we said that we wanted to call the globaldata constructor from the other crate rather than linking the data from that crate into this crate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if there's a public API we can go, at the moment this crate directly uses the DataPayloads.

I think what we decided was that if a crate calls a constructor from another crate, the data constructor should call other data constructors directly. This is not quite what's happening here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still a bit uneasy; should we refactor the code to hold a Normalizer instead of the payloads, or perhaps create a Normalizer than then use a doc(hidden) API to extract the payloads from it? Let's discuss

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussion:

  • If they are in different codegen units, it's possible that the data may be anchored in two different places (duplicated)
  • But LTO or codegen-units=1 should fix it in most cases
  • Most of the call sites are const constructors which have the same problem, but it will show up in client code rather than ICU4X code

Conclusion:

  • Land this as-is and fix it later if we find it is an actual problem

@sffc sffc added the discuss Discuss at a future ICU4X-SC meeting label Jun 27, 2023
@sffc
Copy link
Member

sffc commented Jun 27, 2023

Discuss with:

Optional:

@robertbastian robertbastian merged commit 38391cb into unicode-org:main Jun 28, 2023
@robertbastian robertbastian deleted the collator branch June 29, 2023 08:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Discuss at a future ICU4X-SC meeting
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants