ICU-22326 Integrate CLDR release-44-beta5 to ICU #2682

pedberg-icu · 2023-10-26T00:56:42Z

Checklist

Required: Issue filed: https://unicode-org.atlassian.net/browse/ICU-22326
Required: The PR title must be prefixed with a JIRA Issue number.
Required: The PR description must include the link to the Jira Issue, for example by completing the URL in the first checklist item
Required: Each commit message must be prefixed with a JIRA Issue number.
Issue accepted (done by Technical Committee after discussion)
Tests included, if applicable
API docs and/or User Guide docs changed or added, if applicable

Integrate CLDR release-44-beta5 to ICU. The only change on the CLDR side is to add English locale display names for 3 additional languages. However, in the integration process we also pick up changes for several additional CLDR locales (including 3 for which English display names were added):

blo Anii Moderate
csw Swampy Cree Basic
kxv Kuvi Basic
kxv_Deva Kuvi (Devanagari) Basic
kxv_Orya Kuvi (Odia) Basic
kxv_Telu Kuvi (Telugu) Basic
tok Toki Pona Basic
vec Venetian Moderate
xnr Kangri Basic

The commits are in 4 groups:

Binary data, and binary-like source data (No review needed)
Data generated or copied from CLDR (Can spot check)
Source and test code changes
manually add kxv_Latn* stubs

ALLOW_MULTIPLE_COMMITS=true

…like source)

… copied from CLDR)

… tools, tests)

richgillam

Spot-checked. LGTM.

markusicu · 2023-10-26T01:52:05Z

icu4c/source/data/curr/kxv.txt

Shouldn't there also be an empty kxv_Latn.txt like we have an empty zh_Hans.txt and an empty sr_Cyrl.txt (in each locale data tree)? There is a kxv_Latn.txt in the main "locales" tree but not in each of the trees.

Hmm, checking to see what if anything is different in how we handle kxv_Latn vs e.g. sr_Cyrl

@markusicu @srl295 OK, in the CLDR tool GenerateAliases.java I see this which seems relevant:

static final Set<String> HAS_MULTIPLE_SCRIPTS = org.unicode.cldr.util.Builder.with(new HashSet<String>()) .addAll("ha", "ku", "zh", "sr", "uz", "sh") .freeze();

I can update that and re-integrate. But that will add another round of PR approvals etc. Can we approve and merge this PR and then handle that as a separate pass?

No, that is not it, because we have plenty of other multiscript locales for which a stub for the default script is generated in all the subtrees:

az_Latn bs_Latn ff_Latn ks_Arab mni_Beng pa_Guru sat_Olck sd_Arab shi_Tfng su_Latn vai_Vaii yue_Hant

Must be something else.

One option could be to add the stubs for kxv_Latn manually to this PR and then file a separate ticket to figure out why they are not getting generated automatically... (I note that kxv is not in CLDR coverage, nor in languageInfo.xml).

Or just to take this PR without the stubs for kxv_Latn in other trees, and mark that as a known issue...

One option could be to add the stubs for kxv_Latn manually to this PR and then file a separate ticket to figure out why they are not getting generated automatically...

This seems ok, and better than omitting data files that the code might rely on.

(I note that kxv is not in CLDR coverage, nor in languageInfo.xml).

Not in coverage? Wasn't the point that these locales had gotten enough coverage to be useful?

Not in coverage?

Sorry, I meant the display name for the language is not in modern coverage for other languages.

adding stubs makes sense. Gotta get rid of those hard coded lists :(

@markusicu different overload of "coverage". kxv is in generated coverageLevels.txt because it achieved. It's not in the curated coverageLevels.xml because of a process hole.

OK, I added the stubs here manually as a part-4 commit (can squash into part 3 if desired, it has both binary and source data files). Filed https://unicode-org.atlassian.net/browse/ICU-22557 to fix the underlying problem.

srl295

Spot lgtm

…Latn* stubs)

markusicu

Thanks!
I am fine with the fourth commit. I added it to the list in the PR description.

pedberg-icu added 3 commits October 25, 2023 17:29

ICU-22326 CLDR release-44-beta5 to ICU main part 1 (binaries, binary-…

d6f4329

…like source)

ICU-22326 CLDR release-44-beta5 to ICU main part 2 (data generated or…

cc71ded

… copied from CLDR)

ICU-22326 CLDR release-44-beta5 to ICU main part 3 (ICU sources: lib,…

151cce9

… tools, tests)

pedberg-icu requested review from srl295, macchiati, markusicu, richgillam and DraganBesevic October 26, 2023 00:56

pedberg-icu assigned markusicu Oct 26, 2023

richgillam previously approved these changes Oct 26, 2023

View reviewed changes

markusicu reviewed Oct 26, 2023

View reviewed changes

srl295 previously approved these changes Oct 26, 2023

View reviewed changes

DraganBesevic previously approved these changes Oct 26, 2023

View reviewed changes

ICU-22326 CLDR release-44-beta5 to ICU main part 4 (manually add kxv_…

e0e08e2

…Latn* stubs)

pedberg-icu dismissed stale reviews from DraganBesevic, srl295, and richgillam via e0e08e2 October 26, 2023 16:22

markusicu approved these changes Oct 26, 2023

View reviewed changes

pedberg-icu merged commit da87459 into unicode-org:maint/maint-74 Oct 26, 2023

pedberg-icu deleted the ICU-22326-integrate-cldr-release-44-beta5-to-ICU branch October 26, 2023 17:59

pedberg-icu mentioned this pull request Dec 7, 2023

ICU-22583 CLDR release-44-1 to ICU maint/maint-74 branch #2727

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ICU-22326 Integrate CLDR release-44-beta5 to ICU #2682

ICU-22326 Integrate CLDR release-44-beta5 to ICU #2682

pedberg-icu commented Oct 26, 2023 •

edited by markusicu

Loading

richgillam left a comment

markusicu Oct 26, 2023

pedberg-icu Oct 26, 2023

pedberg-icu Oct 26, 2023

pedberg-icu Oct 26, 2023

pedberg-icu Oct 26, 2023

pedberg-icu Oct 26, 2023

markusicu Oct 26, 2023

pedberg-icu Oct 26, 2023

srl295 Oct 26, 2023 •

edited

Loading

pedberg-icu Oct 26, 2023

srl295 left a comment

markusicu left a comment

ICU-22326 Integrate CLDR release-44-beta5 to ICU #2682

ICU-22326 Integrate CLDR release-44-beta5 to ICU #2682

Conversation

pedberg-icu commented Oct 26, 2023 • edited by markusicu Loading

Checklist

richgillam left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

srl295 Oct 26, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

srl295 left a comment

Choose a reason for hiding this comment

markusicu left a comment

Choose a reason for hiding this comment

pedberg-icu commented Oct 26, 2023 •

edited by markusicu

Loading

srl295 Oct 26, 2023 •

edited

Loading