-
-
Notifications
You must be signed in to change notification settings - Fork 768
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ICU-22326 Integrate CLDR release-44-beta5 to ICU #2682
ICU-22326 Integrate CLDR release-44-beta5 to ICU #2682
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spot-checked. LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't there also be an empty kxv_Latn.txt like we have an empty zh_Hans.txt and an empty sr_Cyrl.txt (in each locale data tree)? There is a kxv_Latn.txt in the main "locales" tree but not in each of the trees.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, checking to see what if anything is different in how we handle kxv_Latn vs e.g. sr_Cyrl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markusicu @srl295 OK, in the CLDR tool GenerateAliases.java I see this which seems relevant:
static final Set<String> HAS_MULTIPLE_SCRIPTS =
org.unicode.cldr.util.Builder.with(new HashSet<String>())
.addAll("ha", "ku", "zh", "sr", "uz", "sh")
.freeze();
I can update that and re-integrate. But that will add another round of PR approvals etc. Can we approve and merge this PR and then handle that as a separate pass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that is not it, because we have plenty of other multiscript locales for which a stub for the default script is generated in all the subtrees:
az_Latn
bs_Latn
ff_Latn
ks_Arab
mni_Beng
pa_Guru
sat_Olck
sd_Arab
shi_Tfng
su_Latn
vai_Vaii
yue_Hant
Must be something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One option could be to add the stubs for kxv_Latn manually to this PR and then file a separate ticket to figure out why they are not getting generated automatically... (I note that kxv is not in CLDR coverage, nor in languageInfo.xml).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or just to take this PR without the stubs for kxv_Latn in other trees, and mark that as a known issue...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One option could be to add the stubs for kxv_Latn manually to this PR and then file a separate ticket to figure out why they are not getting generated automatically...
This seems ok, and better than omitting data files that the code might rely on.
(I note that kxv is not in CLDR coverage, nor in languageInfo.xml).
Not in coverage? Wasn't the point that these locales had gotten enough coverage to be useful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in coverage?
Sorry, I meant the display name for the language is not in modern coverage for other languages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- adding stubs makes sense. Gotta get rid of those hard coded lists :(
- @markusicu different overload of "coverage".
kxv
is in generated coverageLevels.txt because it achieved. It's not in the curated coverageLevels.xml because of a process hole.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I added the stubs here manually as a part-4 commit (can squash into part 3 if desired, it has both binary and source data files). Filed https://unicode-org.atlassian.net/browse/ICU-22557 to fix the underlying problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spot lgtm
e0e08e2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
I am fine with the fourth commit. I added it to the list in the PR description.
Checklist
Integrate CLDR release-44-beta5 to ICU. The only change on the CLDR side is to add English locale display names for 3 additional languages. However, in the integration process we also pick up changes for several additional CLDR locales (including 3 for which English display names were added):
The commits are in 4 groups:
ALLOW_MULTIPLE_COMMITS=true