Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ICU-22325 Integrate CLDR release-44-m1 to ICU #2530

Conversation

DraganBesevic
Copy link
Contributor

@DraganBesevic DraganBesevic commented Jul 20, 2023

Checklist
  • Required: Issue filed: https://unicode-org.atlassian.net/browse/ICU-22325
  • Required: The PR title must be prefixed with a JIRA Issue number.
  • Required: The PR description must include the link to the Jira Issue, for example by completing the URL in the first checklist item
  • Required: Each commit message must be prefixed with a JIRA Issue number.
  • Issue accepted (done by Technical Committee after discussion)
  • Tests included, if applicable
  • API docs and/or User Guide docs changed or added, if applicable

Integrates CLDR release-44-m1 plus the fix for ko_CN in unicode-org/cldr#3107 and the fix for personName test data in unicode-org/cldr#3122.
Some tests results are temporarily changed because CLDR reverted some data items to provisional at the beginning of the submission cycle (these will be fixed with the alpha0 integration).

Fixed pre-existing bug in RBBITestMonkey and logged a known issue for problem with testing limits on Hebrew calendar ICU-22441

Also needed to add personName test data from CLDR per ant copy-cldr-testdata in tools/cldr. Had to add a copyright header for those files in CLDR (currently a known issue with most of them saying they are for unit tests, will fix in CLDR). This also entailed changing the tools/cldr/build.xml code to add a copyright header for the generated catalog.txt file.

@pedberg-icu
Copy link
Contributor

/azp run CI-Exhaustive

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

richgillam
richgillam previously approved these changes Jul 21, 2023
Copy link
Contributor

@richgillam richgillam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too big to review, and looks to be all mechanically generated anyway. Rubber-stamping.

@pedberg-icu pedberg-icu changed the title ICU-22325 Integrate CLDR 44.1 to ICU ICU-22325 Integrate CLDR release-44-m1 to ICU Jul 21, 2023
@markusicu
Copy link
Member

Too big to review, and looks to be all mechanically generated anyway. Rubber-stamping.

Suggestion for future integration pull requests:

  • When creating the pull request, split it into multiple commits so that we can review them separately. For example, split all of the .res files (or all of the binary files including .jar files) into a separate commit.
  • In the end, you will of course need to squash, but then we can just rubber-stamp confidently.

@markusicu
Copy link
Member

What is “unit of pressure: gasoline-equivalent”?
I know of mpge but that's a energy consumption unit, not a pressure unit.

markusicu
markusicu previously approved these changes Jul 21, 2023
Copy link
Member

@markusicu markusicu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rslgtm2 after spot check
but that new unit seems fishy

@pedberg-icu
Copy link
Contributor

What is “unit of pressure: gasoline-equivalent”?
I know of mpge but that's a energy consumption unit, not a pressure unit.

@macchiati is the best person to answer that. It may be a pressure dimensionally even though it is not used for pressure measurements.

@macchiati
Copy link
Member

macchiati commented Jul 22, 2023 via email

@DraganBesevic DraganBesevic dismissed stale reviews from markusicu and richgillam via 3c86935 July 26, 2023 20:04
@DraganBesevic DraganBesevic force-pushed the ICU-22325-brs74rc-integrate-cldr-44-m1 branch from 3c86935 to ab86739 Compare July 26, 2023 20:09
@jira-pull-request-webhook
Copy link

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

@DraganBesevic
Copy link
Contributor Author

/azp run CI-Exhaustive

@azure-pipelines
Copy link

Commenter does not have sufficient privileges for PR 2530 in repo unicode-org/icu

@DraganBesevic
Copy link
Contributor Author

/azp run CI-Exhaustive

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@pedberg-icu
Copy link
Contributor

I think you found a problem, Markus! It is ~33.705 kilowatt-hours, so the base unit should be watt-seconds. Since watt is "kilogram-square-meter-per-cubic-second" it should be kilogram-square-meter-per-square-second which is joules, not pascals.

@macchiati Can you file a CLDR ticket to fix gasoline-equivalent on the CLDR side? That should be fixed with a future CLDR-ICU integration for the current integration we will use the existing category.

@macchiati
Copy link
Member

Will do.

@macchiati
Copy link
Member

macchiati commented Jul 26, 2023 via email

@macchiati
Copy link
Member

macchiati commented Jul 27, 2023 via email

@DraganBesevic DraganBesevic force-pushed the ICU-22325-brs74rc-integrate-cldr-44-m1 branch from dfb4033 to 123e0de Compare July 27, 2023 15:50
@jira-pull-request-webhook
Copy link

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

@pedberg-icu pedberg-icu force-pushed the ICU-22325-brs74rc-integrate-cldr-44-m1 branch from 123e0de to 340ce7c Compare July 27, 2023 17:22
@jira-pull-request-webhook
Copy link

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

@pedberg-icu
Copy link
Contributor

/azp run CI-Exhaustive

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@pedberg-icu pedberg-icu force-pushed the ICU-22325-brs74rc-integrate-cldr-44-m1 branch from 55bdec8 to d23d741 Compare July 27, 2023 22:22
@jira-pull-request-webhook
Copy link

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

@pedberg-icu
Copy link
Contributor

/azp run CI-Exhaustive

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Member

@macchiati macchiati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spot-checked the data files. Markus and Rich would be better suited to check the rest.

richgillam
richgillam previously approved these changes Jul 28, 2023
Copy link
Contributor

@richgillam richgillam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either I forgot to actually rubber-stamp this the first time around, or my stamp got removed by activity in here in the meantime. Rubber-stamping again.

@pedberg-icu pedberg-icu force-pushed the ICU-22325-brs74rc-integrate-cldr-44-m1 branch from 7c4af1f to 9500fa4 Compare July 28, 2023 18:35
@jira-pull-request-webhook
Copy link

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

@pedberg-icu
Copy link
Contributor

/azp run CI-Exhaustive

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@pedberg-icu
Copy link
Contributor

Either I forgot to actually rubber-stamp this the first time around, or my stamp got removed by activity in here in the meantime. Rubber-stamping again.

@richgillam Well I needed to update this again to add a copyright for the personName testdata files (currently most of them say they are for unit tests, known issue that I will fix in CLDR). But this will need another look, sorry.

Copy link
Contributor

@richgillam richgillam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still looks good. Huge thanks (and deep apologies) for the PersonName changes!

String[] FILENAMES_TO_SKIP = {"gaa.txt", "dsb.txt", "syr.txt", "hsb.txt", "lij.txt",
"yue_Hans.txt", "fa.txt", "ja.txt", "ka.txt", "zh_Hant_HK.txt", "zh_Hant.txt",
"bn.txt", "zh.txt", "nl.txt", "to.txt", "uk.txt", "my.txt", "yue.txt",
"bg.txt", "tk.txt", "ps.txt", "ko.txt", "kk.txt", "ms.txt"};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks; I'll try to fix this after I get the algorithms back in sync and Mark regenerates the test data files (or I figure out how to do it).

// { "nl_NL", "LONG", "MONOGRAM", "FORMAL", "DEFAULT", "", "WvP" }, Temporary change the test because data was set to provisional
// { "nl_NL", "LONG", "MONOGRAM", "INFORMAL", "DEFAULT", "", "WvP" },
{ "nl_NL", "LONG", "MONOGRAM", "FORMAL", "DEFAULT", "", "WV" },
{ "nl_NL", "LONG", "MONOGRAM", "INFORMAL", "DEFAULT", "", "WV" },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like I have some cleanup and sync to do here too...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NO, this is because we reset a lot of personName data to provisional before the beginning of data collection for CLDR 44; provisional data dopes not get picked up for ICU. Once we have the CLDR 44 Survey Tool data in xml and redo the integration, these problems should go away.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, okay.

@@ -556,7 +558,8 @@ public void TestLocaleDerivation() {
{"en", "MEDIUM", "REFERRING", "FORMAL", "DEFAULT", "", "陳港生Test"},
}),
new NameAndTestCases("given=港生,surname=陳,given2=Test,locale=zh_Hant", new String[][]{
{"en", "MEDIUM", "REFERRING", "FORMAL", "DEFAULT", "", "陳港生T."},
// {"en", "MEDIUM", "REFERRING", "FORMAL", "DEFAULT", "", "陳港生T."}, Temporary change the test because data was set to provisional
{"en", "MEDIUM", "REFERRING", "FORMAL", "DEFAULT", "", "陳港生Test"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here...

@pedberg-icu
Copy link
Contributor

Still looks good. Huge thanks (and deep apologies) for the PersonName changes!

@richgillam No problem. But can you re-approve? All of the updates to this PR cancelled the previous approvals.

Copy link
Contributor

@richgillam richgillam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving again.

@richgillam
Copy link
Contributor

Still looks good. Huge thanks (and deep apologies) for the PersonName changes!

@richgillam No problem. But can you re-approve? All of the updates to this PR cancelled the previous approvals.

Done. Thank you!

Was my response to Frank in https://unicode-org.atlassian.net/browse/ICU-22304 accurate? If not, would you mind commenting over there? If so, I'll close it again after this PR lands.

@pedberg-icu
Copy link
Contributor

@richgillam @markusicu @macchiati We have had some updates to this PR to get exhaustive tests to pass. Can one of you re-approve? Thanks!

@richgillam
Copy link
Contributor

@richgillam @markusicu @macchiati We have had some updates to this PR to get exhaustive tests to pass. Can one of you re-approve? Thanks!

I did re-approve. I think you're okay (at least it looks like it on my side...).

@pedberg-icu pedberg-icu merged commit 1f07d2b into unicode-org:main Jul 28, 2023
104 checks passed
@pedberg-icu pedberg-icu deleted the ICU-22325-brs74rc-integrate-cldr-44-m1 branch July 28, 2023 23:53
@pedberg-icu
Copy link
Contributor

Still looks good. Huge thanks (and deep apologies) for the PersonName changes!

@richgillam No problem. But can you re-approve? All of the updates to this PR cancelled the previous approvals.

Done. Thank you!

Was my response to Frank in https://unicode-org.atlassian.net/browse/ICU-22304 accurate? If not, would you mind commenting over there? If so, I'll close it again after this PR lands.

Yes, that is accurate...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants