Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ICU-22668 Added unit tests to verify that CLDR-17892 addresses the or… #3184

Merged
merged 1 commit into from
Sep 20, 2024

Conversation

richgillam
Copy link
Contributor

@richgillam richgillam commented Sep 19, 2024

…iginal problem reported here.

Checklist
  • Required: Issue filed: https://unicode-org.atlassian.net/browse/ICU-22668
  • Required: The PR title must be prefixed with a JIRA Issue number.
  • Required: The PR description must include the link to the Jira Issue, for example by completing the URL in the first checklist item
  • Required: Each commit message must be prefixed with a JIRA Issue number.
  • Issue accepted (done by Technical Committee after discussion)
  • Tests included, if applicable
  • API docs and/or User Guide docs changed or added, if applicable

} TestCase;

const TestCase testCases[] = {
{ "en_GB@calendar=iso8601;rg=uszzzz", u"EEEEyMMMMdjmm", u"y MMMM d, EEEE 'at' h:mm a" },
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@macchiati Is it right that j gets us 12-hour time here? It seems like with ISO 8601, the default hour cycle should always be 24-hour.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ISO-based, but not strictly ISO 8601 (which is a constant ASCII format). So decided that what most users want as the alternative is the ordering, but still allowing for certain localized features (names of months, separators, etc).

Comment on lines +852 to +853
{ "en_GB@calendar=iso8601;rg=uszzzz", u"EEEEyMMMMdjmm", u"y MMMM d, EEEE 'at' h:mm a" },
{ "en_GB@calendar=iso8601;rg=uszzzz", u"EEEEyMMMMdHmm", u"y MMMM d, EEEE 'at' HH:mm" },
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@macchiati It doesn't seem like we should be getting the "at" here. I think the %atTime patterns from CLDR didn't make it over to ICU properly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still like some guidance on this issue. I sent a private email. This might actually be another instance of the problem described in https://unicode-org.atlassian.net/issues/ICU-22688 ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mark and I discussed this offline; I think this is a bug in the CLDR-to-ICU tooling. I'm going to file a new ticket.

Comment on lines +864 to +865
{ "en_US", u"Edjmm", u"d EEE, h:mm a" },
{ "en_US", u"EdHmm", u"d EEE, HH:mm" },
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@macchiati Regular US English wants d EEE? "12 Tuesday"? I don't think I've ever heard it expressed that way...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We decided on ordering y m d dow h m s z

We expect to be refining the patterns in later iterations. So this is just capturing what CLDR does now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW ISO 8601 doesn't allow for the dow feature

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one isn't ISO 8601, though. It's just generic US English (for comparison). So is your comment still operative?

{ "en_US", "EEEEyMMMMdHmm", "EEEE, MMMM d, y 'at' HH:mm" },
{ "en_US", "Edjmm", "d EEE, h:mm a" },
{ "en_US", "EdHmm", "d EEE, HH:mm" },
};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is actually failing for me on the Java side. Was there some additional step I (or somebody else) needed to take to get the ISO 8601 changes into the ICU4J data JAR file?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. I think the goal of this test should be to reflect what is in CLDR right now, and refine later. What are the symptoms of the failures now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I copied the same test from C++ to Java and it passes on C++ and fails on Java, which makes me think my copy of ICU4J didn't pick up the CLDR changes. That probably means I skipped a step somewhere, but I don't know what I forgot to do. Or, if this should have happened automatically with the CLDR integration, maybe something was missed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. It was my own fault. There was a hack in the actual code that I had forgotten I needed to take out. Sorry for the extra churn.

if (assertSuccess("Error creating dtpg", &err)) {
UChar actualPattern[200];

udatpg_getBestPatternWithOptions(dtpg, testCases[i].skeleton, -1, 0, actualPattern, 200, &err);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my review comments on the previous PR #3103: Please don't hardcode lengths, that becomes increasingly brittle when code changes over time, use UPRV_LENGTHOF(actualPattern) instead of 200.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree (though if Peter has limited time to fix this PR, it could be fixed later.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're both being overly picky, but I made the change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You already made those changes in PR #3103 so even if you don't think it's particularly important I just don't understand why you'd prefer to throw those improvements away instead of simply copy-pasting that improved code into this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I changed it over there because somebody (probably you) complained there. I didn't take it out intentionally; I started over from scratch because I didn't remember I had that other PR still open. Regardless, the fix is back in now...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you went ahead and made this change on your own. I thought I had made it in my last commit; but maybe I forgot that file.

@richgillam
Copy link
Contributor Author

Can I please get somebody to approve this? Are there changes somebody is still waiting for me to make?

…iginal problem reported here and took out an

earlier hack in the Java code to work around the original problem.
@jira-pull-request-webhook
Copy link

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

@richgillam richgillam merged commit 9adcf8b into unicode-org:main Sep 20, 2024
100 checks passed
@richgillam richgillam deleted the ICU-22668-iso-8601a branch September 20, 2024 23:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants