Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ICU-22404 Improve documentation of segmentation rules #2532

Merged
merged 1 commit into from
Aug 10, 2023

Conversation

eggrobin
Copy link
Member

See comments on #2492 and email thread with Andy titled genbrk.

Checklist
  • Required: Issue filed: https://unicode-org.atlassian.net/browse/ICU-22404
  • Required: The PR title must be prefixed with a JIRA Issue number.
  • Required: The PR description must include the link to the Jira Issue, for example by completing the URL in the first checklist item
  • Required: Each commit message must be prefixed with a JIRA Issue number.
  • Issue accepted (done by Technical Committee after discussion)
  • Tests included, if applicable
  • API docs and/or User Guide docs changed or added, if applicable

macchiati
macchiati previously approved these changes Jul 28, 2023
@aheninger
Copy link
Contributor

There is an out-of-date comment at the top of genbrk.cpp that should also be updated, concerning BOMs and rules encoding. Lines 23-27

Historically, the rule files have been UTF-8. Code for genbrk to ignore a BOM if present dates to the early days of using SVN for ICU version control - on Windows, BOMs are needed on local rules files, and unless svn clients were configured just so, the BOMs had a tendency to migrate into the repository's files too. Meanwhile, on other platforms, some text editors silently strip BOMs if they see them. The net result was BOMs were hit or miss. Maybe they're there, maybe not, and genbrk just had to deal with it.

FrankYFTang
FrankYFTang previously approved these changes Jul 31, 2023
@eggrobin eggrobin dismissed stale reviews from FrankYFTang and macchiati via 778ab01 August 8, 2023 20:18
@eggrobin eggrobin requested a review from macchiati August 9, 2023 13:23
@eggrobin eggrobin force-pushed the breaking-rules-documentation branch from 778ab01 to 890fc62 Compare August 10, 2023 00:04
@jira-pull-request-webhook
Copy link

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

@eggrobin eggrobin merged commit 86193b1 into unicode-org:main Aug 10, 2023
101 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants