Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Case-insensitive \p{} issue #432

Closed
michaeltang4829 opened this issue Jul 17, 2024 · 2 comments
Closed

Case-insensitive \p{} issue #432

michaeltang4829 opened this issue Jul 17, 2024 · 2 comments

Comments

@michaeltang4829
Copy link

Hello,

We're using PCRE2 10.43, and we found what looks to be a bug with unicode properties while using character case properties. When the pattern has the case-insensitive modifier, regardless of the character case being only upper-case, should also find lower case characters.

Text: a
Pattern: (?i:\p{Lu})
Result: No matches
Expected: a

There's also documentation in section 17 in the pcre2compat page hinting that this was the original behavior in Perl, but later Perl corrected this behavior.

Thanks!

@PhilipHazel
Copy link
Collaborator

Yes, I think that's right. At the time Perl changed I thought they were wrong - I thought if you had said \p{Lu} you should get an upper case letter, even in /i mode. However, now that I think about it, I have changed my mind. After all, when you specify A (for example) you expect it to match the lower case in /i mode. So I now agree; PCRE2 should change. I will look into it.

@PhilipHazel
Copy link
Collaborator

Commit 6d82f0c makes PCRE2 behave like Perl. Lu, Ll, and Lt now all behave as Lc when /i is in force.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants