Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce MIX_OF_LOWER_AND_UPPER_CHAR_CASE_IN_RANGE warning #3433

Closed
KvanTTT opened this issue Dec 25, 2021 · 5 comments
Closed

Introduce MIX_OF_LOWER_AND_UPPER_CHAR_CASE_IN_RANGE warning #3433

KvanTTT opened this issue Dec 25, 2021 · 5 comments

Comments

@KvanTTT
Copy link
Member

KvanTTT commented Dec 25, 2021

Consider the following set: [A-z]. Most likely it's an incorrect definition because the range

ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxy

contains nonimplied characters:

[\]^_`

Correct and clear definition: [A-Za-z].

If these characters are implied, they can be added explicitly:

[A-Za-z[\\\]^_`]

I suggest adding at least a new warning MIX_OF_LOWER_AND_UPPER_CHAR_CASE_IN_RANGE. It's especially actual if use the caseInsensitivity option. Or name it as RANGE_PROBABLY_CONTAINS_NOT_IMPLIED_CHARACTERS.

@parrt what do you think?

@KvanTTT KvanTTT changed the title Introduce MIX_OF_LOWER_AND_UPPER_BOUND_IN_RANGE warning Introduce MIX_OF_LOWER_AND_UPPER_CHAR_CASE_IN_RANGE warning Dec 25, 2021
@parrt
Copy link
Member

parrt commented Dec 25, 2021

Hmm...yes, this does seem like it could cause strange errors. On the other hand, has this ever come up? Usually I try to avoid work unless it's solving a known problem haha.

@KvanTTT
Copy link
Member Author

KvanTTT commented Dec 25, 2021

I've encountered the error during fixing caseInsensitive option. Moreover, I think it is quite a real case, it's quite intuitive but wrong definition. That's why it should be fixed. BTW it's incorrect for Latin characters but is correct for Cyrillic characters. [А-я] does not contain intermediate characters.

@parrt
Copy link
Member

parrt commented Dec 25, 2021

Understood. Well I would welcome a PR :) I'd definitely love to get the case insensitive option merged so I can try to review some of your other work...I start work for my new galactic overlord on 4 January and it would be nice to get a stable set up...

@KvanTTT
Copy link
Member Author

KvanTTT commented Dec 25, 2021

It would be good to take a look at this PR while I'm fixing the rest case insensitive issues: #3349 It resolves a lot of issues including character ones (but it should be merged after case insensitive PR).

@parrt parrt closed this as completed in 7fa59a4 Dec 25, 2021
@KvanTTT
Copy link
Member Author

KvanTTT commented Dec 26, 2021

I started testing our grammars-v4 and encountered some subtle cases with the current warning that related to Unicode ranges. Probably I was too fast about the correct solution here, but I'll resolve it (restrict to ANSI characters) or revert it tomorrow. Sorry.

Yet another point for including grammars-v4 to the test infrastructure in some way (integration testing).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants