Consider syntax erroring if the same character is repeated in the character class #287

matklad · 2016-09-27T18:19:51Z

Hi! Today, I've spend some time debugging regex like [a:digit:] which I wrongly assumed should work like [a[:digit:]]. For me, it would be nice if regex failed to compile this regex because : is repeated twice :)

The text was updated successfully, but these errors were encountered:

BurntSushi · 2016-09-27T20:49:24Z

I'm not totally sure we want to do this. It seems like a good idea at first, but what if you used two character classes that partially overlapped? e.g., [\p{Lu}\w] or something (although I admit that's a little strange).

Also, this would be a breaking change. While 1.0 is looming, I'm not sure it's worth it.

matklad · 2016-09-27T21:08:47Z

but what if you used two character classes that partially overlapped?

I was thinking about a simple rule for literal characters only and maybe for exactly duplicated character classes as well. I can't imagine a realistic situation when it is intentional to have such obvious repetitions. But I think they can arise by accident.

I don't know what is the best solution here: I'd made it a syntax error, but I don't have much expertise :) Another option would be to add a lint for this, but it won't be as effective (and won't work for user provided regular expressions).

Also, this would be a breaking change. While 1.0 is looming, I'm not sure it's worth it.

Imo, it's better to fix rough edges before 1.0 rather than leave them as it is. But again, can't say for sure if it is a rough enough edge.

BurntSushi · 2016-12-29T01:34:28Z

I thought about this. Here's why I think we shouldn't do it:

I don't think I know of any other regex engine that reports an error for this. While that alone isn't enough of a reason to forgo reporting an error, some folks might find it surprising. I also worry that there's a corner case we aren't considering. People like to do really funny things with their regexes.
I think the "only error if there's a repeated literal" is a bit strange, and the fact that it would have helped your initial problem seems incidental.

While I do kind of agree with your arguments, I just feel like there isn't enough to break rank with everyone else.

matklad mentioned this issue Sep 27, 2016

Clarify how to use ascii character classes in [..] #286

Closed

BurntSushi closed this as completed Dec 29, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider syntax erroring if the same character is repeated in the character class #287

Consider syntax erroring if the same character is repeated in the character class #287

matklad commented Sep 27, 2016

BurntSushi commented Sep 27, 2016

matklad commented Sep 27, 2016

BurntSushi commented Dec 29, 2016

Consider syntax erroring if the same character is repeated in the character class #287

Consider syntax erroring if the same character is repeated in the character class #287

Comments

matklad commented Sep 27, 2016

BurntSushi commented Sep 27, 2016

matklad commented Sep 27, 2016

BurntSushi commented Dec 29, 2016