-
Notifications
You must be signed in to change notification settings - Fork 446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Negation of full byte-range class causes panic #303
Comments
Ah yup, looks like a bug! Nice find. :-) |
To avoid confusion for future readers, I thought I should point out that part of my description above is technically incorrect: We're using a version of There is still technically a panic in the additional test I referenced, but only because the particular test macro expects an error—the regex itself isn't causing the panic. (By the way...really appreciate the quick response/labeling of the issue submission! Thanks!) |
Further note...in our project, we're using the |
Ah I see, yeah I think I did misunderstand the issue at first glance. When I get a chance I'll dig into this and see if I can straighten everything out. Thanks for bringing it to my attention! To help me prioritize: can you clarify whether you're blocked on this particular issue or not? If I read you correctly, I think you have a work-around for now, right? |
The particular issue of having a match-none regex is solved by I think we're still getting a panic on the negated byte range, though. If I'm doing it correctly, I think I'm getting a full-on panic when I insert mat!(negated_full_byte_range_bytes, r#"[^\x00-\xff]"#, "abcde", None); into https://github.com/rust-lang-nursery/regex/blob/master/tests/bytes.rs If I'm reading things correctly, I think it might be due to the code at the end of fn parse_class(&mut self) -> Result<Build> {
/* ... */
class = self.class_transform(negated, class).canonicalize();
if class.is_empty() {
return Err(self.err(ErrorKind::EmptyClass));
}
Ok(Build::Expr(if self.flags.unicode {
Expr::Class(class)
} else {
Expr::ClassBytes(class.to_byte_class())
}))
} When the character class is for all code points, it doesn't stop at In that case, my initially posted test is incorrect, and the test I added to Regarding whether it's a blocker: It's very inconvenient. We already have to be able to catch panics when processing user-supplied data (like regexes) so that we can still provide some kind of feedback to the user. This will go into the list of one or two known panic-causing conditions. Thanks again for looking into this. Hopefully I'm no longer mixing up my bytes-vs-no-bytes discrepancies with my version discrepancies. |
For a project I'm working on, we wanted to have a simple match-none regular expression. My original attempt was to use an empty capture group. The one I constructed,
r#"[^\x00-\xff]"#
, results in a panic when used withregex::bytes::Regex
(rather than an error as in the case of the more elegantr#"[^\d\D]"#
, now that #257 has been fixed).In https://github.com/scooter-dangle/regex/tree/explicitly-empty-byte-range, I add a test for this particular case to(Per notes below, this original construction was incorrect. See https://github.com/scooter-dangle/regex/blob/panic-on-empty-byte-range-for-bytes-regex/tests/bytes.rs#L57-L61 instead.)regex-syntax/src/parser.rs
. (https://github.com/scooter-dangle/regex/blob/explicitly-empty-byte-range/regex-syntax/src/parser.rs#L2750)Note: For the original goal of a match-none regex, we currently use
r#"\z.\A"#
.The text was updated successfully, but these errors were encountered: