Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

False positive in invalid_regex with unicode class and bytes regex #6005

Closed
michaelsproul opened this issue Sep 4, 2020 · 0 comments · Fixed by #6132
Closed

False positive in invalid_regex with unicode class and bytes regex #6005

michaelsproul opened this issue Sep 4, 2020 · 0 comments · Fixed by #6132
Labels
C-bug Category: Clippy is not doing the correct thing

Comments

@michaelsproul
Copy link
Contributor

michaelsproul commented Sep 4, 2020

The invalid_regex lint incorrectly identifies a valid regular expression involving a unicode general category as invalid, when written as a raw string as the argument to regex::bytes::Regex::new.

Minimal example:

extern crate regex;

use regex::bytes::Regex;

fn main() {
    let re = Regex::new(r"\p{C}").unwrap();
    let text = "hello world\0";
    let processed_text = String::from_utf8(re.replace_all(text.as_bytes(), &b""[..]).to_vec()).unwrap();
    println!("{:?}", processed_text);
}

Playground link: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=b4cfb83fe8ffa625e5c5881b05a89dbc

Clippy produces this error:

Checking playground v0.0.1 (/playground)
error: regex syntax error: Unicode not allowed here
 --> src/main.rs:6:27
  |
6 |     let re = Regex::new(r"\p{C}").unwrap();
  |                           ^^^^^
  |
  = note: `#[deny(clippy::invalid_regex)]` on by default
  = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#invalid_regex

error: aborting due to previous error

Clippy seems to be OK with:

Meta

  • regex crate v1.3.9
  • cargo clippy -V: clippy 0.0.212 (0d0f6b1 2020-09-03)
  • rustc -Vv:
    rustc 1.46.0 (04488afe3 2020-08-24)
    binary: rustc
    commit-hash: 04488afe34512aa4c33566eb16d8c912a3ae04f9
    commit-date: 2020-08-24
    host: x86_64-unknown-linux-gnu
    release: 1.46.0
    LLVM version: 10.0
    
@michaelsproul michaelsproul added the C-bug Category: Clippy is not doing the correct thing label Sep 4, 2020
bors added a commit that referenced this issue Oct 8, 2020
Fix unicode regexen with bytes::Regex

fixes #6005

The rationale for this is that since we wrote that lint, `bytes::Regex` was extended to be able to use unicode character classes.

---

changelog: [`invalid_regex`]: allow unicode character classes in bytes regex.
@bors bors closed this as completed in 1167257 Oct 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: Clippy is not doing the correct thing
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant