Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MatchData::find's EMPTY may still be dangling #42

Closed
cuviper opened this issue Jul 30, 2024 · 5 comments · Fixed by #43
Closed

MatchData::find's EMPTY may still be dangling #42

cuviper opened this issue Jul 30, 2024 · 5 comments · Fixed by #43

Comments

@cuviper
Copy link

cuviper commented Jul 30, 2024

#11 added an EMPTY constant to avoid the dangling pointer of an empty Vec, but there's no guarantee that this is actually a dereferenceable pointer either -- and it is not since rust-lang/rust#123936, Rust 1.79.

We're seeing this cause segfaults in ripgrep integrations tests on EPEL 9 aarch64:
https://koji.fedoraproject.org/koji/taskinfo?taskID=120840654

aarch64 build.log excerpt
failures:
---- binary::after_match1_stdin stdout ----
thread 'binary::after_match1_stdin' panicked at tests/util.rs:416:13:
==========
command failed but expected success!
Did your search end up with no results?
command: cd "/tmp/ripgrep-tests/after_match1_stdin/15" && env -u RIPGREP_CONFIG_PATH "/builddir/build/BUILD/ripgrep-14.1.0/target/rpm/deps/../rg" "--path-separator" "/" "--pcre2" "--no-mmap" "-n" "Project Gutenberg EBook"
cwd: /tmp/ripgrep-tests/after_match1_stdin/15
dir list: ["/tmp/ripgrep-tests/after_match1_stdin/15"]
status: signal: 11 (SIGSEGV) (core dumped)
stdout: 
stderr: 
==========
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
---- binary::after_match1_explicit_count stdout ----
thread 'binary::after_match1_explicit_count' panicked at tests/util.rs:416:13:
==========
command failed but expected success!
Did your search end up with no results?
command: cd "/tmp/ripgrep-tests/after_match1_explicit_count/14" && env -u RIPGREP_CONFIG_PATH "/builddir/build/BUILD/ripgrep-14.1.0/target/rpm/deps/../rg" "--path-separator" "/" "--pcre2" "--no-mmap" "-c" "Project Gutenberg EBook" "hay"
cwd: /tmp/ripgrep-tests/after_match1_explicit_count/14
dir list: ["/tmp/ripgrep-tests/after_match1_explicit_count/14", "/tmp/ripgrep-tests/after_match1_explicit_count/14/hay"]
status: signal: 11 (SIGSEGV) (core dumped)
stdout: 
stderr: 
==========
---- binary::after_match1_explicit_text stdout ----
thread 'binary::after_match1_explicit_text' panicked at tests/util.rs:416:13:
==========
command failed but expected success!
Did your search end up with no results?
command: cd "/tmp/ripgrep-tests/after_match1_explicit_text/13" && env -u RIPGREP_CONFIG_PATH "/builddir/build/BUILD/ripgrep-14.1.0/target/rpm/deps/../rg" "--path-separator" "/" "--pcre2" "--no-mmap" "-n" "--text" "Project Gutenberg EBook" "hay"
cwd: /tmp/ripgrep-tests/after_match1_explicit_text/13
dir list: ["/tmp/ripgrep-tests/after_match1_explicit_text/13", "/tmp/ripgrep-tests/after_match1_explicit_text/13/hay"]
status: signal: 11 (SIGSEGV) (core dumped)
stdout: 
stderr: 
==========
---- binary::after_match1_explicit stdout ----
thread 'binary::after_match1_explicit' panicked at tests/util.rs:416:13:
==========
command failed but expected success!
Did your search end up with no results?
command: cd "/tmp/ripgrep-tests/after_match1_explicit/12" && env -u RIPGREP_CONFIG_PATH "/builddir/build/BUILD/ripgrep-14.1.0/target/rpm/deps/../rg" "--path-separator" "/" "--pcre2" "--no-mmap" "-n" "Project Gutenberg EBook" "hay"
cwd: /tmp/ripgrep-tests/after_match1_explicit/12
dir list: ["/tmp/ripgrep-tests/after_match1_explicit/12", "/tmp/ripgrep-tests/after_match1_explicit/12/hay"]
status: signal: 11 (SIGSEGV) (core dumped)
stdout: 
stderr: 
==========
---- binary::after_match1_implicit_count_binary stdout ----
thread 'binary::after_match1_implicit_count_binary' panicked at tests/util.rs:416:13:
==========
command failed but expected success!
Did your search end up with no results?
command: cd "/tmp/ripgrep-tests/after_match1_implicit_count_binary/18" && env -u RIPGREP_CONFIG_PATH "/builddir/build/BUILD/ripgrep-14.1.0/target/rpm/deps/../rg" "--path-separator" "/" "--pcre2" "--no-mmap" "-c" "--binary" "Project Gutenberg EBook" "-g" "hay"
cwd: /tmp/ripgrep-tests/after_match1_implicit_count_binary/18
dir list: ["/tmp/ripgrep-tests/after_match1_implicit_count_binary/18", "/tmp/ripgrep-tests/after_match1_implicit_count_binary/18/hay"]
status: signal: 11 (SIGSEGV) (core dumped)
stdout: 
stderr: 
==========
---- binary::after_match1_implicit_text stdout ----
thread 'binary::after_match1_implicit_text' panicked at tests/util.rs:416:13:
==========
command failed but expected success!
Did your search end up with no results?
command: cd "/tmp/ripgrep-tests/after_match1_implicit_text/21" && env -u RIPGREP_CONFIG_PATH "/builddir/build/BUILD/ripgrep-14.1.0/target/rpm/deps/../rg" "--path-separator" "/" "--pcre2" "--no-mmap" "-n" "--text" "Project Gutenberg EBook" "-g" "hay"
cwd: /tmp/ripgrep-tests/after_match1_implicit_text/21
dir list: ["/tmp/ripgrep-tests/after_match1_implicit_text/21", "/tmp/ripgrep-tests/after_match1_implicit_text/21/hay"]
status: signal: 11 (SIGSEGV) (core dumped)
stdout: 
stderr: 
==========
---- binary::after_match2_implicit stdout ----
thread 'binary::after_match2_implicit' panicked at tests/util.rs:416:13:
==========
command failed but expected success!
Did your search end up with no results?
command: cd "/tmp/ripgrep-tests/after_match2_implicit/19" && env -u RIPGREP_CONFIG_PATH "/builddir/build/BUILD/ripgrep-14.1.0/target/rpm/deps/../rg" "--path-separator" "/" "--pcre2" "--no-mmap" "-n" "Project Gutenberg EBook|a medical student" "-g" "hay"
cwd: /tmp/ripgrep-tests/after_match2_implicit/19
dir list: ["/tmp/ripgrep-tests/after_match2_implicit/19", "/tmp/ripgrep-tests/after_match2_implicit/19/hay"]
status: signal: 11 (SIGSEGV) (core dumped)
stdout: 
stderr: 
==========
---- binary::after_match1_implicit_binary stdout ----
thread 'binary::after_match1_implicit_binary' panicked at tests/util.rs:416:13:
==========
command failed but expected success!
Did your search end up with no results?
command: cd "/tmp/ripgrep-tests/after_match1_implicit_binary/17" && env -u RIPGREP_CONFIG_PATH "/builddir/build/BUILD/ripgrep-14.1.0/target/rpm/deps/../rg" "--path-separator" "/" "--pcre2" "--no-mmap" "-n" "--binary" "Project Gutenberg EBook" "-g" "hay"
cwd: /tmp/ripgrep-tests/after_match1_implicit_binary/17
dir list: ["/tmp/ripgrep-tests/after_match1_implicit_binary/17", "/tmp/ripgrep-tests/after_match1_implicit_binary/17/hay"]
status: signal: 11 (SIGSEGV) (core dumped)
stdout: 
stderr: 
==========
---- binary::after_match2_implicit_text stdout ----
thread 'binary::after_match2_implicit_text' panicked at tests/util.rs:416:13:
==========
command failed but expected success!
Did your search end up with no results?
command: cd "/tmp/ripgrep-tests/after_match2_implicit_text/25" && env -u RIPGREP_CONFIG_PATH "/builddir/build/BUILD/ripgrep-14.1.0/target/rpm/deps/../rg" "--path-separator" "/" "--pcre2" "--no-mmap" "-n" "--text" "Project Gutenberg EBook|a medical student" "-g" "hay"
cwd: /tmp/ripgrep-tests/after_match2_implicit_text/25
dir list: ["/tmp/ripgrep-tests/after_match2_implicit_text/25", "/tmp/ripgrep-tests/after_match2_implicit_text/25/hay"]
status: signal: 11 (SIGSEGV) (core dumped)
stdout: 
stderr: 
==========
---- binary::after_match1_implicit stdout ----
thread 'binary::after_match1_implicit' panicked at tests/util.rs:416:13:
==========
command failed but expected success!
Did your search end up with no results?
command: cd "/tmp/ripgrep-tests/after_match1_implicit/23" && env -u RIPGREP_CONFIG_PATH "/builddir/build/BUILD/ripgrep-14.1.0/target/rpm/deps/../rg" "--path-separator" "/" "--pcre2" "--no-mmap" "-n" "Project Gutenberg EBook" "-g" "hay"
cwd: /tmp/ripgrep-tests/after_match1_implicit/23
dir list: ["/tmp/ripgrep-tests/after_match1_implicit/23", "/tmp/ripgrep-tests/after_match1_implicit/23/hay"]
status: signal: 11 (SIGSEGV) (core dumped)
stdout: 
stderr: 
==========
---- binary::before_match1_explicit stdout ----
thread 'binary::before_match1_explicit' panicked at tests/util.rs:416:13:
==========
command failed but expected success!
Did your search end up with no results?
command: cd "/tmp/ripgrep-tests/before_match1_explicit/27" && env -u RIPGREP_CONFIG_PATH "/builddir/build/BUILD/ripgrep-14.1.0/target/rpm/deps/../rg" "--path-separator" "/" "--pcre2" "--no-mmap" "-n" "Heaven" "hay"
cwd: /tmp/ripgrep-tests/before_match1_explicit/27
dir list: ["/tmp/ripgrep-tests/before_match1_explicit/27", "/tmp/ripgrep-tests/before_match1_explicit/27/hay"]
status: signal: 11 (SIGSEGV) (core dumped)
stdout: 
stderr: 
==========
---- binary::before_match2_explicit stdout ----
thread 'binary::before_match2_explicit' panicked at tests/util.rs:416:13:
==========
command failed but expected success!
Did your search end up with no results?
command: cd "/tmp/ripgrep-tests/before_match2_explicit/35" && env -u RIPGREP_CONFIG_PATH "/builddir/build/BUILD/ripgrep-14.1.0/target/rpm/deps/../rg" "--path-separator" "/" "--pcre2" "--no-mmap" "-n" "a medical student" "hay"
cwd: /tmp/ripgrep-tests/before_match2_explicit/35
dir list: ["/tmp/ripgrep-tests/before_match2_explicit/35", "/tmp/ripgrep-tests/before_match2_explicit/35/hay"]
status: signal: 11 (SIGSEGV) (core dumped)
stdout: 
stderr: 
==========
---- binary::before_match1_implicit_text stdout ----
thread 'binary::before_match1_implicit_text' panicked at tests/util.rs:416:13:
==========
command failed but expected success!
Did your search end up with no results?
command: cd "/tmp/ripgrep-tests/before_match1_implicit_text/34" && env -u RIPGREP_CONFIG_PATH "/builddir/build/BUILD/ripgrep-14.1.0/target/rpm/deps/../rg" "--path-separator" "/" "--pcre2" "--no-mmap" "-n" "--text" "Heaven" "-g" "hay"
cwd: /tmp/ripgrep-tests/before_match1_implicit_text/34
dir list: ["/tmp/ripgrep-tests/before_match1_implicit_text/34", "/tmp/ripgrep-tests/before_match1_implicit_text/34/hay"]
status: signal: 11 (SIGSEGV) (core dumped)
stdout: 
stderr: 
==========
---- binary::before_match1_implicit_binary stdout ----
thread 'binary::before_match1_implicit_binary' panicked at tests/util.rs:416:13:
==========
command failed but expected success!
Did your search end up with no results?
command: cd "/tmp/ripgrep-tests/before_match1_implicit_binary/32" && env -u RIPGREP_CONFIG_PATH "/builddir/build/BUILD/ripgrep-14.1.0/target/rpm/deps/../rg" "--path-separator" "/" "--pcre2" "--no-mmap" "-n" "--binary" "Heaven" "-g" "hay"
cwd: /tmp/ripgrep-tests/before_match1_implicit_binary/32
dir list: ["/tmp/ripgrep-tests/before_match1_implicit_binary/32", "/tmp/ripgrep-tests/before_match1_implicit_binary/32/hay"]
status: signal: 11 (SIGSEGV) (core dumped)
stdout: 
stderr: 
==========
---- binary::before_match2_implicit_text stdout ----
thread 'binary::before_match2_implicit_text' panicked at tests/util.rs:416:13:
==========
command failed but expected success!
Did your search end up with no results?
command: cd "/tmp/ripgrep-tests/before_match2_implicit_text/39" && env -u RIPGREP_CONFIG_PATH "/builddir/build/BUILD/ripgrep-14.1.0/target/rpm/deps/../rg" "--path-separator" "/" "--pcre2" "--no-mmap" "-n" "--text" "a medical student" "-g" "hay"
cwd: /tmp/ripgrep-tests/before_match2_implicit_text/39
dir list: ["/tmp/ripgrep-tests/before_match2_implicit_text/39", "/tmp/ripgrep-tests/before_match2_implicit_text/39/hay"]
status: signal: 11 (SIGSEGV) (core dumped)
stdout: 
stderr: 
==========
---- feature::f740_passthru stdout ----
thread 'feature::f740_passthru' panicked at tests/util.rs:416:13:
==========
command failed but expected success!
Did your search end up with no results?
command: cd "/tmp/ripgrep-tests/f740_passthru/163" && env -u RIPGREP_CONFIG_PATH "/builddir/build/BUILD/ripgrep-14.1.0/target/rpm/deps/../rg" "--path-separator" "/" "--pcre2" "-n" "--passthru" "foo" "file"
cwd: /tmp/ripgrep-tests/f740_passthru/163
dir list: ["/tmp/ripgrep-tests/f740_passthru/163", "/tmp/ripgrep-tests/f740_passthru/163/file", "/tmp/ripgrep-tests/f740_passthru/163/patterns"]
status: signal: 11 (SIGSEGV) (core dumped)
stdout: 
stderr: 
==========
---- regression::r1559 stdout ----
thread 'regression::r1559' panicked at tests/util.rs:416:13:
==========
command failed but expected success!
Did your search end up with no results?
command: cd "/tmp/ripgrep-tests/r1559/488" && env -u RIPGREP_CONFIG_PATH "/builddir/build/BUILD/ripgrep-14.1.0/target/rpm/deps/../rg" "--path-separator" "/" "--pcre2" "TaskID +int"
cwd: /tmp/ripgrep-tests/r1559/488
dir list: ["/tmp/ripgrep-tests/r1559/488", "/tmp/ripgrep-tests/r1559/488/foo"]
status: signal: 11 (SIGSEGV) (core dumped)
stdout: 
stderr: 
==========
failures:
    binary::after_match1_explicit
    binary::after_match1_explicit_count
    binary::after_match1_explicit_text
    binary::after_match1_implicit
    binary::after_match1_implicit_binary
    binary::after_match1_implicit_count_binary
    binary::after_match1_implicit_text
    binary::after_match1_stdin
    binary::after_match2_implicit
    binary::after_match2_implicit_text
    binary::before_match1_explicit
    binary::before_match1_implicit_binary
    binary::before_match1_implicit_text
    binary::before_match2_explicit
    binary::before_match2_implicit_text
    feature::f740_passthru
    regression::r1559
test result: FAILED. 282 passed; 17 failed; 0 ignored; 0 measured; 0 filtered out; finished in 1.29s

This was compiled against the RHEL 9's libpcre2, version 10.40. When debugging, I found calls to pcre2_match_8 with the subject pointer 1, length 0. Since it only crashed on aarch64, I'm guessing there's something in the aarch64 JIT that wasn't handling this properly. However, when I copy this exact rg binary to a current Fedora system with newer pcre2, it passes.

#10 noted that upstream pcre2 also made fixes, although the vcs link is now dead, but that would have long predated 10.40 in April 2022. So I'm not sure what further fix makes it work in Fedora.

@BurntSushi
Copy link
Owner

I can't make sense of the failure modes you're seeing, but what's the fix for this? My read of rust-lang/rust#123936 seems to suggest that switching from const to static wouldn't necessarily do it either? So I guess this reduces to, "how do I get a dereferencable pointer to an always-empty slice"? Which... kinda seems like a wrong question to ask, because why would you dereference it?

I guess to some extent this is a bug in PCRE2. We hand it a pointer and a length, and even though the length is zero, PCRE2 still dereferences it. Since Fedora has the newer PCRE2, I would guess that's why you don't see the bad behavior on Fedora?

IDK, I'm happy to try and fix things on my end to the extent possible, I'm just not sure what. I suppose we could pass &[0].as_ptr() instead right? The length given will still be 0, but we'd be passing a pointer that has to be dereferencable.

@cuviper
Copy link
Author

cuviper commented Jul 30, 2024

I guess to some extent this is a bug in PCRE2. We hand it a pointer and a length, and even though the length is zero, PCRE2 still dereferences it. Since Fedora has the newer PCRE2, I would guess that's why you don't see the bad behavior on Fedora?

Yeah, that's what I figure too, but I don't know enough about that library to really track it down. I'm especially not keen to dig into its JIT details.

IDK, I'm happy to try and fix things on my end to the extent possible, I'm just not sure what. I suppose we could pass &[0].as_ptr() instead right? The length given will still be 0, but we'd be passing a pointer that has to be dereferencable.

I think that should work, since provenance of that pointer will cover real memory while it's passed to FFI, length aside.

BurntSushi added a commit that referenced this issue Jul 30, 2024
To work around likely bugs in (older versions of) PCRE2. Namely, at one
point, PCRE2 would dereference the haystack pointer even when the length
was zero.

This was reported in #10 and we worked around this in #11 by passing a
pointer to a const `&[]`, with the (erroneous) presumption that this
would be a valid pointer to dereference. In retrospect though, this was
a little silly, because you should never be dereferencing a pointer to
an empty slice. It's not valid. Alas, at that time, Rust did actually
hand you a valid pointer that could be dereferenced. But [this
PR][rust-pull] changed that. And thus, we're back to where we started:
handing buggy versions of PCRE2 a zero length haystack with a dangling
pointer.

So we fix this once and for all by passing a slice of length 1, but with
a haystack length of 0, to the PCRE2 search routine when searching an
empty haystack. This will guarantee the provision of a dereferencable
pointer should PCRE2 decide to dereference it.

Fixes #42

[rust-pull]: rust-lang/rust#123936
@BurntSushi
Copy link
Owner

OK, should be fixed in pcre2 0.2.9. Godspeed.

@cuviper
Copy link
Author

cuviper commented Jul 30, 2024

Thanks! I confirmed in a local build that 0.2.9 does fix the ripgrep tests. I'll report back if anything else comes up in our full EPEL build environment.

@cuviper
Copy link
Author

cuviper commented Jul 31, 2024

FYI, our PCRE2 maintainer found the upstream fix -- links for cross-reference:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants