-
Notifications
You must be signed in to change notification settings - Fork 446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Index out of bounds panic when using shortest_match_at
, is_match_at
with end-anchored regex
#969
Labels
Comments
wabain
changed the title
Index out of bounds panic when using
Index out of bounds panic when using Mar 24, 2023
shortest_match_at
with end-anchored regexshortest_match_at
, is_match_at
with end-anchored regex
BurntSushi
added a commit
that referenced
this issue
Mar 24, 2023
In turns out that in *some* calls to Fsm::reverse, we were passing an incorrect start offset. Namely, the haystack we pass is sub-sliced at `&text[start..]`, but in some places, we were passing `text.len()` as the start offset of the reverse search. But of course, it should be `text.len() - start`. This was indeed the case in most places, but it looks like it needed to be corrected in two additional places. I've also added this test to regex-automata's set of regression tests and can confirm that it doesn't happen there. (regex-automata is far more principled about handling offsets like this.) Fixes #969
Nice find! I've got a PR up in #970. You were right that it was something about the reverse DFA. In some places where it is called, the offsets passed to it were incorrect. This only happened in some calls and only when the start of the search wasn't |
BurntSushi
added a commit
that referenced
this issue
Mar 25, 2023
In turns out that in *some* calls to Fsm::reverse, we were passing an incorrect start offset. Namely, the haystack we pass is sub-sliced at `&text[start..]`, but in some places, we were passing `text.len()` as the start offset of the reverse search. But of course, it should be `text.len() - start`. This was indeed the case in most places, but it looks like it needed to be corrected in two additional places. I've also added this test to regex-automata's set of regression tests and can confirm that it doesn't happen there. (regex-automata is far more principled about handling offsets like this.) Fixes #969
This bug is fixed in |
crapStone
pushed a commit
to Calciumdibromid/CaBr2
that referenced
this issue
Apr 11, 2023
This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [regex](https://github.com/rust-lang/regex) | dependencies | patch | `1.7.1` -> `1.7.3` | --- ### Release Notes <details> <summary>rust-lang/regex</summary> ### [`v1.7.3`](https://github.com/rust-lang/regex/blob/HEAD/CHANGELOG.md#​173-2023-03-24) [Compare Source](rust-lang/regex@1.7.2...1.7.3) \================== This is a small release that fixes a bug in `Regex::shortest_match_at` that could cause it to panic, even when the offset given is valid. Bug fixes: - [BUG #​969](rust-lang/regex#969): Fix a bug in how the reverse DFA was called for `Regex::shortest_match_at`. ### [`v1.7.2`](https://github.com/rust-lang/regex/blob/HEAD/CHANGELOG.md#​172-2023-03-21) [Compare Source](rust-lang/regex@1.7.1...1.7.2) \================== This is a small release that fixes a failing test on FreeBSD. Bug fixes: - [BUG #​967](rust-lang/regex#967): Fix "no stack overflow" test which can fail due to the small stack size. </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNS4xOS4wIiwidXBkYXRlZEluVmVyIjoiMzUuNDAuMiJ9--> Co-authored-by: cabr2-bot <cabr2.help@gmail.com> Reviewed-on: https://codeberg.org/Calciumdibromid/CaBr2/pulls/1827 Reviewed-by: crapStone <crapstone@noreply.codeberg.org> Co-authored-by: Calciumdibromid Bot <cabr2_bot@noreply.codeberg.org> Co-committed-by: Calciumdibromid Bot <cabr2_bot@noreply.codeberg.org>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
What version of regex are you using?
1.7.2, 1.7.1
Describe the bug at a high level.
Regex::shortest_match_at
panics for certain regex patterns when given seemingly valid inputs. From the stack trace it looks like it might be specific to patterns that generate reversed FSMs.What are the steps to reproduce the behavior?
(playground)
The same backtrace occurs if
shortest_match_at
is replaced withis_match_at
.What is the actual behavior?
What is the expected behavior?
I believe the output should be
Some(6)
in this case. Since 4 is the index of a character boundary in the input text I think that it certainly shouldn't panic.The text was updated successfully, but these errors were encountered: