Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix RegEx search_all for zero length matches/lookahead #85783

Merged
merged 1 commit into from
Jan 9, 2024

Conversation

TheSofox
Copy link
Contributor

@TheSofox TheSofox commented Dec 5, 2023

Fixes #85605, fixes #73920

'search_all' in RegEx function wasn't handling zero length matches well. When one was found, it would stop and not search for any more matches. My fix simply increments the position in the string after a zero length match and proceeds as normal otherwise.

This fixes the lookahead problem in the linked issue, as lookaheads are one way of having zero lenght matches that still return a value.

Of note, I've also attached a few extra RegEx tests that check for lookahead functions, include the specific example in the linked issue.

@TheSofox TheSofox requested a review from a team as a code owner December 5, 2023 13:25
@akien-mga akien-mga changed the title Fixed RegEx search_all for zero length matches/lookahead Fixed RegEx search_all for zero length matches/lookahead Dec 5, 2023
@akien-mga akien-mga added bug topic:core cherrypick:4.2 Considered for cherry-picking into a future 4.2.x release labels Dec 5, 2023
@akien-mga akien-mga added this to the 4.3 milestone Dec 5, 2023
@TheSofox
Copy link
Contributor Author

TheSofox commented Dec 5, 2023

Damn, just experimented and noticed that this doesn't handle zero length lookbehinds very well. To be fair, the previous code also breaks at zero length lookbehinds, my code just breaks it in a different (and better?) way.

I'll see if I can amend this PR to take that into account, along with some more unit tests.

@TheSofox
Copy link
Contributor Author

TheSofox commented Dec 5, 2023

And pushed. Managed to find a solution that worked for zero length results, whether lookahead or lookbehind. Added lookbehind tests too.

While the original bug report didn't explicitly state lookbehind as an issue, it still falls under the same root issue that this PR is trying to fix (ie. Zero length results not being handled gracefully by 'search_all')

@TheSofox
Copy link
Contributor Author

TheSofox commented Dec 5, 2023

Came across another bug that this PR fixes already (this one: #73920 ). Gonna add another test case to this PR the covers the circumstance that the bug lists.

@TheSofox TheSofox force-pushed the regex-lookahead-fix branch 4 times, most recently from 8279da2 to 6534f7d Compare December 5, 2023 22:42
Copy link
Member

@adamscott adamscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested all the test cases with https://regex101.com/. It matches.

And I approve the change of logic of RegEx::search_all(), it makes more sense now and it shows how the previous logic was flawed. It could only return 1 result if it got any zero-length match.

@akien-mga
Copy link
Member

Should the documentation for RegEx be amended to clarify the change in behavior?

@adamscott
Copy link
Member

adamscott commented Dec 6, 2023

Should the documentation for RegEx be amended to clarify the change in behavior?

No? From what I read, the documentation is OK, it's just that it was bugged for some usecases, which this PR fixes.

I'll take a look if it's possible to cherry-pick this PR for 4.1.

@adamscott adamscott added the cherrypick:4.1 Considered for cherry-picking into a future 4.1.x release label Dec 6, 2023
@YuriSizov YuriSizov changed the title Fixed RegEx search_all for zero length matches/lookahead Fix RegEx search_all for zero length matches/lookahead Dec 8, 2023
@akien-mga akien-mga merged commit b7f7ca1 into godotengine:master Jan 9, 2024
15 checks passed
@akien-mga
Copy link
Member

Thanks!

@akien-mga akien-mga removed the cherrypick:4.1 Considered for cherry-picking into a future 4.1.x release label Jan 9, 2024
@thebluetropics
Copy link

thebluetropics commented Jan 15, 2024

hey thanks for the fix! 🎉

@YuriSizov YuriSizov removed the cherrypick:4.2 Considered for cherry-picking into a future 4.2.x release label Jan 25, 2024
@YuriSizov
Copy link
Contributor

Cherry-picked for 4.2.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants