Core: Fixed bug with greedy matching #2632

RunDevelopment · 2020-11-13T12:54:09Z

I stumble across this when trying to implement #2115. The problem is that greedy matching is disabled for the last element of the token stream. This is an optimization but it will only correct if the pattern doesn't use lookbehinds (either native lookbehinds or assertions; Prism lookbehind groups are fine).

The fix is to simply remove the optimization. This shouldn't affect performance. Without the optimization in place, all the loops that find the position to insert the greedy token at will short-circuit because there are no tokens after the last token. I honestly don't understand why this optimization was there in the first place.

The test case illustrates the bug in action.

In the first round for the test bab, all substrings matching /a/ will be tokenized. The resulting token stream will be:

[
	"b",
	["a", "a"],
	"b"
]

In the second round, all greedy matches of /^b/. Obviously, the b at the start of the string matches, so we will get the token stream:

[
	["b", "b"],
	["a", "a"],
	"b"
]

But matching isn't done yet. After the a token is skipped, we reach the last b. Since it's the last item in the token stream currentNode != tokenList.tail.prev will be false, so we will match it as if it wasn't greedy. This is a problem because it means that the regex will be executed like a non-greedy pattern with the following settings:

var pattern = /^b/g;
pattern.lastIndex = 0;
var match = pattern.exec("b");

A match will be found because the ^ assertion matches the start of the substring.

This is incorrect behavior. Greedy patterns always have to be matched against the whole string.

Core: Fixed bug with greedy matching

deb50ef

RunDevelopment added bug needs review core labels Nov 13, 2020

RunDevelopment mentioned this pull request Nov 13, 2020

Markdown front matter #2634

Merged

RunDevelopment merged commit 8fa8dd2 into PrismJS:master Nov 25, 2020

RunDevelopment deleted the core-greedy-tail-string-bug branch November 25, 2020 21:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Core: Fixed bug with greedy matching #2632

Core: Fixed bug with greedy matching #2632

RunDevelopment commented Nov 13, 2020

Core: Fixed bug with greedy matching #2632

Core: Fixed bug with greedy matching #2632

Conversation

RunDevelopment commented Nov 13, 2020