Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Throw at compile time for patterns that resemble infinite recursion #29

Open
slevithan opened this issue Feb 11, 2025 · 0 comments
Open

Comments

@slevithan
Copy link
Owner

slevithan commented Feb 11, 2025

Oniguruma throws error "never ending recursion" at compile time for patterns that contain what resembles infinite recursion. Oniguruma-To-ES should do the same.

From the readme:

Oniguruma's recursion depth limit is 20. Oniguruma-To-ES uses the same limit by default but allows customizing it via the rules.recursionLimit option. [...] Patterns that would trigger an infinite recursion error in Oniguruma might find a match in Oniguruma-To-ES (since recursion is bounded), but future versions will detect this and error at transpilation time.

More details:

  • The error occurs at compile time. It's not a runtime check on recursion depth.
    • Ex: (?!)\g<0> will throw even though it can never match.
  • This is based on resemblance to infinite recursion, since in fact Oniguruma always caps recursion depth at 20. It flags patterns that would recurse infinitely if the allowed recursion depth was infinite.
  • Although not detecting and throwing for things that look like infinite recursion creates an inconsistency with Oniguruma, there's no risk for performance/memory problems, and it doesn't throw at regex-search time. That's because:
    1. Recursion is in fact always capped (in both Oniguruma and Oniguruma-To-ES), and…
    2. Oniguruma-To-ES already doesn't allow multiple overlapping recursions (which would enable crafting valid patterns with exponential expansion).

Implementation:

  • The first step for this should be moving detection of which subroutines are recursive from the transformer to the parser (maybe adding a boolean recursive property or similar on Subroutine nodes).
    • See use of function createRecursion in the transformer for the current places where recursive subroutines are detected.
  • Then the parser should additionally check whether the recursion is infinite.
    • Needs to be done in the parser so it can throw even if just generating an OnigurumaAst (rather than transpiling to JS).

Things that can cause infinite recursion:

  • Recursion token not preceded (in the recursed [sub]pattern) by any required, consumptive tokens.
    • Always leads to infinite recursion, even if the recursion is zero-min-quantified, among multiple alternation options, or nested within groups.
    • Ex: (?i)\K(?=a)\g<0>? throws.
    • Ex: a?\g<0>? throws, but a\g<0>? is okay.
    • Ex: (a|\g<0>?) throws, but a(a|\g<0>?) is okay.
    • Ex: (a(\g<2>)?) throws, but (a(\g<1>)?) is okay.
    • Ex: a|\g<0>? throws, but a|a\g<0>? is okay.
    • Ex: a(|\g<1>?) and a(a|\g<1>?) throw, but a(a\g<1>?) is okay.
  • No way out from recursion.
    • Not zero-min-quantified or among other alternation options.
    • Needs to consider whether any containing group is zero-min quantified or among other alternation options.
    • Ex: a\g<0> and a(\g<0>) throw, but a\g<0>?, a(\g<0>)?, a(\g<0>+)?, a(|\g<0>), a(\g<0>|), and a((\g<0>)|) are okay.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant