Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise RewriteRules/Conds #40

Merged
merged 1 commit into from
May 10, 2021
Merged

Optimise RewriteRules/Conds #40

merged 1 commit into from
May 10, 2021

Conversation

Roy-Orbison
Copy link
Contributor

The base URL determination is substantially faster if the input starts with the expected common substring (the initial match of the RewriteRule), rather than repeatedly recalculating the backreferences until the lengths match.

A demonstration that the updated version can take take two orders of magnitude fewer steps:
original: https://regex101.com/r/Lvi7Pu/1
updated: https://regex101.com/r/RXjdVZ/1

Also simplify previous rules where the match itself is not used in a condition or rewrite.

Q A
Documentation no
Bugfix ?
BC Break no
New Feature no
RFC ?
QA no

Description

Regexps are a performance drain when unoptimised, especially when they rely of heavy backtracking and re-checking for matches. PCRE has built-in features to optimise for un/greedy matching which The BASE env var determination was not using. This patch optimises both the logic and the operators to perform much faster matching.

My only concern is that it appears as though the original did not always rewrite to the index.php file, but surely that can't be true: https://regex101.com/r/Lvi7Pu/2

I could modifiy this so that the rewritten path has only a relative prefix (except retain the / in the env var for backwards compat.), which makes more sense to me because rules are already matched against relative paths in .htaccess context.

The base URL determination is substantially faster if the input starts with the
expected common substring (the initial match of the RewriteRule), rather than
repeatedly recalculating the backreferences until the lengths match.

A demonstration that the updated version can take take two orders of magnitude
fewer steps:
original: https://regex101.com/r/Lvi7Pu/3
updated: https://regex101.com/r/RXjdVZ/3

Also simplify previous rules where the match itself is not used in a condition
or rewrite.

Signed-off-by: Roy-Orbison <Roy-Orbison@users.noreply.github.com>
@Roy-Orbison
Copy link
Contributor Author

I've corrected the regexp to retain the original behaviour which I had misunderstood. In the example of the updated version (https://regex101.com/r/RXjdVZ/3), it appears as though the third, empty path input would match where the original wouldn't (https://regex101.com/r/Lvi7Pu/3), however the associated RewriteRule has been adjusted to require at least 1 character in the path. This effectively places the same constraint on the entire match as the (.+) portion of the original RewriteCond.

@weierophinney weierophinney added the Enhancement New feature or request label May 10, 2021
@weierophinney weierophinney added this to the 3.8.0 milestone May 10, 2021
@weierophinney
Copy link
Contributor

I've tested this in a project, and works as a drop-in replacement for the existing rewrite rules - thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants