-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RegexRouteConstraint should use Compiled Regex #46154
Comments
Thanks for contacting us. We're moving this issue to the |
Reopening because I want to make sure we haven't missed anything. Copied from PR comment: I looked back in time, and we've never used
Maybe |
Agreed, thanks for re-opening this @JamesNK. Let's make sure we have performance coverage here. |
These constraints were copied and pasted from MVC 5. I mean |
Consider the opposite case though, if you are heavily using regex constraints in your routes, interpreting the regex for every request is going to slow your server down. Especially as the URL input length grows. |
I get the benefit. However, some customers build huge apps with many routes. Easily thousands. Some are tens of thousands. Hundreds of thousands? Probably. I'd like to understand the impact of compiling route regexes on startup. Inventing some numbers: A 10 ms improvement per request is great, but not if the start-up time increases by 30 seconds. There might be some things we can do to get the best of both worlds. For example, create |
I don’t know the code, so this is probably wacky, but could one giant Regex be made out of all these patterns or-ed together, each a named capturing group, then there’s only one pattern compiled and one run matching per request? |
Didn’t mean to close |
Cc @stephentoub |
Routing uses a DFA graph, similar to what Regex uses if URL paths look like The regex constraints are just run against a segment. For example, We could combine regexes if there are multiple constraints on one parameter, e.g. |
@stephentoub How much sharing happens between different app.MapGet("api/customer/{name:regex([A-Z])}");
app.MapGet("api/product/{name:regex([A-Z])}"); |
With the regex ctors, zero. Every instance is its own thing. If the exact same inputs are fed into the regex ctor twice, it'll do the exact same work twice, including any ref emit work if Compiled is specified. (The regex static methods employ an mru cache for the whole regex instance, but the cache by default is small and isn't intended for this sort of workload.) |
So there's one DFA with potentially 10's of thousands of edges, and potentially thousands of those may be expressed as patterns -- presumably some of those edges are far hotter than others, right? So some kind of MRU cache could work very well -- or a counter threshold for compiling, like tiering. I assume this is what you're suggesting. |
I believe a regex for a route is shared between all the edges generated for that route (should confirm). For example, A situation that would cause multiple instances is combining a route, such as from an attribute on a controller or a minimal API group. The example below duplicates the [Route("{apiVersion:regex(^v[0-9])}"]
public class MyController
{
[HttpGet("action1")] // resolved to "{apiVersion:regex(^v[0-9])}/action1"
public object Action1(string apiVersion, ...) { ... }
// Repeat for hundreds of other actions in the controller.
} |
Why not share it in that case, too? They seem like two different ways of expressing the same concept, I'm surprised they end up with different implementation approaches. Even with interpreted regexes, there's potentially a lot of work done as part of construction. |
ASP.NET Core has two routing concepts: convention-based routing and attribute routing.
There are big differences in how the endpoints and routes are calculated between the two approaches. It hasn't mattered before* that attribute routing generates more routes and constructs more *It might have mattered, but no customers have reported it.
Absolutely. And because of how routing works, we can share everywhere in routing. Routes are known and calculated at startup. We know all the routes in the app and all their regexes when we start building the DFA. A regex cache in the DFA builder can eliminate duplicates. |
Test app with 30,000 regex routes: using System.Diagnostics;
var builder = WebApplication.CreateBuilder(args);
var app = builder.Build();
app.Use(async (HttpContext context, Func<Task> next) =>
{
Console.WriteLine("Start time");
Stopwatch stopwatch = Stopwatch.StartNew();
await next();
stopwatch.Stop();
Console.WriteLine(stopwatch.Elapsed.TotalSeconds);
});
app.UseRouting();
Task Plaintext(HttpContext context) => context.Response.WriteAsync("Hello, World!");
for (int i = 0; i < 30_000; i++)
{
var url = "/plaintext/nested" + i + "/{riskId:regex(^\\d{{7}}|(SI[[PG]]|JPA|DEM)\\d{{4}})}";
app.MapGet(url, Plaintext);
}
app.MapGet("/", (HttpContext context) =>
{
return context.Response.WriteAsync("Hello world");
});
Console.WriteLine("Running app");
app.Run(); Time to first request:
App memory usage:
I think we should:
That will bring first request and memory usage down to "no regex" results for most apps. We get our cake and eat it too: improved startup performance and regex per-request performance. There still might be some edge-case apps out there that perform worst. They have lots of unique regex routes, and they all get visited, and memory usage is a problem. Routing is configurable so we can advise them to configure the regex constraint with an implementation that doesn't call |
Does that push regex coupling further into core of routing? Will that negatively impact the ability to trim regex when it's not being used? |
Good point. I'm sure we can come up with something. A couple of ideas:
Option 2 could create a small general perf boost on startup in apps with lots of constraints. Right now, these constraints are created by reflection. We could skip that for duplicates we know aren't stateful. For example, reusing the same regex constraint would save 29,999 |
@surayya-MS From the discussion in our call, here is an example of configuring the public class NonCompiledRegexInlineRouteConstraint : RegexRouteConstraint
{
public NonCompiledRegexInlineRouteConstraint(string regexPattern)
: base(new Regex(regexPattern, RegexOptions.CultureInvariant | RegexOptions.IgnoreCase))
{
}
}
var builder = WebApplication.CreateBuilder(args);
builder.Services.AddRouting(options =>
{
options.ConstraintMap["regex"] = typeof(NonCompiledRegexInlineRouteConstraint);
}); This is the escape hatch we can recommend to customers if, even after improvements, their app has problems with compiled regexes. |
Today, when using a
regex
route constraint, we create a new Regex:aspnetcore/src/Http/Routing/src/Constraints/RegexRouteConstraint.cs
Lines 40 to 43 in e523876
This should use the
RegexOptions.Compiled
as well. That way we aren't interpreting these regular expressions every time the route is inspected.Our docs even say this should be using Compiled:
The text was updated successfully, but these errors were encountered: