-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[UNDERTOW-2303] Introduced a new, faster utility for routing path templates #1566
base: main
Are you sure you want to change the base?
[UNDERTOW-2303] Introduced a new, faster utility for routing path templates #1566
Conversation
@dirkroets its failing with: This means that spotbugs has a problem with it. Enable it locally with "-Dfindbugs". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will look at rest of the code once I get some free time.
Objects.requireNonNull(uriTemplate); | ||
Objects.requireNonNull(handler); | ||
|
||
// Router builders are not thread-safe, so we need to synchronize. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a quick glance:
1.Why would the need to be?
2. in which cases this would be needed?
3. Im not sure if such design of builder is optimal. AFAIR in other builders are context one-offs
4. AFAIR, in general, if there is no need to sync, we dont do it, rather rely on local var/single failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The short answer is for backwards compatibility. The original PathTemplateHandler class (also PathTemplateRouter) provided synchronised mutation of the underlying routes.
The longer answer:
- The underlying router itself is immutable and therefore thread-safe. The builder that creates the router is not thread-safe as it is intended to be used from a single thread to instantiate a router. For most use cases that I can think of the routes (path templates) should be discovered/added during start-up of a service. Once discovered, a router can be built and then used to serve many many requests. Therefore, I believe that in most services instantiating a router is a once-off process / invocation vs probably millions of invocations of the route method. For this reason I believe that overall performance is served well by a slightly more expensive (computationally) instantiation process vs a very cheap routing process. Yet, this does not mean that calling the builder must be synchronised.... in my opinion most use cases must instantiate a router and pass it to an immutable handler such as the new PathTemplateRouterHandler class - which I provided as an example. We can add/offer builders for handlers - for convenience sake - that do the same thing as PathTemplateHandler and PathTemplateRouter, but that create immutable instances of those handlers instead of the current implementations that support mutating the underlying routers. This would however be a breaking change for developers who use and mutate the current implementations, so my recommendation would be to leave the current implementations as they are - for backwards compatibility - to deprecate them and to offer the new pattern as new, immutable handler classes that developers can switch to.
- I believe there are very very few use cases that require synchronised mutation of the routing handlers. For those use cases, developers should implement their own synchronised mutation of underlying handlers. As I have mentioned before, I just implemented the synchronised mutation to be consistent with the previous implementations in order to avoid introducing breaking changes.
- I agree that this is not optimal, but I am not sure if we are prepared to introduce this as a breaking change?
- Agreed. In this case there should be no need to sync for 99%+ of use cases.
As a last note - in case it is not that obvious from the code and in case we want to keep the backwards compatibility - the synchronisation only happens when the underlying routers are modified, so when new path templates are added. It only synchronises the mutations amongst multiple threads trying to mutate the routes concurrently. The actual routing of requests are never synchronised - not even when other threads are busy mutating the routers. For this reason I believe that it is okay to leave the synchronisation in order to maintain backwards compatibility. And then perhaps to deprecate and offer alternative classes in a next PR?
if (paths.size() == 1) { | ||
return "path-template( " + paths.toArray()[0] + " )"; | ||
final List<PathTemplateRouter.PatternEqualsAdapter<PathTemplateRouter.Template<Supplier<HttpHandler>>>> templates; | ||
synchronized (lock) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
toString impeding normal operation is not ideal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I have updated this. I believe that this toString method is probably only used for debugging purposes due to the size of the string produced (Consistent with the original implementation). This may now produce results that aren't entirely consistent when called concurrently with routes being mutated, but that is also consistent with how the original implementation worked.
@@ -0,0 +1,72 @@ | |||
/* | |||
* JBoss, Home of Professional Open Source. | |||
* Copyright 2014 Red Hat, Inc., and individual contributors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, 2014, looks like we are traveling back in time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I just copied and pasted the header from another class and missed the instruction to update the dates. Will keep this in mind in future. Will be fixed when I update the PR
* A handler that matches URI templates. | ||
* | ||
* @author Dirk Roets dirkroets@gmail.com | ||
* @since 2023-07-20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we are supposed to be so pedantic to require bump on this as well. @fl4via will know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dirkroets @baranowb because it will be merged in 2024, I think we should either bump it or just omit it, keeping in mind that we are going to have 2024 in the copyright header above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dirkroets Just FYI, to fully verify CI in your machine run: |
I stand corrected, to really fully verify CI in your machine you will also need to run the tests with IPv6 option, however, you need to do this only if your fix editted something related to ipv6, which usually is not the case: |
@fl4via @baranowb I have added a second commit to this PR that fixes the CI issues and with some documentation improvements. According to the documentation one can edit the previous commit - provided that the previous commit was not a big commit. I am not sure what qualifies as a big commit, so I played it safe and added a second commit. |
@dirkroets Cool. I will try to reserve some time to crunch through this change. |
Thanks, @baranowb . Let me know if you have any questions or if there is anything I can assist with. |
I slowly make progress with this. I will most likely have bigger window to go through bulk of it in ~2weeks. |
Hey. Sorry for delay, I wasnt able to login last week. Anyway. Pretty neat design. Im going to add general comment here and some inline with review. I did not copy/paste comments when it either was done above or is part of general review.
Maintaintability: 2a) Example values, concepts need to be explained, ie for PathTemplateRouter.Template + usage/expected results
|
* was specifically to provide something that is very fast even at the expense of maintainable code. The router | ||
* does a very simple thing and there should arguably be no need to constantly work on this code. | ||
*/ | ||
public class PathTemplateRouter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confusing declaration, there is inteface called Router, yet, PathTemplateRouter does not implement it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I have made several structural changes in the next update along with improvements to documentation.
//<editor-fold defaultstate="collapsed" desc="PatternElement inner class"> | ||
/** | ||
* Interface for elements that represent a URL path pattern. The objective of this interface is to provide | ||
* a contract for comparing path patterns. For example: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this contract defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that this is confusing. I believe it is described better in the update.
* | ||
* @param <T> Type of pattern elements. | ||
*/ | ||
public static final class PatternEqualsAdapter<T extends PatternElement> implements |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a need for parametrization?
Also, AFAIR, is used in most of classes here, classes that parameters ( I think) dont relate, this is tad confusing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are essentially two categories of "patterns".
- The one category is the template itself - that consists of segments.
- The second category is the different types of segments.
The top level builder/factory methods always expect the adapter to contain templates, but there is code in the RouterFactory inner class that builds the matcher trees that expects to only work with types of segments - extensions of AbstractTemplateSegment. One can remove the parameter, but would then have to add a few more "instance of" checks or depend on future developers knowing which types of patterns are supported in which areas of the code. Parametrization seems like the best of the three options?
* | ||
* Extensions of this class must be immutable. | ||
*/ | ||
private abstract static class TemplateSegment implements PatternElement { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- examples, relations.
- TemplateSegment vs PatternElement - most likely some explanation would suffice? naming scheme just does not fall into place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I have made several naming changes and added more descriptions in the next update for this PR.
/** | ||
* Index for the segment inside of the template. | ||
*/ | ||
protected final int segmentIdx; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is it? why must be greater than 0 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added comments in the code for the next update to the PR. It is an index into the array that contains the segments of a requested URL path, so it must be 0 or more.
|
||
/** | ||
* A simple router that routes paths containing static segments or parameters. Specifically this router has | ||
* an optimisation - based on the segment counts of requests - that do not support wild cards. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to explain said optimization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional comments added in the update to the PR
|
||
@Override | ||
public String toString() { | ||
return "SimpleRouter{" + '}'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just an IDE generated toString that I use(d) for debugging. I removed the concatenation :D
* | ||
* @param <T> Target type. | ||
*/ | ||
private static class MatcherLeaveArterfacts<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 'Leaf' ?
- explanation/visualization? This, as few other defs are key innerowking classes, it has to be described in way that will allow maintenance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Would like to say it was a typo, but just my English that let me down :D
- Added more comments in the next update.
private Matcher[] matchers; | ||
private Matcher[] wildCardMatchers; | ||
|
||
private RouterFactory( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here and most likely any other code below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added more comments in the next update.
* | ||
* @return Reference to the builder. | ||
*/ | ||
public Builder<S, T> removeTemplate(final String pathTemplate) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not ideal to parse again. Possibly add method to remove by reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add a map for the path template string -> adapter instance, then look up the key and remove it directly from the "templates" map by reference. However, there are two things to consider:
- This is part of the "setup phase" of the routers that typically happens only once when a service starts up. So the overhead of parsing again has a very small impact on the overall performance. I have personally never seen someone add a path template and then remove it again. I have added this method to support the "remove" method that already existed in PathTemplateHandler - so basically for backwards compatibility. In all of our projects we create new routers using new builders if we do want to mutate the routes, so I suspect that this will almost never be used and for that reason the overhead is probably okay if you consider the next point.
- Consider the following two template strings /books/{bookId}/chapters and /books/{id}/chapters These two strings are not equal, but the patterns represented by the templates are equal. Since one can't have "duplicate" patterns in a router, I would argue that the intention of removing the template is to remove the pattern associated with it so that you can potentially replace it with another template that uses a different parameter name - for example. So if we were to implement that map to remove by reference and we can't find the exact string in the map, then we would have to parse the string again anyway in order to be sure that it has been removed.
Hi @baranowb A general question from my side... what would be the best way to respond to requests for clarification? I.e.
Thanks for all your effort with this PR! |
Just a note. I have started work to rename some classes, use more package level classes and to have a lot more documentation in JDOC and as comments. Will probably push more commits by the weekend |
Cool. Generally at least good explanation in code should be enough. This code should be maintained by community members, so they have to be able to navigate/understand it relatively easy. As to other type of documentation, I will have to ask devs. |
Just a quick update. I'm still busy restructuring the classes and updating the documentation. I'm moving as much of the documentation as possible into JavaDoc. I'm quite busy during the week at the moment, but will try to finish these changes this coming weekend then. |
I'm pushing a few commits now that contains a lot of structural (+naming) improvements and improvements to documentation - i.e. JavaDoc and comments. I have also replied to the existing comments on the PR, but I'm not sure how useful those will be once the structural changes have been pushed. Therefore I believe that it is necessary to provide a summary of the changes here as well:
|
Im slowly looking at it again. Generally good work with explaining stuff, some minor cracks so far. |
Thanks, @baranowb . Let me know if there is anything I can update so long. |
@dirkroets hey, so far I did not find anything bad, aside things Ive mentioned. Im still waiting for second review as well. |
public interface PathTemplateRouter<T> { | ||
|
||
/** | ||
* @return The default target for requests that do no match any specific routes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*not?
* underlying URL path templates, then result will contain {@link #getDefaultTarget() } as the target and will contain an | ||
* empty Optional in {@link PathTemplateRouteResult#getPathTemplate() }. | ||
*/ | ||
PathTemplateRouteResult<T> route(String path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im not sure how old code handled that, but method name and description hint nothing more should be done as "route" is very specific action. So Id wager it could be confusing, maybe add something about that?
* <li>It is assumed that most services that use routing based on path templates will setup a router once (when the service | ||
* starts) using a single thread and will then use that router to route millions of inbound requests using multiple concurrent | ||
* threads. Perhaps the setup will not happen exactly once, but the mutation of routes happen very few times when compared to | ||
* the number of times that requests are routed. For this reason this factory is heavily biased towards optimising the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"this factory" - this is interface?
Generally explanations are quite good, though sometimes off topic. Which not necessarily bad, if done properly. Here its tad confusing? Especially that its repetition of PathTemplateRouter last paragraph?
Also, PathTemplateRouter actually describes overall design, so there is no need to duplicate what is done there, unless its needed for extension in each class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE: I might have had this class open twice( will have to check) so "repetition" comment might not be correct.
done... Im going to bring this PR up on call today. |
Thanks, @baranowb . I'm travelling for work until 30 September, so time is very limited on my side. I will make a few changes to the documentation again from the 30th of September. Any more feedback after last week's call? |
@dirkroets hey, sorry I missed notification. I did bring it up and its on devs TODO list, which got delayed due to conference and some "issues on a deadline". |
@baranowb , thanks for the feedback. I've added a commit with the javadoc improvements that you suggested on the 4th of September. Please let me know if there is anything else that I can assist with. |
We'll be moving forward with this PR. I don't want to disrupt your work but we have been seeing some issues with CI that have been fixed. |
…plates. Updated existing path template based routing handlers to use the new utility.
…eaders to 2024. Removed synchronisation for toString(). Fixed Spotbugs issues.
…asses to their own files, improved JavaDoc and comments, other minor improvemets.
6381471
to
c591a97
Compare
Thanks, @fl4via . I've rebased onto upstream/main and ran the tests locally. All seems okay on my side. Let me know if there is anything else I can assist with :) |
Objective
Improve the performance of routing path templates to target handlers.
I started working on this code after I had seen this comment by Stuart Douglas on
io.undertow.util.PathTemplateMatcher
:The new path template router does incorporate a tree, but also numerous other performance enhancements. There are rudimentary performance benchmarks in
io.undertow.util.PathTemplateRouterTest
. The@Test
annotation must be uncommented forpublic void comparePerformance()
in order to run the comparison - in which case the results will be written to/tmp/path-template-router-performance.txt
in CSV format. The attached chartis based on numerous runs of
comparePerformance
and even though the benchmarks are admittedly rudimentary, it does show a very significant improvement in performance. The new utility routes approximately 3x more requests in the same amount of time as the previous utility, therefore a 200% increase in performance. Complexity should be O(log n) for number of path templates added to the router:Jira
UNDERTOW-2303
Tests
io.undertow.util.PathTemplateRouterTest
contains tests to verify that requests are routed correctly based on different combinations of path templates, including wildcards. Some of these tests were specifically written during the development ofio.undertow.util.PathTemplateRouter
, but many of the assertions have been copied fromio.undertow.util.PathTemplateTestCase
to verify consistency against the previous implementation.io.undertow.server.handlers.PathTemplateHandlerTestCase
was left untouched and verifies thatio.undertow.server.handlers.PathTemplateHandler
(which now uses the new router under the hood) routes requests in a way that is consistent with the previous implementation.io.undertow.server.handlers.RoutingHandlerTestCase
was left untouched and verifies thatio.undertow.server.RoutingHandler
(which now uses the new router under the hood) routes requests in a way that is consistent with the previous implementation.