Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement routing in Router #240

Closed
SabrinaJewson opened this issue Aug 22, 2021 · 16 comments · Fixed by #363
Closed

Implement routing in Router #240

SabrinaJewson opened this issue Aug 22, 2021 · 16 comments · Fixed by #363
Labels
C-musings Category: musings about a better world

Comments

@SabrinaJewson
Copy link
Contributor

SabrinaJewson commented Aug 22, 2021

Feature Request

Motivation

  • Axum does not warn when routes are ambiguous.
  • Routing is O(n).
  • 404 handling is currently a bit hacky.

Proposal

Move all route matching into the Router type itself. I have implemented a sketch of this proposal on the playground, based on an imaginary Node type that does all the actual routing, and can be optimized to O(log n) because it knows all the routes up front. It is similar to matchit::Node but with a larger API; we might be able to upstream the required features there and use it. The sketch uses un-newtyped future types, which would obviously be changed in the actual library. We also might want to make the implementation details (Destinations, RouteDestination, etc) inaccessible (public in a private module) to avoid polluting the public API.

Now that routing is all done in one place, 404 handling can be implemented simply as another generic field of the Router type, which would be an Empty404 service by default, but can be changed to any service.

Drawbacks

The current design makes route("/", get(a)).route("/", post(b)) not work anymore; it will panic at server startup. Similarly route("/", get(a)).or(route("/", post(b))) panics as well. However this might not be necessary now that we can panic instead of silently fail on that code.

The implementation uses an hlist to store the list of services the routes point to, meaning routing is technically O(n). It should be really easy for the optimizer to turn it into O(1), but I haven't checked so we're kind of at the mercy of LLVM there. That said, even if it is O(n) it's just a sequence of integer comparisons which should be pretty cheap.

Alternatives

  • Have a single non-generic Router type that boxes all internal services, but otherwise operates in a similar fashion to above. This would make it a lot easier to pass around and return routers, even when .boxed() exists. Not sure how it would impact performance.
@davidpdrsn
Copy link
Member

I've read through your implementation and my initial impression is that it feels like a lot of changes, and new API, for not much gain.

Axum does not warn when routes are ambiguous.

By "ambiguous" do you something like route("/foo", _).route("/:key", _)? If so, is that actually a problem in practice?

Routing is O(n)

I don't think this is an issue. The actual work performed by hyper or your app is likely taking up way more time than routing.

404 handling is currently a bit hacky

I disagree. I actually like that 404 handling is able to be built directly using existing middleware.

@davidpdrsn davidpdrsn added the C-musings Category: musings about a better world label Aug 22, 2021
@SabrinaJewson
Copy link
Contributor Author

By "ambiguous" do you something like 'route("/foo", _).route("/:key", _)`? If so, is that actually a problem in practice?

I can't speak for whether it has been a problem someone has actually encountered, but I imagine it would be quite easy to accidentally leave some old code in when refactoring that would create a situation like that.

Routing is O(n)

I don't think this is an issue. The actual work performed by hyper or your app is likely taking up way more time than routing.

That's a fair point. I sometimes do enjoy optimization a little too much 😛.

I disagree. I actually like that 404 handling is able to be built directly using existing middleware.

The reason I consider the current 404 handling to be hacky is that I see Responses as being "final" objects; if a service returns a response, that response should be sent to the client, perhaps with decorations applied (CORS, compression, etc), but still fundamentally the same response. The 404 system breaks this model by adding new meaningful data to the response at a stage when it has already been constructed; because 404 is returned by services as a complete response and not some typed data it should be treated as a complete response rather than as typed data.

Another reason I devised this approach is that I see it as providing a sort of consistency between services and routers. There are two approaches that a library could take:

  1. A router is merely a service that forwards to one of two services depending on whether a path matches.
  2. A router is a map of routes to services that knows every single route, and given a path chooses the correct service to forward it to.

Axum currently is sort of halfway between the two; on the one hand, it has a Router type through which all routing is done, but on the other hand this Router doesn't actually do any routing itself and is only necessary for convenience. One the one hand, the Route type is the exact system in the first option, but on the other hand you are not able to use it directly from user code like a regular service. This suggestion, if applied, would solidify Axum as using the second type of routing which would IMO make it more consistent.

@jplatte
Copy link
Member

jplatte commented Aug 23, 2021

I agree with pretty much all of @KaiJewson's points. I'd also be worried about possible route ambiguity when writing larger services with axum's current routing, feel slightly uneasy about O(n) route matching and am not a fan of the current solution to creating a custom not-found page.

As a potentially relevant data point, in Rocket IIRC you can't even start your app if the routes are ambiguous. This is not as painful as it might sound, since it disambiguates many common cases that would otherwise be ambiguous through a simple rule that decides the default "rank" of a route (lower rank will be preferred if otherwise ambiguous), in combination with allowing users to explicitly specify a route's rank. Only routes that overlap and have the same rank will result in an error. IME this results in really intuitive and easy-to-use routing.

Disclaimer: I haven't actually had a look at the implementation, only read the discussion here.

@davidpdrsn
Copy link
Member

davidpdrsn commented Aug 23, 2021

Firstly I wanna apologies for my first response being a bit dismissive. I really appreciate this kind of insightful feedback ❤️

The reason I consider the current 404 handling to be hacky is that I see Responses as being "final" objects;

I understand where you're coming from but I personally don't have this opinion. I think part of tower's power comes from Service being as flexible as it is. So if a middleware wants to totally rewrite a response, thats fair game in my book.

[..] The 404 system breaks this model by adding new meaningful data to the response at a stage when it has already been constructed; because 404 is returned by services as a complete response and not some typed data it should be treated as a complete response rather than as typed data.

In a way app.or(handle_404.into_service()) (how its done here) kinda works like that. or does rely on some internals to know that no route could handle the request, and not just some handler that happened to return a 404.

  1. A router is merely a service that forwards to one of two services depending on whether a path matches.
  2. A router is a map of routes to services that knows every single route, and given a path chooses the correct service to forward it to.

Thats an interesting way of looking at it. I think I see your point. You are right at the Router type is just convenience. It was created to replace the RoutingDsl trait. Maybe App would have been a better name 🤷

I wonder if you'd be up for building a minimal working version of your prototype. Doesn't have to copy all of axum's API of course, just enough to route a request to one of a few services. Doing it in a fresh repo is fine. I think having some working code to look at would help.

@jplatte
Copy link
Member

jplatte commented Aug 23, 2021

In a way app.or(handle_404.into_service()) (how its done here) kinda works like that. or does rely on some internals to know that no route could handle the request, and not just some handler that happened to return a 404.

This is certainly interesting, I had the previous 404 example in mind when I said I wasn't a fan of the current solution. This is a whole lot better IMHO.

@dbofmmbt
Copy link
Contributor

Axum does not warn when routes are ambiguous.

I agree with the vision that the application shouldn't compile or should panic before serving requests if any routes are ambiguous. It would feel more rusty not having to debug why you can't reach the endpoint you thought you set up correctly when your server is running. I cannot tell whether the proposed solution is the way forward to solve this topic, though.

Routing is O(n).

No idea about the influence of it on real apps.

404 handling is currently a bit hacky.

The 404 example was updated recently and it looks great. It fits very well with the rest of the framework.

@SabrinaJewson
Copy link
Contributor Author

In a way app.or(handle_404.into_service()) (how its done here) kinda works like that.

Oh, you're right - I didn't realize the 404 handling had been changed. It's a lot cleaner now :D

In that case, it might be the best thing to do might be to add Router::from_service FromEmptyRouter to the public API, making Axum what I described as "type 1", where all routing is done on a per-type basis and Router is just there to prevent certain footguns. My approach still does have the small benefit of preventing ambiguous routes - I'll try to implement it in a fork of Axum when I have time next week.

@SabrinaJewson
Copy link
Contributor Author

Ok, I have finished my implementation on this branch.

Some notes:

  • I haven't updated documentation, tests or the changelog. All the tests pass, but I had to remove some obsolete ones and I haven't replaced them yet.
  • Surprisingly, there is a net gain of just 72 lines. This is admittedly mostly caused by the removal of buffer.rs and several tests.
  • I implemented method routing using the Destinations infrastructure as well. This allows it to panic on unreachable routes too. I renamed OnMethod to MethodRouter because it's a more accurate name. I am considering renaming Router to PathRouter to make it explicit that it only routes paths.
  • To avoid duplicating the entire destinations infrastructure for handlers as well as services, I implemented handler::MethodRouter in terms of the service equivalent, just wrapping everything in IntoService.
  • This resolves handler::any only matches on known HTTP methods #289 as a side-effect; any now matches all HTTP methods due to the way I implemented method routing.
  • This resolves Reverse route ordering #259 by making it obsolete, since .route("/foo", x).route("/:capture", y) and .route("/:capture", y).route("/foo", x) now behave identically.
  • This resolves some of the concerns raised in Correctly handle HEAD requests #129, because both get(x).head(y) and head(x).get(y) now work the same.
  • This reverts Fallback to calling next route if no methods match #224, because now the code will cause a panic at startup.
  • This has some effect on route has small footgun when trying to create nested routes #174, because now asterisk routes are supported. This was a necessary addition, because .nest and .or now only accepts other Routers. There are a few differences between .route("/foo/*", other_router) and .nest("/foo", other_router):
    • In the former, any services inside other_router not see the /foo part of the request path. In the latter, the router itself will act as if the request path starts after /foo but any interior services will see the full request path.
    • The former does not allow for failed routing to back up and try other routes in the outer router; if the path starts with /foo/, it will go to the inner router and never come back.
    • The former does not detect ambiguity, for example this does not panic: .route("/foo/*", Router::new().route("/x", unreachable_route)).route("/foo/x", reachable_route) whereas the equivalent code using the second method does.
    • The latter is more efficient because it stores all the routes in one data structure.
  • 404 handling can be done with .route("/*", handle_404) anywhere when constructing the Router.
  • This does not affect Allow route fallback #257. I think the best solution there would be to either allow .route("::<{capture}>", service) (which would be trivial to support in my branch) [cc Ability to do more complex route matching #109] or just have jplatte implement their own custom Service and use .route("/*", the_service).
  • The * route is now supported: .route("*", some_service). It panics when used inside a nested router.

@davidpdrsn
Copy link
Member

@KaiJewson thats fantastic! I'll give it a read within a week or two.

@dbofmmbt
Copy link
Contributor

dbofmmbt commented Sep 2, 2021

@KaiJewson it looks great!

I think it may be possible to close #174 by constraining route only for Services and make Router not be a service itself, but convertible to a type that does it. I tried to experiment it myself, but my trait abilities are not good enough for that 😢. It would stop people from putting Routers inside of route, and nest would be the only way to nest stuff, which looks fair to me.

The new routing works just by denying ambiguity + routing to the less generic ones first?

@davidpdrsn
Copy link
Member

I have considered that as well but it doesn't prevent people from doing .route("/foo", Router::new().route("/bar", _).into_service()) which still wouldn't work. So the issue is still there, just moved a bit.

@SabrinaJewson
Copy link
Contributor Author

If we document that asterisk routes should not be used with another Router I think it would be fine. There is something to the idea of having route always go to arbitrary service leaves and nest/or always combine other routers.

The new routing works just by denying ambiguity + routing to the less generic ones first?

Yes, exact values take precedence over captures which take precedence over asterisk routes.

@malobre
Copy link

malobre commented Sep 27, 2021

@KaiJewson Your implementation also removes the need for the regex crate which is quite heavy to compile. (I see its still in the Cargo.toml though).

@davidpdrsn davidpdrsn mentioned this issue Oct 3, 2021
7 tasks
@davidpdrsn
Copy link
Member

I've been working on a different implementation that replaces the router with matchit #363

Some points of comparison to @KaiJewson implementation from #240 (comment)

  • Relying on a third party crate for something as important as routing is a bit of a risk but for now its a risk I'm willing to take.
  • or and nest still accepts any tower::Service which is the way I want to keep it
  • Has significantly less new internal API. This is kind of a given since all the routing is done by a different crate rather than implemented in tree.
  • Still just uses tower::Service everywhere. I'm not a big fan of the Destinations since its so similar to Service. I'm also not a fan of public traits with methods that cannot be called from outside axum. It reminds me of warp which have a lot of these.
  • My version doesn't touch method routing.
  • I don't rename OnMethod. I think the current name is fine and its not something users have to name directly. Although I'm not strongly opposed to renaming it to MethodRouter.
  • I want to keep the name Router. I think renaming it to PathRouter will only do more harm than good. Router is a much more common word and something users should understand. PathRouter is less clear imo.

@houqp
Copy link
Contributor

houqp commented Oct 24, 2021

Thanks @KaiJewson for the initial PoC and @davidpdrsn for the final implementation. The common prefix based route matching is great 👍

@davidpdrsn
Copy link
Member

Yes! Huge thanks to @KaiJewson for their work ❤️ 0.3 is going to great 🎊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-musings Category: musings about a better world
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants