Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How should servers utilise root_path #229

Closed
pgjones opened this issue Jan 20, 2021 · 12 comments
Closed

How should servers utilise root_path #229

pgjones opened this issue Jan 20, 2021 · 12 comments

Comments

@pgjones
Copy link
Contributor

pgjones commented Jan 20, 2021

root_path is often considered equivalent to WSGI's SCRIPT_NAME, yet usage of root_path by servers differs in that WSGI servers seem to split out the SCRIPT_NAME from the PATH_INFO whereas ASGI servers (Daphne, Hypercorn, Uvicorn) do not,

I find (for all 3 ASGI servers, with daphne shown),

$ SCRIPT_NAME="/bob" gunicorn t110:app
>> environ = {'RAW_URI': '/bob/', 'PATH_INFO': '/', 'SCRIPT_NAME': '/bob', ...}

$ daphne --root-path "/bob" t110:app
>> scope = {'path': '/bob/', 'raw_path': b'/bob/', 'root_path': '/bob', ...}

whereas I think the scope should be,

>> scope = {'path': '/', 'raw_path': b'/bob/', 'root_path': '/bob', ...}

I.e. the server should 404 if the request target does not start with the root_path, and if it does remove it from the path, but keep it in the raw_path. Are the current ASGI servers correct, or does the ASGI spec need to be clearer about this?

@davidism
Copy link

davidism commented Jan 20, 2021

ASGI documents root_path as "same as SCRIPT_NAME in WSGI" https://asgi.readthedocs.io/en/latest/specs/www.html#http-connection-scope. That means it should be stripping root_path from path, otherwise route matching won't work as expected when mounted under a path.

We ran into this while working on sans-IO / ASGI support in Werkzeug pallets/werkzeug#2005 (comment).

@davidism
Copy link

This seems to be the relevant bug in Daphne: django/daphne#125. I couldn't find one for Uvicorn.

@euri10
Copy link
Contributor

euri10 commented Jan 20, 2021 via email

@davidism
Copy link

root_path is also used in reverse proxy situations, although it also needs to be supported by the A/WSGI server.

There are various ways to manage it, but ultimately the WSGI server and/or app needs to know where it's mounted and adjust SCRIPT_NAME accordingly. (Same for ASGI and root_path.) So for example, ASGI apps could work around this right now by using some middleware to fix root_path themselves before handling the request, although that's not ideal, as shown in the workaround in the issue @euri10 linked.

@pgjones
Copy link
Contributor Author

pgjones commented Jan 21, 2021

I've altered Hypercorn to act as WSGI does, see this commit

@pgjones
Copy link
Contributor Author

pgjones commented Jan 21, 2021

See also #230.

@andrewgodwin
Copy link
Member

andrewgodwin commented Jan 22, 2021

The intention of the spec is thus:

  • root_path does, in fact, match SCRIPT_NAME and is the prefix of where the app is stored
  • path includes root_path and then anything else after it, unlike PATH_INFO, which has SCRIPT_NAME removed from the beginning

(this is made slightly clearer at the bottom in https://asgi.readthedocs.io/en/latest/specs/www.html#wsgi-compatibility)

It was my intention that frameworks doing URL routing on just the "local" section remove root_path from path themselves.

@davidism
Copy link

davidism commented Jan 22, 2021

I see. Then the spec needs to clarify that root_path has the same value as SCRIPT_NAME, but doesn't have the same behavior with path. If I understand, in ASGI, root_path is intended to be a hint for how the app should treat path, it's just passed through by the server without doing anything?

I sort of understand this, since it makes building URLs work the same as matching URLs. But this is going to make extracting behavior in Werkzeug to work with both WSGI and ASGI trickier, not sure exactly what the answer is yet for us. It also seems to be causing misunderstandings based on the linked issues.

@davidism
Copy link

The compatibility section of the spec currently says:

PATH_INFO can be derived from path and root_path

Now that you've clarified your intention, I see that it means "can be derived by stripping root_path from path". But reading over it coming from WSGI, I misread that it was the same separation as WSGI.

@davidism
Copy link

I guess my next question is what the point of root_path is then, if it's just passed on to the app. Why not just configure that in the app and not have it passed by the server at all, since the app would be the thing stripping it while matching or appending it while building.

@andrewgodwin
Copy link
Member

andrewgodwin commented Jan 22, 2021

Well, the point of having it is twofold:

  • I wanted to be able to preserve compatibility with WSGI through a translation layer, so that information was needed somehow
  • You might want to proxy the same app with different root paths in e.g. a multi-tenant hosting situation

I agree the spec could be a lot more clear - happy to improve the language there.

@pgjones
Copy link
Contributor Author

pgjones commented Jan 27, 2021

This now makes sense to me, I'll close.

@pgjones pgjones closed this as completed Jan 27, 2021
abersheeran added a commit to abersheeran/a2wsgi that referenced this issue Dec 6, 2023
abersheeran added a commit to abersheeran/a2wsgi that referenced this issue Dec 6, 2023
DaGenix added a commit to DaGenix/hypercorn that referenced this issue Sep 29, 2024
…spec

Before this change, the dispatcher middleware didn't do anything with
'root_path'. It did, however, modify 'path'.

The problem with this behavior, is that the child ASGI app has no way to
determine what its prefix it. And if it can't determine its prefix, it
doesn't know how to construct URLs.

The ASGI spec isn't super clear on the expected behavior. However, some
resources to review are:

* https://asgi.readthedocs.io/en/latest/specs/www.html#wsgi-compatibility
* encode/starlette#2400
* django/asgiref#229

Based on the above, I believe that the correct behavior is that
"root_path" should be updated by the dispatcher middleware but that
"path" should not be modified.

In addition to the above change, I also updated the tests. And I also
added a new test case where the dispatcher middleware is nested inside
of itself.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants