-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bare and percent-encoded pluses in path variables become spaces in match_info #1816
Comments
I am not sure if unsafe="+" is right solution for url encoding. but '%2B' should be fixed. |
@kxepal ^^^ |
I think current behavior is right. we can add some settings, but this would amount of parameters we already have |
Nikolay, thank you for responding. The issue is that with the current behavior there is no way to distinguish between pluses and spaces in any of the matches. The path matching behavior has been provided as a helpful tool; but this would limit its use to only those cases where plusses may not occur. I believe keeping both '+' and '%2B' as '+' in the match would be ideal, because this behavior would be stable and least surprising, and it would make the behavior identical regardless of whether or not the aiohttp server is sitting behind a proxy (that may do the mapping of '%2B' to '+' before forwarding the request.) If you insist on unescaping '+' to space, at the very least, I would then expect '%2B' to be unescaped to '+'. However, this would be harder to maintain, as it is requires that escape/unescape (or quote/unquote, as it's called in the code) be called only a fixed number of times. I agree that adding new settings is rarely the right solution. Sincerely, Ilya |
you are right. |
added "unsafe='+'" |
@fafhrd91 : Thank you! 👍 |
This problem comes again in >=3.5.4 |
Perhaps you use slow pure-python yarl build; it has a bug with different behavior for Cython optimized version (correct) and Pure Python (incorrect). |
Long story short
When match info is constructed for variable resource routes,
'+'
characters are converted to spaces. I do not believe this is the expected behavior.Furthermore, even
'%2B'
codes are converted to spaces, limiting workarounds.Note that similar behavior yarl's path parsing, reported in yarl/issues/59, has been corrected in 88799aec. A similar fix is needed in match_info parsing.
Example
In our use case, the version string matched by the route is a semantic version that may include a plus sign followed by the build name, which runs afoul of the eager plus-to-space conversion.
Suppose the handler is reached via a
URL('http://server/resources/1.0.0+build')
:curl http://127.0.0.1:8088/resource/1.0.0+build # Note: this workaround fails too: curl http://127.0.0.1:8088/resource/1.0.0%2Bbuild
Expected behavior
Inside the handler, I expect the plus sign to be preserved in the portion of the URL path that has been captured by match info:
Actual behavior
Plus sign is preserved in the parsed path, but replaced by space in the match info:
Your environment
Xubuntu 16.04, python3.5, yarl==0.10.0, aiohttp==1.2.0 (but confirmed with aiohttp==2.0.7)
Code inspection / suggested fix
The implementation of DynamicResource._match in aiohttp/web_urldispatcher.py is causing the problem.
aiohttp/aiohttp/web_urldispatcher.py
Line 349 in bcbceb5
Similarly to 88799aec in yarl, if the
unquote(value)
is replaced withunquote(value, unsafe='+')
, then the issue I demonstrated above is fixed. I am unclear on whether '+:' should be used here as well. Nor am I confident that the same fix is needed in StaticResource (I believe it is) or other resources.If someone more familiar with the code validates the approach and helps determine which resources need this, I am more than happy to compose and file a PR.
The text was updated successfully, but these errors were encountered: