-
-
Notifications
You must be signed in to change notification settings - Fork 752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow horizontal tabs \t
in response header values
#2345
Conversation
The main question I have is whether I should be updating the CHANGELOG.md now or if that will be on the person who will be preparing the release to do that. |
I'll do it. |
@@ -26,7 +26,7 @@ | |||
from uvicorn.server import ServerState | |||
|
|||
HEADER_RE = re.compile(b'[\x00-\x1f\x7f()<>@,;:[]={} \t\\"]') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the name allow it as well, or is this good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect that, in the case of header names, the above regular expression does not match anything that is actually valid…
I will take another look at the new HTTP RFCs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late comment.
In short:
RFC 9110 has the following relevant bits:
field-name = token
And in Section 5.6.1 Tokens:
token = 1*tchar
tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*"
/ "+" / "-" / "." / "^" / "_" / "`" / "|" / "~"
/ DIGIT / ALPHA
; any VCHAR, except delimiters
- A check of the
tchar
whitelist from RFC 9110 shows that it does not overlap at all with the blacklist of theHEADER_RE
regular expression. Thus, there is no reason for there to be an issue where someone's valid header name would be blocked by this check; - Only 2 ASCII characters are not covered by either the whitelist from RFC 9110, or the
HEADER_RE
blacklist:/
and?
. I can add them to theHEADER_RE
regex if you would like to; - In principle, any byte above 0x80 would be forbidden by the standard too. Should they also be added to
HEADER_RE
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other words, I think that the HEADER_RE
regular expression for header names is fine, in that it does not block any standard-valid input.
Another question @Kludex @tomchristie : would you be opposed to adding, in the exception messages, a repr of the invalid value? (note: h11 does that) |
async def test_response_header_splitting(http_protocol_cls: HTTPProtocol): | ||
app = Response(b"", headers={"key": "value\r\nCookie: smuggled=value"}) | ||
|
||
protocol = get_connected_protocol(app, http_protocol_cls) | ||
protocol.data_received(SIMPLE_GET_REQUEST) | ||
await protocol.loop.run_one() | ||
assert b"HTTP/1.1 500 Internal Server Error" not in protocol.transport.buffer | ||
assert b"\r\nCookie: smuggled=value\r\n" not in protocol.transport.buffer | ||
assert protocol.transport.is_closing() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this test about? This was already passing before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While working on this PR, I did not find a test that checked that invalid response header values were blocked by uvicorn (preventing, among other issues, header response splitting).
Thus, this test fulfills two purposes for me:
- It blocks me from implementing "allow horizontal tabs" by removing the regular expression check, either wilfully or inadvertently;
- It confirms that there is no need to go and change the
h11
implementation for header splitting issues;
In other words, I believed that there was no other test that was covering that bit of functionality, so I decided to add one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Always great to review your work @jhominal :)
If you want, we can make a release right away. Please create a bump PR, if you want. Also, feel free to merge this. |
Summary
This PR allows for
\t
to be returned in response header values by uvicorn whenhttptools
are in use.As explained in #2344, both the letter of the current spec (RFC 9210), and the
h11
implementation, allow tab characters to be present in HTTP header values.Checklist