Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow horizontal tabs \t in response header values #2345

Merged
merged 3 commits into from
May 28, 2024

Conversation

jhominal
Copy link
Member

Summary

This PR allows for \t to be returned in response header values by uvicorn when httptools are in use.
As explained in #2344, both the letter of the current spec (RFC 9210), and the h11 implementation, allow tab characters to be present in HTTP header values.

Checklist

  • I understand that this PR may be closed in case there was no previous discussion. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly.

@jhominal jhominal requested a review from a team May 22, 2024 09:02
@jhominal
Copy link
Member Author

The main question I have is whether I should be updating the CHANGELOG.md now or if that will be on the person who will be preparing the release to do that.

@Kludex
Copy link
Member

Kludex commented May 24, 2024

The main question I have is whether I should be updating the CHANGELOG.md now or if that will be on the person who will be preparing the release to do that.

I'll do it.

@@ -26,7 +26,7 @@
from uvicorn.server import ServerState

HEADER_RE = re.compile(b'[\x00-\x1f\x7f()<>@,;:[]={} \t\\"]')
Copy link
Member

@Kludex Kludex May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the name allow it as well, or is this good?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that, in the case of header names, the above regular expression does not match anything that is actually valid…

I will take another look at the new HTTP RFCs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late comment.

In short:

RFC 9110 has the following relevant bits:

In Section 5.1 Field names:

  field-name     = token

And in Section 5.6.1 Tokens:

  token          = 1*tchar

  tchar          = "!" / "#" / "$" / "%" / "&" / "'" / "*"
                 / "+" / "-" / "." / "^" / "_" / "`" / "|" / "~"
                 / DIGIT / ALPHA
                 ; any VCHAR, except delimiters
  • A check of the tchar whitelist from RFC 9110 shows that it does not overlap at all with the blacklist of the HEADER_RE regular expression. Thus, there is no reason for there to be an issue where someone's valid header name would be blocked by this check;
  • Only 2 ASCII characters are not covered by either the whitelist from RFC 9110, or the HEADER_RE blacklist: / and ?. I can add them to the HEADER_RE regex if you would like to;
  • In principle, any byte above 0x80 would be forbidden by the standard too. Should they also be added to HEADER_RE?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words, I think that the HEADER_RE regular expression for header names is fine, in that it does not block any standard-valid input.

@jhominal
Copy link
Member Author

Another question @Kludex @tomchristie : would you be opposed to adding, in the exception messages, a repr of the invalid value? (note: h11 does that)

Comment on lines +514 to +522
async def test_response_header_splitting(http_protocol_cls: HTTPProtocol):
app = Response(b"", headers={"key": "value\r\nCookie: smuggled=value"})

protocol = get_connected_protocol(app, http_protocol_cls)
protocol.data_received(SIMPLE_GET_REQUEST)
await protocol.loop.run_one()
assert b"HTTP/1.1 500 Internal Server Error" not in protocol.transport.buffer
assert b"\r\nCookie: smuggled=value\r\n" not in protocol.transport.buffer
assert protocol.transport.is_closing()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this test about? This was already passing before.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While working on this PR, I did not find a test that checked that invalid response header values were blocked by uvicorn (preventing, among other issues, header response splitting).

Thus, this test fulfills two purposes for me:

  1. It blocks me from implementing "allow horizontal tabs" by removing the regular expression check, either wilfully or inadvertently;
  2. It confirms that there is no need to go and change the h11 implementation for header splitting issues;

In other words, I believed that there was no other test that was covering that bit of functionality, so I decided to add one.

Copy link
Member

@Kludex Kludex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always great to review your work @jhominal :)

@Kludex
Copy link
Member

Kludex commented May 28, 2024

If you want, we can make a release right away. Please create a bump PR, if you want.

Also, feel free to merge this.

@jhominal jhominal merged commit 6d666d9 into encode:master May 28, 2024
15 checks passed
@jhominal jhominal deleted the tab-header-values-httptools branch June 5, 2024 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants