Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Many tests in /cookies/http-state consistently time out on Taskcluster but not on BuildBot #13204

Closed
Hexcles opened this issue Sep 25, 2018 · 20 comments

Comments

@Hexcles
Copy link
Member Author

Hexcles commented Sep 25, 2018

Looks like the root cause is a Unicode error from wptserve when there are non-ASCII characters in headers (cookies in this case).

To reproduce:

  1. Start wptserve and open localhost:8000
  2. Set a cookie with non-ASCII characters
document.cookie="foo=你好"
  1. Refresh the page

Traceback:

ERROR:web-platform-tests:Traceback (most recent call last):
  File "/usr/local/google/home/robertma/github/wpt/tools/wptserve/wptserve/server.py", line 544, in handle_one_request
    self.finish_handling_h1(request_line_is_valid)
  File "/usr/local/google/home/robertma/github/wpt/tools/wptserve/wptserve/server.py", line 229, in finish_handling_h1
    request = Request(self)
  File "/usr/local/google/home/robertma/github/wpt/tools/wptserve/wptserve/request.py", line 280, in __init__
    int(self.headers.get("Content-Length", 0)))
  File "/usr/local/google/home/robertma/github/wpt/tools/wptserve/wptserve/request.py", line 332, in headers
    self._headers = RequestHeaders(self.raw_headers)
  File "/usr/local/google/home/robertma/github/wpt/tools/wptserve/wptserve/request.py", line 385, in __init__
    headers = [maybedecode(items[header])]
  File "/usr/local/google/home/robertma/github/wpt/tools/wptserve/wptserve/request.py", line 363, in maybedecode
    return s.decode("ascii")
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 4: ordinal not in range(128)

cc @jgraham @gsnedders

@Hexcles
Copy link
Member Author

Hexcles commented Sep 25, 2018

Is there a spec/RFC that says what encoding HTTP headers should use? UTF-8?

@gsnedders
Copy link
Member

A recipient MUST parse an HTTP message as a sequence of octets in an encoding that is a superset of US-ASCII [USASCII].

and

Historically, HTTP has allowed field content with text in the ISO-8859-1 charset [ISO-8859-1], supporting other charsets only through use of [RFC2047] encoding. In practice, most HTTP header field values use only a subset of the US-ASCII charset [USASCII]. Newly defined header fields SHOULD limit their field values to US-ASCII octets. A recipient SHOULD treat other octets in field content (obs-text) as opaque data.

So, uh, not really.

@Hexcles
Copy link
Member Author

Hexcles commented Sep 26, 2018

Okay... so, what do we do? I mean, hard crashing the server and indefinitely hanging the test are probably not we want in any case...

Why does wptserve need to decode all header strings?

@gsnedders
Copy link
Member

Okay... so, what do we do? I mean, hard crashing the server and indefinitely hanging the test are probably not we want in any case...

I agree. 🙂

Why does wptserve need to decode all header strings?

That would be my thought. Is there any reason not to just give a bunch of bytes to the handler?

@gsnedders
Copy link
Member

@annevk @mnot thoughts?

@annevk annevk added the cookies label Sep 26, 2018
@annevk
Copy link
Member

annevk commented Sep 26, 2018

Looking at blame this is @FHorschig and @mikewest.

@gsnedders
Copy link
Member

@annevk this isn't about that specific test, it's about the wptserve behaviour

@gsnedders
Copy link
Member

@Hexcles actually is this just a regression from #11769? (cc/ @Ms2ger)

@Hexcles
Copy link
Member Author

Hexcles commented Sep 26, 2018

@gsnedders yes. I can confirm reverting that change fixes the problem.

@annevk
Copy link
Member

annevk commented Sep 26, 2018

Ah sorry, yes, either passing on bytes or applying https://infra.spec.whatwg.org/#isomorphic-encode.

@jgraham
Copy link
Contributor

jgraham commented Sep 26, 2018

Not decoding stuff seems reasonable but it's going to need a full test run to find all the cases it breaks. Does someone have time to work on that?

@annevk
Copy link
Member

annevk commented Sep 26, 2018

It's probably more reasonable to do isomorphic decoding since consumers likely want strings anyway as they're easier to operate on.

@gsnedders
Copy link
Member

@jgraham pretty sure this is just a regression from #11769 and we didn't use to decode at all?

@annevk we should probably expose the raw bytes somewhere, given there are cases of other encoding being used, no?

@annevk
Copy link
Member

annevk commented Sep 26, 2018

@gsnedders we could, but I don't think it's needed (there's no information loss with isomorphic). If you really wanted to reinterpret you could, but for tests that's typically not what you ought to be doing (you want exact comparisons on the return values).

@Hexcles
Copy link
Member Author

Hexcles commented Sep 26, 2018

Given that this is a regression and we don't run wptserve with Python3 in any critical places, shall we revert #11769 for now?

Ms2ger added a commit that referenced this issue Sep 27, 2018
Ms2ger added a commit that referenced this issue Sep 27, 2018
Hexcles added a commit that referenced this issue Sep 27, 2018
The test sends a request to wptserve with non-ASCII characters in a
header and sets up a simple handler to return the value of that header.
The server shouldn't crash in either Python 2 or 3, and the response
should not be garbled. The server crashes in Python 2 (#13204).
@gsnedders
Copy link
Member

@Hexcles yeah, that was my kinda implicit suggestion before

any objection @Ms2ger?

gsnedders added a commit to gsnedders/web-platform-tests that referenced this issue Sep 27, 2018
gsnedders added a commit to gsnedders/web-platform-tests that referenced this issue Sep 28, 2018
@foolip
Copy link
Member

foolip commented Sep 28, 2018

This regression is blocking https://chromium-review.googlesource.com/c/chromium/src/+/1249141. Gecko automatically imported the change in mozilla/gecko-dev@0ad26e7 although I don't know if also broke the tests.

Per https://web-platform-tests.org/appendix/reverting.html I think reverting this back in #13204 (comment) would have been reasonable. Is #13246 the alternative fix? That's not a tiny change, can we revert first and combine the two commits for relanding?

@gsnedders
Copy link
Member

#13248 is a PR to revert it.

@foolip
Copy link
Member

foolip commented Sep 28, 2018

OK, I've reopened and approved. @Ms2ger, can you take a look?

Hexcles pushed a commit that referenced this issue Sep 28, 2018
Hexcles added a commit that referenced this issue Sep 28, 2018
The test sends a request to wptserve with non-ASCII characters in a
header and sets up a simple handler to return the value of that header.
The server shouldn't crash in either Python 2 or 3, and the response
should not be garbled. The server crashes in Python 2 (#13204).
@Hexcles Hexcles reopened this Sep 28, 2018
Ms2ger pushed a commit that referenced this issue Oct 4, 2018
The test sends a request to wptserve with non-ASCII characters in a
header and sets up a simple handler to return the value of that header.
The server shouldn't crash in either Python 2 or 3, and the response
should not be garbled. The server crashes in Python 2 (#13204).
Ms2ger pushed a commit that referenced this issue Oct 9, 2018
The test sends a request to wptserve with non-ASCII characters in a
header and sets up a simple handler to return the value of that header.
The server shouldn't crash in either Python 2 or 3, and the response
should not be garbled. The server crashes in Python 2 (#13204).
Hexcles added a commit that referenced this issue Oct 9, 2018
The test sends a request to wptserve with non-ASCII characters in a
header and sets up a simple handler to return the value of that header.
The server shouldn't crash in either Python 2 or 3, and the response
should not be garbled. The server crashes in Python 2 (#13204).
Hexcles added a commit that referenced this issue Oct 9, 2018
The test sends a request to wptserve with non-ASCII characters in a
header and sets up a simple handler to return the value of that header.
The server shouldn't crash in either Python 2 or 3, and the response
should not be garbled. The server crashes in Python 2 (#13204).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants