Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSL support and a Stream API #107

Merged
merged 78 commits into from
Jun 12, 2017
Merged

SSL support and a Stream API #107

merged 78 commits into from
Jun 12, 2017

Conversation

njsmith
Copy link
Member

@njsmith njsmith commented Mar 20, 2017

This is currently a first attempt with no tests, I'm not sure I understand SSLObject's semantics, and even if I got that right then the implementation is still a bit tricky for my taste (and I'm not 100% sure it's correct). But AFAICT it does work!

My current understanding of SSLObject is:

  • It provides 4 I/O operations: do_handshake, read, write, and unwrap
  • I/O operations never raise SSLWantWriteError
  • They may or may not alter the buffers (in particular, writing something to the outgoing buffer), and then they either succeed or raise SSLWantReadError.
  • If they succeed, then we need to send on any outgoing data, and then we've succeeded
  • If they raise SSLWantReadError then we need to send on any outgoing data, refill the incoming buffer, and then retry the operation.

In particular, we assume that we can synchronously send any outgoing data before refilling the incoming buffer. I think (hope?) this is safe according to the SSL state machine and won't cause deadlocks, but am not 100% sure.

If this is wrong then the code is definitely broken. If it's correct then the code might be correct :-).

Well, and unwrap is definitely broken right now, because it doesn't seem to follow the "keep trying until you succeed" model. Need to investigate further.

Implementing some kind of suppress_ragged_eofs functionality would probably be good, given how pervasive this is.

This currently allows multiple tasks to be in recv or sendall at the same time (as a side-effect of allowing recv and sendall to coordinate with each other) – this probably encourages bad habits. I guess a simple boolean "lock" on each method would be enough to stop it.

@njsmith
Copy link
Member Author

njsmith commented Mar 20, 2017

cc @dabeaz: I suspect you won't find this useful at all, but hey :-)

@codecov-io
Copy link

codecov-io commented Mar 20, 2017

Codecov Report

Merging #107 into master will increase coverage by 0.56%.
The diff coverage is 99.59%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #107      +/-   ##
==========================================
+ Coverage   98.44%   99.01%   +0.56%     
==========================================
  Files          53       59       +6     
  Lines        6116     7996    +1880     
  Branches      476      568      +92     
==========================================
+ Hits         6021     7917    +1896     
+ Misses         80       62      -18     
- Partials       15       17       +2
Impacted Files Coverage Δ
trio/tests/test_util.py 100% <100%> (ø) ⬆️
trio/tests/test_socket.py 100% <100%> (ø) ⬆️
trio/_streams.py 100% <100%> (+37.28%) ⬆️
trio/tests/test_sync.py 100% <100%> (ø) ⬆️
trio/tests/test_testing.py 100% <100%> (ø) ⬆️
trio/_network.py 100% <100%> (ø)
trio/_core/_exceptions.py 100% <100%> (ø) ⬆️
trio/__init__.py 100% <100%> (ø) ⬆️
trio/_core/_run.py 100% <100%> (ø) ⬆️
trio/_util.py 86.07% <100%> (+4.72%) ⬆️
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9f651bf...b61535f. Read the comment docs.

njsmith added 25 commits April 30, 2017 20:49
Still should probably rename wait_writable
Should (hopefully) fix travis builds
Forgetting to call do_handshake gives you a SSL connection that works
perfectly except that it has no MITM protection:
  https://bugs.python.org/issue30141
a java echo server! a pyopenssl equivalent that somehow seems to be
working now? the body of _retry is duplicated 3 times so I can get
more detailed debugging and coverage information depending on who
called it!
Maybe a bit fragile though... checkpointing before attempting a new idea
And use it in MockClock to get better behavior when using
wait_all_tasks_blocked and autojump_threshold=0 together. Potentially
useful in general as well.
This is starting to look like something real!

And ssl.py is approaching a 2:1 ratio of comments to code, which is
probably about right.
(it's an AbstractAsyncContextManager, but that doesn't exist, even in
py36)
@njsmith
Copy link
Member Author

njsmith commented May 24, 2017

Two potential minor additions that occurred to me:

  • Adding support for trailing data on initialization, for weird STARTTLS where the client initiates the upgrade without waiting. Can't think of any cases where this is is actually needed, but it's easy to implement and I guess theoretically useful for completeness...

  • If send_all is called and it implicit triggers do_handshake, then "cork" the final send_all call in the handshake (if any) so that the end of the handshake can be bundled up in the same packet as the first bit of data. Basically this would mean passing an argument to _retry that says "if you're finished except for having data to send, then just leave the data in the outgoing data buffer" and then send_all would immediately call _retry again and flush that data out.

@njsmith
Copy link
Member Author

njsmith commented May 28, 2017

Another thing to think about in the future, but probably not right now: automagically optimizing TLS record size. (SSLObject doesn't expose this directly, but basically it has to create a separate record each time you call write, so if someone passes a large chunk of data to send_all, we can split it up manually.) Possibly this is something to implement downstream first and then think about merging back into trio proper.

@njsmith
Copy link
Member Author

njsmith commented Jun 5, 2017

Staring at the urllib3 code is making me wonder if we should jump through the necessary hoops so that when SSLStream.send_all is cancelled, at least the receive half of the stream survives.

Rationale: they want the following semantics: initially, we start concurrently sending a request and receiving for response headers. Normally, we send the whole request, and then the response headers arrive. But sometimes, when we're only partly through sending the request, the response headers arrive preemptively. Either way, as soon as response headers arrive, we need to stop sending and return (!) the response headers. Then, our caller will make separate calls (!!) to ask us to read chunks of the response body (like an iterator). So the problem here is that we need to start up a task to send the request, and then after we get the response headers we need to either stop this task or else let it keep running in the background. Either is OK and would be correct, but the only reliable way we have to stop it is to cancel it (we can't assume it will notice any more gentle method, like setting a flag, because it might be blocked on I/O). And letting it continue after we return from the method is quite awkward, because it means that someone higher up the stack has to take responsibility for it. I'm not sure we should rule out the "run in background" solution, because I know Cory wants to add HTTP/2 support to requests as well, and that will mean finding some way to have a task running in the background for as long as each connection is alive, because someone has to respond promptly to pings, and solving that might provide a solution here as well. But for this comment let's consider the version where we make cancellation work.

To do this: first, if self.transport_stream.send_all is cancelled, we need some way to know how many bytes were actually sent. This is easy enough for SocketStream.send_all to report, by attaching some attribute to the Cancelled error. Maybe the report should also have the stream's id() attached to it, so that the receiver can verify that it didn't leak through from some other level. Then if we get this report, we need to save off the data that we haven't sent yet, somewhere where it can be found by the next task to enter the _inner_send_lock critical section. And of course whenever we enter _inner_send_lock, we need to check for this special saved data and prepend it to whatever we were going to send.

The trickiest thing is how to report the resulting state. Technically at this point the SSLStream is in full working order. But... we don't want to report that we only sent some of the data, because then our caller will think that it needs to resend some. OTOH if we report that we sent all of the data, then our caller may think that the other side has received it and be waiting for them to respond, when in fact it's still sitting in our internal buffer. Some options:

  • Extend the attribute-attached-to-Cancelled-error-from-send_all protocol so that it has some way to say "all data was accepted for sending, but it won't actually be sent until you call send_all again". Maybe separate data_accepted and data_transmitted fields? And declare that send_all(b"") needs to send all the buffered stuff? (this probably would need special casing to implement)

  • Intentionally break the send half of the SSLStream, i.e. set some flag so that future calls to send_all will error out with BrokenStreamError. This isn't technically necessary, we could make them succeed, but it avoids the problem of trying to explain what happened and provide mechanisms to fully recover... and in the mean time, all that machinery above is still useful, because it means that receive calls are still safe to continue. Which, if you're an HTTP client trying to deal with a preemptive response, is enough.

I'm leaning towards the latter. When someone cancels a send, they're saying they don't really want to send it, and in fact it's urgent that they stop sending it immediately, to the point that it's worth risking losing sync over. In general I don't think it's a requirement that send_all implementations must be able to recover from cancellation. (Though this should probably be thought through carefully in light of #147.) So this motivates the thing where other calls to _retry won't notice and try to re-send the residual data unless they really have to. And given all this it makes sense not to get super fancy.

Though, hmm, note that for TLS-over-TLS to survive cancellation, we actually do need the inner stream to fully survive. So that's a pretty good argument for the former. (Though this doesn't need an explicit flush command; it's enough that everything will be flushed the next time it does a non-trivial send_all call.)

@njsmith
Copy link
Member Author

njsmith commented Jun 7, 2017

<lukasa> njs: Yeah, so my TL;DR is that it looks reasonable
<lukasa> As with any chunk of code that large I can only review it in the abstract, and so with that in mind there are no obvious howlers
<lukasa> But the proof will be actually running that code for a while
<njs> lukasa: sure, and the test suite is reasonably thorough. Mostly I just wanted someone else to review that my understanding of openssl's API was correct, because there are some crucial things that I kind of had to guess based on what made sense rather than actually found written in any documentation :-)
<lukasa> Yeah, it seems right to me
<lukasa> Woo OpenSSL
<lukasa> Handily using the Python APIs removes some of the real major footguns that are available
<njs> oh for sure
<njs> while adding a few of its own :-)
<njs> lukasa: thanks a lot!

@njsmith
Copy link
Member Author

njsmith commented Jun 7, 2017

I think I'll merge this tomorrow, when I'm more awake.

Todo:

  • skim over to remind myself what the state of everything is
  • remove some defunct notes-to-self from the bottom of test_ssl.py
  • take any remaning notes-to-self, and the notes-to-self that are in comments up above, and file them as proper issues

@@ -29,6 +29,18 @@ if [ "$USE_PYPY_NIGHTLY" = "1" ]; then
source testenv/bin/activate
fi

if [ "$USE_PYPY_RELEASE" = "1" ]; then
curl -Lo pypy.tar.bz2 https://bitbucket.org/squeaky/portable-pypy/downloads/pypy3.5-5.7.1-beta-linux_x86_64-portable.tar.bz2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is not beta anymore.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the URL where he posted it; what do you want me to do here :-). Note that "beta" here doesn't mean "this is a beta of 5.7.1", it means "this is 5.7.1, and watch out b/c the py3 support is still considered beta quality".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, my bad. Ignore comment then.

ssl_context.check_hostname = False
s = SSLStream(StapledStream(process.stdin, process.stdout), ssl_context)

[Note: subprocess support is not implemented yet, but that's the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.. note ::

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll leave this as is for now, it's not really a call-out-box kind of note, more like a whispered parenthetical.

If that makes any sense at all, which it probably doesn't.

self._held = False


class UnLock:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming of UnLock (if I understand correctly) seem weird, and counterintuitive. I'm going to guess many people will read it as Unlocking something. Maybe naming as "Restricive" of "Exclusionary" ? Or RaisingLock ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is definitely a bit over-cutesy. (Note that it isn't a public API, though.) Also it should probably be refactored a bit since currently it's written as a generic object that can raise any exception, but in fact every single caller uses it to raise ResourceBusy.

@@ -705,3 +741,5 @@ def _make_simple_sock_method_wrapper(methname, wait_fn, maybe_avail=False):
# else:
# raise OSError("getaddrinfo returned an empty list")
# __all__.append("create_connection")


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read until here so far. I'm not going to assume I can wrap all that in my head, but I did not saw anything particularly wrong. For now I'm going to switch task a bit.

@buhman buhman mentioned this pull request Jun 10, 2017
:class:`~trio.ssl.SSLStream`::

# Get a raw SocketStream connection to the proxy:
s0 = await open_tcp_stream("proxy", 443)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like open_tcp_stream exists.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's correct, but #145 will eventually add it I think :-).

In general this PR has gotten kind of out of hand, so I want to try and get it merged and turn most of these comments into follow-up issues. And the docs in particular are in a bit of a mess because there isn't really a good way to document a library that has half of a high-level networking API implemented... best thing is to finish it and then figure out how to make the docs good :-). This is a useful reminder to add some sort of "please excuse our dust" note to the docs in the interim though...

@njsmith
Copy link
Member Author

njsmith commented Jun 11, 2017

I added some "excuse our dust" notes to the docs in d1144b8, filed followups for the other issues discussed above: #195, #196, #197, #198

They're either todone or filed as separate issues now.
@njsmith
Copy link
Member Author

njsmith commented Jun 11, 2017

See also #199, #227

...I feel like I just started this PR, how did we get from #107 to the 200s during it

@njsmith njsmith changed the title [wip] SSL support SSL support and a Stream API Jun 12, 2017
@njsmith
Copy link
Member Author

njsmith commented Jun 12, 2017

Okay, let's do this! Wooo!

@njsmith njsmith merged commit ec4cdee into python-trio:master Jun 12, 2017
@njsmith njsmith deleted the ssl branch June 12, 2017 07:26
@buhman
Copy link
Member

buhman commented Jun 12, 2017

👍

@imrn
Copy link

imrn commented Mar 31, 2018

Is outgoing MemoryBIO flushed out when unwrap is called?

@njsmith
Copy link
Member Author

njsmith commented Mar 31, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants