Add `readable` method to Socket #238

guyskk · 2017-12-17T15:02:58Z

I need this method for HTTP keep-alive check, and I found there is a writeable method but no readable method, here is the demo code for my use case:

try:
    await timeout_after(self.keep_alive_timeout, cli_sock.readable())
except TaskTimeout as ex:
    await cli_sock.close()

guyskk · 2017-12-18T05:39:46Z

The test test_worker_timeout failed, but I had no idea how to fix it.
And should I replace all await _read_wait(self._fileno) and await _write_wait(self._fileno) to
await self.readable() and await self.writeable() ?

dabeaz · 2017-12-19T11:42:22Z

I'm not really sure I fully understand the utility of a separate readable() method. Wouldn't a call to readable() almost always be followed by a call to actually read the data? If so, wouldn't it make more sense to put the timeout on the read operation instead?

For what it's worth, the writeable() method is there primarily to enable a non-greedy write. Normally, Curio, tries to write first and only blocks if the write fails. writeable() is there to flip the order around if you want--you can check to make sure a socket is writeable first and then only write if it would succeed. To be honest, I'm somewhat inclined to remove the writeable() method rather than give it a readable() sibling.

imrn · 2017-12-19T11:52:32Z

On 12/19/17, David Beazley ***@***.***> wrote: If so, wouldn't it make more sense to put the timeout on the read operation instead?

May be on the socket/stream itself.

dabeaz · 2017-12-19T11:59:23Z

No. Any blocking operation whatsoever can have a timeout applied to it. I'm not going to special case timeouts at the socket/stream level. What I mean, is why wouldn't you just do this:

 async with timeout_after(timeout):
          data = await sock.recv(maxsize)       # Or whatever

I don't see what having a separate blocking readable() method is really buying you.

guyskk · 2017-12-19T15:02:12Z

@dabeaz
Yes, a call to readable() is followed by a call to actually read the data, but the caller of readable() is something like connection manager, which should't actually read the data,
and the caller of read() is something like http parser, which will consume the data.
I don't like to implement the connection manager logic (eg: keep_alive_timeout) in http parser.

imrn · 2017-12-19T19:07:01Z

On 12/19/17, David Beazley ***@***.***> wrote: async with timeout_after(timeout): data = await sock.recv(maxsize) # Or whatever

Of course we can have this. But a timeout in socket/stream saves us from lots of context manager code lines similar to this. I admit it is an old habit from past sync era and I don't mind if you want to stick to async purism. Never the less, I already have such custom wrappers. But note that, embedded timeouts makes curio code more parallel to sync code, which is also a "feature" of curio.

I don't see what having a separate blocking `readable()` method is really buying you.

I guess it is also an old sync habit which I'll strongly oppose. Because -unlike the above syntactic sugar suggestion- this has the potential of disturbing async logic of the developer. It will encourage bad async style. The core idea of asyncs it to eliminate sync checks and loops. This seemingly innocent readable() func punches it badly.

On 12/19/17, kk ***@***.***> wrote: Yes, a call to readable() is followed by a call to actually read data, but the caller of readable() is something like connection manager, which should't actually read the data, and the caller of read() is something like http parser, which will consume the data. I don't like to implement the connection manager logic (eg: keep_alive_timeout) in http parser.

Let me guess: You are trying to implement http1.1 pipelining or http2.0 server or a middleware like proxy... Right? I can assure you it is doable elegantly without readable(). But you'll need connection manager to recycle or close the connections anyway since no single http request/response can decide on the fate of a connection. (Other than close requests or someother thing which I couldn't remember right now.) Someone has to have a wholistic view. No shortcuts there.

imrn · 2017-12-19T19:12:52Z

To be honest, I'm somewhat inclined to remove the writeable() method rather than give it a readable() sibling.

Agreed..

dabeaz · 2017-12-19T19:34:15Z

One of the reasons why there is no timeout parameter on calls is that literally every single blocking call in the whole library potentially allows a timeout. Early on, the API for that was getting to be insane. It made much more sense (to me anyway) to have some kind of wrapper that could apply a timeout to whatever you were doing if you really wanted it.

imrn · 2017-12-19T19:41:20Z

We think Curio as an async library. Right? How about this: Curio: Syncs.. As it should be..

CreatCodeBuild · 2017-12-19T19:43:32Z

In my own uses, writable() and readable() were never needed.

njsmith · 2017-12-20T01:05:00Z

The use case for writable is when you have a latency-sensitive protocol, and you want to delay making a decision about which bits to send until the kernel is ready to accept them - see #83. For example, imagine a screen-sharing protocol that loops on taking a screenshot and sending it -- if the network isn't ready to accept data, then you want to delay taking the screenshot until it is, rather than queue up stale data.

The use case for readable is very specific: if you have an HTTP client with a connection pool, then after using a connection you put it back in the pool and ignore it for a while. Later, when you decide to make another request to the same server, you pull the connection back out again to re-use. However, in the mean time, the server might have gotten bored and closed the connection (possibly after sending some 408 Request Timeout response). So when retrieving a connection from the pool, you want to do something like:

check if the socket is readable. If so, the server has given up. This isn't an error; it just means we need to close this connection and make a new one.
Once we have a connection that seems to be valid, send our request, while simultaneously listening for a response. At this point, an early response or socket close indicates an error → this request failed.

So you really do want a non-blocking "is there data there to read?" query here. In my efforts to port urllib3 to Trio, my plan so far is to just drop down to the lower level socket API here and call select or whatever, since this is such a weird and specialized thing that I don't want to put it into my abstract stream API.

Note in particular that for a TLS-encrypted connection, there isn't in general any way to check if the logical, application-level data stream is readable without actually trying to read data, but in this case that's not what we want anyway – here we really just want to check if there's any data on the underlying socket level.

guyskk · 2017-12-20T04:01:09Z

Let me guess: You are trying to implement http1.1
pipelining or http2.0 server or a middleware like
proxy... Right?

I'm tring to implement http1.1 persistent connection in server side, I try to show the wholistic view
in code below:

async def _worker(self, cli_sock, cli_addr):
    # browsers may preconnect but didn't send request immediately
    # keep-alive connection has similar behaviors
    # just close the connection if keep_alive_timeout exceed
    try:
        await timeout_after(self.keep_alive_timeout, cli_sock.readable())
    except TaskTimeout as ex:
        await cli_sock.close()
        return
    # parse request, will send `408 Request Timeout` response if request_timeout exceed
    parser = RequestParser(cli_sock, cli_addr)
    try:
        request = await timeout_after(self.request_timeout, parser.parse())
    except TaskTimeout as ex:
        response = Response(408)  # Request Timeout
    else:
        response = await self.app(request)
    # send response
    async for chunk in response:
        await cli_sock.sendall(chunk)
    # close the connection or wait for next request(keep-alive)
    if not response.keep_alive:
        await cli_sock.close()
    else:
        await spawn(self._worker(cli_sock, cli_addr), daemon=True)

The use case for readable is very specific: if you have an HTTP client with a connection pool, then after using a connection you put it back in the pool and ignore it for a while. Later, when you decide to make another request to the same server, you pull the connection back out again to re-use. However, in the mean time, the server might have gotten bored and closed the connection (possibly after sending some 408 Request Timeout response)

The ConnectionPool will check if the socket is readable,
but this can't be implemented by await sock.readable() because
await sock.readable() will block until socket is readable.
It needs a sync and non-blocking is_readable() method actually.

I implemented this check very ugly, in curequests(https://github.com/guyskk/curequests/blob/master/curequests/connection_pool.py#L75):

def _is_peer_closed(self):
    """check if socket in close-wait state"""
    # the socket is non-blocking mode, read 1 bytes will return EOF
    # which means peer closed, or raise exception means alive
    try:
        r = self.sock._socket_recv(1)  # FIXME: I use a private method, bad!
    except WantRead:
        return False
    except WantWrite:
        return False
    assert r == b'', "is_peer_closed shouldn't be called at this time!"
    return True

njsmith · 2017-12-20T06:49:00Z

Oh, I see, right, I got is_readable and wait_readable mixed up...

For the server, I don't see why you want a wait_readable. I can see how t might be useful if you want to provide a timeout for time-to-first-request-byte, but (a) I'm not sure what that specific timeout semantics are especially useful -- e.g. in this example curio web server I just imposed a timeout on time-to-end-of-request-headers. And (b) it doesn't seem like it'd be that hard to pass some argument into parser.parse saying "please wrap your first call in conn.recv (or whatever curio calls it) in a timeout_after with this timeout".

njsmith · 2017-12-20T06:51:04Z

Oh, and speaking of removing writable, I'm actually considering one approach for doing that in trio here: python-trio/trio#371

But I doubt Dave will like it :-)

imrn · 2017-12-20T07:08:57Z

This thread is an example of how that "evil" writeable() misled kk, spoiling his style and time. I'd suggest removing writable() and any other possible sync checks from curio. Trio may have them, as it explores such concepts.

dabeaz · 2017-12-20T10:34:18Z

The only reason I wouldn't like the Trio approach is that it's trying to make Trio do something--I don't want Curio to do anything. Doing things is the whole problem.

That said, I'm kind of inclined to remove writeable() and readable(). Mainly in the interest of having a smaller API, but also because they're kind of misleading (i.e., are people confused into thinking that such calls are necessary?). If one really needs to test is a socket is readable, I concur that one could run it through a zero-timeout select() call and find out. I'd consider it to be a special case. I'd also consider all uses of writeable() to be a special case that Curio doesn't really need to address directly.

njsmith · 2017-12-20T11:29:32Z

@dabeaz I was thinking you wouldn't like it because it would force every call to sendall to yield to the kernel, which was a point we disagreed on before :-). Probably also relevant to note though that that trio issue is talking about Trio's higher-level stream interface, the equivalent of curio.SocketStream rather than curio.io.Socket. (However, trio.SocketStream.wait_send_all_might_not_block is of course implemented using trio.socket.SocketType.wait_writable.)

guyskk · 2017-12-20T12:32:27Z

I agree with "having a smaller API" since my needs can be implemented without readable() method.
It's easy to add features but not easy to remove features.

dabeaz · 2017-12-23T01:45:47Z

This is more of a meta-question, but has anyone ever implemented a "screen-sharing application" or something similar to it in Python? If so, what library was used?

njsmith · 2017-12-23T02:19:50Z

@dabeaz Part of the reason that use case comes to mind so quickly for me is that long ago I wrote a little app called Xpra, which is still in reasonably wide use under different management. For networking I originally used glib (since I was using gtk for the UI anyway), and then later switched to threads (sigh) for better Windows compatibility. I'm not sure what they use these days.

Screen sharing is also the example that Apple used in their WWDC talk on TCP_NOTSENT_LOWAT and related things, I guess because it's so dramatic.

HTTP/2 is another protocol that acts like screen sharing in this regard.

Add readable method to Socket

1474022

guyskk closed this Dec 29, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `readable` method to Socket #238

Add `readable` method to Socket #238

guyskk commented Dec 17, 2017

guyskk commented Dec 18, 2017

dabeaz commented Dec 19, 2017

imrn commented Dec 19, 2017 via email

dabeaz commented Dec 19, 2017

guyskk commented Dec 19, 2017

imrn commented Dec 19, 2017 via email •

edited

Loading

imrn commented Dec 19, 2017 via email •

edited

Loading

dabeaz commented Dec 19, 2017

imrn commented Dec 19, 2017 via email •

edited

Loading

CreatCodeBuild commented Dec 19, 2017

njsmith commented Dec 20, 2017

guyskk commented Dec 20, 2017

njsmith commented Dec 20, 2017

njsmith commented Dec 20, 2017

imrn commented Dec 20, 2017 via email •

edited

Loading

dabeaz commented Dec 20, 2017

njsmith commented Dec 20, 2017

guyskk commented Dec 20, 2017

dabeaz commented Dec 23, 2017

njsmith commented Dec 23, 2017

Add readable method to Socket #238

Add readable method to Socket #238

Conversation

guyskk commented Dec 17, 2017

guyskk commented Dec 18, 2017

dabeaz commented Dec 19, 2017

imrn commented Dec 19, 2017 via email

dabeaz commented Dec 19, 2017

guyskk commented Dec 19, 2017

imrn commented Dec 19, 2017 via email • edited Loading

imrn commented Dec 19, 2017 via email • edited Loading

dabeaz commented Dec 19, 2017

imrn commented Dec 19, 2017 via email • edited Loading

CreatCodeBuild commented Dec 19, 2017

njsmith commented Dec 20, 2017

guyskk commented Dec 20, 2017

njsmith commented Dec 20, 2017

njsmith commented Dec 20, 2017

imrn commented Dec 20, 2017 via email • edited Loading

dabeaz commented Dec 20, 2017

njsmith commented Dec 20, 2017

guyskk commented Dec 20, 2017

dabeaz commented Dec 23, 2017

njsmith commented Dec 23, 2017

Add `readable` method to Socket #238

Add `readable` method to Socket #238

imrn commented Dec 19, 2017 via email •

edited

Loading

imrn commented Dec 19, 2017 via email •

edited

Loading

imrn commented Dec 19, 2017 via email •

edited

Loading

imrn commented Dec 20, 2017 via email •

edited

Loading