Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please use TCP_NODELAY at least for websockets. #664

Closed
socketpair opened this issue Dec 7, 2015 · 29 comments
Closed

Please use TCP_NODELAY at least for websockets. #664

socketpair opened this issue Dec 7, 2015 · 29 comments
Labels
Milestone

Comments

@socketpair
Copy link
Contributor

sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)

Tornado use TCP_NODELAY at the end of every message. i.e. when tornado knows that this is last chunk written, it issue TCP_NODELAY setting to 1. After write buffer is flushed, it set TCP_NODELAY to 0.

This is helps alot. For example, see https://github.com/benjamin-hodgson/asynqp/pull/41/files and tornado sources against TCP_NODELAY

@socketpair
Copy link
Contributor Author

But, be careful. see tornadoweb/tornado@91fd578

@asvetlov
Copy link
Member

asvetlov commented Dec 7, 2015

As I see tornado toggles NODELAY at the end of every HTTP request handling. That may make sense in some situations but I'd like to have at least a flag for configuring behavior and benchmark showing the effort.

For websockets there is a method for switching NODELAY on (disabled by default). It may be useful but user can do it right now by request.transport.get_extra_info('socket').setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1).

@socketpair
Copy link
Contributor Author

@asvetlov

  1. For non-websockets - this will speedup delivery of last chunk(s) of HTTP response body. This especially useful for short answers like 404 or '{success: true}'
  2. For websockets, TCP_NODELAY should be set dy default. In any case, since we writing websocket message in one big chunk, this will not hurt performance in any way. See conversation Not using TCP_NODELAY reduce performance considerably benjamin-hodgson/asynqp#40

@asvetlov
Copy link
Member

asvetlov commented Dec 8, 2015

  1. Disabling Nagle's algorithm may hurt performance in many cases while enabling it may lead to degradation in other situations. Thus we should have reasonable default and a way to toggle the behavior by user.
  2. I insist to keep default OS configuration: TCP_NODELAY is off if not specified.
  3. There is a way to enable NODELAY using current public API (while I'm open to consider Pull Request for adding nodelay functionality to StreamResponse object).

@kxepal
Copy link
Member

kxepal commented Dec 8, 2015

+1 to have this as an option. While in most situations it's better to set this flag on, this is not a general rule. However, having some API nicer than

request.transport.get_extra_info('socket').setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)

won't hurt a lot.

@socketpair
Copy link
Contributor Author

  1. Since stream api buffers write requests, virtually never python will call many send syscalls with small portions of data. Reasonable default is not what kernel developers (and standards writers) assume. Just like O_CLOEXEC default behaviour is changes in Python 3.5. Yes, from kernel point of view it is reasonable to enable nagle algorithm, but we have many caching middlewares so we don't need kernel to expect bad bufferring in userspace. Also, can you provide any case where it will hurt performance ? I mean some cases of HTTP or websocket usage.
  2. Default behaviour in OS is not suitable for stream API and is not suitable for transactional-like tcp usage.
  3. Ok. I will create PR asyncio about setting nodelay.

@asvetlov
Copy link
Member

asvetlov commented Dec 9, 2015

I don't want change aiohttp defaults without benchmarking.
Also I suspect that if aiohttp is deployed behind nginx TCP_NODELAY don't change performance at all.

@socketpair
Copy link
Contributor Author

@asvetlov

  1. Benchmarking has been done by me in asynqp Not using TCP_NODELAY reduce performance considerably benjamin-hodgson/asynqp#40 (comment) (performance raised in 10 times)
  2. Seems you don't understand what is TCP_NODELAY. When this option is not set (by default) kernel will not send TCP packet immediatelly after send() syscall complete. Instead it wait for about 0.01 seconds before sending. So, when aiohttp-application will send response to NGINX, problem still here. If asyncio-app and nginx are on same machine, prbolem may not be noticed since tcp over local loopback works differently in kernel (with hacks).

This problem is not reproduced in bulk data transfer, like file downloading/uploading or for large HTTP bodies. Problem appear when TCP is used for short message passing pattern (i.e. websockets or small http responses, long-polling), since kernel adds dealy before actual message send.

@fafhrd91
Copy link
Member

fafhrd91 commented Dec 9, 2015

I think TCP_NODELAY is reasonable default. We just just need api and I'd add TCP_CORK support as well

@fafhrd91
Copy link
Member

fafhrd91 commented Dec 9, 2015

Btw asyncio does not buffer writes, same for aiohttp, so even small responses generate multiple 'send' calls. It worse for chunked response, aiohttp sends delimiters as separate writes.

@fafhrd91
Copy link
Member

fafhrd91 commented Dec 9, 2015

Asyncio buffers writes only under io pressure

@socketpair
Copy link
Contributor Author

OKAY. I will clear some points.

  1. Yes @fafhrd91 is right, multiple small send() syscalls may be issued during normal operation.
  2. Tornado set TCP_NODELAY after last byte of HTTP response is written. In other words, tornado works as follow:
.send()
.send()
.send()
.send('last_chunk')
set_nodelay(True)
try:
    await .drain()
finally:
    set_nodelay(False)

In that scheme, everything works perfectly. But for websockets, nodelay is set unconditionally (?). Also torando docs states that delay may be raised by kernel up to 500 ms (!).

  1. When we talk about websockets, it desirable to set TCP_NODELAY once socket negotiated and do not switch it off, since API of sending websocket messages imply sending one big chunk of data at once.

@asvetlov
Copy link
Member

asvetlov commented Dec 9, 2015

I discovered effect of setting TCP_NODELAY.
Yes, it improves latency very well. See #671

NGINX uses different approach than tornado. It has tcp_nodelay configuration parameter (on by default) which is applied only for keepalived connections.
That sounds very reasonable.

Also it has tcp_nopush param for forcing TCP_CORK only if sendfile is used. That maybe good for NGINX. Effort for pushing CORK before sending headers need additional investigation.

I'm ok now for setting TCP_NODELAY on websocket negotiation.

@fafhrd91
Copy link
Member

fafhrd91 commented Dec 9, 2015

tcp_cork should configurable

Sent from my iPhone

On Dec 9, 2015, at 11:33 AM, Andrew Svetlov notifications@github.com wrote:

I discovered effect of setting TCP_NODELAY.
Yes, it improves latency very well. See #671

NGINX uses different approach than tornado. It has tcp_nodelay configuration parameter (on by default) which is applied only for keepalived connections.
That sounds very reasonable.

Also it has tcp_nopush param for forcing TCP_CORK only if sendfile is used. That maybe good for NGINX. Effort for pushing CORK before sending headers need additional investigation.

I'm ok now for setting TCP_NODELAY on websocket negotiation.


Reply to this email directly or view it on GitHub.

@socketpair
Copy link
Contributor Author

Okay, good news about websockets.

It seems that 0ac669f sets TCP_NODELAY for any http response....does not it ? I think it should do so only for websockets and tornado approach for non-websockets

@asvetlov
Copy link
Member

It's not for any response but for keepalived connections only.
That's how nginx works.
I f you believe that tornado handles http connections better than nginx please make a benchmark.

@fafhrd91
Copy link
Member

what about chunked responses?

@fafhrd91
Copy link
Member

Basically, any response that have notion of stream need tcp_nodelay

@asvetlov
Copy link
Member

Agree. But aiohttp has no knowledge is response is streamed or not.
User should call resp.set_tcp_nodelay(True) manually.

@fafhrd91
Copy link
Member

I'd make 'on' by default and off manual. But I am fine with your approach

@socketpair
Copy link
Contributor Author

Well. @asvetlov : If nginx never send small portions of data, or resend all data it received in another socket, it is good to set nodelay unconditionally in order to, surprise, minimize delay, but also minimize packets per second.

As @fafhrd91 said, aiohttp does send small pieces to kenrel in different send syscalls. So, to make tcp perform best, we should either turn nodelay unconditionally and optimize our code, or set and remove nodelay as tornado does.

Seem this is clean to understand. If not - I will try to describe in more details.

In current situation (where you enabled nodelay for all keepalive connections), performance will be slightly lower than before, since each delimiter of chunked response will be sent in separate tcp packet.

It is not hard to check it, using tcpdump. Why not to use tornado approach ?! No one say why it is worse.

@asvetlov
Copy link
Member

Because I agree with @fafhrd91 -- nodelay should be on by default, both for websockets and stream responses.
TCP_CORK or TCP_NOPUSH may be used for further optimizations (sending headers, non-streaming response handling and file transferring).

I feel this way is a bit better than tornado approach.

Also asyncio has no such callback as tornado has.
.drain() guarantees that writing data buffer is smaller than low-watermark but not strictly empty.
It was invited to solve flooding problem but cannot help if signal for buffer exhausting signal is needed.

@asvetlov
Copy link
Member

UPD.
Response's headers are sent by the single chunk: https://github.com/KeepSafe/aiohttp/blob/master/aiohttp/protocol.py#L660
Non-streaming response is also optimized to send all in one data portion: https://github.com/KeepSafe/aiohttp/blob/master/aiohttp/web_reqrep.py#L817

Thus TCP_CORK/TCP_PUSH optimization is not bad and make sense at least for file transferring but not extremely important.

@socketpair
Copy link
Contributor Author

@asvetlov
Copy link
Member

Good catch!
While chunked encoding is not such wide spread as content-length case it worth to be optimized.
Done by 5d5017e

@socketpair
Copy link
Contributor Author

But anyway, chunk header and chunk data will be written in separate syscalls, so tcp cork or temporatily switching nodelay off will improve performance.

Same thing for other places like http headers and body.

@asvetlov
Copy link
Member

Actual behavior is a bit more complicated, see my message in #680 #680 (comment)

@asvetlov asvetlov added this to the 0.20 milestone Dec 18, 2015
@asvetlov
Copy link
Member

Fixed by #680 #685

@lock
Copy link

lock bot commented Oct 29, 2019

This thread has been automatically locked since there has not been
any recent activity after it was closed. Please open a new issue for
related bugs.

If you feel like there's important points made in this discussion,
please include those exceprts into that new issue.

@lock lock bot added the outdated label Oct 29, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 29, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants