Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Default nodelay value needs to be set for all sockets. #9235

Closed
mhdawson opened this issue Feb 17, 2015 · 10 comments
Closed

Default nodelay value needs to be set for all sockets. #9235

mhdawson opened this issue Feb 17, 2015 · 10 comments
Labels

Comments

@mhdawson
Copy link
Member

The API indicates that the nodelay option is enabled by default http://nodejs.org/api/net.html#net_socket_setnodelay_nodelay

However, when debugging failures in this test for 0.12 - test/simple/test-http-request-end.js we discovered that the failures were because nodelay was not set for the sockets on AIX.

For linux, the default at the OS level seems to be to enable TCP_NODELAY so all sockets have TCP_NODELAY set. The default for AIX is for TCP_NODELAY to be false.

Since the Node API specifies that the default for nodelay is true it should make the required calls in order to ensure that nodelay is set to true, regardless of the default for the OS.

I looked at the libuv code and I don't see any code in the OS specific files for any of the platforms that would adjust the default for any platform.

This patch on net.js adjusts the creation of TCP objects so that by default nodelay is true for all sockets.

diff --git a/lib/net.js b/lib/net.js
index d353ff7..d655dd3 100644
--- a/lib/net.js
+++ b/lib/net.js
@@ -48,7 +48,12 @@ function createPipe() {
 // constructor for lazy loading
 function createTCP() {
   var TCP = process.binding('tcp_wrap').TCP;
-  return new TCP();
+  var handle = new TCP();
+  // The api indicates the default for noDelay is true. This is not
+  // true by default at the OS level on all platforms (ex AIX)
+  // so set it to ensure it is set at the OS level.
+  handle.setNoDelay(true);
+  return handle;
 }

If this is the right fix I can create a pull request.

@reqshark
Copy link

disabling Nagle’s algorithm improves latency at the expense of throughput

@trevnorris
Copy link

@reqshark That's acceptable for Node's use case. Generally speaking, users will be writing out a complete message. So delaying the final packet isn't what we want.

@trevnorris trevnorris added the net label Feb 20, 2015
@trevnorris
Copy link

/cc @tjfontaine @misterdjules @cjihrig

@orangemocha
Copy link
Contributor

@mdawsonibm

The API indicates that the nodelay option is enabled by default

my interpretation of the docs leads me to the opposite conclusion:

By default TCP connections use the Nagle algorithm

.. that means that nodelay is off by default (according to the docs). What defaults to true is the nodelay argument if no arguments are passed to setNoDelay(). Do you agree with this interpretation?

I don't have an opinion on what the best default would be perf-wise, but it would seem reasonable to stick to what the docs say -which also appears to be the default on most OSes afaict- for sake of compatibility. But whatever default we pick, I think we should strive to be consistent across platforms.

WIndows certainly has Nagle on / nodelay off by default.

@misterdjules
Copy link

@mdawsonibm I'm not sure I understand how the use or absence of TCP_NODELAY would affect test/simple/test-http-request-end.js. Could you please give us more details on how the test fails?

On another topic, do you have one or more AIX setups that we can use to investigate/reproduce such issues in the future?

@mhdawson
Copy link
Member Author

mhdawson commented Mar 9, 2015

The test fails because it depends on the data arriving together. Internally the Node functions use two writes to send the data over and without TCP_NODELAY the way the data arrives is such that it is received on two different iterations of the event loop with the end result being the test failing. Changing the TCP_NODELAY behaviour does not actually fix the test, but makes the behaviour consistent with other platforms such that the test passes. We see the same thing with test-http-default-encoding.js.

In terms of getting an externally available AIX setup I'll have to look into that.

@mhdawson
Copy link
Member Author

mhdawson commented Mar 9, 2015

In respect to the comment by orangemoca.

I see what you mean, it could just mean that the default is true if you don't specify it in the call. Given that context I've looked to see why we thought it was on by default for Linux other that what we saw with the tests, and thinking it was consistent with the API doc. Since I can't find anything to support that and I can see that I likely misread the doc I agree we should leave as is and we'll go back to looking to see if we make the tests tolerate the different timing we see on AIX.

@misterdjules
Copy link

@mdawsonibm Ok, thank you for the clarification.

I have the same interpretation of the documentation as @orangemocha. So if I understand correctly we have two distinct issues here:

  1. test/simple/test-htt-request-end.js and test/simple/test-http-default-encoding.js are racey and should be fixed, independently of whether TCP_NODELAY ends up being the default or not. We should create separate issues or/and pull requests for these tests.
  2. The behavior of sockets is not in line with the documentation regarding TCP_NODELAY on at least some Linux platforms.

So I guess the question is whether we consider 2) as a documentation bug or as inconsistent behavior that needs to be fixed. It seems that for v0.12, fixing the documentation would be safer, and for v0.13 having a more consistent behavior would be better.

What do you think?

@misterdjules
Copy link

@mdawsonibm I hadn't seen your latest comment before posting mine, but it sounds good to me.

@mhdawson
Copy link
Member Author

Ok I'm going to close this issue and once we have investigated the tests further will open new issues/pull requests as appropriate

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants