Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uncaught TypeError using http.Agent in keep-alive mode #8650

Closed
cdhowie opened this issue Sep 18, 2016 · 22 comments
Closed

Uncaught TypeError using http.Agent in keep-alive mode #8650

cdhowie opened this issue Sep 18, 2016 · 22 comments
Labels
http Issues or PRs related to the http subsystem.

Comments

@cdhowie
Copy link

cdhowie commented Sep 18, 2016

  • Version: v6.6.0 (also happens on v4.5.0)
  • Platform: Debian Jessie
  • Subsystem: http

This is the same issue as this one I filed against aws-sdk-js, but I'm getting more suspicious that this is actually a Node issue.

Configuration example:

AWS.config.update({
    httpOptions: {
        agent: new (require('http').Agent)({
            keepAlive: true
        })
    }
});

After some number of HTTP requests (a few dozen) using an http.Agent constructed with keepAlive: true, the following uncaught error occurs:

     Uncaught TypeError: object is not a function
      at afterWrite (_stream_writable.js:361:3)
      at onwrite (_stream_writable.js:352:7)
      at WritableState.onwrite (_stream_writable.js:89:5)
      at Socket._writeGeneric (net.js:714:5)
      at Socket._writev (net.js:719:8)
      at doWrite (_stream_writable.js:305:12)
      at clearBuffer (_stream_writable.js:394:5)
      at Socket.Writable.uncork (_stream_writable.js:243:7)
      at ClientRequest._flushOutput (_http_outgoing.js:678:10)
      at ClientRequest.OutgoingMessage._flush (_http_outgoing.js:652:16)
      at _http_client.js:195:10
      at callSocketMethod (_http_client.js:591:7)
      at ClientRequest.onSocket (_http_client.js:596:7)
      at tickOnSocket (_http_client.js:562:7)
      at onSocketNT (_http_client.js:574:5)
      at _combinedTickCallback (internal/process/next_tick.js:74:11)
      at process._tickDomainCallback (internal/process/next_tick.js:122:9)

Digging into the Node code, this hints that the callback supplied to a net.Socket.write() call was omitted, but the documentation states this callback is optional. In either case, changing keepAlive: true to keepAlive: false when constructing the http.Agent eliminates the error.

I do not make any HTTP calls myself, only via the AWS SDK. I can't find any obvious bug in my code; I'm just making AWS DynamoDB calls using their SDK. Since the call stack for the error is all Node internals, I do not have any opportunity to properly catch the error; the process simply dies.

The error consistently happens on one particular test. If I comment out a particular request in that test, then a future test (which does something entirely different) will fail instead. The fact that the test passes when not using keep-alive suggests to me that there is some kind of timing issue somewhere else.

@imyller imyller added http Issues or PRs related to the http subsystem. stream Issues and PRs related to the stream subsystem. labels Sep 18, 2016
@mscdex mscdex removed the stream Issues and PRs related to the stream subsystem. label Sep 18, 2016
@imyller
Copy link
Member

imyller commented Sep 19, 2016

@mscdex would you reconsider stream label if:

at clearBuffer (_stream_writable.js:394:5)

is the one actually making the failing doWrite call:

  doWrite(stream, state, true, state.length, buffer, '', holder.finish);

and by looks of it the holder.finish is the callback object resolving to non-function and causing TypeError.

holder.finish should be internally defined callback function in _stream_writable. Which means that net.Socket.write() callback (or non-callback) might not even be involved.

@cdhowie
Copy link
Author

cdhowie commented Sep 19, 2016

If there's any way I can grab extra information about the error, I will be happy to do so. I'm not experienced debugging Node internals, so I might need guidance.

I added a delay before the call that consistently fails, and this doesn't seem to prevent it from failing, which I guess means that it's not a timing issue as much as an order-of-operations issue.

@bnoordhuis
Copy link
Member

Try https://www.npmjs.com/package/longjohn, it will give you long stack traces.

@imyller
Copy link
Member

imyller commented Sep 19, 2016

NODE_DEBUG=http,net could also be helpful if you are able to reproduce the error as it happens.

@imyller
Copy link
Member

imyller commented Sep 19, 2016

Also, I noticed that aws-sdk-js automatically selects between http and https protocol client depending on internal AWS service endpoint. As far as I can see, it still uses same httpOptions to provide options for both.

I noted that you've manually provided custom agent object for the client with http.Agent.

Just to be sure, I'd avoid using http.Agent for connections where https.Agent might be more appropriate; or let each client default to it's own globalAgent. Problem maybe amplified even more, if requests by AWS lib do mixed use of http and https connections using the shared agent object.

This could be nothing, but maybe worth a check?

@cdhowie
Copy link
Author

cdhowie commented Sep 19, 2016

@imyller I'm using dynalite to mock DynamoDB for the service-level tests, and I've already verified with tcpdump that no HTTPS connections are even being attempted, so I don't think HTTP/HTTPS confusion in the AWS SDK is the issue.

I've run a test using longjohn and the given NODE_DEBUG environment variable. The output is immense; should I pastebin the whole thing or post it here, and should I try to pare it down a bit to (what I think are) just the relevant bits?

@imyller
Copy link
Member

imyller commented Sep 19, 2016

I don't think HTTP/HTTPS confusion in the AWS SDK is the issue.

Excellent that you've verified that. In theory, that had potential to cause unexpected behaviour.

should I pastebin the whole thing or post it here

Pastebin or GitHub Gist are both ok for me. Sometimes the things that first seem irrelevant are indeed the most valuable. We can pick the relevant shorter snippets from there to this discussion if needed.

@cdhowie
Copy link
Author

cdhowie commented Sep 20, 2016

Here's the exception trace. I'll get the NODE_DEBUG output later today.

https://gist.github.com/cdhowie/515b97824b49660a60f9d535077070f0

@cdhowie
Copy link
Author

cdhowie commented Sep 27, 2016

I updated the gist with the debug info. (Sorry for the delay, been busy with this project!)

This test is the one that fails. I assume that whatever went wrong did so right before this line. This prior test succeeds, so I guess we're looking somewhere between these two lines to start with?

The test that fails makes several successful HTTP requests before making the one that consistently fails with this "object is not a function" error.

@chimurai
Copy link

Hi @imyller @bnoordhuis ,

@EladBezalel reported an issue really similar to this one.

He created a isolated example: chimurai/http-proxy-middleware#39 (comment)

Hope this will be helpful in pinpointing the problem.

@bnoordhuis
Copy link
Member

@chimurai You wouldn't happen to have a test case that only uses built-in modules?

@bac42x
Copy link

bac42x commented Jan 11, 2017

Any developments with this issue?

I'm using webpack-dev-server with a proxy to a WEBAPI that uses NTLM. I used the suggestion from @chimurai in http-proy-middleware #39, but I also get this error: "TypeError: cb is not a function"

@EladBezalel
Copy link

@bac42x until i can find a solution for that i dropped the use of webpack-dev-server and i work directly to my selfhost with no hot reload :\

@Trott
Copy link
Member

Trott commented Jul 15, 2017

Has anyone managed to replicate this using just Node.js and not external modules? If so, a gist or something would be useful.

@EladBezalel
Copy link

@Trott do you mean without any http modules? I also tried without http-proxy-middleware and did my own express server to proxy the requests, still not working

@Trott
Copy link
Member

Trott commented Jul 16, 2017

@Trott do you mean without any http modules?

I mean using only Node.js and internal modules. So no http-proxy-middleware, no express...

@cdhowie
Copy link
Author

cdhowie commented Aug 24, 2017

The test that usually failed due to this issue is not failing as of v6.11.2. I'm still hesitant to turn this on in production due to the unpredictable nature of the issue.

@EladBezalel What Node version are you using?

@EladBezalel
Copy link

@cdhowie that was long time ago, so i assume some version of node 6

@cdhowie
Copy link
Author

cdhowie commented Sep 8, 2017

Do you still experience this issue? I've tried numerous things and for the life of me I can't get it to happen as of v6.11.2.