Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thrown stream error while uncorking #1033

Open
jfhbrook opened this issue Apr 11, 2016 · 6 comments
Open

Thrown stream error while uncorking #1033

jfhbrook opened this issue Apr 11, 2016 · 6 comments

Comments

@jfhbrook
Copy link

  • Version: node_redis 2.4.2 (in the process of upgrading to 2.5.3 just in case), redis 2.8.24 on elasticache
  • Platform: node 4.4.1 on centos 7 (in the process of upgrading to 4.4.2)
  • Description: Description of your issue, stack traces from errors and code that reproduces the issue

This is me blatantly cross-posting [https://github.com/nodejs/node/issues/6154](this node core issue) because I don't actually know that it's their fault.

I have some production services that are throwing a pretty crazy error:

TypeError: Cannot set property 'entry' of null
    at clearBuffer (_stream_writable.js:379:18)
    at Socket.Writable.uncork (_stream_writable.js:238:7)
    at RedisClient.uncork (/cn/runtime/epi-services/nodejs/releases/current/node_modules/@condenast/tsugu-service/node_modules/redis/index.js:361:25)
    at Multi.exec_transaction (/cn/runtime/epi-services/nodejs/releases/current/node_modules/@condenast/tsugu-service/node_modules/redis/index.js:1171:18)

For some background: We're using redis as a caching layer, and this code is getting triggered while writing to redis with its 'multi' helper. We're using v2.4.2 of the redis module but I'm gonna upgrade to 2.5.3 just in case it helps.

As far as I can tell, what's happening here is that the redis client creates a net stream and at various points in time might call uncork on it, including inside the multi exec handler. For some reason, only sometimes, when this happens, trying to clear the buffer throws because state.corkedRequestsFree is null.

I did look at net and Duplex code paths, as well as the redis code, and didn't see anything touching internal stream state, but maybe I missed something?

Unfortunately, I don't have a reproducing case beyond the production code. I tried triggering this with a naive fuzzer on a writable stream, no dice.

Thanks!

@BridgeAR
Copy link
Contributor

Cork / uncork is called on each .exec call of a multi or batch command.

This is very certain an issue with Node.js and not with node_redis. You should definitely not run into a TypeError by calling uncork.

@mcollina
Copy link

In order for this to happen, the client is issuing synchronously multiple cork cycles, which means multiple exec and batch for this library. @jfhbrook can you confirm if this is the case?

@BridgeAR yes it is. However you might want to check if the way used to uncork is the most efficient way, as in my tests (on other libs and in core) it is usually better to uncork inside a process.nextTick, like so: https://github.com/nodejs/node/blob/be68b68d4863f0d389cc46fdf6f1cbcd1b241d0a/lib/_http_outgoing.js#L467-L470. The actual reason why this might be better is that you are traversing the c++ barrier less.
This is also the main reason we missed this regression in the first place.

(A fix is on the way).

@BridgeAR
Copy link
Contributor

@mcollina for my use case using nextTick or not did not show any difference in my benchmarks, so I decided to use the sync version. I'll switch to nextTick though. Thanks for pointing out that this circumvents that bug 👍
I'd actually prefer to not need it in the first place but writing buffers is really slow otherwise (for me it's more than six times slower to write lots of small buffers without using cork with tcp).
I guess it might be worth to check how performance could be improved for those in general (see also nodejs/node#5095).

@jfhbrook thanks for bringing this up!

@mcollina
Copy link

@BridgeAR not using nextTick will also add more tasks to the nextTick queue, which will be slightly more work, and it might cause more issues with the gc.

Yeah, cork is necessary when doing a lot of small writes. That's why it has been introduced and it has been added in core. BTW, you might want to redo your benchmarks on v4.4 or v5.7+, you might get a boost thanks to the very same patch it is causing this problem.

@jfhbrook
Copy link
Author

In order for this to happen, the client is issuing synchronously multiple cork cycles, which means multiple exec and batch for this library. @jfhbrook can you confirm if this is the case?

Yes, we use exec/batch heavily.

@dhendo
Copy link

dhendo commented Apr 13, 2016

Note - Possibly related: #1029

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants