Pipelining using multiple context handles #28

hikeonpast · 2011-03-25T04:30:02Z

Hi,

I'm using Hiredis and pipelining to improve throughput (a lot!) over traditional blocking calls, without the complexity of going fully async. With the delay of Cluster, I'm looking at coarse-level sharding by "service" to alleviate potential scaling pitfalls in the near term, with the goal of leveraging Cluster when it is available.

I have a series of paired functions, one pair per "service", as follows:

PipelineServiceA(handle1);
PipelineServiceB(handle1);
PipelineServiceC(handle1);
ParseReplyServiceA(handle1);
ParseReplyServiceB(handle1);
ParseReplyServiceC(handle1);

The Pipeline functions use redisAppendCommand() and the ParseReply functions call redisGetReply(). It works great so far when using a common context (i.e. single server handle), as I can fill up the send buffer with all of my queries, the fire it off and sort out the responses in one round trip.

If I wanted to allocate a dedicated Redis instance for each "service" in an attempt to add some coarse sharding, I can do this relatively painlessly by simply passing the dedicated handle to each function.

PipelineServiceA(handle1);
PipelineServiceB(handle2);
PipelineServiceC(handle3);
ParseReplyServiceA(handle1);
ParseReplyServiceB(handle2);
ParseReplyServiceC(handle3);

The problem is that Hiredis doesn't send the output buffer until redisGetReply() is called, so instead of 3 sets of grouped commands being sent out at nearly the same time, followed by parsing of the responses quickly in series, the effective performance becomes nearly:

PipelineServiceA(handle1);
ParseReplyServiceA(handle1);
PipelineServiceB(handle2);
ParseReplyServiceB(handle2);
PipelineServiceC(handle3);
ParseReplyServiceC(handle3);

The send for each handle doesn't happen until the first redisGetReply() is called in a ParseReply function, so they become 3 serial, blocking requests instead of being close to the original (single context) implementation.

It seems like Hiredis would benefit from a non-blocking SendOutputBuffer type of command in order to support "sharded pipelining".

The text was updated successfully, but these errors were encountered:

pietern · 2011-03-25T08:40:39Z

The reason for only sending the output buffer when redisGetReply is called is to collect as many queries as possible before doing the call to write. Indeed, this causes hiredis to not flush in parallel when this could be possible. However, these are blocking contexts, so there are a couple of problems when you want to flush them in parallel. For instance: when the output buffer is large, flushing can unnecessarily stall execution of the following statements. It is not possible to create a non-blocking write function for blocking contexts, since hiredis is not aware of individual commands in the output buffer and can therefore not be sure that a full command was sent after writing less than the full output buffer. The only thing that could be added it a call instructing hiredis to flush the output buffer, and do nothing else. I'm not a fan of this, since the use of multiple connections to do parallel queries can be better implemented in an async approach. Here, you rely on an event loop lib (such as libev, see example-*.c in the repo) to handle the I/O. This makes hiredis write to the respective sockets when they are writable, thus causing minimal blockage.

hikeonpast · 2011-03-25T16:10:36Z

I understand that what I'm suggesting does blur the lines a bit between
the blocking and async approaches, and I appreciate your pragmatism for
wanting to keep things pure. From my (selfish, but perhaps not unique)
perspective, I have code developed that I believe would cleanly support
Redis Cluster if it were available, and am looking for ways to manually
shard with minimal redesign. I don't really need full async in this
appliction; blocking pipelining is perfectly adequate except that it
forces serial requests when used with more than one context. I'd really
like to avoid the complexity of a major rewrite to use an event loop if
I can possibly avoid it.

A full output buffer flush command is exactly what I need--basically a
way to retain the benefits of pipelining using the blocking API, while
supporting manual sharding.

I understand that an event loop would be the ideal way to implement
this, and if I were starting from scratch I would probably take that
approach. What I'm proposing as an added option to Hiredis is simply
another way for developers to use the library. It would add flexibility
to support a broader range of uses.

I hope you warm to the idea of adding an output buffer flush capability.
:) In the interim, I'll look at how painful it will be to refactor to
full async support vs. maintaining my own Hiredis fork. Cheers.

On 3/25/2011 1:40 AM, pietern wrote:

The reason for only sending the output buffer when redisGetReply is called is to collect as many queries as possible before doing the call to write. Indeed, this causes hiredis to not flush in parallel when this could be possible. However, these are blocking contexts, so there are a couple of problems when you want to flush them in parallel. For instance: when the output buffer is large, flushing can unnecessarily stall execution of the following statements. It is not possible to create a non-blocking write function for blocking contexts, since hiredis is not aware of individual commands in the output buffer and can therefore not be sure that a full command was sent after writing less than the full output buffer. The only thing that could be added it a call instructing hiredis to flush the output buffer, and do nothing else. I'm not a fan of this, since the use of multiple connections to do parallel queries can be better implemented in an async approach. Here, you r
ely on an event loop lib (such as libev, see example-*.c in the repo) to handle the I/O. This makes hiredis write to the respective sockets when they are writable, thus causing minimal blockage.

pietern · 2011-03-29T12:34:05Z

I just realized this is already possible :-). The hiredis API exports the function redisBufferWrite(redisContext *c, int *done) which does exactly what you want. Internally, hiredis calls this function in a loop until done == 1 and then starts reading from the socket. You can manually call this for every separate connection and then start reading from every connection using the regular redisGetReply (since it skips writing when the output buffer is empty).

hikeonpast · 2011-03-29T14:28:18Z

Awesome. Thanks, Pieter!

On 3/29/2011 5:34 AM, pietern wrote:

I just realized this is already possible :-). The hiredis API exports the function redisBufferWrite(redisContext *c, int *done) which does exactly what you want. Internally, hiredis calls this function in a loop until done == 1 and then starts reading from the socket. You can manually call this for every separate connection and then start reading from every connection using the regular redisGetReply (since it skips writing when the output buffer is empty).

dtjm · 2011-12-09T00:13:32Z

A slightly-related note from a situation I was wrestling with today. I had a separate thread looping on redisGetReply, and I was making redisAppendCommand calls from my main thread.

It appears that since the output buffer is flushed when redisGetReply is called, the results of redisAppendCommand could not be received if redisGetReply is called before redisAppendCommand. The solution is to call redisBufferWrite after redisAppendCommand.

pietern closed this as completed Mar 29, 2011

kevinan9 mentioned this issue Jan 8, 2015

Hiredis-test got FAILED when starting. #294

Closed

Pea13 mentioned this issue Jun 22, 2015

hiredis-test: 2 tests failed on OpenBSD #342

Closed

jvdvjvdv mentioned this issue Aug 31, 2016

setsockopt call issue on SPARC Solaris 10 Test#46 Failing #459

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipelining using multiple context handles #28

Pipelining using multiple context handles #28

hikeonpast commented Mar 25, 2011

pietern commented Mar 25, 2011

hikeonpast commented Mar 25, 2011

pietern commented Mar 29, 2011

hikeonpast commented Mar 29, 2011

dtjm commented Dec 9, 2011

Pipelining using multiple context handles #28

Pipelining using multiple context handles #28

Comments

hikeonpast commented Mar 25, 2011

pietern commented Mar 25, 2011

hikeonpast commented Mar 25, 2011

pietern commented Mar 29, 2011

hikeonpast commented Mar 29, 2011

dtjm commented Dec 9, 2011