Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipelining using multiple context handles #28

Closed
hikeonpast opened this issue Mar 25, 2011 · 5 comments
Closed

Pipelining using multiple context handles #28

hikeonpast opened this issue Mar 25, 2011 · 5 comments

Comments

@hikeonpast
Copy link

Hi,

I'm using Hiredis and pipelining to improve throughput (a lot!) over traditional blocking calls, without the complexity of going fully async. With the delay of Cluster, I'm looking at coarse-level sharding by "service" to alleviate potential scaling pitfalls in the near term, with the goal of leveraging Cluster when it is available.

I have a series of paired functions, one pair per "service", as follows:

PipelineServiceA(handle1);
PipelineServiceB(handle1);
PipelineServiceC(handle1);
ParseReplyServiceA(handle1);
ParseReplyServiceB(handle1);
ParseReplyServiceC(handle1);

The Pipeline functions use redisAppendCommand() and the ParseReply functions call redisGetReply(). It works great so far when using a common context (i.e. single server handle), as I can fill up the send buffer with all of my queries, the fire it off and sort out the responses in one round trip.

If I wanted to allocate a dedicated Redis instance for each "service" in an attempt to add some coarse sharding, I can do this relatively painlessly by simply passing the dedicated handle to each function.

PipelineServiceA(handle1);
PipelineServiceB(handle2);
PipelineServiceC(handle3);
ParseReplyServiceA(handle1);
ParseReplyServiceB(handle2);
ParseReplyServiceC(handle3);

The problem is that Hiredis doesn't send the output buffer until redisGetReply() is called, so instead of 3 sets of grouped commands being sent out at nearly the same time, followed by parsing of the responses quickly in series, the effective performance becomes nearly:

PipelineServiceA(handle1);
ParseReplyServiceA(handle1);
PipelineServiceB(handle2);
ParseReplyServiceB(handle2);
PipelineServiceC(handle3);
ParseReplyServiceC(handle3);

The send for each handle doesn't happen until the first redisGetReply() is called in a ParseReply function, so they become 3 serial, blocking requests instead of being close to the original (single context) implementation.

It seems like Hiredis would benefit from a non-blocking SendOutputBuffer type of command in order to support "sharded pipelining".

@pietern
Copy link
Contributor

pietern commented Mar 25, 2011

The reason for only sending the output buffer when redisGetReply is called is to collect as many queries as possible before doing the call to write. Indeed, this causes hiredis to not flush in parallel when this could be possible. However, these are blocking contexts, so there are a couple of problems when you want to flush them in parallel. For instance: when the output buffer is large, flushing can unnecessarily stall execution of the following statements. It is not possible to create a non-blocking write function for blocking contexts, since hiredis is not aware of individual commands in the output buffer and can therefore not be sure that a full command was sent after writing less than the full output buffer. The only thing that could be added it a call instructing hiredis to flush the output buffer, and do nothing else. I'm not a fan of this, since the use of multiple connections to do parallel queries can be better implemented in an async approach. Here, you rely on an event loop lib (such as libev, see example-*.c in the repo) to handle the I/O. This makes hiredis write to the respective sockets when they are writable, thus causing minimal blockage.

@hikeonpast
Copy link
Author

I understand that what I'm suggesting does blur the lines a bit between
the blocking and async approaches, and I appreciate your pragmatism for
wanting to keep things pure. From my (selfish, but perhaps not unique)
perspective, I have code developed that I believe would cleanly support
Redis Cluster if it were available, and am looking for ways to manually
shard with minimal redesign. I don't really need full async in this
appliction; blocking pipelining is perfectly adequate except that it
forces serial requests when used with more than one context. I'd really
like to avoid the complexity of a major rewrite to use an event loop if
I can possibly avoid it.

A full output buffer flush command is exactly what I need--basically a
way to retain the benefits of pipelining using the blocking API, while
supporting manual sharding.

I understand that an event loop would be the ideal way to implement
this, and if I were starting from scratch I would probably take that
approach. What I'm proposing as an added option to Hiredis is simply
another way for developers to use the library. It would add flexibility
to support a broader range of uses.

I hope you warm to the idea of adding an output buffer flush capability.
:) In the interim, I'll look at how painful it will be to refactor to
full async support vs. maintaining my own Hiredis fork. Cheers.

On 3/25/2011 1:40 AM, pietern wrote:

The reason for only sending the output buffer when redisGetReply is called is to collect as many queries as possible before doing the call to write. Indeed, this causes hiredis to not flush in parallel when this could be possible. However, these are blocking contexts, so there are a couple of problems when you want to flush them in parallel. For instance: when the output buffer is large, flushing can unnecessarily stall execution of the following statements. It is not possible to create a non-blocking write function for blocking contexts, since hiredis is not aware of individual commands in the output buffer and can therefore not be sure that a full command was sent after writing less than the full output buffer. The only thing that could be added it a call instructing hiredis to flush the output buffer, and do nothing else. I'm not a fan of this, since the use of multiple connections to do parallel queries can be better implemented in an async approach. Here, you r
ely on an event loop lib (such as libev, see example-*.c in the repo) to handle the I/O. This makes hiredis write to the respective sockets when they are writable, thus causing minimal blockage.

@pietern
Copy link
Contributor

pietern commented Mar 29, 2011

I just realized this is already possible :-). The hiredis API exports the function redisBufferWrite(redisContext *c, int *done) which does exactly what you want. Internally, hiredis calls this function in a loop until done == 1 and then starts reading from the socket. You can manually call this for every separate connection and then start reading from every connection using the regular redisGetReply (since it skips writing when the output buffer is empty).

@pietern pietern closed this as completed Mar 29, 2011
@hikeonpast
Copy link
Author

Awesome. Thanks, Pieter!

On 3/29/2011 5:34 AM, pietern wrote:

I just realized this is already possible :-). The hiredis API exports the function redisBufferWrite(redisContext *c, int *done) which does exactly what you want. Internally, hiredis calls this function in a loop until done == 1 and then starts reading from the socket. You can manually call this for every separate connection and then start reading from every connection using the regular redisGetReply (since it skips writing when the output buffer is empty).

@dtjm
Copy link

dtjm commented Dec 9, 2011

A slightly-related note from a situation I was wrestling with today. I had a separate thread looping on redisGetReply, and I was making redisAppendCommand calls from my main thread.

It appears that since the output buffer is flushed when redisGetReply is called, the results of redisAppendCommand could not be received if redisGetReply is called before redisAppendCommand. The solution is to call redisBufferWrite after redisAppendCommand.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants