-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RedisClusterClient gives much higher load to first healthy node #404
Comments
So there are 2 ways how the Redis client may be used:
The first approach has an obvious disadvantage: for each command, it obtains a new connection. Not a TCP connection, those are pooled, but a connection object that may holds important state. In case of clustered Redis, that state held in the connection object includes the hash slot distribution. So using the first approach, every command indeed requires first sending (Looking at the code, I can see that the clustered client handles ASK redirections, but it seems that a MOVED redirection just ends up as an error, and it is a responsibility of the caller to close the connection and reconnect. I might be reading the code wrong, though. I'd have to explore more to be confident.) I don't know if it is the Quarkus Redis client who doesn't bother maintaining a |
The code is hard to extract, let me describe the call chain:
So each command from I use Quarkus-3.3.0. |
Yeah, I'm just looking at the Quarkus implementation too. I have never seen that code before, so I might be wrong, but it seems to me that if you use So in the short term, you can use either Longer-term, I don't really know what's the Vert.x Redis client philosophy. Are the Looking at the various |
The high level ReactiveXxxCommands API has no chance to use withConnection() or batch(). I think it’s OK to obtain pooled connection for each command, the problem is Slots data should belong to RedisClusterClient as you described above, and the Slots data should be cached, never send “CLUSTER SLOTS” first for each Redis command. |
You said you're I overall agree that the hash slot assignment should probably be stored in the |
I don't directly use the low level API |
When I'm referring to low-level APIs, I'm talking about the Vert.x APIs ( |
My bad, the Quarkus guide does say |
Checked the |
Yeah, I suspected as much. I'll see what I can do. |
I am wondering if such redis connection should not be pooled as well by the vertx redis client, WDYT @pmlopes ? |
TCP connections to the Redis server(s) are pooled. The |
I'm trying to reproduce this locally, but I'm hitting a connection pooling issue (basically connection pooling doesn't seem to work at all) on the 4.x branch ( |
OK, so the connection pooling not working at all is #365 (and the fix is #374). This is currently scheduled for 5.0, which in my opinion is wrong -- we totally need to fix this in 4.x, otherwise Redis cluster is basically unusable. I can't see any possible downside to backporting the fix, it's just about adding |
OK, so once I actually make the cluster client behave, this issue is fairly easy to reproduce with this simple Vert.x application: public class MainVerticle extends AbstractVerticle {
public static void main(String[] args) {
Vertx vertx = Vertx.vertx();
vertx.deployVerticle(new MainVerticle());
}
@Override
public void start(Promise<Void> startPromise) {
Redis redis = Redis.createClient(vertx, new RedisOptions()
.setType(RedisClientType.CLUSTER)
.setUseReplicas(RedisReplicas.SHARE)
.setConnectionString("redis://192.168.1.171"));
RedisAPI client = RedisAPI.api(redis);
call(client, "foo", 0);
call(client, "bar", 0);
call(client, "baz", 0);
call(client, "qux", 0);
call(client, "quux", 0);
call(client, "corge", 0);
call(client, "grault", 0);
call(client, "garply", 0);
call(client, "waldo", 0);
call(client, "fred", 0);
call(client, "plugh", 0);
call(client, "xyzzy", 0);
call(client, "thud", 0);
}
private void call(RedisAPI client, String prefix, int last) {
if (last == 1_000_000) {
return;
}
int x = last + 1;
client.set(List.of(prefix + x, "" + x))
.flatMap(response -> {
return client.get(prefix + x);
}).flatMap(response -> {
System.out.println(prefix + " --> " + response);
return client.set(List.of("__last__" + prefix, "" + x));
}).onSuccess(response -> {
vertx.runOnContext(ignored -> call(client, prefix, x));
}).onFailure(error -> {
System.out.println(error);
});
}
} I'm using a simple 3-node cluster with 3 masters and no replicas. I run an actual virtual machine with 1 CPU and 1024 MB of RAM for each cluster node, so I can actually measure load average and process CPU consumption. I made an interesting observation: when I tried to even the load on the cluster nodes by issuing the Next, I'll look into storing the hash slot assignment in the |
OK, storing hash slot assignment in the I'll also send 2 PRs to 4.x: backport of #374 and a backport of the fix for this issue. |
vertx-redis-client/src/main/java/io/vertx/redis/client/impl/RedisClusterClient.java
Line 140 in 9e88503
For each command
RedisClusterClient.connect()
always callsgetSlots()
to send commandCLUSTER SLOTS
to first healthy node, thus the first healthy node has very high CPU usage, in my benchmark test with Quarkus, the first healthy node consumes 80% CPU and other nodes consume 30% CPU, this leads to extra latency and uneven load on Redis cluster.The
getSlots()
call should be cached and automatically periodically updated, or updated byMOVED
response from Redis cluster.The text was updated successfully, but these errors were encountered: