Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extreme amount of ioredisClusterSubscriber clients #1325

Closed
GaryWilber opened this issue Apr 5, 2021 · 2 comments · Fixed by #1329
Closed

Extreme amount of ioredisClusterSubscriber clients #1325

GaryWilber opened this issue Apr 5, 2021 · 2 comments · Fixed by #1329
Labels

Comments

@GaryWilber
Copy link

I have a service using ioredis 4.24.6 and recently ran a stress test against it, which ended up maxing out the redis cluster cpu.
While the Redis cluster cpu was 100%, the ioredis cluster clients started disconnecting / reconnecting.
In the redis server telemetry, the client count continued to increase until it hit the maximum number of clients allowed.

Running CLIENT LIST on the redis shard returns over one thousand ioredisClusterSubscriber clients. This is much higher than the amount of redis clients I would expect.

id=16460 addr={redacted}:58325 fd=6501 name=ioredisClusterSubscriber age=16429 idle=8916 flags=N db=0 sub=24 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 ow=0 owmem=0 events=r cmd=subscribe numops=7
id=16462 addr={redacted}:52934 fd=6503 name=ioredisClusterSubscriber age=16429 idle=8645 flags=N db=0 sub=8 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 ow=0 owmem=0 events=r cmd=subscribe numops=7
id=16466 addr={redacted}:59636 fd=6507 name=ioredisClusterSubscriber age=16429 idle=8915 flags=N db=0 sub=16 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 ow=0 owmem=0 events=r cmd=subscribe numops=5
id=16470 addr={redacted}:52910 fd=6511 name=ioredisClusterSubscriber age=16429 idle=8450 flags=N db=0 sub=207 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 ow=0 owmem=0 events=r cmd=subscribe numops=5
id=16471 addr={redacted}:53023 fd=6512 name=ioredisClusterSubscriber age=16429 idle=8447 flags=N db=0 sub=183 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 ow=0 owmem=0 events=r cmd=info numops=5
id=16472 addr={redacted}:53027 fd=6513 name=ioredisClusterSubscriber age=16429 idle=9048 flags=N db=0 sub=2 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 ow=0 owmem=0 events=r cmd=info numops=5
id=16473 addr={redacted}:59736 fd=6514 name=ioredisClusterSubscriber age=16429 idle=8915 flags=N db=0 sub=16 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 ow=0 owmem=0 events=r cmd=subscribe numops=7
id=16477 addr={redacted}:59661 fd=6518 name=ioredisClusterSubscriber age=16429 idle=9182 flags=N db=0 sub=7 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 ow=0 owmem=0 events=r cmd=info numops=5
id=16492 addr={redacted}:52987 fd=6533 name=ioredisClusterSubscriber age=16429 idle=9182 flags=N db=0 sub=5 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 ow=0 owmem=0 events=r cmd=info numops=5
id=16498 addr={redacted}:62283 fd=6539 name=ioredisClusterSubscriber age=16429 idle=9180 flags=N db=0 sub=90 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 ow=0 owmem=0 events=r cmd=subscribe numops=5
id=16501 addr={redacted}:62305 fd=6542 name=ioredisClusterSubscriber age=16429 idle=9180 flags=N db=0 sub=105 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 ow=0 owmem=0 events=r cmd=info numops=5
id=16507 addr={redacted}:52922 fd=6548 name=ioredisClusterSubscriber age=16429 idle=9180 flags=N db=0 sub=95 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 ow=0 owmem=0 events=r cmd=subscribe numops=5
id=16508 addr={redacted}:52923 fd=6549 name=ioredisClusterSubscriber age=16429 idle=9185 flags=N db=0 sub=5 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 ow=0 owmem=0 events=r cmd=subscribe numops=5
id=16522 addr={redacted}:52908 fd=6563 name=ioredisClusterSubscriber age=16429 idle=9180 flags=N db=0 sub=88 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 ow=0 owmem=0 events=r cmd=subscribe numops=5
id=16523 addr={redacted}:52912 fd=6564 name=ioredisClusterSubscriber age=16429 idle=9184 flags=N db=0 sub=2 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 ow=0 owmem=0 events=r cmd=subscribe numops=5
id=16524 addr={redacted}:52921 fd=6565 name=ioredisClusterSubscriber age=16429 idle=8450 flags=N db=0 sub=201 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 ow=0 owmem=0 events=r cmd=subscribe numops=5

Based on this, I suspect there is some bug in the ClusterSubscriber code that is causing client/connection leaks when a Redis cluster has high cpu / command timeouts.

Please correct me if I'm wrong but based on a preliminary look through of that code, I'm suspicious of the this.subscriber / this.lastActiveSubscriber properties.
In that event that selectSubscriber is called and fails to resubscribe to the previous channels, this.lastActiveSubscriber = this.subscriber is never called. This means that if selectSubscriber is called again, it will create another subscriber while leaving the previous one alive forever.

Can someone knowledgeable check and confirm that? I think the follow-up items are:

  1. Fix ClusterSubscriber leaking connections
  2. It would be great if the client name was not just ioredisClusterSubscriber. In my case, I have a connectionName set on the ioredis cluster client. Could we make the ClusterSubscriber connection name append ioredisClusterSubscriber to the end of the main client? This would make it easier to debug via CLIENT LIST.
@luin
Copy link
Collaborator

luin commented Apr 8, 2021

Hey @GaryWilber, thanks for reporting this issue!

I can confirm this is a bug and I'm working on a fix.

luin added a commit that referenced this issue Apr 8, 2021
@luin luin closed this as completed in #1329 Apr 8, 2021
luin added a commit that referenced this issue Apr 8, 2021
ioredis-robot pushed a commit that referenced this issue Apr 8, 2021
# [4.26.0](v4.25.0...v4.26.0) (2021-04-08)

### Bug Fixes

* **cluster:** subscriber connection leaks ([81b9be0](81b9be0)), closes [#1325](#1325)

### Features

* **cluster:** apply provided connection name to internal connections ([2e388db](2e388db))
@ioredis-robot
Copy link
Collaborator

🎉 This issue has been resolved in version 4.26.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

janus-dev87 added a commit to janus-dev87/ioredis-work that referenced this issue Mar 1, 2024
# [4.26.0](redis/ioredis@v4.25.0...v4.26.0) (2021-04-08)

### Bug Fixes

* **cluster:** subscriber connection leaks ([81b9be0](redis/ioredis@81b9be0)), closes [#1325](redis/ioredis#1325)

### Features

* **cluster:** apply provided connection name to internal connections ([2e388db](redis/ioredis@2e388db))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants