-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Endless loop of subscriptionManager #692
Comments
Thanks for the detailed report, I believe there are multiple related issues at play here. The tight loop is fairly obvious given the SVG you provided so I will have a fix for that shortly. What I am still trying to figure out is how it got into that state in the first place. I have a few hypotheses but if you could provide the sarama logs from the incident that would be very helpful. |
Here are the logs around the time of the incident. |
Hmm... in some time prior to the incident were there any sarama logs that contained the phrase "abandoned subscription"? If so then I'm pretty sure I understand what happened. Either way, #693 should fix it. |
Yes, I periodically have this message in the logs. Before and after the problem. |
Good to know, thanks. If those logs also say "because consuming was taking too long" then you may want to adjust your |
I'm not sure it could work in our case, we basically take data from Kafka and push it to other systems that we don't control. Pseudo code is: Should we worry about theses "abandoned subscription because consuming was taking too long" messages in our case? I didn't see any dataloss in our application. |
It won't drop any data it's just an efficiency thing. If you knew that your processing e.g. takes 10 seconds per message it is just more efficient not to constantly be dropping/re-establishing the subscription. In your case it sounds like the slow cases are not predictable enough to matter. |
Versions
Sarama Version: 1.8
Kafka Version: 0.9.2
Go Version: 1.6.2
Configuration
What configuration values are you using for Sarama and Kafka?
I'm running a 3 node kafka & zookeeper cluster.
Sarama is used with github.com/wvanbergen/kafka for the consumer-group management.
I realize we are not using the latest sarama lib and the wvanbergen consumer-group library is probably deprecated as well. If you think this issue isn't relevant, I can understand
Logs
Problem Description
After the machine running sarama lost its connection to our kafka/zookeeper cluster for a few minutes, sarama takes all the available CPUs.
Running pprof on the instance shows most of the time is spent in the go runtime doing channel management (selectgoImpl, atomic.Xchg) with the only sarama function appearing in the profile being borkerConsumer.subscriptionManager. Looks like subscriptionManager is busy looping forever.
The svg profile from pprof: pprof001.zip
The text was updated successfully, but these errors were encountered: