-
Notifications
You must be signed in to change notification settings - Fork 661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Partition offset rewinds during rebalances #209
Comments
An update on this: we seem to not have gotten any new reset offsets during deployments by executing this in the revoked partitions event:
so it does seem like my initial hypothesis holds |
@AlexJF I'm super curious--did you start noticing the negative lag due to https://github.com/DataDog/integrations-core/blob/181bfec3f7fdffc750f8ec82a9b36ae1c8734e56/kafka_consumer/datadog_checks/kafka_consumer/kafka_consumer.py#L336-L343 ? Or do you use a different check internally to monitor for negative consumer lag? |
We never actually observed negative lag at any point here. We noticed this problem due to the tracking of the |
Oops, my bad, misread your description as the lag going the other way. I understand what you're saying now. |
Description
We have a large pool (100+) of consumers using the high-level functional API consumer all with the same consumer group. These consumers use the auto-commit functionality but we call
StoreOffsets
ourselves at the end of our processing pipeline. We also explicitly handle rebalance events.We've noticed that whenever we do a deployment (we do a rolling deployment, 20 nodes at a time, so there are often a few partition assign+revoke events in succession) we sometimes witness offset rewinds in some partitions (we monitor offsets using Datadog integrations and have alerts for when the derivative of committed offsets goes negative).
It's not trivial to reproduce and it's not really feasible for us to activate the extra debug logging as it quickly overflows our disk given our traffic pattern. However, I attempted to capture the situation using a Datadog graphing of assigned and committed offsets for an affected partition at the time its offset got rewinded by approx 6.7k messages (10 minutes of data):
x-axis is time, y-axis is offset. Solid lines represent offsets as reported by a
CommittedOffsets
event. Dashed lines represent offsets as reported byconsumer.Committed
that is called onAssignedPartitions
event and these are used as the offsets forconsumer.Assign
.The ideal representation for this would actually be points not lines but that's unfortunately not a display option in Datadog at the moment. I put some black dots around the "points" of interest to try and minimize the noise caused by all the line interpolation.
Hypothesis
Barring some misuse of the library on our end (don't think it's the case since it matches the examples I see everywhere else), the situation described by the previous image suggests that there might be 2 separate issues here:
i-01d9b
goes back to an offset very close to the one it had been assigned to 10 minutes before suggests that the auto-commit mechanism committed an old stored offset for this partition.consumer.Committed
fori-0cd0a
10 minutes before was actually bigger than the one that got returned byconsumer.Committed()
fori-01d9b
during the offset rewind. This suggests that the auto-commit that committed the old stored offset successfully updated the broker committed offset beforei-01d9b
even got the response toconsumer.Committed
, much lessconsumer.Assign
.We'll be experimenting with this hypothesis by explicitly calling
consumer.Commit()
and storing -1001 as the offset for all assigned partitions before a call toUnassign()
.Code
Config
Event handling
kc
= KafkaConsumerfr
= FlightRecorder, basically keeps track of in-flight messages and EOFs. It also has some safeguards against processing messages with offsets previous to those that were reported as committed during the AssignedPartitions events (we thought this might have been the cause of the offset rewinds before we had enough telemetry but actually have seen no instance where we received a message with an offset previous to the one we passed on viaconsumer.Assign
).Committing
How to reproduce
Not clear. We haven't been able to reproduce in a controlled environment yet.
Checklist
Please provide the following information:
LibraryVersion()
):ConfigMap{...}
"debug": ".."
as necessary)The text was updated successfully, but these errors were encountered: