You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Mar 17, 2024. It is now read-only.
Sometimes the best metric for alerting purposes is the percent of the currently available offset range that the lag represents, i.e. OffsetLagMetric / (LatestOffsetMetric - EarliestOffsetMetric). In order to have this available, capture of the earliest offsets for a topic partition needs to be added. In the future, this could also be used to report the estimated retention period for a topic, so that if you've specified retention.bytes you could get a reasonable estimate of the number of seconds it takes to hit the specified byte size of the log.
I'm working on a PR for capturing and reporting EarliestOffsetMetric now, though I'm not tackling the seconds estimation piece yet.
The text was updated successfully, but these errors were encountered:
@graphex That's an interesting deriative metric. If your lag is close to your available offset range that would have obvious consequences for message delivery guarantees. It would be interesting to see how it could be used to tune retention period for a topic automatically. WDYT of reporting the derivative metric (OffsetLagMetric / (LatestOffsetMetric - EarliestOffsetMetric)) versus just the beginning offset? Is the beginning offset metric useful onits own?
I think the earliest offset is useful on its own for graph display and such. It seems like #71 is the most critical next step, but after getting that, I think adding the a few derivative metrics makes sense. I'll outline those in another issue.
Sometimes the best metric for alerting purposes is the percent of the currently available offset range that the lag represents, i.e.
OffsetLagMetric / (LatestOffsetMetric - EarliestOffsetMetric)
. In order to have this available, capture of the earliest offsets for a topic partition needs to be added. In the future, this could also be used to report the estimated retention period for a topic, so that if you've specifiedretention.bytes
you could get a reasonable estimate of the number of seconds it takes to hit the specified byte size of the log.I'm working on a PR for capturing and reporting EarliestOffsetMetric now, though I'm not tackling the seconds estimation piece yet.
The text was updated successfully, but these errors were encountered: