Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Inbound federation lag metrics should not report if no messages sent/received. #8716

Closed
michaelkaye opened this issue Nov 4, 2020 · 2 comments · Fixed by #9540
Closed

Inbound federation lag metrics should not report if no messages sent/received. #8716

michaelkaye opened this issue Nov 4, 2020 · 2 comments · Fixed by #9540
Labels
z-p2 (Deprecated Label)

Comments

@michaelkaye
Copy link
Contributor

The metrics introduced in #8430 are not clear when no inbound federation is being received from a server:

An inspection of a graph zoomed in shows this being hard to use in action

lag

This however doesn't show up as clearly when looking at a wider scale, and it's not obvious there is an issue - it just looks like it's successfully having a very small amount of lag and the problem has gone away.

lag2

Not emitting a metric while there aren't enough messages to provide any data seems like a better option here - at the moment we are emitting the out of date metric only, so any averages / minimums etc will not work correctly.

@michaelkaye
Copy link
Contributor Author

the metrics are frozen from 18:00ish to 04:00; the top graph above is the zoomed in small peak at 18:00

@erikjohnston erikjohnston added the z-p2 (Deprecated Label) label Nov 5, 2020
@richvdh
Copy link
Member

richvdh commented Jan 12, 2021

I guess the solution here is probably to convert the metric to a timestamp rather than an age, and convert back to an age on the prometheus side.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
z-p2 (Deprecated Label)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants