-
Notifications
You must be signed in to change notification settings - Fork 469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TTL for pushed metrics? #117
Comments
Dupe of #19 For the rare times you need to delete a group, you can do so by hand. |
I'm wondering why is this an anti-pattern. So I have batch job 'A' that I run periodically, month later this job is (a) no longer required or maybe (b) it was converted to a daemon with it's own exporter. Now in case 'b' I have a duplicate metric available (daemon exporter and push gateway). In case 'a' I have a stale metric (information which is no longer valid). |
#19 documents the conclusion back then. If you want to bring forward new evidence that justifies re-opening the discussion, please do so on the prometheus-developers mailing list. |
Sorry to keep dredging this up, but I would like the devteam to explain the best practice for this situation:
This works well but the queries get slower as the number of unique hostnames increases. This is because every unique instance name is permanently remembered by the pushgateway. The way the pushgateway is designed, it seems like we have these choices:
I think this is a common use case, and that the official documentation should describe what the best practice for this sort of 'ephemeral producer' use case is. |
I'm sure the Prometheus community is happy to discuss your use case. But an already closed GitHub issue is not the right place. Could you post to the prometheus-users mailing list where the discussion is accessible for everybody so that more people can benefit from it? |
would like to see a TTL feature too |
I'd also like to see a TTL feature. Having to manually remove stale groups is painful and for me it's not a 'rare time'. |
@yumpy maybe you want to look at this fork |
would like to see a TTL feature too stable version |
I would also like this feature. Prometheus is frequently deployed in container infrastructures, where all jobs are ephemeral. This is doubly the case with the push gateway, which is designed for ephemeral jobs. Furthermore, mailing lists are where discussions go to die. Email chains fork, they're often only visible to a small group, and they're frequently lost. Github persists context and conversation across years, as this issue shows. Garbage collecting push metrics is nearly impossible in the prometheus model, because it's hard to know when a metric is no longer relevant. Fortunately, most of us don't need perfect: we need good enough. And stale metric deletion is good enough for most use cases. |
I have the same problem. I have hundreds of jobs every day and I need to monitor the status of the jobs, but after a few hours the job finishes but the metric is still there. Then new jobs arrive continuously then pushgateway keep accumulating the jobs. Meanwhile, given those are ephemeral values, I guess I can delete all entries before adding new ones. |
Could you please take #117 (comment) into account? Really, folks, you are using the wrong forum to express your concerns. |
Hi, it appears that a pushgateway doesn't support any form of TTL for the pushed metrics. Yes, I've seen this link: https://prometheus.io/docs/practices/pushing/. However cache should be invalidated under the right circumstances and I think that introducing a TTL (e.g. with a "meta" label like: 'push_gateway_ttl_seconds') could help in removing cached stale metrics.
The text was updated successfully, but these errors were encountered: