Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for multiple samples within same metric 'NewMetricWithTimestamp' #1137

Closed
machadovilaca opened this issue Sep 20, 2022 · 12 comments · Fixed by #1181
Closed

Support for multiple samples within same metric 'NewMetricWithTimestamp' #1137

machadovilaca opened this issue Sep 20, 2022 · 12 comments · Fixed by #1181

Comments

@machadovilaca
Copy link
Contributor

This might not even be considered a bug, but I think this is a very annoying behavior.

When using NewMetricWithTimestamp in a Collector as in:

// {
//     Name: "metric_test_1", Description: "metric_test_1 description",
//     Values: [ { value: 1,  timestamp: 1663713926158 }, { value: 2,  timestamp: 1663713937491 } ] 
// } 

desc := prometheus.NewDesc(m.Name, m.Description, nil, nil)

for _, value := range m.Values {
  ch <- prometheus.NewMetricWithTimestamp(
    value.timestamp,
    prometheus.MustNewConstMetric(desc, prometheus.GaugeValue, value.value),
  )
}

The client is running checkMetricConsistency for each of the pushed metrics which is throwing the following error

...
* collected metric "metric_test_1" { gauge:<value:2 > timestamp_ms:1663713937491 } was collected before with the same name and label values
...

I think the validation should be aware of the metrics timestamps and not throw an error in this case because the metrics belong to different point in times, and there are use cases where it makes sense to allow multiple ones to be pushed in the same scrape call.

@bwplotka
Copy link
Member

bwplotka commented Nov 3, 2022

Hey, I think his is not desired functionality, I am afraid. ):

Prometheus clients are optimized to present the current state (or last known state) of data to a potential collector/scraper/Prometheus. Even if we somewhat hack it to put multiple samples of the same series, Prometheus and other scrapers will most likely fail. In other words, you would need to propose a such a change to the https://github.com/OpenObservability/OpenMetrics standard. Only if it's there, will we implement it such extension (which I doubt will be accepted).

Perhaps you might need to look in Otel Go SDK and push the metric paradigm to have functionality like this, but you need to understand the huge impact of buffering multiple samples in your clients.

Hope that makes sense. I will close this issue to avoid confusion but feel free to discuss - maybe I missed something. We can always reopen. Thanks!

@bwplotka bwplotka closed this as completed Nov 3, 2022
@machadovilaca
Copy link
Contributor Author

@bwplotka I was just checking the OpenMetrics standard specification. In the
Gauge MetricPoint section, isn't what I proposed on of the use cases they show?

An example with a Metric with no labels and two MetricPoints with timestamps:

# TYPE foo gauge
foo 17.0 123
foo 18.0 456

@machadovilaca
Copy link
Contributor Author

By the way, just tested this with the Python client, and it supports what I suggested

import time
import random
import datetime
from prometheus_client.core import GaugeMetricFamily, REGISTRY
from prometheus_client import start_http_server

class RandomNumberCollector(object):
    def __init__(self):
        pass
    def collect(self):
        gauge = GaugeMetricFamily("random_number", "A random number generator, I have no better idea", labels=["randomNum"])
        gauge.add_metric(['random_num'], random.randint(1, 20), int(datetime.datetime.utcnow().timestamp()))
        time.sleep(1)
        gauge.add_metric(['random_num'], random.randint(1, 20), int(datetime.datetime.utcnow().timestamp()))
        yield gauge

if __name__ == "__main__":
    port = 9000
    frequency = 1
    start_http_server(port)
    REGISTRY.register(RandomNumberCollector())
    while True:
        time.sleep(frequency)
...
# HELP random_number A random number generator, I have no better idea
# TYPE random_number gauge
random_number{randomNum="random_num"} 14.0 1668885744000
random_number{randomNum="random_num"} 18.0 1668885745000

@bwplotka
Copy link
Member

Client supports, but how do you want to consume it? Do you plan to use Prometheus or Prometheus-like scraper?

@bwplotka
Copy link
Member

bwplotka commented Nov 21, 2022

You got me with OM statement: https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#metric-types-1

..and https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#metricpoint-1

I don't understand yet, why this is there, but you are right, OM supports it. Trying to get more information about it.

BTW Again, what's your use case? Why do you want this done? (:

@bwplotka bwplotka changed the title was collected before with the same name and label values Error when using 'NewMetricWithTimestamp' Support for multiple samples within same metric 'NewMetricWithTimestamp' Nov 21, 2022
@bwplotka bwplotka reopened this Nov 21, 2022
@machadovilaca
Copy link
Contributor Author

I was planning to use Prometheus.
My use case is just a tool I am developing to help in creating and testing metrics/alerts. What I was looking for was the ability to quickly create multiple metric values in different timestamps in the last hour/day or so.

@bwplotka
Copy link
Member

Please check how this is consumed by Prometheus. It's likely Prometheus just drops other samples, but let's see. Maybe there is some work to be done there. For 2-4 samples it's ok, but more than that on scale, the scrape will be simply too large.

Given OM supports it I think I might change my mind towards implementation here (:

@machadovilaca
Copy link
Contributor Author

They are

func (s *httpServer) handleMetricReq(w http.ResponseWriter, _ *http.Request) {
	w.WriteHeader(http.StatusOK)

	now := time.Now()

	w.Write([]byte("# HELP random_number A random number generator, I have no better idea\n"))
	w.Write([]byte("# TYPE random_number gauge\n"))

	w.Write([]byte("random_number{randomNum=\"random_num\"} 100.0 " + strconv.FormatInt(now.Add(-60*time.Minute).Unix(), 10) + "000\n"))
	w.Write([]byte("random_number{randomNum=\"random_num\"} 180.0 " + strconv.FormatInt(now.Add(-20*time.Minute).Unix(), 10) + "000\n"))
	w.Write([]byte("random_number{randomNum=\"random_num\"} 120.0 " + strconv.FormatInt(now.Add(-10*time.Minute).Unix(), 10) + "000\n"))
	w.Write([]byte("random_number{randomNum=\"random_num\"} 140.0 " + strconv.FormatInt(now.Add(-5*time.Minute).Unix(), 10) + "000\n"))
}

Screenshot from 2022-11-26 15-41-02

@bwplotka
Copy link
Member

Cool, thanks for checking. I think you convinced me (: We might want to go through other APIs too with @kakkoyun , but we can def reopen your PR if you still want to help make that work! #1142 - Can we please add some unit test too?

@machadovilaca
Copy link
Contributor Author

I don't know if @Elecon-rou wants to proceed with it, if not I can open a new PR

@Elecon-rou
Copy link

Elecon-rou commented Nov 28, 2022

@bwplotka @machadovilaca I'll try to add the test, sure

@machadovilaca
Copy link
Contributor Author

@bwplotka I created a new PR with the fix and a unit test for the metric consistency check

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants