-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Prometheus receiver observability #4973
Comments
Hi, the collector already exposes internal metrics on |
That may be possible. The promhttp.HandlerFor() method provides a single handler for a single gatherer, but we might be able to implement a composite handler to add both to the same endpoint. I think we would want to only add the prometheus server metrics if a prometheus receiver is being used. Otherwise, it's a lot of extra metrics without benefit. |
We had a few discussions in the past about handling our "own" telemetry reporting. I have a task to document what component owners should do, but I'm blocked by the current state of otel-go SDK. The main idea is that we want component owners to use the otel-api for instrumentation, and allow operators to specify a special telemetry pipeline to send the collector's own pipeline. |
We probably wouldn't get what we are looking for by directly using the otel-go API, since we want to know what is going on inside the prometheus server (library), which uses the prometheus client. The original proposal above of writing a bridge from prometheus.Gatherer to the OTel-go api seems likely to be the best option |
I believe we can already do that: https://github.com/open-telemetry/opentelemetry-go/blob/main/exporters/prometheus/prometheus.go#L89-L123 I beleive that by doing the following:
This will provide us with a MeterProvider which we can use in the OTel pipeline. I'll need to double-check how to use this meter provider though. |
I think that lets you go OTel -> prometheus, not the other way around. |
🤦 You are right! I always thought by passing I'll have to see how to get a |
How is this issue going now? I also need to relate to this issue. |
Can we build a extension to expose metrics in prometheus.DefaultGatherer? |
Thats not a bad idea short-term, but long-term we probably want to unify the self-observability of components long-term. |
I would like to know if a metric is scraped by the prometheus receiver, and from what scrape target. Most receivers set the instrumentation scope name to the receiver name (e.g. I'm not sure if this is worthy of its own issue and discussion, or if it should be part of this one. |
I would recommend opening a separate issue. I think that would be a good idea. It is also related to open-telemetry/opentelemetry-specification#2703 |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
hi, any progress on this? @dashpole could we use the method like open-telemetry/opentelemetry-collector#6297? |
Yes. I think it might actually be really easy to add these metrics to our current prometheus endpoint. The harder part would be to make those metrics work if we aren't using a Prometheus exporter for the metrics. That would require a Prometheus bridge of some sort. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Prometheus bridge: open-telemetry/opentelemetry-go#4351 |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Is your feature request related to a problem? Please describe.
The current set of otelcol receiver observability metrics seem to be
otelcol_receiver_accepted_metric_points
andotelcol_receiver_refused_metric_points
. Compared with running a prometheus server, this is missing a lot of metrics which are useful for debugging. For example, theprometheus_sd_discovered_targets
can tell me which targets my config has discovered, or theprometheus_target_metadata_cache_bytes
metric tells me how large the receiver's metadata cache is.Describe the solution you'd like
Allow ingesting metrics from the the prometheus.DefaultGatherer into the metrics pipeline. To accomplish this, implement a "bridge" from the prometheus gatherer to opentelemetry apis. Add configuration to the prometheus receiver to enable the collection of these metrics. Disable collection by default.
Describe alternatives you've considered
a. pros: Simple
b. cons: Self-scraping requires additional configuration; self-scraping is inefficient compared to in-process alternatives.
b. cons: Difficult to debug prometheus service discovery, target health, caching, and other problems.
Additional context
While it is not in the stated design goals of the collector, it would be useful to be able to insert a collector into an existing prometheus pipeline:
Before:
prometheus application -> prometheus server
After:
prometheus application -> opentelemetry collector -> prometheus server
This would allow users to make use of opentelemetry features (additional receivers/processors), or help facilitate migrations to/from the prometheus server.
These operational metrics are one of the things a user would currently "lose" if they inserted an opentelemetry collector into their prometheus pipeline.
The text was updated successfully, but these errors were encountered: