Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus receiver does not scrape some Summary type metrics #1638

Closed
djjayeeta opened this issue Aug 25, 2020 · 12 comments
Closed

Prometheus receiver does not scrape some Summary type metrics #1638

djjayeeta opened this issue Aug 25, 2020 · 12 comments
Assignees
Labels
bug Something isn't working help wanted Good issue for contributors to OpenTelemetry Service to pick up priority:p2 Medium
Milestone

Comments

@djjayeeta
Copy link

djjayeeta commented Aug 25, 2020

Describe the bug
A clear and concise description of what the bug is.

Steps to reproduce
If possible, provide a recipe for reproducing the error.

What did you expect to see?
I have a Prometheus receiver + file exporter + logging exporter config. I expect to see all gauge, histogram and summary metric in the json file from file exporter.

What did you see instead?
I can only see Gauge metrics, some of histogram and summary in the json file from fileexporter and logging exporter.

Specifically _sum, _count for summary metrics are not available

The Prometheus receiver translates counter, histogram, and summary metrics into OTLP metrics with non-cumulative temporality.

Is this the expected behavior with Promethues receiver? If so do I need modify/add a new fileexporter or the receiver to handle this scenario?

What version did you use?

otel/opentelemetry-collector-dev:latest@sha256:ceebae71600ece5b280f7b4f74e7dfde8670761cf84b5c2a5f17089e8db31be0

What config did you use?

receivers:
        prometheus:
          use_start_time_metric: false
          config:
            global:
              scrape_interval: 1m
            scrape_configs:
            - bearer_token_file:TOKEN
              job_name: kube-apiserver
              kubernetes_sd_configs:
              - role: pod
                selectors:
                - role: pod
                  field: "spec.nodeName=$NODE_NAME"
              params:
                component:
                - kube-apiserver
              metric_relabel_configs:
              - action: keep
                regex: (apiserver_admission_step_admission_duration_seconds_summary_sum)
                source_labels:
                - __name__
              relabel_configs:
              - action: labelmap
                regex: __meta_kubernetes_pod_label_(.+)
              - action: replace
                regex: ([^:]+)(?::\d+)?
                replacement: "${ARG1}:9443"
                source_labels:
                - __address__
                target_label: __address__
              - action: keep
                regex: kube-apiserver
                source_labels:
                - __meta_kubernetes_pod_label_component
              - action: replace
                source_labels:
                - __meta_kubernetes_namespace
                target_label: namespace
              - action: replace
                source_labels:
                - __meta_kubernetes_pod_name
                target_label: pod
              scheme: https
              tls_config:
                cert_file: CERT_FILE
                insecure_skip_verify: true
                key_file: KEY_FILE
      exporters:
        file:
          path: /data/exporter/metrics-data.json
        logging:
          loglevel: debug
          sampling_initial: 5
          sampling_thereafter: 5
      service:
        pipelines:
          metrics/prom:
            receivers:
              - prometheus
            exporters:
              - logging
              - file

Environment
NA

Additional context
My metrics are exposed in prometheus format, so I am left with the choice of using prometheus receiver. The above config works for a prometheus server. So the metric is available.

cc @huyan0 @alolita @jmacd @bogdandrutu

@djjayeeta djjayeeta added the bug Something isn't working label Aug 25, 2020
@bogdandrutu
Copy link
Member

This is a bug in the prometheus receiver, see #1255

@bogdandrutu
Copy link
Member

I don't think the file exporter does not work, it exports the entire data as json, are you sure you receive the data?

@djjayeeta
Copy link
Author

Yes, the logging exporter shows the histogram and summary metrics but I see them as DoubleDataPoints

b.logEntry("DoubleDataPoints #%d", i)

@bogdandrutu
Copy link
Member

That is not a histogram or summary it is a double data point.

@bogdandrutu
Copy link
Member

Please add the logs.

@djjayeeta djjayeeta changed the title File exporter does not export Histogram, Summary metrics File exporter does not export some Summary type metrics Aug 26, 2020
@djjayeeta
Copy link
Author

djjayeeta commented Aug 26, 2020

@bogdandrutu Sorry for the late response, it took me some time to debug. Both the logging exporter and file exporter is not able to export some summary type metrics, specially the _sum type

My config file

receivers:
        prometheus:
          use_start_time_metric: false
          config:
            global:
              scrape_interval: 1m
            scrape_configs:
            - bearer_token_file:TOKEN
              job_name: kube-apiserver
              kubernetes_sd_configs:
              - role: pod
                selectors:
                - role: pod
                  field: "spec.nodeName=$NODE_NAME"
              params:
                component:
                - kube-apiserver
              metric_relabel_configs:
              - action: keep
                regex: (apiserver_admission_step_admission_duration_seconds_summary_sum)
                source_labels:
                - __name__
              relabel_configs:
              - action: labelmap
                regex: __meta_kubernetes_pod_label_(.+)
              - action: replace
                regex: ([^:]+)(?::\d+)?
                replacement: "${ARG1}:9443"
                source_labels:
                - __address__
                target_label: __address__
              - action: keep
                regex: kube-apiserver
                source_labels:
                - __meta_kubernetes_pod_label_component
              - action: replace
                source_labels:
                - __meta_kubernetes_namespace
                target_label: namespace
              - action: replace
                source_labels:
                - __meta_kubernetes_pod_name
                target_label: pod
              scheme: https
              tls_config:
                cert_file: CERT_FILE
                insecure_skip_verify: true
                key_file: KEY_FILE
      exporters:
        file:
          path: /data/exporter/metrics-data.json
        logging:
          loglevel: debug
          sampling_initial: 5
          sampling_thereafter: 5
      service:
        pipelines:
          metrics/prom:
            receivers:
              - prometheus
            exporters:
              - logging
              - file

With the same config I launched a prometheus server which is able to scrape the metric. I think this is issue of promethues receiver, which is not able to scrape apiserver_admission_step_admission_duration_seconds_summary_sum metric. I am not sure of the reason why

This is specially interesting as apiserver_admission_step_admission_duration_seconds_summary is scrapped by the receiver and exported properly. Am I missing something?

@djjayeeta djjayeeta changed the title File exporter does not export some Summary type metrics Prometheus receiver does not scrape some Summary type metrics Aug 26, 2020
@djjayeeta
Copy link
Author

I think I get it now, the _sum and _count metrics are summarized in the same metric https://github.com/open-telemetry/opentelemetry-collector/blob/master/receiver/prometheusreceiver/internal/metricfamily.go#L157.

Out of curiosity Is there a specific reason to do that rather than leaving the metric as it is. ex we get apiserver_admission_step_admission_duration_seconds_summary_sum as a separate metric.

@tigrannajaryan tigrannajaryan added this to the Backlog milestone Sep 2, 2020
@tigrannajaryan tigrannajaryan added the help wanted Good issue for contributors to OpenTelemetry Service to pick up label Sep 2, 2020
@djjayeeta
Copy link
Author

There is also a similar issue. I have a metric named X(gauge) and X_count(counter). Both metric have different set of labels. Due to the logic https://github.com/open-telemetry/opentelemetry-collector/blob/master/receiver/prometheusreceiver/internal/metricsbuilder.go#L199 they get merged in a single metric of gauge type(union of labels from both X and X_count), though the data points are not lost but it would require to custom logic to extract individual ones. Can we make this trimming optional? A boolean flag which will be default true and we can set it false for not trimming suffix.

I would also like to know the reasoning behind trimming suffixes. Is there some major drawback if we keep the suffix

@rmfitzpatrick
Copy link
Contributor

rmfitzpatrick commented Oct 1, 2020

I've run into this and think the issue is that summary types aren't supported by the internal pdata types: https://github.com/open-telemetry/opentelemetry-collector/blob/master/translator/internaldata/oc_to_metrics.go#L227. @bogdandrutu is there an inherent limitation to this or could support be added relatively easily? I'd be happy to take a crack at it if immediately feasible.

edit: this comment may apply primarily to #1255

@jmacd
Copy link
Contributor

jmacd commented Oct 6, 2020

One proposal for Summary can be found here:
open-telemetry/opentelemetry-specification#982
(search for "A minor note about about Summary data points" in the description).

tigrannajaryan pushed a commit that referenced this issue Nov 6, 2020
_Note:_ This PR is based off the Summary datapoint proto (that has not been merged upstream) in this [PR](https://github.com/open-telemetry/opentelemetry-proto/pull/227/files).

**Description:** 

This PR has two main parts: 

* Implement the [Summary datapoint proto](https://github.com/open-telemetry/opentelemetry-proto/pull/227/files) in the OTel Collector. Changes were made to _metrics_struct.go_ to create the generated Summary structs (using the `make genproto` and `make genpdata` commands).
* Add in the conversion between OpenCensus summary metrics and OTel Summary datapoints. This conversion directly adds compatibility for the Prometheus Summary metric in the PrometheusReceiver. 

**Link to tracking Issue:** [opentelemetry-specification/issues/1146](open-telemetry/opentelemetry-specification#1146) and [opentelemetry-collector/issues/1638](#1638)

A lot of the code in this PR is auto-generated from creating the structs and protos. The following files are were auto-generated:
* generated_metrics.go
* generated_metrics_test.go
* trace_config.pb.go 
* metrics.pg.go

**Testing:**

* Updated unit tests in _oc_to_metrics.go_ for conversion between OC summary metrics to OTel summary metrics
* Created end-to-end tests for the Summary metric in PrometheusReceiver
@JasonXZLiu
Copy link
Member

@tigrannajaryan Would you be able to assign this to me? Thank you!
I'll add the summary metric to the logging exporter so we can close this issue.

@JasonXZLiu
Copy link
Member

This issue is resolved (through PRs #2048 and #2084) and can be closed.

MovieStoreGuy pushed a commit to atlassian-forks/opentelemetry-collector that referenced this issue Nov 11, 2021
* Unexport the simple and batch SpanProcessors

* Update changes in changelog
hughesjj pushed a commit to hughesjj/opentelemetry-collector that referenced this issue Apr 27, 2023
…can break metrics for Kubernetes pods and containers, pinning this receiver's version to v0.51.0 (open-telemetry#1638)
Troels51 pushed a commit to Troels51/opentelemetry-collector that referenced this issue Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Good issue for contributors to OpenTelemetry Service to pick up priority:p2 Medium
Projects
None yet
Development

No branches or pull requests

7 participants