Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/datadog] Trace stats are severely underreporting hit counts #31713

Closed
sirianni opened this issue Mar 12, 2024 · 9 comments
Closed

[exporter/datadog] Trace stats are severely underreporting hit counts #31713

sirianni opened this issue Mar 12, 2024 · 9 comments
Labels

Comments

@sirianni
Copy link
Contributor

sirianni commented Mar 12, 2024

Component(s)

exporter/datadog

What happened?

Description

The trace.* stats calculated within the datadogexporter are severely underreporting hit counts in many cases.

We confirmed this by comparing the span counts

image

Strangely, this discrepancy is not observed for a few HTTP routes (e.g. /metrics/logical_clusters in the above screenshot).

We do not have sampling enabled anywhere in our OTel Collector configuration.

Collector version

v0.93.0

Environment information

Environment

Kubernetes

OpenTelemetry Collector configuration

exporters:
  datadog
    metrics:
      resource_attributes_as_tags: true
      instrumentation_scope_metadata_as_tags: true
      summaries:
        mode: noquantiles
    traces:
      compute_stats_by_span_kind: false
      peer_tags_aggregation: false
      trace_buffer: 1000

    host_metadata:
      enabled: false
    sending_queue:
      queue_size: 200

Log output

No response

Additional context

No response

@sirianni sirianni added bug Something isn't working needs triage New item requiring triage labels Mar 12, 2024
@sirianni
Copy link
Contributor Author

@mx-psi @dineshg13 @mackjmr FYI

@github-actions github-actions bot added the exporter/datadog Datadog components label Mar 12, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@mx-psi mx-psi added priority:p1 High data:traces Trace related issues and removed needs triage New item requiring triage labels Mar 12, 2024
@backjo
Copy link
Contributor

backjo commented Mar 18, 2024

I thinks this is probably related to #31219 (and a bug). If you're not using the connector at all and just the exporter, the DD exporter is sending Datadog-Client-Computed-Stats: true as a header which is telling DD that the APM stats have already been computed - which is only true if you are using the connector.

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label May 20, 2024
@dineshg13
Copy link
Member

@sirianni are you still facing the issue? can you please us know if you were able to try the latest version of connector.

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Jul 22, 2024
@lucassantoss1701
Copy link

I’m facing a similar issue in an application, where there’s a discrepancy between the counters of metrics generated from span tags and the metrics automatically generated by the otelhttp library.

For example, the API receives 1,000 hits within a certain period, and during execution, it consistently makes calls to another API.

When making these calls, we generate a client-type span, which leads to Datadog creating a client APM metric. This allows us to filter and monitor the application's dependencies. However, the value of this metric significantly differs from the metrics generated by otelhttp, causing inconsistencies in the data we're monitoring.

I’m not sure if this is related to the warning in the Datadog documentation that mentions metrics from traces can indeed have inconsistencies.

@github-actions github-actions bot removed the Stale label Aug 24, 2024
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Oct 24, 2024
Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants