Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/otlphttp] Getting errors "VALUE_INVALID", "DIMENSION_KEY_OR_VALUE_EMPTY", "METRIC_UNIT_INVALID_CHARACTERS", "OTLP_MESSAGE_MAX_SIZE_EXCEEDED" #33650

Closed
vaibhhavv opened this issue Jun 19, 2024 · 7 comments
Labels
bug Something isn't working needs triage New item requiring triage

Comments

@vaibhhavv
Copy link

vaibhhavv commented Jun 19, 2024

Component(s)

No response

What happened?

Description

We are using opentelemtry-collector helm chart v0.72.0.
Our use case is we are receiving data from our consumers to our opentelemetry-collector and we want to export those data to a dynatrace. After configuring the exporters and pipeline, we are receiving few eoors/warning logs in otel, and we want to eliminate them.

Steps to Reproduce

Expected Result

Their should be no warnings/errors in otel logs.

Actual Result

2024-06-18T05:19:22.476Z	warn	{"kind": "exporter", "data_type": "metrics", "name": "otlphttp/export-to-dynatrace", "message": "The following issues were encountered while ingesting OTLP metrics:\nErrors:\nMetric value dropped. key: 'traffic_duration_p99' details: 'Value was NaN' - Reason: VALUE_INVALID\nMetric value dropped. key: 'traffic_duration_p99' details: 'Value was NaN' - Reason: VALUE_INVALID\nMetric value dropped. key: 'routing_duration_p95' details: 'Value was NaN' - Reason: VALUE_INVALID\nMetric value dropped. key: 'routing_duration_p95' details: 'Value was NaN' - Reason: VALUE_INVALID\nMetric value dropped. key: 'routing_duration_p95' details: 'Value was NaN' - Reason: VALUE_INVALID\n", "dropped_data_points": 45}

2024-06-18T05:19:22.476Z	warn	otlphttpexporter@v0.97.0/otlp.go:355	Partial success response	{"kind": "exporter", "data_type": "metrics", "name": "otlphttp/export-to-dynatrace", "message": "The following issues were encountered while ingesting OTLP metrics:\nWarnings:\nDimension dropped. key: 'connection_type' value: '' - Reason: DIMENSION_KEY_OR_VALUE_EMPTY\nDimension dropped. key: 'connection_type' value: '' - Reason: DIMENSION_KEY_OR_VALUE_EMPTY\nDimension dropped. key: 'connection_type' value: '' - Reason: DIMENSION_KEY_OR_VALUE_EMPTY\n", "dropped_data_points": 0}

2024-06-18T05:19:22.506Z	warn	otlphttpexporter@v0.97.0/otlp.go:355	Partial success response	{"kind": "exporter", "data_type": "metrics", "name": "otlphttp/export-to-dynatrace", "message": "The following issues were encountered while ingesting OTLP metrics:\nWarnings:\nUnit 'Total requests to get a data by its ID.' for metric 'get_data_by_id_total' dropped - Reason: METRIC_UNIT_INVALID_CHARACTERS\\n, "dropped_data_points": 0}

2024-06-18T05:19:22.747Z	error	exporterhelper/queue_sender.go:101	Exporting failed. Dropping data.	{"kind": "exporter", "data_type": "metrics", "name": "otlphttp/export-to-dynatrace", "error": "not retryable error: Permanent error: error exporting items, request to https://<some-endpoint> responded with HTTP Status Code 413, Message=All metric data points were rejected. The following issues were encountered while ingesting OTLP metrics:\nErrors:\nThe OTLP metrics message exceeded the size limit - Reason: OTLP_MESSAGE_MAX_SIZE_EXCEEDED\n, Details=[]", "dropped_items": 17181}
go.opentelemetry.io/collector/exporter/exporterhelper.newQueueSender.func1

Collector version

v0.87.0

Environment information

Environment

We have a k8s cluster and otel is deployed as a deployment.

OpenTelemetry Collector configuration

receivers:
    otlp:
      protocols:
        http:
          endpoint: ${env:MY_POD_IP}:4318
exporters: 
    otlphttp/export-to-dynatrace:
      endpoint: "DYNATRACE_ENDPOINT"
      headers:
        Authorization: "Api-Token ${env:DYNATRACE_TOKEN}"
      retry_on_failure:
        enabled: true
        max_elapsed_time: 600s
      sending_queue:
        enabled: true
        queue_size: 1000
service:
    pipelines:
      metrics/export-to-dynatrace:
       exporters:
       - otlphttp/export-to-dynatrace
       processors: []
       receivers:
       - otlp

Log output

No response

Additional context

No response

@vaibhhavv vaibhhavv added bug Something isn't working needs triage New item requiring triage labels Jun 19, 2024
@vaibhhavv vaibhhavv changed the title Getting errors "VALUE_INVALID", "DIMENSION_KEY_OR_VALUE_EMPTY", "METRIC_UNIT_INVALID_CHARACTERS", "OTLP_MESSAGE_MAX_SIZE_EXCEEDED" [exporter/otlphttp] Getting errors "VALUE_INVALID", "DIMENSION_KEY_OR_VALUE_EMPTY", "METRIC_UNIT_INVALID_CHARACTERS", "OTLP_MESSAGE_MAX_SIZE_EXCEEDED" Jun 19, 2024
@codeboten
Copy link
Contributor

@vaibhhavv it looks like the payload is too big? pinging the dynatrace contributors here @evan-bradley @dyladan @arminru

@vaibhhavv
Copy link
Author

vaibhhavv commented Jun 20, 2024

@codeboten yes, the payload can be big in our scenario. But receiving other warnings/errors is our concern.
Also if there are solutions to handle big payloads in production eliminating above errors, it would be great if otel/dynatrace experts share their approaches.

@vaibhhavv
Copy link
Author

Hi @evan-bradley @dyladan can you please share your expertise here.

@joaopgrassi
Copy link
Member

Hi @vaibhhavv I'm also working for Dynatrace and can help you with this.

First, our metrics OTLP API have both limits on the size of the OTLP request and also on the number of metric data points in the request. These you can find in our limits page

For your case, the message is being dropped because it exceeds the maximum size (4MB) that's why you get the OTLP_MESSAGE_MAX_SIZE_EXCEEDED error. But it would also exceed the maximum number of metric data points per request (15k).

To solve this, we recommend you using the collector batch processor. You can find an example for setting up the batch processor in our documentation: Batch OTLP requests. That is an initial example which you can tweak to meet our API limits and also make sure your requests don't get throttled read more here.

Now for the error warnings you get, those are due to the data not conforming to our guidelines. For example, the dimension connection_type has an empty value which we don't allow so the dimension is dropped. Same with the unit error METRIC_UNIT_INVALID_CHARACTERS, as we only allow letters for units and you have spaces in them.
The other one is also invalid because the metric value is NaN which we also don't support so the metric is dropped.

These validations and rules can also be found in our docs

You may want to look at the source of such metrics and adapt so they comply with our guidelines. Then you should not have warnings/partial success responses anymore.

@vaibhhavv
Copy link
Author

Hi @joaopgrassi, first of all, many thanks for your inputs.

  1. We have now enabled the batch processor with the set of values mentioned as an initial example and can see that OTLP_MESSAGE_MAX_SIZE_EXCEEDED are gone from the logs.
  2. Correct, as you suggested here we will try to adapt the metrics and the attributes with proper values.

One quick question, let's say we are exporting metrics to Dynatrace and one of the metric attributes does not fulfil the criteria due to which some warnings come in our logs.
In the backend, will that metric be exported to Dynatrace without that attribute? or the whole metric will get dropped?

@joaopgrassi
Copy link
Member

joaopgrassi commented Jun 26, 2024

Great! I'm glad that things are working now :)

One quick question, let's say we are exporting metrics to Dynatrace and one of the metric attributes does not fulfil the criteria due to which some warnings come in our logs.
In the backend, will that metric be exported to Dynatrace without that attribute? or the whole metric will get dropped?

When issues happen in the metric attributes, the metric is always still ingested in Dynatrace. The only situation where attributes can cause a metric to be dropped is if you go over the limit of 50 attributes per metric. Another case is when the metric name/key is entirely invalid.

All of these cases are also explained in our limits/limitation page (see the table there)

What usually happens though in such cases is that we "normalize" the data to fit our standards. For ex, invalid characters are replaced with _ and uppercase are converted to lowercase so most dimensions, even when invalid, are still ingested after this process.

@vaibhhavv
Copy link
Author

Thanks for the brief explanation @joaopgrassi, I got what I was looking for!
Your shared insights helped me with my use case. ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage New item requiring triage
Projects
None yet
Development

No branches or pull requests

3 participants