Error "Permanent error: server returned HTTP status 400 Bad Request" #396

nabulsi · 2021-03-12T06:17:30Z

Hi,
I am getting the following error:
2021-03-12T01:08:16.432Z ERROR exporterhelper/queued_retry.go:239 Exporting failed. The error is not retryable. Dropping data. {"component_kind": "exporter", "component_type": "awsprometheusremotewrite", "component_name": "awsprometheusremotewrite", "error": "Permanent error: server returned HTTP status 400 Bad Request: user=123456789012_ws-abcdefgh-aaaa-bbbb-cccc-dddddddddddd: series={__name__=\"container_cpu_usage_kernelmode_total\", cluster=\"observability-demo-cluster\"}, timestamp=2021-03-12T01:08:16.386Z: duplicate sample for timestamp ", "dropped_items": 26}

The setup I have includes 2 running ECS tasks i.e. each task has 1 demo application container and 1 AWS OpenTelemetry collector container. I've attached a file for mroe details (it's a CSV file. Please change the extension from txt to csv as GitHub doesn't support csv files).

Thanks!

log-events-viewer-result_new.txt

The text was updated successfully, but these errors were encountered:

pingleig · 2021-03-12T08:15:39Z

The error comes from cortex (which is used by awsprometheus) opstrace/opstrace#214 cortexproject/cortex#3411

I guess the problem is there is no extra label attached to time series to distinguish metrics from two different ecs tasks (e.g. task id). Labels like job and instance are dropped by prometheus receiver open-telemetry/opentelemetry-collector#575 . Can you share your prometheus receiver config? Based on your description you are running collector as sidecar (i.e. collector and application are in same ecs task definition).

nabulsi · 2021-03-12T09:40:54Z

Receiver config:

receivers:
  awsecscontainermetrics:

yes we are running collector as a sidecar (same task definition for application and collector)

Thanks

mxiamxia · 2021-03-12T17:38:49Z

Have you enabled the following processors in your pipeline processors: [filter, metricstransform, resource]? Are you using the sample configuration file we provided below for collecting ECS infra metrics?

https://github.com/aws-observability/aws-otel-collector/blob/main/config/ecs/container-insights/otel-task-metrics-config.yaml#L213

mxiamxia · 2021-03-12T19:09:28Z

OTel upstream has an issue related: open-telemetry/prometheus-interoperability-spec#10

mxiamxia closed this as completed Jun 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error "Permanent error: server returned HTTP status 400 Bad Request" #396

Error "Permanent error: server returned HTTP status 400 Bad Request" #396

nabulsi commented Mar 12, 2021

pingleig commented Mar 12, 2021

nabulsi commented Mar 12, 2021

mxiamxia commented Mar 12, 2021

mxiamxia commented Mar 12, 2021

Error "Permanent error: server returned HTTP status 400 Bad Request" #396

Error "Permanent error: server returned HTTP status 400 Bad Request" #396

Comments

nabulsi commented Mar 12, 2021

pingleig commented Mar 12, 2021

nabulsi commented Mar 12, 2021

mxiamxia commented Mar 12, 2021

mxiamxia commented Mar 12, 2021