You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The used opentelemetry-sdk received a hard limit of 2000 for the label cardinality with [release 0.20.0](https://github.com/open-telemetry/opentelemetry-rust/blob/main/opentelemetry-sdk/CHANGELOG.md#v0200) and with PR open-telemetry/opentelemetry-rust#1066
Matrics / Streams with a cardinality exceeding 2000 will only be emitted via the overflow tagged metrics: apollo_router_http_requests_total{job="router", otel_metric_overflow="true", otel_scope_name="apollo/router"} 1
which causes all of the cardinality to be lost / dropped. Also a warning is logged (#5287, OpenTelemetry metric error occurred: Metrics error: Warning: Maximum data points for metric stream exceeded/ Entry added to overflow
While this protection feature in very appreciated and according to [OTel spec](https://github.com/open-telemetry/opentelemetry-specification/blob/v1.39.0/specification/metrics/sdk.md#cardinality-limits) it's not configurable yet, so 2000 is a hard limit. But configurability of this value is planned:
Please track the upstream feature (open-telemetry/opentelemetry-rust#1951) and expose a configuration variable to allow increasing / adjusting the cardinality limit. Maybe this could be part of the umbrella issue #3226 ?
Describe alternatives you've considered
Moving away from adding an ever higher cardinality to the metrics and switching to using access logs with all the fields certainly makes sense a some point - provided one has capable log shipping and aggregation in place.
Is your feature request related to a problem? Please describe.
The router allows to configure value extraction to add labels and cardinality to the exported metrics (https://www.apollographql.com/docs/graphos/reference/router/telemetry/metrics-exporters/otlp).
The used opentelemetry-sdk received a hard limit of 2000 for the label cardinality with
[release 0.20.0](https://github.com/open-telemetry/opentelemetry-rust/blob/main/opentelemetry-sdk/CHANGELOG.md#v0200)
and with PR open-telemetry/opentelemetry-rust#1066Matrics / Streams with a cardinality exceeding 2000 will only be emitted via the overflow tagged metrics:
apollo_router_http_requests_total{job="router", otel_metric_overflow="true", otel_scope_name="apollo/router"} 1
which causes all of the cardinality to be lost / dropped. Also a warning is logged (#5287,
OpenTelemetry metric error occurred: Metrics error: Warning: Maximum data points for metric stream exceeded/ Entry added to overflow
While this protection feature in very appreciated and according to
[OTel spec](https://github.com/open-telemetry/opentelemetry-specification/blob/v1.39.0/specification/metrics/sdk.md#cardinality-limits)
it's not configurable yet, so 2000 is a hard limit. But configurability of this value is planned:Describe the solution you'd like
Please track the upstream feature (open-telemetry/opentelemetry-rust#1951) and expose a configuration variable to allow increasing / adjusting the cardinality limit. Maybe this could be part of the umbrella issue #3226 ?
Describe alternatives you've considered
Moving away from adding an ever higher cardinality to the metrics and switching to using access logs with all the fields certainly makes sense a some point - provided one has capable log shipping and aggregation in place.
Additional context
The text was updated successfully, but these errors were encountered: