Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make http.status_code metric attribute an int #2943

Merged
merged 3 commits into from
Nov 17, 2022

Conversation

alanwest
Copy link
Member

Fixes #2855

The trace semantic conventions have a nice Type column. The metric conventions does not. It'd be nice if the structure of these tables matched. I'd be happy follow up with with this, but for now I wanted to make the smallest possible change to make sure we agree http.status_code should be an integer attribute.

@yurishkuro
Copy link
Member

because metrics are sensitive to cardinality, I've seen instrumentations using strings like 4xx, 5xx for status code.

@svrnm
Copy link
Member

svrnm commented Nov 14, 2022

because metrics are sensitive to cardinality, I've seen instrumentations using strings like 4xx, 5xx for status code.

I am wondering if there should be 2 attributes to cover that, i.e. http.status_code and maybe http.status_code_group, or is there a standardised concept for such a kind of metric aggregation?

@alanwest
Copy link
Member Author

alanwest commented Nov 14, 2022

because metrics are sensitive to cardinality, I've seen instrumentations using strings like 4xx, 5xx for status code.

Interesting. I think it would be ideal if the http.status_code had the same semantics between traces and metrics. Maybe a separate attribute http.status_code_class could be proposed for capturing 2xx, 4xx, etc?

edit: Ignore me 😄 I had not refreshed and seen same comment from @svrnm.

@svrnm
Copy link
Member

svrnm commented Nov 14, 2022

Interesting. I think it would be ideal if the http.status_code had the same semantics between traces and metrics. Maybe a separate attribute http.status_code_class could be proposed for capturing 2xx, 4xx, etc?

status_code_class is better, since it's the official name (see https://datatracker.ietf.org/doc/html/rfc9110#section-15), I am wondering if it should be "1xx", "2xx", etc or "Informational", "Successful", etc?

@alanwest
Copy link
Member Author

status_code_class is better, since it's the official name (see https://datatracker.ietf.org/doc/html/rfc9110#section-15), I am wondering if it should be "1xx", "2xx", etc or "Informational", "Successful", etc?

Good question. Opened open-telemetry/semantic-conventions#1056 for follow up.

@joaopgrassi
Copy link
Member

@jsuereth I saw this only today, and couldn't join the semconv wg meeting, but shouldn't this be seen as a breaking change?

  • Existing queries using the string "200" for ex, or any string-based operation like startsWith, contains would break if this is changed to int. For ex, this prometheus query
# To select all HTTP status codes except 4xx ones, you could run:
http_requests_total{status!~"4.."}
  • Existing instrumentation that already implemented this metric today export strings. How should this change be handled by them?

I'm not sure if it was discussed the change of attribute types in the last meeting being a breaking change.

Looking at status_code in particular, I wonder: What operations can one do with it being an int that you can't do with a string? I can still do things like: Show me all requests that failed (meaning startsWith "4" or startsWith "5").

Another point to consider is, a status_code is not a "real" numeric value, but rather a notion of a category of values - a status code could be expressed as bad_request for ex. It just happens to be a number in my view.

It seems to me that the original issue was brought up due to the inconsistency between traces and metrics semconvs, but I'm failing to see a real use-case why the dimension should be an int. What value would this change bring? Maybe we should focus more on that instead of just fixing the inconsistency. I my view it's OK they are different types, since metric systems might not handle typed dimension values anyway and have to stringify it.

@reyang
Copy link
Member

reyang commented Nov 15, 2022

@jsuereth I saw this only today, and couldn't join the semconv wg meeting, but shouldn't this be seen as a breaking change?

  • Existing queries using the string "200" for ex, or any string-based operation like startsWith, contains would break if this is changed to int. For ex, this prometheus query

No, for Prometheus Exporter, the integer 200 label should be converted to "200", which means from consumption side it'll be the same.

@joaopgrassi
Copy link
Member

joaopgrassi commented Nov 16, 2022

No, for Prometheus Exporter, the integer 200 label should be converted to "200", which means from consumption side it'll be the same.

@reyang right, Prometheus was just one example, but did we (or should we) consider metric back-ends that don't support non-string attribute values? If users of such back-ends are already using instrumentation libraries that sends this as strings and rely on the attribute being a string for their dashboards/alerts, by moving forward with this, we are essentially breaking them or am I missing something?

@reyang
Copy link
Member

reyang commented Nov 16, 2022

@reyang right, Prometheus was just one example, but did we (or should we) consider metric back-ends that don't support non-string attribute values? If users of such back-ends are already using instrumentation libraries that sends this as strings and rely on the attribute being a string for their dashboards/alerts, by moving forward with this, we are essentially breaking them or am I missing something?

Changing type from string to int is a breaking change IMHO (@jsuereth FYI). We shouldn't encourage such change once the semantic convention is Stable. For this particular case, I would highly encourage/support aligning the types at Experimental stage.

@reyang reyang closed this Nov 17, 2022
@reyang reyang reopened this Nov 17, 2022
@reyang
Copy link
Member

reyang commented Nov 17, 2022

🤷‍♂️

image

@reyang reyang closed this Nov 17, 2022
@reyang reyang reopened this Nov 17, 2022
@reyang reyang closed this Nov 17, 2022
@reyang reyang reopened this Nov 17, 2022
@reyang reyang merged commit b6c6176 into open-telemetry:main Nov 17, 2022
@jsuereth
Copy link
Contributor

Just to chime in, agree with @reyang here. For us to get HTTP semantic conventions stable we need to go through the differences between Trace/Metrics and align them.

This is a breaking change, but hopefully one of the few we need to make prior to HTTP stability.

@alanwest alanwest deleted the patch-1 branch February 24, 2023 04:27
carlosalberto pushed a commit to carlosalberto/opentelemetry-specification that referenced this pull request Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Semantic convention type mismatch for http.status_code