Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add http metrics for content length #2567

Closed
jack-berg opened this issue May 20, 2022 · 10 comments · Fixed by #2588
Closed

Add http metrics for content length #2567

jack-berg opened this issue May 20, 2022 · 10 comments · Fixed by #2588
Assignees
Labels
area:semantic-conventions Related to semantic conventions [label deprecated] triaged-accepted [label deprecated] Issue triaged and accepted by OTel community, can proceed with creating a PR semconv:HTTP spec:metrics Related to the specification/metrics directory

Comments

@jack-berg
Copy link
Member

It's very useful to have metrics about http request and response content length, for both http clients and servers.

I propose the addition of the following metrics:

Name Instrument Type Unit Description
http.server.request_content_length Histogram By Measures the size of the HTTP request payload
http.server.response_content_length Histogram By Measures the size of the HTTP request payload
http.client.request_content_length Histogram By Measures the size of the HTTP request payload
http.client.response_content_length Histogram By Measures the size of the HTTP request payload

The dimensions that are useful for http.*.duration are useful in analyzing distributions of request / response size as well, so I would suggest extending the same set of attributes to these new metrics.

Arguably, this type of information fits into the io naming conventions, in which case we'd have metrics of the form http.*.io, with an additional attribute indicating whether the direction is request or response. However, I don't see any precedent for histogram io metrics, and in the case of http content lengths, the distribution is important.

If accepted, it would be natural to extend these same metrics to rpc.* as well.

@jack-berg jack-berg added area:semantic-conventions Related to semantic conventions spec:metrics Related to the specification/metrics directory labels May 20, 2022
@tigrannajaryan
Copy link
Member

It's very useful to have metrics about http request and response content length, for both http clients and servers.

Any reason to name http.server.request_content_* and not http.request_content_* if they are both for clients and servers?

@jack-berg
Copy link
Member Author

Any reason to name http.server.request_content_* and not http.request_content_* if they are both for clients and servers?

If you're running an http server that also calls downstream http servers, you need some way to differentiate between requests the server is handling, versus requests the server is producing as a client. The scope names of the client and server metrics will be different, but requiring users to be aware of the scope name is bad ergonomics IMO.

@arminru
Copy link
Member

arminru commented May 23, 2022

An alternative would be to encode client vs server in an attribute/dimension. For spans we have SpanKind as a dedicated property on the span. For metrics we could consider adding a generic attribute semantic convention (which might be set directly using a Convenience API) for specifying a "party" or "counterpart" kind.
The same idea was also brought up in the scope of RPC: #2419

@jack-berg
Copy link
Member Author

Yeah I think these metrics should follow whatever conventions exist for http server / client metrics, and rpc server / client metrics. Right now, the difference between server and client is encoded in the metric name (i.e. http.server.duration and http.client.duration vs. rpc.server.duration and rpc.client.duration). If that pattern changes generally, these metrics should follow suit.

@tigrannajaryan
Copy link
Member

If you're running an http server that also calls downstream http servers, you need some way to differentiate between requests the server is handling, versus requests the server is producing as a client.

This is an interesting problem. For spans we solve this by requiring a server span and a client span for each of these operations. For metrics we don't have the exact equivalent notion (span kind). Perhaps this should be recorded as a dimension (e.g. direction=in/out)?

@jack-berg
Copy link
Member Author

I think separate metric names are preferable to attributes because:

  1. It would be strange to want to analyze the aggregate of request / response bytes the service processed in its capacity as an http server with request / response bytes the service processed in its capacity as an http client.
  2. The instruments for server http metrics and client http metrics will almost certainly fall under different instrumentation scopes, since rarely does a library / framework work as both a client and server. So even if the metrics for server and client shared the same name, they're under different scopes and probably shouldn't be aggregated.
  3. After closer inspection, rpc already has metrics for this concept under names rpc.server.*.size and rpc.client.*.size. I've followed this convention in PR Add metrics for http request and response size #2588.

@tigrannajaryan
Copy link
Member

  1. It would be strange to want to analyze the aggregate of request / response bytes the service processed in its capacity as an http server with request / response bytes the service processed in its capacity as an http client.

You mean aggregation over "direction" dimension would not be meaningful, right? I think I agree (in some cases perhaps the average between "in" and "out" could be considered somewhat meaningful as a "throughput" indication? I don't think it is very useful though).

We have a bunch of other metrics that have a direction dimension (e.g. read/write for disk metrics). I wonder if this is a mistake to have a dimension there as well. I submitted an issue to look at it #2589

@tigrannajaryan
Copy link
Member

The instruments for server http metrics and client http metrics will almost certainly fall under different instrumentation scopes, since rarely does a library / framework work as both a client and server. So even if the metrics for server and client shared the same name, they're under different scopes and probably shouldn't be aggregated.

This is another interesting point. If we wanted to allow this then different instrumentations would need to create instruments with the same name but different Scope (different library name - server vs client)), and each library would create data points only for the particular value of the attribute. I am not sure if I understand what our API says about what happens in this case. It seems like we can't have the same Metric with different Scopes, so there is really no way to emit such data using our API.

@jack-berg
Copy link
Member Author

You mean aggregation over "direction" dimension would not be meaningful, right? I think I agree (in some cases perhaps the average between "in" and "out" could be considered somewhat meaningful as a "throughput" indication? I don't think it is very useful though).

In this case there are 4 directions: An http server has request and response sizes to track, and and http client has request and response sizes to track. I think its unlikely (yet still possible) that you'd want to analyze http server request and response size together. Same applies to http client request and response size. However, I can't think of a scenario in which you'd want to analyze server AND client request and response size together.

If we wanted to allow this then different instrumentations would need to create instruments with the same name but different Scope (different library name - server vs client)), and each library would create data points only for the particular value of the attribute.

This is allowed. Instruments with the same name under different scopes are distinct and won't produce any conflict or error scenario. Relevant language from the api. Relevant language from the datamodel.

For the purpose of this conversation, this means that there is no benefit to giving server and client instruments the same name, since they'll be collected using different scopes, and will therefore be distinct instruments regardless of whether they share the same name.

@tigrannajaryan
Copy link
Member

For the purpose of this conversation, this means that there is no benefit to giving server and client instruments the same name, since they'll be collected using different scopes, and will therefore be distinct instruments regardless of whether they share the same name.

Thanks for confirming, that's what I wasn't sure about our metric API.

@tigrannajaryan tigrannajaryan added the [label deprecated] triaged-accepted [label deprecated] Issue triaged and accepted by OTel community, can proceed with creating a PR label Jun 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:semantic-conventions Related to semantic conventions [label deprecated] triaged-accepted [label deprecated] Issue triaged and accepted by OTel community, can proceed with creating a PR semconv:HTTP spec:metrics Related to the specification/metrics directory
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants