Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Establishing HTTP connection spec #1226

Open
mateuszrzeszutek opened this issue Jan 30, 2023 · 5 comments
Open

Establishing HTTP connection spec #1226

mateuszrzeszutek opened this issue Jan 30, 2023 · 5 comments
Labels

Comments

@mateuszrzeszutek
Copy link
Member

What are you trying to achieve?

I started working on implementing the http.resend_count spec in the Java instrumentations, and I ran into an interesting issue with the current spec.

Currently (that is, without the resends), in pretty much all Java HTTP client instrumentations, we are creating HTTP client spans for the outermost operation, usually the top-level interface exposed by a particular HTTP client library. This means that there is only one HTTP span created for the entire operation; if any resends occur during the call they'll go unnoticed, and the child SERVER spans will all receive the same parent span id (which is why we need to modify/rewrite all HTTP client instrumentations we have, they're all too "shallow"). The HTTP span also includes the "establishing the connection" phase (in most cases; I'll mention the exceptions to that later); in case a connection error occurs (closed port, server not responding at all, etc) the HTTP span will end with an error, and the telemetry about the thrown exception will be recorded.

This changes with the resend counter spec. As I understand it, we must now create a span for every attempt to actually send an HTTP request over the wire (and bump the counter if there's a retry). Most HTTP clients that I've read through separate the "establishing the connection" phase from the "sending the request" phase pretty cleanly; which means that whenever a DNS resolution/connection/TLS handshake error occurs, the actual HTTP instrumentation won't catch that - because it runs later than that.

I find this to be somewhat problematic, since usually you'd want to know if this kind of situations occur with your application.

There are several low-level HTTP clients that completely separate these two phases - where any details about the HTTP request are simply not accessible at the connection phase (mostly Netty and Netty-based HTTP clients). For these, we already had to introduce something that'll capture the connection level details, which we've called a CONNECT span. Note that this already exists in some of our instrumentations, and predates the resend spec.

Now, to solve the problem with the missing connection-level telemetry we could do one of the following things:

  • In the top-level HTTP client interface: if there was an exception thrown, and no attempts to send an HTTP request have been made, create a "fake" HTTP span just for the connection error. (Note that this is impossible to implement in the Netty-based clients that I've briefly mentioned above)
  • Introduce the CONNECT span to the HTTP client spec: create a CONNECT span whenever a connection is established (actual connection; taking a connection from a pool should not emit a span), regardless of whether it was successful or not. (We've had customers ask us to implement spans that capture low-level connection details; this kind of information would be generally useful, at least for some people).

Additional context.

@mateuszrzeszutek
Copy link
Member Author

mateuszrzeszutek commented Jan 30, 2023

CC @trask this is a part of the HTTP semconv effort

@trask trask assigned trask and unassigned SergeyKanzhelev Jan 30, 2023
@lmolkova
Copy link
Contributor

lmolkova commented Feb 1, 2023

@mateuszrzeszutek great find!

I'd like to entertain the idea of having top-level logical HTTP span to capture connect time, DNS time, circuit-breaking, the duration for tries with backoff, and overall operation success.

We might need to give instrumentations some freedom to instrument only client call, HTTP attempts, or both:

  • I heard from @tedsuo that AWS includes full HTTP request with tracing headers when creating request signature and it can't change between tries. (on Azure, we only include x-ms-* headers or a closed set of headers into the signature)
  • Client library instrumentations instrument public API surface anyway and don't necessarily need top-level span to account for other things

@mateuszrzeszutek
Copy link
Member Author

We might need to give instrumentations some freedom to instrument only client call, HTTP attempts, or both:

  • I heard from @tedsuo that AWS includes full HTTP request with tracing headers when creating request signature and it can't change between tries. (on Azure, we only include x-ms-* headers or a closed set of headers into the signature)
  • Client library instrumentations instrument public API surface anyway and don't necessarily need top-level span to account for other things

💯

Had the same thoughts exactly -- I'm pretty sure there are HTTP clients that may do retries but it's impossible to instrument each send attempt because of how code is structured (I did not have a good example of that before; the AWS SDK is a great one), and the spec should also somehow fit these cases too.

carlosalberto referenced this issue in open-telemetry/opentelemetry-specification Apr 14, 2023
Related to
https://github.com/open-telemetry/opentelemetry-specification/issues/3155
and
#3234

## Changes

This PR contains the less controversial parts of
#3234;
it describes how the `http.resend_count` attribute should be used, and
proposes two ways of instrumenting HTTP clients.
@trask
Copy link
Member

trask commented Apr 14, 2023

@mateuszrzeszutek do you think we still need this for Java http instrumentation stability? (or now with open-telemetry/opentelemetry-specification#3290, is the Java http instrumentation default state addressed, and we just need this for opt-in behaviors)

jsuereth referenced this issue in jsuereth/otel-semconv-test Apr 19, 2023
Related to
https://github.com/open-telemetry/opentelemetry-specification/issues/3155
and
open-telemetry/opentelemetry-specification#3234

## Changes

This PR contains the less controversial parts of
open-telemetry/opentelemetry-specification#3234;
it describes how the `http.resend_count` attribute should be used, and
proposes two ways of instrumenting HTTP clients.
@mateuszrzeszutek
Copy link
Member Author

@mateuszrzeszutek do you think we still need this for Java http instrumentation stability? (or now with open-telemetry/opentelemetry-specification#3290, is the Java http instrumentation default state addressed, and we just need this for opt-in behaviors)

OOF, I totally missed this issue 🙈
We've talked about this offline, I don't think it's required; at this point this is an addition that might be introduced later.

jsuereth referenced this issue May 11, 2023
Related to
https://github.com/open-telemetry/opentelemetry-specification/issues/3155
and
open-telemetry/opentelemetry-specification#3234

## Changes

This PR contains the less controversial parts of
open-telemetry/opentelemetry-specification#3234;
it describes how the `http.resend_count` attribute should be used, and
proposes two ways of instrumenting HTTP clients.
@lmolkova lmolkova transferred this issue from open-telemetry/opentelemetry-specification Jul 9, 2024
@lmolkova lmolkova added enhancement New feature or request area:new area:http labels Jul 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Can be addressed after stability
5 participants