-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add client spans around calls to upstream #17
Comments
Nginx has a set of variables associated used to measure time: $request_time, $session_time, $connection_time, $upstream_connect_time, $upstream_first_byte_time, $upstream_header_time, $upstream_response_time, $upstream_queue_time, $upstream_session_time. One can use these variables in spans to get visibility into that. The $request_time variable is a non-cacheable variable. This means that nginx computes it every time you call it. And this can be used in a very powerful way. If you call the $request_time variable at the right moment during the request processing (and you can do it during different moments), you can get additional measurements. For example, consider the following configuration:
In the nginx configuration provided, we combine the non-cacheable variable $request_time with cacheable variables $proxy_delay and $stub from the map statement. We calculate $request_time only once during the calculation of $stub variable for proxy_set_header directive when we form a request to upstream. As we need to cache $request_time value to reuse during the log phase (if you refer it nginx-otel span configuration), we wrap it with the map from $proxy_delay. map variables are cacheable, so the value of $proxy_delay is cached when we call it the first time (in our example, we do it when we reached the point where we prepare a request to be proxied to upstream). In order to avoid sending unnecessary header “x-stub”, we use another map wrapper that always maps to the empty value (‘’). Proxy_set_header directive does not add an empty header to a proxied request. Using such patter one can get a rich set of time measurements. There are some internals though that useful to know. For example, nginx computes time once per even loop pass. So this is not an exact precision. But in 99.99% cases it should be good enough. Will this solve the use cases your shared? |
It sounds like a possible workaround, but creating client spans around your upstream calls is the standard way of doing these things and like I mentioned a ton of observability tools have built useful visualizations around that. Is there context/a reason creating extra spans at the appropriate moments (especially client spans) isn't a good fit for nginx? |
This is reasonable request, but Nginx doesn't provide clear "hook" points around upstream calls in all cases. The only option I can think of so far is to introduce new |
I agree that the ability to create client spans would be a very useful feature. Many experience around OpenTelemetry traces are built on the assumption that one service calling another service synchronously is modeled by a parent client span and a child server span. HTTP client instrumentation creates a client span, HTTP server framework instrumentations create a server span ( If you now have nginx between the instrumented HTTP client and the instrumented HTTP server framework, you end up with a span structure
It would be great to have correct timings, however, given the status quo having client and server spans with the same timings are the lesser evil (as opposed to the |
I found this while searching for information about adding spans to subrequests. In my situation: |
@bmerigan, subrequests tracing might deserve its own issue. Also, can you share config example so it's clearer what those subrequests are used for (e.g. authentication)? |
@p-pautov yes it is indeed for authentication (oauth2) and authorization (opa). I'd like those 2 sub_requests to get recorded as well. |
@dkrizic has a very good illustration why multiple spans are needed for traditional service graph metrics. I think we can assess that the current behavior of the extension is not spec-conformant and that it is not implementation specific. I think the minimum this extension should do is to adhere to the spec so downstream consumers can rely on the spec. Deviating from the spec is currently not an option for Grafana Labs and I would wonder if the service graph connector in the otel collector, the canonical servicegraph generator in Otel, would consider deviating - probably not. We [Grafana Labs] see a couple of use-cases in our user-base that are directly related to this ticket and nginx technology:
Any help with these would be greatly appreciated - there are proprietary extensions to nginx that can do all of the above and it would change the game for open-source rooted observability with nginx if that was possible by default. Why is this issue so important?Let me spend a few characters on why we think this should be a priority: nginx is a strategic and central component in many infrastructure settings. It is, because the technology is versatile, reliable and efficient. - However with how the otel implementation is set up for now, it is not observable as it will break any trace and service graph - our recommendation can only be to not use the extension as long as the extension does not conform to the spec. - "Fixing" this, would make this the go-to recommendation for any Otel user, a viable option and probably increase adoption. |
Is your feature request related to a problem? Please describe
Services usually have a server span upon request entry and client span (span.kind = client) around calls to upstream.
This helps with a few things:
Describe the solution you'd like
Describe alternatives you've considered
Putting a proxy between nginx and its upstreams.
The text was updated successfully, but these errors were encountered: