-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding support for end-to-end activity tracing #470
Comments
StreamJsonRpc already raises TraceSource and EventSource events including with Start/Stop semantics so that PerfView can show a request->response as an activity. Does what you're proposing go beyond that? |
How can that be used to correlate across invocations across processes that can be happening simultaneously? I mean precise correlation, not just heuristic based on timing... |
I don't think that's possible today. Are you proposing adding fields to the JSON-RPC messages to carry the correlation id? |
I know it's not possible, which is why I created this issue 😉. The System.Diagnostics.Activity.Id basically implements the scenarios in the W3C Trace Context recommendation so that activities (or operations) can properly be tracked across microservices invocations. They have an ActivityUserGuide that explains (in an HTTP-based scenario) how it works and how it's used, but I think the same concept (if not the implementation? or part of it?) applies to non HTTP-based distributed services, such as StreamJsonRpc. So, we could borrow the entire implementation as-is from the |
So, bottom-line: my proposal is to just incorporate the hierarchical request ID propagation and reuse the existing request ID property to do so. Then TraceSource/EventSource/DiagnosticSource(Activity) events could be automatically emitted to light up activity tracing for all three APIs. |
With the assumption that we want this to be transparent to users of proxies, is there a list of activity ids we need to set/unset so that we cover TraceSource, EventSource and other loggers. The ones I found were:
|
I think we should also cover the new-ish DiagnosticSource and its Activity, since it seems it's the one being most up-to-date with regards to standards (W3C) and community efforts (such as OpenTelemetry). This could be useful as a guideline for the implementation: aspnetcore diagnostics+activity. I think it would be more important to support DiagnosticSource/Activity than ILogger... |
This work is starting in my |
It's great to see this happening. .NET5 is a big jump forward in terms of standards-based tracing and it's great that a key component in the VS ecosystem will also be onboard with that 👍 |
I have That leaves one more concern: how to reconcile |
Since there's no currently existing tooling that allows you to make sense of the current ActivityId, I'd say drop it and see who complains ;). I doubt there's much usage (that can be checked via telemetry, btw). If usage is low/non-existent, there's no dilemma: go Activity all the way. There's going to be much more tooling around that from first and third parties, so I'd say it's worth sacrificing a bit of optimizations (which I'm sure will eventually come to Activity too, since it's a key part of ASP.NET core E2E tracing too). Just my 2cents :) |
Thanks for your thoughts, @kzu. I'm leaning toward |
Taking the decision out of the library seems like a good way to avoid having to mandate one over the other 👍 |
When StreamJsonRpc is used to power microservices-based architectures, the need for rich (end-to-end) tracing and activity propagation, to enable better troubleshooting and monitoring becomes quite important. This has been an explicit focus in .NET/ASP.NET Core 3.0 and with the brand-new W3C Trace Context recommendation there's an opportunity to introduce this in StreamJsonRpc in a standards-compliant way.
The goal of a built-in activity tracing support would be to enable rich tracing that can correlate calls across multiple remote invocations. This could then be used, for example, in System.Diagnostics.TraceSource for TraceTransfer/Start/Stop which model activities, or EventSource.SetCurrentThreadActivityId to do activity-based logging in ETW traces, or the newer Activity in .NET Core that has built-in support for the W3C standard.
Currently, there doesn't seem to be any specific support for propagating activity identifiers across RPC calls (just a request identifier). One option would be to leverage the fact that the request id does not need to have a specific format and therefore could accomodate a proper W3C trace context.
Attempting to add this support from outside of StreamJsonRpc (in a spike) turned out to be very tricky/hacky and not very satisfactory, which is why I think it's best if this was supported built-in.
This support would allow tools that can properly correlate and visualize traces to provide improved diagnostics like the Service Trace Viewer in this sample screenshot from a spike:
The text was updated successfully, but these errors were encountered: