-
Notifications
You must be signed in to change notification settings - Fork 413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(llmobs): llmobs context provider #10138
Conversation
|
BenchmarksBenchmark execution time: 2024-08-09 20:55:39 Comparing candidate commit 79303da in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 214 metrics, 2 unstable metrics. |
3af4255
to
86dc15e
Compare
@@ -3,8 +3,7 @@ | |||
METADATA = "_ml_obs.meta.metadata" | |||
METRICS = "_ml_obs.metrics" | |||
ML_APP = "_ml_obs.meta.ml_app" | |||
PROPAGATED_PARENT_ID_KEY = "_dd.p.llmobs_parent_id" | |||
PARENT_ID_KEY = "_ml_obs.llmobs_parent_id" | |||
PARENT_ID_KEY = "_dd.p.llmobs_parent_id" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsure if we need to rename this to a different key name. This format _dd.p.*
was required because we were injecting the key-value into the context._meta
, but if we are directly injecting/extracting from the request headers this might not be needed.
Note 2: are there any requirements for key-naming for injecting into request headers?
Closing this due to annoying merge conflicts from upstream changes. Replacing this with #10767. |
tl;dr: This PR adds an additional context provider to the
LLMObs
service to keep track of the activeLLMObs
-type span.This PR also adds a convenience
LLMObs.current_span()
method to get the current LLMObs type span, and uses this in our existing public methods that involve checking for the current span if not provided by the user as an argument (ex:LLMObs.export_span(span=None, ...)
,LLMObs.annotate(span=None, ...)
.Workflow
When an LLMObs span gets started, (i.e.
LLMObs._start_span(...)
,BaseLLMIntegration.trace(submit_to_llmobs=True, ...)
,BedrockIntegration.set_llmobs_tags(...)
), we look at the current active LLMObs span, use that to set the parent_id field of the started span, and then activate the started span as the new current active LLMObs span.The LLMObsContextProvider updates the current active LLMObs span by iterating through the span's ancestor tree and stopping at the closest LLMObs-type span which is not finished.
For distributed tracing scenarios, we inject the request headers with an LLMObs parent ID. We then create a Context object using the extracted LLMObs parent ID header and activate that in the new process.
Context
Spans submitted to LLM Observability are technically APM spans (but with enhanced metadata/metrics and in a different span structure) marked with an
llm
span_type. This means APM span/parent/trace IDs are all reused in LLMObs spans.Problem: if APM-specific spans (via integration/manual) are intermixed with LLMObs spans in a trace, this causes problems with traces submitted to LLM Observability, where spans may have
parent_id
values referencing spans that do not exist in LLM Observability storage, because those APM-specific spans are not ingested to LLM Observability. (example trace structure below)Initial Solution: Use Span Context
In an effort to reuse as much tracer-internals as possible, internal LLMObs trace processing involved traversing a span's parent tree to find the nearest LLMObs type ancestor span, and then using that span's span_id as the parent_id. For distributed cases, the LLMObs parent ID was injected into request headers, and then extracted into the underlying span
Context
object'scontext._meta
dictionary field.This works fine for the most part, except for multithreading applications as the
Context._meta
field is a shallow copy and thus shared between threads, making it susceptible to race conditions where the propagated LLMObs parent ID would get overwritten by other threads starting.Current Approach: Manage our own LLMObs contexts
This approach is similar to the first, but instead of attempting to use the
Context._meta
field in a use case it was not designed for, we just create our own context provider for tracking active LLMObs span/contexts. Since we're relying onContextVar
which is thread-safe, this does not have the same problems as previously regarding multithreaded applications.Since we are relying on the new
LLMObsContextProvider
to manage and update the active LLMObs span (which reuses the same logic of traversing the parent tree for the nearest LLMObs span), we can get rid of all of the special casing required for dealing withContext
-propagated tags.The only changes we need to make to shared library code is adding distributed injection code into
HTTPPropagator.inject()
, activating the propagated request headers intotrace_utils.activate_distributed_headers()
, and context propagation/activation in the threading patch code.Checklist
Reviewer Checklist