-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MLOB-1561] LLM Observability SDK API #4773
[MLOB-1561] LLM Observability SDK API #4773
Conversation
…trace-js into sabrenner/llmobs-sdk-sdk
Overall package sizeSelf size: 7.7 MB Dependency sizes| name | version | self size | total size | |------|---------|-----------|------------| | @datadog/native-appsec | 8.1.1 | 18.67 MB | 18.68 MB | | @datadog/native-iast-taint-tracking | 3.1.0 | 12.27 MB | 12.28 MB | | @datadog/pprof | 5.3.0 | 9.85 MB | 10.22 MB | | protobufjs | 7.2.5 | 2.77 MB | 5.16 MB | | @datadog/native-iast-rewriter | 2.5.0 | 2.51 MB | 2.59 MB | | @opentelemetry/core | 1.14.0 | 872.87 kB | 1.47 MB | | @datadog/native-metrics | 2.0.0 | 898.77 kB | 1.3 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | import-in-the-middle | 1.11.2 | 112.74 kB | 826.22 kB | | msgpack-lite | 0.1.26 | 201.16 kB | 281.59 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.0 | 109.9 kB | 109.9 kB | | semver | 7.6.3 | 95.82 kB | 95.82 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | lru-cache | 7.14.0 | 74.95 kB | 74.95 kB | | ignore | 5.3.1 | 51.46 kB | 51.46 kB | | int64-buffer | 0.1.10 | 49.18 kB | 49.18 kB | | shell-quote | 1.8.1 | 44.96 kB | 44.96 kB | | istanbul-lib-coverage | 3.2.0 | 29.34 kB | 29.34 kB | | rfdc | 1.3.1 | 25.21 kB | 25.21 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | dc-polyfill | 0.1.4 | 23.1 kB | 23.1 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | path-to-regexp | 0.1.10 | 6.38 kB | 6.38 kB | | module-details-from-path | 1.0.3 | 4.47 kB | 4.47 kB |🤖 This report was automatically generated by heaviest-objects-in-the-universe |
BenchmarksBenchmark execution time: 2024-10-17 23:55:17 Comparing candidate commit 7c7a328 in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 259 metrics, 7 unstable metrics. |
Hi, we're interested to try out monitoring LLMs used in our Node application. Do you have an idea when JS SDK is planned to be shipped? |
…trace-js into sabrenner/llmobs-sdk-sdk
Hi @sergey-mamulchenko! We're looking to release this SDK by the end of the month. Feel free to follow along with #4742, as that will be the PR that lands containing the SDK 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No blocking comments, ty!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than a small comment about exposing active()
and some clarification questions, no major blocking comments from the MLObs team perspective! Great work @sabrenner 🎉
|
||
if (result && typeof result.then === 'function') { | ||
return result.then(value => { | ||
if (value && kind !== 'retrieval' && !LLMObsTagger.tagMap.get(span)?.[OUTPUT_VALUE]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will this still attempt to auto annotate the output for an llm-kind span? Asking because LLM span outputs are in message format so we should just avoid auto-annotating here entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahhh yeah, i actually missed that our _model_decorator
s do not auto annotate at all. i'll add a check here for llm
, and then input annotation check for llm
and embedding
, and just write a couple small regression tests!
} | ||
|
||
// extracts the argument names from a function string | ||
function parseArgumentNames (str) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is parsing the argument and function signature from the entire serialized function? 💀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah 💀 but i memoized it with a weak reference to the function, so if the same function is invoked a million times, it'll use the parsed argument names from the first time and map over the arguments accordingly. tbh this might need some iteration, because i tried to assume user cases in writing tests but i'm sure i didn't think of everything. try/catch
ed it so we will never crash, but will probably iterate on it accordingly from user reports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be nice if JS had some built-in parsers/helpers like Python, but all pretty basic string operations here so it shouldn't be too time consuming, its just all linear
Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't speak too much to the llm-specific stuff since i'm still ramping up to everything. Main concern is the soft fails instead of hard. Feel free to merge and address in the bigger PR.
…e-js into sabrenner/llmobs-sdk-sdk
Will merge this PR in to the feature branch, it looks like a serverless benchmark is failing but I will resolve that in the final PR if needed |
7e8e0f7
into
sabrenner/llmobs-sdk-release
* [MLOB-1540] add llmobs configuration to global tracer config (#4696) add llmobs config * [MLOB-1555] LLM Observability writers (#4699) LLM Observability writers * [MLOB-1556] LLM Observability tagger (#4718) LLM Observability tagger * [MLOB-1560] LLMObs Span Processor (#4738) * span processor * tests * remove agent exporter log and do not stringify tags * remove llmobs from exporter tests * add in default unserializable value * review comments * warning log for metric * todo-ify * remove some duplicate logic * decouple llmobs span processing with a channel * use a static weakmap to store llmobs tags/annotations instead of span tags * do not register span in map if it does not have an llmobs span kind * span is passed on an object from sp publisher * re-clarify TODOs * only send span in publish * log multiple warnings and return conditional undefined * update error logic * [MLOB-1561] LLM Observability SDK API (#4773) * wip * type definitions * active + try/catch eval metric writer append * test ts * use tagger map and processor as a channel subscriber * change decorate and add in dev changes * try some api changes * add decorate to noop * fix breaking proxy tests * experimental decorators for TS docs * api changes, fix unit + e2e tests * try removing global log mocks * add some util tests * remove logger mocks * add module tests + do not enable when not specified * fix eval metric integration test * wip * memoize getFunctionArguments * move any subscriber and global writer to the module enablement level instead of sdk * should fix TS tests * add ts integration test and fix decorator * devex for ts versions * add noop typescript test * remove startSpan * remove unneeded change * dedup decorator code * Update index.d.ts Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com> * map metrics names * change validKind to validateKind and throw * tagger for metrics follow-up * review feedback * add some tests for not auto-annotating in certain cases --------- Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com> * hard fail instead of soft fail, except for `wrap` span name * add ml-observability codeowners * resolve ts test * update auto-annotation check * tagger can soft fail * using custom ASL instance and scope activation * fix test comments and remove * address review comments * remove llmobs.apiKey config, only rely on global * fix evaulations test * make llmobs storage accessible --------- Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>
* [MLOB-1540] add llmobs configuration to global tracer config (#4696) add llmobs config * [MLOB-1555] LLM Observability writers (#4699) LLM Observability writers * [MLOB-1556] LLM Observability tagger (#4718) LLM Observability tagger * [MLOB-1560] LLMObs Span Processor (#4738) * span processor * tests * remove agent exporter log and do not stringify tags * remove llmobs from exporter tests * add in default unserializable value * review comments * warning log for metric * todo-ify * remove some duplicate logic * decouple llmobs span processing with a channel * use a static weakmap to store llmobs tags/annotations instead of span tags * do not register span in map if it does not have an llmobs span kind * span is passed on an object from sp publisher * re-clarify TODOs * only send span in publish * log multiple warnings and return conditional undefined * update error logic * [MLOB-1561] LLM Observability SDK API (#4773) * wip * type definitions * active + try/catch eval metric writer append * test ts * use tagger map and processor as a channel subscriber * change decorate and add in dev changes * try some api changes * add decorate to noop * fix breaking proxy tests * experimental decorators for TS docs * api changes, fix unit + e2e tests * try removing global log mocks * add some util tests * remove logger mocks * add module tests + do not enable when not specified * fix eval metric integration test * wip * memoize getFunctionArguments * move any subscriber and global writer to the module enablement level instead of sdk * should fix TS tests * add ts integration test and fix decorator * devex for ts versions * add noop typescript test * remove startSpan * remove unneeded change * dedup decorator code * Update index.d.ts Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com> * map metrics names * change validKind to validateKind and throw * tagger for metrics follow-up * review feedback * add some tests for not auto-annotating in certain cases --------- Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com> * hard fail instead of soft fail, except for `wrap` span name * add ml-observability codeowners * resolve ts test * update auto-annotation check * tagger can soft fail * using custom ASL instance and scope activation * fix test comments and remove * address review comments * remove llmobs.apiKey config, only rely on global * fix evaulations test * make llmobs storage accessible --------- Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>
* [MLOB-1540] add llmobs configuration to global tracer config (#4696) add llmobs config * [MLOB-1555] LLM Observability writers (#4699) LLM Observability writers * [MLOB-1556] LLM Observability tagger (#4718) LLM Observability tagger * [MLOB-1560] LLMObs Span Processor (#4738) * span processor * tests * remove agent exporter log and do not stringify tags * remove llmobs from exporter tests * add in default unserializable value * review comments * warning log for metric * todo-ify * remove some duplicate logic * decouple llmobs span processing with a channel * use a static weakmap to store llmobs tags/annotations instead of span tags * do not register span in map if it does not have an llmobs span kind * span is passed on an object from sp publisher * re-clarify TODOs * only send span in publish * log multiple warnings and return conditional undefined * update error logic * [MLOB-1561] LLM Observability SDK API (#4773) * wip * type definitions * active + try/catch eval metric writer append * test ts * use tagger map and processor as a channel subscriber * change decorate and add in dev changes * try some api changes * add decorate to noop * fix breaking proxy tests * experimental decorators for TS docs * api changes, fix unit + e2e tests * try removing global log mocks * add some util tests * remove logger mocks * add module tests + do not enable when not specified * fix eval metric integration test * wip * memoize getFunctionArguments * move any subscriber and global writer to the module enablement level instead of sdk * should fix TS tests * add ts integration test and fix decorator * devex for ts versions * add noop typescript test * remove startSpan * remove unneeded change * dedup decorator code * Update index.d.ts Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com> * map metrics names * change validKind to validateKind and throw * tagger for metrics follow-up * review feedback * add some tests for not auto-annotating in certain cases --------- Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com> * hard fail instead of soft fail, except for `wrap` span name * add ml-observability codeowners * resolve ts test * update auto-annotation check * tagger can soft fail * using custom ASL instance and scope activation * fix test comments and remove * address review comments * remove llmobs.apiKey config, only rely on global * fix evaulations test * make llmobs storage accessible --------- Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>
* [MLOB-1540] add llmobs configuration to global tracer config (#4696) add llmobs config * [MLOB-1555] LLM Observability writers (#4699) LLM Observability writers * [MLOB-1556] LLM Observability tagger (#4718) LLM Observability tagger * [MLOB-1560] LLMObs Span Processor (#4738) * span processor * tests * remove agent exporter log and do not stringify tags * remove llmobs from exporter tests * add in default unserializable value * review comments * warning log for metric * todo-ify * remove some duplicate logic * decouple llmobs span processing with a channel * use a static weakmap to store llmobs tags/annotations instead of span tags * do not register span in map if it does not have an llmobs span kind * span is passed on an object from sp publisher * re-clarify TODOs * only send span in publish * log multiple warnings and return conditional undefined * update error logic * [MLOB-1561] LLM Observability SDK API (#4773) * wip * type definitions * active + try/catch eval metric writer append * test ts * use tagger map and processor as a channel subscriber * change decorate and add in dev changes * try some api changes * add decorate to noop * fix breaking proxy tests * experimental decorators for TS docs * api changes, fix unit + e2e tests * try removing global log mocks * add some util tests * remove logger mocks * add module tests + do not enable when not specified * fix eval metric integration test * wip * memoize getFunctionArguments * move any subscriber and global writer to the module enablement level instead of sdk * should fix TS tests * add ts integration test and fix decorator * devex for ts versions * add noop typescript test * remove startSpan * remove unneeded change * dedup decorator code * Update index.d.ts Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com> * map metrics names * change validKind to validateKind and throw * tagger for metrics follow-up * review feedback * add some tests for not auto-annotating in certain cases --------- Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com> * hard fail instead of soft fail, except for `wrap` span name * add ml-observability codeowners * resolve ts test * update auto-annotation check * tagger can soft fail * using custom ASL instance and scope activation * fix test comments and remove * address review comments * remove llmobs.apiKey config, only rely on global * fix evaulations test * make llmobs storage accessible --------- Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>
* [MLOB-1540] add llmobs configuration to global tracer config (#4696) add llmobs config * [MLOB-1555] LLM Observability writers (#4699) LLM Observability writers * [MLOB-1556] LLM Observability tagger (#4718) LLM Observability tagger * [MLOB-1560] LLMObs Span Processor (#4738) * span processor * tests * remove agent exporter log and do not stringify tags * remove llmobs from exporter tests * add in default unserializable value * review comments * warning log for metric * todo-ify * remove some duplicate logic * decouple llmobs span processing with a channel * use a static weakmap to store llmobs tags/annotations instead of span tags * do not register span in map if it does not have an llmobs span kind * span is passed on an object from sp publisher * re-clarify TODOs * only send span in publish * log multiple warnings and return conditional undefined * update error logic * [MLOB-1561] LLM Observability SDK API (#4773) * wip * type definitions * active + try/catch eval metric writer append * test ts * use tagger map and processor as a channel subscriber * change decorate and add in dev changes * try some api changes * add decorate to noop * fix breaking proxy tests * experimental decorators for TS docs * api changes, fix unit + e2e tests * try removing global log mocks * add some util tests * remove logger mocks * add module tests + do not enable when not specified * fix eval metric integration test * wip * memoize getFunctionArguments * move any subscriber and global writer to the module enablement level instead of sdk * should fix TS tests * add ts integration test and fix decorator * devex for ts versions * add noop typescript test * remove startSpan * remove unneeded change * dedup decorator code * Update index.d.ts Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com> * map metrics names * change validKind to validateKind and throw * tagger for metrics follow-up * review feedback * add some tests for not auto-annotating in certain cases --------- Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com> * hard fail instead of soft fail, except for `wrap` span name * add ml-observability codeowners * resolve ts test * update auto-annotation check * tagger can soft fail * using custom ASL instance and scope activation * fix test comments and remove * address review comments * remove llmobs.apiKey config, only rely on global * fix evaulations test * make llmobs storage accessible --------- Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>
* [MLOB-1540] add llmobs configuration to global tracer config (#4696) add llmobs config * [MLOB-1555] LLM Observability writers (#4699) LLM Observability writers * [MLOB-1556] LLM Observability tagger (#4718) LLM Observability tagger * [MLOB-1560] LLMObs Span Processor (#4738) * span processor * tests * remove agent exporter log and do not stringify tags * remove llmobs from exporter tests * add in default unserializable value * review comments * warning log for metric * todo-ify * remove some duplicate logic * decouple llmobs span processing with a channel * use a static weakmap to store llmobs tags/annotations instead of span tags * do not register span in map if it does not have an llmobs span kind * span is passed on an object from sp publisher * re-clarify TODOs * only send span in publish * log multiple warnings and return conditional undefined * update error logic * [MLOB-1561] LLM Observability SDK API (#4773) * wip * type definitions * active + try/catch eval metric writer append * test ts * use tagger map and processor as a channel subscriber * change decorate and add in dev changes * try some api changes * add decorate to noop * fix breaking proxy tests * experimental decorators for TS docs * api changes, fix unit + e2e tests * try removing global log mocks * add some util tests * remove logger mocks * add module tests + do not enable when not specified * fix eval metric integration test * wip * memoize getFunctionArguments * move any subscriber and global writer to the module enablement level instead of sdk * should fix TS tests * add ts integration test and fix decorator * devex for ts versions * add noop typescript test * remove startSpan * remove unneeded change * dedup decorator code * Update index.d.ts Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com> * map metrics names * change validKind to validateKind and throw * tagger for metrics follow-up * review feedback * add some tests for not auto-annotating in certain cases --------- Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com> * hard fail instead of soft fail, except for `wrap` span name * add ml-observability codeowners * resolve ts test * update auto-annotation check * tagger can soft fail * using custom ASL instance and scope activation * fix test comments and remove * address review comments * remove llmobs.apiKey config, only rely on global * fix evaulations test * make llmobs storage accessible --------- Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>
What does this PR do?
Adds an LLM Observability SDK API onto the tracer.
Important Call-Outs
ML Observability reviewers:
index.d.ts
contains the TypeScript type definitions which will constitute our API. I think this is most relevant to what we have been solidifying across our SDKssdk.js
is the main file for the SDK, which includes handling different values passed into the API functions and verifying the required data is metAPM Reviewers:
packages/dd-trace/src/llmobs/index.js
houses the enablement of the LLMObs module. However, since the SDK is always initialized even when LLMObs isn't enabled (it still acts in a no-op state although it is not the no-op SDK), any writers and channel subscribers happen in the module, which is only enabled when a)enable(config)
is called from the tracer proxy because LLMObs was enabled duringinit
, or b)llmobs.enable(llmobsConfig)
is called afterinit
is called. This way, no writers/periodic flushing and span processing/injection subscribers are registered if LLMObs isn't explicitly enabled.Motivation
Part of the LLM Observability SDK release. This is the last PR for the main SDK. A follow-up PR will be for an OpenAI LLM Obs plugin.