[MLOB-1561] LLM Observability SDK API #4773

sabrenner · 2024-10-11T15:53:42Z

What does this PR do?

Adds an LLM Observability SDK API onto the tracer.

Important Call-Outs

ML Observability reviewers:

index.d.ts contains the TypeScript type definitions which will constitute our API. I think this is most relevant to what we have been solidifying across our SDKs
sdk.js is the main file for the SDK, which includes handling different values passed into the API functions and verifying the required data is met

APM Reviewers:

packages/dd-trace/src/llmobs/index.js houses the enablement of the LLMObs module. However, since the SDK is always initialized even when LLMObs isn't enabled (it still acts in a no-op state although it is not the no-op SDK), any writers and channel subscribers happen in the module, which is only enabled when a) enable(config) is called from the tracer proxy because LLMObs was enabled during init, or b) llmobs.enable(llmobsConfig) is called after init is called. This way, no writers/periodic flushing and span processing/injection subscribers are registered if LLMObs isn't explicitly enabled.

Motivation

Part of the LLM Observability SDK release. This is the last PR for the main SDK. A follow-up PR will be for an OpenAI LLM Obs plugin.

…trace-js into sabrenner/llmobs-sdk-sdk

github-actions · 2024-10-11T15:54:21Z

Overall package size

Self size: 7.7 MB
Deduped: 62.44 MB
No deduping: 62.72 MB

Dependency sizes

| name | version | self size | total size | |------|---------|-----------|------------| | @datadog/native-appsec | 8.1.1 | 18.67 MB | 18.68 MB | | @datadog/native-iast-taint-tracking | 3.1.0 | 12.27 MB | 12.28 MB | | @datadog/pprof | 5.3.0 | 9.85 MB | 10.22 MB | | protobufjs | 7.2.5 | 2.77 MB | 5.16 MB | | @datadog/native-iast-rewriter | 2.5.0 | 2.51 MB | 2.59 MB | | @opentelemetry/core | 1.14.0 | 872.87 kB | 1.47 MB | | @datadog/native-metrics | 2.0.0 | 898.77 kB | 1.3 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | import-in-the-middle | 1.11.2 | 112.74 kB | 826.22 kB | | msgpack-lite | 0.1.26 | 201.16 kB | 281.59 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.0 | 109.9 kB | 109.9 kB | | semver | 7.6.3 | 95.82 kB | 95.82 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | lru-cache | 7.14.0 | 74.95 kB | 74.95 kB | | ignore | 5.3.1 | 51.46 kB | 51.46 kB | | int64-buffer | 0.1.10 | 49.18 kB | 49.18 kB | | shell-quote | 1.8.1 | 44.96 kB | 44.96 kB | | istanbul-lib-coverage | 3.2.0 | 29.34 kB | 29.34 kB | | rfdc | 1.3.1 | 25.21 kB | 25.21 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | dc-polyfill | 0.1.4 | 23.1 kB | 23.1 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | path-to-regexp | 0.1.10 | 6.38 kB | 6.38 kB | | module-details-from-path | 1.0.3 | 4.47 kB | 4.47 kB |

_{🤖 This report was automatically generated by heaviest-objects-in-the-universe}

pr-commenter · 2024-10-11T16:40:26Z

Benchmarks

Benchmark execution time: 2024-10-17 23:55:17

Comparing candidate commit 7c7a328 in PR branch sabrenner/llmobs-sdk-sdk with baseline commit b6452ad in branch sabrenner/llmobs-sdk-release.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 259 metrics, 7 unstable metrics.

sergey-mamulchenko · 2024-10-14T12:22:27Z

Hi, we're interested to try out monitoring LLMs used in our Node application. Do you have an idea when JS SDK is planned to be shipped?

…instead of sdk

…trace-js into sabrenner/llmobs-sdk-sdk

index.d.ts

sabrenner · 2024-10-16T17:08:20Z

Hi @sergey-mamulchenko! We're looking to release this SDK by the end of the month. Feel free to follow along with #4742, as that will be the PR that lands containing the SDK 😄

…-sdk

lievan

No blocking comments, ty!

index.d.ts

packages/dd-trace/src/llmobs/sdk.js

Yun-Kim

Other than a small comment about exposing active() and some clarification questions, no major blocking comments from the MLObs team perspective! Great work @sabrenner 🎉

docs/test.ts

index.d.ts

packages/dd-trace/src/llmobs/sdk.js

Yun-Kim · 2024-10-23T22:48:33Z

packages/dd-trace/src/llmobs/sdk.js

+
+        if (result && typeof result.then === 'function') {
+          return result.then(value => {
+            if (value && kind !== 'retrieval' && !LLMObsTagger.tagMap.get(span)?.[OUTPUT_VALUE]) {


will this still attempt to auto annotate the output for an llm-kind span? Asking because LLM span outputs are in message format so we should just avoid auto-annotating here entirely.

ahhh yeah, i actually missed that our _model_decorators do not auto annotate at all. i'll add a check here for llm, and then input annotation check for llm and embedding, and just write a couple small regression tests!

packages/dd-trace/src/llmobs/sdk.js

Yun-Kim · 2024-10-23T22:56:39Z

packages/dd-trace/src/llmobs/util.js

+}
+
+// extracts the argument names from a function string
+function parseArgumentNames (str) {


This is parsing the argument and function signature from the entire serialized function? 💀

yeah 💀 but i memoized it with a weak reference to the function, so if the same function is invoked a million times, it'll use the parsed argument names from the first time and map over the arguments accordingly. tbh this might need some iteration, because i tried to assume user cases in writing tests but i'm sure i didn't think of everything. try/catched it so we will never crash, but will probably iterate on it accordingly from user reports.

would be nice if JS had some built-in parsers/helpers like Python, but all pretty basic string operations here so it shouldn't be too time consuming, its just all linear

Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>

Kyle-Verhoog

can't speak too much to the llm-specific stuff since i'm still ramping up to everything. Main concern is the soft fails instead of hard. Feel free to merge and address in the bigger PR.

docs/test.ts

index.d.ts

packages/dd-trace/src/llmobs/sdk.js

…e-js into sabrenner/llmobs-sdk-sdk

sabrenner · 2024-10-24T15:34:22Z

Will merge this PR in to the feature branch, it looks like a serverless benchmark is failing but I will resolve that in the final PR if needed

* [MLOB-1540] add llmobs configuration to global tracer config (#4696) add llmobs config * [MLOB-1555] LLM Observability writers (#4699) LLM Observability writers * [MLOB-1556] LLM Observability tagger (#4718) LLM Observability tagger * [MLOB-1560] LLMObs Span Processor (#4738) * span processor * tests * remove agent exporter log and do not stringify tags * remove llmobs from exporter tests * add in default unserializable value * review comments * warning log for metric * todo-ify * remove some duplicate logic * decouple llmobs span processing with a channel * use a static weakmap to store llmobs tags/annotations instead of span tags * do not register span in map if it does not have an llmobs span kind * span is passed on an object from sp publisher * re-clarify TODOs * only send span in publish * log multiple warnings and return conditional undefined * update error logic * [MLOB-1561] LLM Observability SDK API (#4773) * wip * type definitions * active + try/catch eval metric writer append * test ts * use tagger map and processor as a channel subscriber * change decorate and add in dev changes * try some api changes * add decorate to noop * fix breaking proxy tests * experimental decorators for TS docs * api changes, fix unit + e2e tests * try removing global log mocks * add some util tests * remove logger mocks * add module tests + do not enable when not specified * fix eval metric integration test * wip * memoize getFunctionArguments * move any subscriber and global writer to the module enablement level instead of sdk * should fix TS tests * add ts integration test and fix decorator * devex for ts versions * add noop typescript test * remove startSpan * remove unneeded change * dedup decorator code * Update index.d.ts Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com> * map metrics names * change validKind to validateKind and throw * tagger for metrics follow-up * review feedback * add some tests for not auto-annotating in certain cases --------- Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com> * hard fail instead of soft fail, except for `wrap` span name * add ml-observability codeowners * resolve ts test * update auto-annotation check * tagger can soft fail * using custom ASL instance and scope activation * fix test comments and remove * address review comments * remove llmobs.apiKey config, only rely on global * fix evaulations test * make llmobs storage accessible --------- Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>

sabrenner added 9 commits October 3, 2024 13:08

wip

a451f5d

type definitions

c57f22a

active + try/catch eval metric writer append

8507034

test ts

e58f30e

use tagger map and processor as a channel subscriber

3a50ca4

change decorate and add in dev changes

e11d5ae

Merge branch 'sabrenner/llmobs-sdk-release' of github.com:DataDog/dd-…

031559b

…trace-js into sabrenner/llmobs-sdk-sdk

try some api changes

db8e2eb

add decorate to noop

a4001d5

fix breaking proxy tests

ca22dff

experimental decorators for TS docs

e02c993

sabrenner added 11 commits October 15, 2024 12:55

api changes, fix unit + e2e tests

bcebeac

try removing global log mocks

b2d4216

add some util tests

622f54f

remove logger mocks

dbe664d

add module tests + do not enable when not specified

00d4572

fix eval metric integration test

ad28001

wip

9d68a0c

memoize getFunctionArguments

20a5ab1

move any subscriber and global writer to the module enablement level …

b400b1c

…instead of sdk

Merge branch 'sabrenner/llmobs-sdk-release' of github.com:DataDog/dd-…

d902720

…trace-js into sabrenner/llmobs-sdk-sdk

should fix TS tests

4c167ed

sabrenner commented Oct 16, 2024

View reviewed changes

index.d.ts Outdated Show resolved Hide resolved

sabrenner marked this pull request as ready for review October 16, 2024 17:04

sabrenner requested a review from a team as a code owner October 16, 2024 17:04

add ts integration test and fix decorator

3c8b1d7

sabrenner and others added 6 commits October 16, 2024 15:42

Merge branch 'sabrenner/llmobs-sdk-release' into sabrenner/llmobs-sdk…

9ed2f5e

…-sdk

devex for ts versions

8a0374a

add noop typescript test

040d04a

remove startSpan

745d4ae

remove unneeded change

79f0d07

dedup decorator code

7c7a328

rochdev approved these changes Oct 21, 2024

View reviewed changes

lievan approved these changes Oct 23, 2024

View reviewed changes

index.d.ts Show resolved Hide resolved

packages/dd-trace/src/llmobs/sdk.js Outdated Show resolved Hide resolved

packages/dd-trace/src/llmobs/sdk.js Outdated Show resolved Hide resolved

Yun-Kim approved these changes Oct 23, 2024

View reviewed changes

Update index.d.ts

49c20e8

Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>

Kyle-Verhoog reviewed Oct 24, 2024

View reviewed changes

sabrenner added 6 commits October 24, 2024 09:34

map metrics names

24396eb

change validKind to validateKind and throw

af823ba

tagger for metrics follow-up

854afef

review feedback

379fbf7

Merge branch 'sabrenner/llmobs-sdk-sdk' of github.com:DataDog/dd-trac…

c65b8b0

…e-js into sabrenner/llmobs-sdk-sdk

add some tests for not auto-annotating in certain cases

e6562f0

sabrenner merged commit 7e8e0f7 into sabrenner/llmobs-sdk-release Oct 24, 2024
200 of 202 checks passed

sabrenner deleted the sabrenner/llmobs-sdk-sdk branch October 24, 2024 15:34

sabrenner mentioned this pull request Oct 24, 2024

[MLOB-1524] feat(llmobs): Introduce LLM Observability SDK #4742

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MLOB-1561] LLM Observability SDK API #4773

[MLOB-1561] LLM Observability SDK API #4773

sabrenner commented Oct 11, 2024 •

edited

Loading

github-actions bot commented Oct 11, 2024 •

edited

Loading

pr-commenter bot commented Oct 11, 2024 •

edited

Loading

sergey-mamulchenko commented Oct 14, 2024

sabrenner commented Oct 16, 2024

lievan left a comment

Yun-Kim left a comment •

edited

Loading

Yun-Kim Oct 23, 2024

sabrenner Oct 24, 2024

Yun-Kim Oct 23, 2024

sabrenner Oct 24, 2024

sabrenner Oct 24, 2024

Kyle-Verhoog left a comment

sabrenner commented Oct 24, 2024

[MLOB-1561] LLM Observability SDK API #4773

[MLOB-1561] LLM Observability SDK API #4773

Conversation

sabrenner commented Oct 11, 2024 • edited Loading

What does this PR do?

Important Call-Outs

Motivation

github-actions bot commented Oct 11, 2024 • edited Loading

Overall package size

pr-commenter bot commented Oct 11, 2024 • edited Loading

Benchmarks

sergey-mamulchenko commented Oct 14, 2024

sabrenner commented Oct 16, 2024

lievan left a comment

Choose a reason for hiding this comment

Yun-Kim left a comment • edited Loading

Choose a reason for hiding this comment

Yun-Kim Oct 23, 2024

Choose a reason for hiding this comment

sabrenner Oct 24, 2024

Choose a reason for hiding this comment

Yun-Kim Oct 23, 2024

Choose a reason for hiding this comment

sabrenner Oct 24, 2024

Choose a reason for hiding this comment

sabrenner Oct 24, 2024

Choose a reason for hiding this comment

Kyle-Verhoog left a comment

Choose a reason for hiding this comment

sabrenner commented Oct 24, 2024

sabrenner commented Oct 11, 2024 •

edited

Loading

github-actions bot commented Oct 11, 2024 •

edited

Loading

pr-commenter bot commented Oct 11, 2024 •

edited

Loading

Yun-Kim left a comment •

edited

Loading