Add `stream` param for inference APIs #198646

pgayvallet · 2024-11-01T09:46:28Z

Summary

Fix #198644

Add a stream parameter to the chatComplete and output APIs, defaulting to false, to switch between "full content response as promise" and "event observable" responses.

Note: at the moment, in non-stream mode, the implementation is simply constructing the response from the observable. It should be possible later to improve this by having the LLM adapters handle the stream/no-stream logic, but this is out of scope of the current PR.

Normal mode

const response = await chatComplete({
  connectorId: 'my-connector',
  system: "You are a helpful assistant",
  messages: [
     { role: MessageRole.User, content: "Some question?"},
  ]
});

const { content, toolCalls } = response;
// do something

Stream mode

const events$ = chatComplete({
  stream: true,
  connectorId: 'my-connector',
  system: "You are a helpful assistant",
  messages: [
     { role: MessageRole.User, content: "Some question?"},
  ]
});

events$.subscribe((event) => {
   // do something
});

pgayvallet · 2024-11-01T09:46:47Z

/ci

…-fix'

pgayvallet · 2024-11-01T10:50:44Z

/ci

pgayvallet · 2024-11-01T11:11:42Z

/ci

pgayvallet · 2024-11-01T12:46:12Z

/ci

elasticmachine · 2024-11-01T12:47:53Z

Pinging @elastic/appex-ai-infra (Team:AI Infra)

nreese

kibana-presentation changes LGTM
code review only

…tream-param

legrego

Thanks for all the docs, this looks really good. I left a few questions, and some minor suggestions.

x-pack/plugins/inference/common/create_output_api.test.ts

x-pack/plugins/inference/public/chat_complete.ts

x-pack/plugins/inference/server/routes/chat_complete.ts

x-pack/plugins/inference/public/chat_complete.ts

x-pack/plugins/inference/public/chat_complete.test.ts

x-pack/packages/ai-infra/inference-common/src/output/api.ts

x-pack/plugins/inference/README.md

…tream-param

legrego

LGTM!

elasticmachine · 2024-11-05T12:40:37Z

💚 Build Succeeded

Buildkite Build
Commit: 3bba16c

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id	before	after	diff
`@kbn/inference-common`	39	38	-1

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id	before	after	diff
`dashboard`	647.2KB	647.3KB	+42.0B

Public APIs missing exports

Total count of every type that is part of your API that should be exported but is not. This will cause broken links in the API documentation system. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats exports for more detailed information.

id	before	after	diff
`@kbn/inference-common`	0	1	+1

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id	before	after	diff
`inference`	5.7KB	6.0KB	+307.0B

Unknown metric groups

API count

id	before	after	diff
`@kbn/inference-common`	99	121	+22

History

💚 Build #248066 succeeded 7269617
💔 Build #247910 failed 72d7c27
💚 Build #247753 succeeded 9032ba1
💔 Build #247724 failed 1bd42ad
💔 Build #247707 failed 031408a

pgayvallet · 2024-11-05T13:13:46Z

@elasticmachine merge upstream

kibanamachine · 2024-11-05T14:55:01Z

Starting backport for target branches: 8.x

https://github.com/elastic/kibana/actions/runs/11686697979

kibanamachine · 2024-11-05T14:59:34Z

💔 All backports failed

Status	Branch	Result
❌	8.x	Backport failed because of merge conflicts

Manual backport

To create the backport manually run:

node scripts/backport --pr 198646

Questions ?

Please refer to the Backport tool documentation

## Summary Fix elastic#198644 Add a `stream` parameter to the `chatComplete` and `output` APIs, defaulting to `false`, to switch between "full content response as promise" and "event observable" responses. Note: at the moment, in non-stream mode, the implementation is simply constructing the response from the observable. It should be possible later to improve this by having the LLM adapters handle the stream/no-stream logic, but this is out of scope of the current PR. ### Normal mode ```ts const response = await chatComplete({ connectorId: 'my-connector', system: "You are a helpful assistant", messages: [ { role: MessageRole.User, content: "Some question?"}, ] }); const { content, toolCalls } = response; // do something ``` ### Stream mode ```ts const events$ = chatComplete({ stream: true, connectorId: 'my-connector', system: "You are a helpful assistant", messages: [ { role: MessageRole.User, content: "Some question?"}, ] }); events$.subscribe((event) => { // do something }); ``` --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> (cherry picked from commit fe16822)

# Backport This will backport the following commits from `main` to `8.x`: - Add `stream` param for inference APIs (#198646) (fe16822)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport)

## Summary Fix elastic#198644 Add a `stream` parameter to the `chatComplete` and `output` APIs, defaulting to `false`, to switch between "full content response as promise" and "event observable" responses. Note: at the moment, in non-stream mode, the implementation is simply constructing the response from the observable. It should be possible later to improve this by having the LLM adapters handle the stream/no-stream logic, but this is out of scope of the current PR. ### Normal mode ```ts const response = await chatComplete({ connectorId: 'my-connector', system: "You are a helpful assistant", messages: [ { role: MessageRole.User, content: "Some question?"}, ] }); const { content, toolCalls } = response; // do something ``` ### Stream mode ```ts const events$ = chatComplete({ stream: true, connectorId: 'my-connector', system: "You are a helpful assistant", messages: [ { role: MessageRole.User, content: "Some question?"}, ] }); events$.subscribe((event) => { // do something }); ``` --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

Introduce the stream param for inference APIs

031408a

pgayvallet added release_note:skip Skip the PR/issue when compiling release notes backport:prev-minor Backport to (9.0) the previous minor version (i.e. one version back from main) Team:AI Infra AppEx AI Infrastructure Team v8.17.0 labels Nov 1, 2024

kibanamachine and others added 3 commits November 1, 2024 10:07

[CI] Auto-commit changed files from 'node scripts/eslint --no-cache -…

a6c8f6f

…-fix'

push patch on output API

4a75ca3

Sure TS, sure

1bd42ad

fix o11y usage of schema

5c5933b

await non stream response

9032ba1

pgayvallet marked this pull request as ready for review November 1, 2024 12:47

pgayvallet requested review from a team as code owners November 1, 2024 12:47

dgieselaar approved these changes Nov 1, 2024

View reviewed changes

nreese approved these changes Nov 1, 2024

View reviewed changes

pgayvallet added 4 commits November 1, 2024 17:44

improve readme

8e1cc64

self-review

f4acc20

Merge remote-tracking branch 'upstream/main' into kbn-xxx-inference-s…

72d7c27

…tream-param

Merge remote-tracking branch 'upstream/main' into kbn-xxx-inference-s…

7269617

…tream-param

legrego reviewed Nov 4, 2024

View reviewed changes

pgayvallet added 3 commits November 5, 2024 11:17

review nits

8783253

review nits

ab158bf

review nits

7e791bc

pgayvallet added 3 commits November 5, 2024 11:32

add section about inference connectors

ed3adf3

factorize route handlers

253cf6e

Merge remote-tracking branch 'upstream/main' into kbn-xxx-inference-s…

3bba16c

…tream-param

pgayvallet requested a review from legrego November 5, 2024 11:05

legrego approved these changes Nov 5, 2024

View reviewed changes

Merge branch 'main' into kbn-xxx-inference-stream-param

89ceeee

pgayvallet enabled auto-merge (squash) November 5, 2024 13:14

pgayvallet merged commit fe16822 into elastic:main Nov 5, 2024
21 checks passed

kibanamachine added the v9.0.0 label Nov 5, 2024

pgayvallet mentioned this pull request Nov 6, 2024

[8.x] Add stream param for inference APIs (#198646) #199081

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `stream` param for inference APIs #198646

Add `stream` param for inference APIs #198646

pgayvallet commented Nov 1, 2024 •

edited by kibanamachine

Loading

pgayvallet commented Nov 1, 2024

pgayvallet commented Nov 1, 2024

pgayvallet commented Nov 1, 2024

pgayvallet commented Nov 1, 2024

elasticmachine commented Nov 1, 2024

nreese left a comment

legrego left a comment

legrego left a comment

elasticmachine commented Nov 5, 2024

API count

pgayvallet commented Nov 5, 2024

kibanamachine commented Nov 5, 2024

kibanamachine commented Nov 5, 2024

Add stream param for inference APIs #198646

Add stream param for inference APIs #198646

Conversation

pgayvallet commented Nov 1, 2024 • edited by kibanamachine Loading

Summary

Normal mode

Stream mode

pgayvallet commented Nov 1, 2024

pgayvallet commented Nov 1, 2024

pgayvallet commented Nov 1, 2024

pgayvallet commented Nov 1, 2024

elasticmachine commented Nov 1, 2024

nreese left a comment

Choose a reason for hiding this comment

legrego left a comment

Choose a reason for hiding this comment

legrego left a comment

Choose a reason for hiding this comment

elasticmachine commented Nov 5, 2024

💚 Build Succeeded

Metrics [docs]

Public APIs missing comments

Async chunks

Public APIs missing exports

Page load bundle

API count

History

pgayvallet commented Nov 5, 2024

kibanamachine commented Nov 5, 2024

kibanamachine commented Nov 5, 2024

💔 All backports failed

Manual backport

Questions ?

Add `stream` param for inference APIs #198646

Add `stream` param for inference APIs #198646

pgayvallet commented Nov 1, 2024 •

edited by kibanamachine

Loading