New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Support OpenAI Assistants API #601

Merged

keepingitneil merged 20 commits into main from neil/assistant

Aug 16, 2024

Contributor

keepingitneil commented Aug 7, 2024

No description provided.

keepingitneil added 3 commits

August 7, 2024 08:24


          assistant work

3a4c044


          Merge branch 'main' into neil/assistant

2c6027d


          assistants

a3a3359

changeset-bot bot commented Aug 7, 2024 •

edited

Loading

🦋 Changeset detected

Latest commit: f85458c

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages

Name	Type
livekit-agents	Patch
livekit-plugins-openai	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

keepingitneil added 4 commits

August 7, 2024 08:31


          Async event handler

bc61866


          misc

b59cf32


          Assistants api

85df1e7


          fix ruff

3c97fd7

keepingitneil commented

View reviewed changes

examples/voice-assistant/minimal_assistant.py Outdated

Contributor Author

keepingitneil Aug 7, 2024

No need to review this file. It won't be included in the PR. Just using it for testing

davidzhao reviewed

View reviewed changes

Member

davidzhao left a comment

the interfaces seem to fit in pretty well

examples/voice-assistant/minimal_assistant.py Outdated

                   await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)
                   assistant = VoiceAssistant(
                       vad=silero.VAD.load(),
                       stt=deepgram.STT(),
-                      llm=openai.LLM(),
+                      llm=openai.AssistantLLM(
+                          assistant_opts=openai.AssistantCreateOptions(

Member

davidzhao Aug 8, 2024

if I'm already using assistant API today.. is there a one to one mapping for the options?

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/assistant_llm.py Outdated

+                              ),
+                          ),
+                      )
+                      self._assistant_opts = assistant_opts

Member

davidzhao Aug 8, 2024

nit: a bit confusing to set both types to the same var, and then to check the same var later.. not knowing what type it is. seems cleaner to keep them separate and only sync if it's a AssistantCreateOptions

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/assistant_llm.py Outdated

+                      n: int | None = 1,
+                      parallel_tool_calls: bool | None = None,
+                  ):
+                      return AssistantLLMStream(

Member

davidzhao Aug 8, 2024

would there be race issues if a user does:

llm.chat(message1)
llm.chat(message2)

is it possible for message1 to override the message context from message2 there? should the chat function limit only a single active stream at any given time?

Contributor Author

keepingitneil Aug 8, 2024

This is a limitation of the OpenAI api. For any given thread_id, only one stream can be running at a time. Also, messages can not be mutated while that stream is running. The llm instance represents the thread.

Logically it makes more sense for the chat_ctx to be the object that represents a thread but since anyone can mutate and/or copy the chat_ctx at any time (VoiceAssistant), we needed additional logic and state which couldn't really live anywhere else.

Member

theomonnom Aug 13, 2024

Can we hold a "lock" when a stream is running? So we can raise an Exception if someone is trying to run a new stream while an older one is active

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/assistant_llm.py Outdated

+                          for msg in self._chat_ctx.messages:
+                              msg_id = msg._metadata.get(MESSAGE_ID_KEY)
+                              if msg_id and msg_id not in added_messages_set:
+                                  await self._client.beta.threads.messages.delete(

Member

davidzhao Aug 8, 2024

can we run a task to sync all the changes that needed to be made vs await one by one?

Contributor Author

keepingitneil Aug 8, 2024

There isn't a way to order messages so it was one by one. BUT!

I just made an adjustment by using the additional_messages which creates the messages at stream time. This introduced a fair amount of additional complexity because now we don't know the open_ai message_id up front. But definitely worth it for latency.

keepingitneil added 3 commits

August 8, 2024 10:54


          improvements

803f981


          batch upload messages

aff9772


          assistants

0e457da

theomonnom reviewed

View reviewed changes

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/assistant_llm.py Outdated

+                  def __init__(
+                      self,
+                      *,
+                      assistant_opts: AssistantOptions,

Member

theomonnom Aug 13, 2024

What about AssistantCreateOptions | AssistantLoadOptions instead?

theomonnom reviewed

View reviewed changes

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/assistant_llm.py Outdated

+              from .utils import build_oai_message
+              DEFAULT_MODEL = "gpt-4o"
+              OPEN_AI_MESSAGE_ID_KEY = "__openai_message_id__"

Member

theomonnom Aug 13, 2024

nit

Suggested change

      
            OPEN_AI_MESSAGE_ID_KEY = "__openai_message_id__"
          
            OPENAI_MESSAGE_ID_KEY = "__openai_message_id__"

theomonnom reviewed

View reviewed changes

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/assistant_llm.py Outdated

+                      n: int | None = 1,
+                      parallel_tool_calls: bool | None = None,
+                  ):
+                      if n is not None:

Member

theomonnom Aug 13, 2024 •

edited

Loading

Seems like this is always going to this condition?
n defaults to 1 here. I think the default should be None

keepingitneil added 5 commits

August 16, 2024 14:09


          Merge branch 'main' into neil/assistant

fe6b4b1


          move under beta namespace

aa819ed


          version check

6d864a0


          revert example change

09cd633


          readme

36d9677

keepingitneil marked this pull request as ready for review

August 16, 2024 22:34

keepingitneil added 2 commits

August 16, 2024 15:50


          try different

9ed4093


          types

5939c0f

davidzhao approved these changes

View reviewed changes

Member

davidzhao left a comment

lg!


          types?

2a442e6

keepingitneil changed the title ~~[WIP] Support OpenAI Assistants API~~ Support OpenAI Assistants API

keepingitneil added 2 commits

August 16, 2024 16:05


          Create small-doors-arrive.md

d489c26


          Update small-doors-arrive.md

f85458c

theomonnom approved these changes

View reviewed changes

Member

theomonnom left a comment

awesome!!

keepingitneil merged commit 0406665 into main

8 checks passed

keepingitneil deleted the neil/assistant branch

August 16, 2024 23:15

github-actions bot mentioned this pull request

Version Packages #615

Merged

donnyyung added a commit to okolabs/livekit-agents that referenced this pull request

* Fix deepgram English check (livekit#625)

* Cartesia bump to 0.4.0 (livekit#624)

* Introduce manual package release (livekit#626)

* Use the correct working directory in the manual publish job (livekit#627)

* Modified RAG plugin (livekit#629)

Co-authored-by: Théo Monnom <theo.monnom@outlook.com>

* Revert "nltk: fix broken punkt download" (livekit#630)

* Expose WorkerType explicitly (livekit#632)

* openai: allow sending user IDs (livekit#633)

* silero: fix vad padding & choppy audio  (livekit#631)

* ipc: use our own duplex instead of mp.Queue (livekit#634)

* llm: fix optional arguments & non-hashable list (livekit#637)

* Add agent_name to WorkerOptions (livekit#636)

* Support OpenAI Assistants API (livekit#601)

* voiceassistant: fix will_synthesize_assistant_reply race (livekit#638)

* silero: adjust vad activation threshold (livekit#639)

* Version Packages (livekit#615)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* voiceassistant: fix llm not having the full chat context on bad interruption timing (livekit#640)

* livekit-plugins-browser: handle mouse/keyboard inputs on devmode  (livekit#644)

* nltk: fix another semver break (livekit#647)

* livekit-plugins-browser: python API (livekit#645)

* Delete test.py (livekit#652)

* livekit-plugins-browser: prepare for release (livekit#653)

* Version Packages (livekit#641)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Revert "Version Packages" (livekit#659)

* fix release workflow (livekit#661)

* Version Packages (livekit#660)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Add ServerMessage.termination handler (livekit#635)

Co-authored-by: Théo Monnom <theo.8bits@gmail.com>

* Introduce anthropic plugin (livekit#655)

* fix uninitialized SpeechHandle error on interruption  (livekit#665)

* voiceassistant: avoid stacking assistant replies when allow_interruptions=False (livekit#667)

* fix: disconnect event may now have some arguments  (livekit#668)

* Anthropic requires the first message to be a non empty 'user' role (livekit#669)

* support clova speech (livekit#439)

* Updated readme with LLM options (livekit#671)

* Update README.md (livekit#666)

* plugins: add docstrings explaining API keys (livekit#672)

* Disable anthropic test due to 429s (livekit#675)

* Remove duplicate entry from plugin table (livekit#673)

* Version Packages (livekit#662)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* deepgram: switch the default model to phonecall (livekit#676)

* update livekit to 0.14.0 and await tracksubscribed (livekit#678)

* Fix Google STT exception when no valid speech is recognized (livekit#680)

* Introduce easy api for starting tasks for remote participants (livekit#679)

* examples: document how to log chats (livekit#685)

* Version Packages (livekit#677)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* voiceassistant: keep punctuations when sending agent transcription (livekit#648)

* Pass context into participant entrypoint (livekit#694)

* Version Packages (livekit#693)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Update examples to use participant_entrypoint (livekit#695)

* voiceassistant: add VoiceAssistantState (livekit#654)

Co-authored-by: Théo Monnom <theo.8bits@gmail.com>

* Fix anthropic package publishing (livekit#701)

* fix non pickleable log (livekit#691)

* Revert "Update examples to use participant_entrypoint" (livekit#702)

* google-tts: ignore wav header (livekit#703)

* fix examples (livekit#704)

* skip processing of choice.delta when it is None (livekit#705)

* delete duplicate code (livekit#707)

* voiceassistant: skip speech initialization if interrupted  (livekit#715)

* Ensure room.name is available before connection (livekit#716)

* Add deepseek LLMs at OpenAI plugin (livekit#714)

* add threaded job runners (livekit#684)

* voiceassistant: add before_tts_cb callback (livekit#706)

* voiceassistant: fix mark_audio_segment_end with no audio data (livekit#719)

* add JobContext.wait_for_participant (livekit#712)

* Enable Google TTS with application default credentials (livekit#721)

* improve gracefully_cancel logic (livekit#720)

* bump required livekit version to 0.15.2 (livekit#722)

* elevenlabs: expose enable_ssml_parsing (livekit#723)

* Version Packages (livekit#697)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* release anthropic (livekit#724)

* Version Packages (livekit#725)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Update examples to use wait_for_participant (livekit#726)

Co-authored-by: Théo Monnom <theo.8bits@gmail.com>

* Introduce function calling to OpenAI Assistants (livekit#710)

Co-authored-by: Théo Monnom <theo.8bits@gmail.com>

* tts_forwarder: don't raise inside mark_{audio,text}_segment_end when nothing was pushed (livekit#730)

* Add Cerebras to OpenAI Plugin (livekit#731)

* Fixes to Anthropic Function Calling (livekit#708)

* ci: don't run tests on forks (livekit#739)

* Only send actual audio to Deepgram (livekit#738)

* Add support for cartesia voice control (livekit#740)

Co-authored-by: Théo Monnom <theo.8bits@gmail.com>

* Version Packages (livekit#727)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Allow setting LLM temperature with VoiceAssistant (livekit#741)

* Update STT sample README (livekit#709)

* avoid returning tiny frames from TTS (livekit#747)

* run tests on main (and make skipping clearer) (livekit#748)

* voiceassistant: avoid tiny frames on playout (livekit#750)

* limit concurrent process init to 1 (livekit#751)

* windows: default to threaded executor & fix dev mode  (livekit#755)

* improve graceful shutdown  (livekit#756)

* better dev defaults (livekit#762)

* 11labs: send phoneme in one entire xml chunk (livekit#766)

* ipc: fix process not starting if num_idle_processes is zero (livekit#763)

* limit noisy logs & keep the root logger info (livekit#768)

* use os.exit to exit forcefully  (livekit#770)

* Fix Assistant API Vision Capabilities (livekit#771)

* voiceassistant: allow to cancel llm generation inside before_llm_cb (livekit#753)

* Remove useless logs (livekit#773)

* voiceassistant: expose min_endpointing_delay (livekit#752)

* Add typing-extensions as a dependency (livekit#778)

* rename voice_assistant.state to agent.state (livekit#772)

Co-authored-by: aoife cassidy <aoife@livekit.io>

* bump rtc (livekit#782)

* Version Packages (livekit#744)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* added livekit-plugins-playht text-to-speech (livekit#735)

* Fix function for OpenAI Assistants (livekit#784)

* fix the problem of infinite loop when agent speech is interrupted (livekit#790)

---------

Co-authored-by: David Zhao <dz@livekit.io>
Co-authored-by: Neil Dwyer <neildwyer1991@gmail.com>
Co-authored-by: Alejandro Figar Gutierrez <afigar@me.com>
Co-authored-by: Théo Monnom <theo.monnom@outlook.com>
Co-authored-by: Théo Monnom <theo.8bits@gmail.com>
Co-authored-by: aoife cassidy <aoife@livekit.io>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: josephkieu <168809198+josephkieu@users.noreply.github.com>
Co-authored-by: Mehadi Hasan Menon <104126711+mehadi92@users.noreply.github.com>
Co-authored-by: lukasIO <mail@lukasseiler.de>
Co-authored-by: xsg22 <111886011+xsg22@users.noreply.github.com>
Co-authored-by: Yuan He <183649+lenage@users.noreply.github.com>
Co-authored-by: Ryan Sinnet <rsinnet@users.noreply.github.com>
Co-authored-by: Henry Tu <henry@henrytu.me>
Co-authored-by: Ben Cherry <bcherry@gmail.com>
Co-authored-by: Jaydev <jaydevjadav.015@gmail.com>
Co-authored-by: Jax <anyetiangong@qq.com>

SuJingnan pushed a commit to SuJingnan/agents that referenced this pull request


          Support OpenAI Assistants API (livekit#601)

7c707a9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet