From 8c4c0754fa3babea9087e2abcd424a548f3f4b83 Mon Sep 17 00:00:00 2001
From: Donny Yung <donny.yung@aprime.io>
Date: Wed, 25 Sep 2024 21:16:39 -0400
Subject: [PATCH] Merge livekit-agent 0.9.0 (#4)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* Fix deepgram English check (#625)

* Cartesia bump to 0.4.0 (#624)

* Introduce manual package release (#626)

* Use the correct working directory in the manual publish job (#627)

* Modified RAG plugin (#629)

Co-authored-by: Théo Monnom <theo.monnom@outlook.com>

* Revert "nltk: fix broken punkt download" (#630)

* Expose WorkerType explicitly (#632)

* openai: allow sending user IDs (#633)

* silero: fix vad padding & choppy audio  (#631)

* ipc: use our own duplex instead of mp.Queue (#634)

* llm: fix optional arguments & non-hashable list (#637)

* Add agent_name to WorkerOptions (#636)

* Support OpenAI Assistants API (#601)

* voiceassistant: fix will_synthesize_assistant_reply race (#638)

* silero: adjust vad activation threshold (#639)

* Version Packages (#615)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* voiceassistant: fix llm not having the full chat context on bad interruption timing (#640)

* livekit-plugins-browser: handle mouse/keyboard inputs on devmode  (#644)

* nltk: fix another semver break (#647)

* livekit-plugins-browser: python API (#645)

* Delete test.py (#652)

* livekit-plugins-browser: prepare for release (#653)

* Version Packages (#641)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Revert "Version Packages" (#659)

* fix release workflow (#661)

* Version Packages (#660)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Add ServerMessage.termination handler (#635)

Co-authored-by: Théo Monnom <theo.8bits@gmail.com>

* Introduce anthropic plugin (#655)

* fix uninitialized SpeechHandle error on interruption  (#665)

* voiceassistant: avoid stacking assistant replies when allow_interruptions=False (#667)

* fix: disconnect event may now have some arguments  (#668)

* Anthropic requires the first message to be a non empty 'user' role (#669)

* support clova speech (#439)

* Updated readme with LLM options (#671)

* Update README.md (#666)

* plugins: add docstrings explaining API keys (#672)

* Disable anthropic test due to 429s (#675)

* Remove duplicate entry from plugin table (#673)

* Version Packages (#662)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* deepgram: switch the default model to phonecall (#676)

* update livekit to 0.14.0 and await tracksubscribed (#678)

* Fix Google STT exception when no valid speech is recognized (#680)

* Introduce easy api for starting tasks for remote participants (#679)

* examples: document how to log chats (#685)

* Version Packages (#677)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* voiceassistant: keep punctuations when sending agent transcription (#648)

* Pass context into participant entrypoint (#694)

* Version Packages (#693)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Update examples to use participant_entrypoint (#695)

* voiceassistant: add VoiceAssistantState (#654)

Co-authored-by: Théo Monnom <theo.8bits@gmail.com>

* Fix anthropic package publishing (#701)

* fix non pickleable log (#691)

* Revert "Update examples to use participant_entrypoint" (#702)

* google-tts: ignore wav header (#703)

* fix examples (#704)

* skip processing of choice.delta when it is None (#705)

* delete duplicate code (#707)

* voiceassistant: skip speech initialization if interrupted  (#715)

* Ensure room.name is available before connection (#716)

* Add deepseek LLMs at OpenAI plugin (#714)

* add threaded job runners (#684)

* voiceassistant: add before_tts_cb callback (#706)

* voiceassistant: fix mark_audio_segment_end with no audio data (#719)

* add JobContext.wait_for_participant (#712)

* Enable Google TTS with application default credentials (#721)

* improve gracefully_cancel logic (#720)

* bump required livekit version to 0.15.2 (#722)

* elevenlabs: expose enable_ssml_parsing (#723)

* Version Packages (#697)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* release anthropic (#724)

* Version Packages (#725)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Update examples to use wait_for_participant (#726)

Co-authored-by: Théo Monnom <theo.8bits@gmail.com>

* Introduce function calling to OpenAI Assistants (#710)

Co-authored-by: Théo Monnom <theo.8bits@gmail.com>

* tts_forwarder: don't raise inside mark_{audio,text}_segment_end when nothing was pushed (#730)

* Add Cerebras to OpenAI Plugin (#731)

* Fixes to Anthropic Function Calling (#708)

* ci: don't run tests on forks (#739)

* Only send actual audio to Deepgram (#738)

* Add support for cartesia voice control (#740)

Co-authored-by: Théo Monnom <theo.8bits@gmail.com>

* Version Packages (#727)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Allow setting LLM temperature with VoiceAssistant (#741)

* Update STT sample README (#709)

* avoid returning tiny frames from TTS (#747)

* run tests on main (and make skipping clearer) (#748)

* voiceassistant: avoid tiny frames on playout (#750)

* limit concurrent process init to 1 (#751)

* windows: default to threaded executor & fix dev mode  (#755)

* improve graceful shutdown  (#756)

* better dev defaults (#762)

* 11labs: send phoneme in one entire xml chunk (#766)

* ipc: fix process not starting if num_idle_processes is zero (#763)

* limit noisy logs & keep the root logger info (#768)

* use os.exit to exit forcefully  (#770)

* Fix Assistant API Vision Capabilities (#771)

* voiceassistant: allow to cancel llm generation inside before_llm_cb (#753)

* Remove useless logs (#773)

* voiceassistant: expose min_endpointing_delay (#752)

* Add typing-extensions as a dependency (#778)

* rename voice_assistant.state to agent.state (#772)

Co-authored-by: aoife cassidy <aoife@livekit.io>

* bump rtc (#782)

* Version Packages (#744)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* added livekit-plugins-playht text-to-speech (#735)

* Fix function for OpenAI Assistants (#784)

* fix the problem of infinite loop when agent speech is interrupted (#790)

---------

Co-authored-by: David Zhao <dz@livekit.io>
Co-authored-by: Neil Dwyer <neildwyer1991@gmail.com>
Co-authored-by: Alejandro Figar Gutierrez <afigar@me.com>
Co-authored-by: Théo Monnom <theo.monnom@outlook.com>
Co-authored-by: Théo Monnom <theo.8bits@gmail.com>
Co-authored-by: aoife cassidy <aoife@livekit.io>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: josephkieu <168809198+josephkieu@users.noreply.github.com>
Co-authored-by: Mehadi Hasan Menon <104126711+mehadi92@users.noreply.github.com>
Co-authored-by: lukasIO <mail@lukasseiler.de>
Co-authored-by: xsg22 <111886011+xsg22@users.noreply.github.com>
Co-authored-by: Yuan He <183649+lenage@users.noreply.github.com>
Co-authored-by: Ryan Sinnet <rsinnet@users.noreply.github.com>
Co-authored-by: Henry Tu <henry@henrytu.me>
Co-authored-by: Ben Cherry <bcherry@gmail.com>
Co-authored-by: Jaydev <jaydevjadav.015@gmail.com>
Co-authored-by: Jax <anyetiangong@qq.com>
---
 .changeset/cuddly-eels-sin.md                 |    5 -
 .changeset/five-planes-drum.md                |    7 -
 .changeset/itchy-ligers-exist.md              |    5 -
 .changeset/lazy-cups-cross.md                 |    5 -
 .changeset/moody-doors-poke.md                |    5 +
 .changeset/proud-birds-press.md               |    5 -
 .changeset/red-taxis-smoke.md                 |    5 -
 .changeset/shaggy-apes-matter.md              |    5 -
 .changeset/tidy-years-refuse.md               |    6 +
 .github/workflows/build-package.yml           |   98 ++
 .github/workflows/check-types.yml             |    6 +-
 .github/workflows/publish-package.yml         |   36 +-
 .github/workflows/tests.yml                   |    9 +-
 README.md                                     |   35 +
 examples/browser/browser_track.py             |   55 +
 examples/browser/standalone_app.py            |    3 +
 examples/minimal_worker.py                    |    6 +-
 examples/participant-entrypoint/README.md     |   30 +
 .../participant_entrypoint.py                 |   44 +
 .../participant-entrypoint/requirements.txt   |    1 +
 examples/simple-color/agent.py                |   15 +-
 examples/simple-color/requirements.txt        |    2 +-
 examples/speech-to-text/README.md             |   10 +-
 examples/speech-to-text/deepgram_stt.py       |    3 +
 examples/speech-to-text/requirements.txt      |    4 +-
 examples/text-to-speech/cartesia_tts.py       |   43 +
 examples/text-to-speech/elevenlabs_tts.py     |    7 +-
 examples/text-to-speech/openai_tts.py         |    7 +-
 examples/text-to-speech/requirements.txt      |    6 +-
 .../text-to-speech/sync_tts_transcription.py  |    7 +-
 examples/voice-assistant/README.md            |   22 +-
 .../voice-assistant/custom_pronunciation.py   |   49 +
 examples/voice-assistant/function_calling.py  |  115 --
 .../function_calling_weather.py               |   85 ++
 examples/voice-assistant/minimal_assistant.py |   35 +-
 examples/voice-assistant/requirements.txt     |   11 +-
 examples/voice-assistant/save_chatctx.py      |   84 ++
 .../voice-assistant/simple-rag/assistant.py   |   12 +-
 livekit-agents/CHANGELOG.md                   |  126 ++
 livekit-agents/livekit/agents/__init__.py     |   10 +-
 livekit-agents/livekit/agents/cli/cli.py      |  232 ++--
 livekit-agents/livekit/agents/cli/log.py      |   21 +-
 livekit-agents/livekit/agents/cli/proto.py    |    2 +-
 livekit-agents/livekit/agents/cli/watcher.py  |   65 +-
 livekit-agents/livekit/agents/ipc/__init__.py |   18 +-
 .../livekit/agents/ipc/job_executor.py        |   29 +
 .../agents/ipc/{proc_main.py => job_main.py}  |  112 +-
 ...upervised_proc.py => proc_job_executor.py} |   61 +-
 .../livekit/agents/ipc/proc_lazy_main.py      |   72 ++
 .../livekit/agents/ipc/proc_pool.py           |   76 +-
 livekit-agents/livekit/agents/ipc/proto.py    |   16 +-
 .../livekit/agents/ipc/thread_job_executor.py |  256 ++++
 livekit-agents/livekit/agents/job.py          |  107 +-
 livekit-agents/livekit/agents/llm/_oai_api.py |    2 +-
 .../livekit/agents/llm/chat_context.py        |   15 +-
 .../livekit/agents/llm/function_context.py    |   80 +-
 livekit-agents/livekit/agents/log.py          |    2 +-
 livekit-agents/livekit/agents/proto.py        |    5 +
 .../livekit/agents/tokenize/__init__.py       |    3 +-
 .../agents/tokenize/_basic_paragraph.py       |   30 +-
 .../livekit/agents/tokenize/_basic_sent.py    |   43 +-
 .../livekit/agents/tokenize/_basic_word.py    |   47 +-
 .../livekit/agents/tokenize/basic.py          |   20 +-
 .../livekit/agents/tokenize/token_stream.py   |   82 +-
 .../livekit/agents/tokenize/tokenizer.py      |    6 +
 .../livekit/agents/tokenize/utils.py          |   82 ++
 .../agents/transcription/stt_forwarder.py     |   27 +-
 .../agents/transcription/tts_forwarder.py     |  265 ++--
 .../livekit/agents/utils/aio/__init__.py      |   30 +-
 .../livekit/agents/utils/aio/duplex_unix.py   |   25 +-
 .../livekit/agents/utils/aio/itertools.py     |  114 ++
 livekit-agents/livekit/agents/utils/audio.py  |    2 +-
 livekit-agents/livekit/agents/utils/misc.py   |    2 +
 livekit-agents/livekit/agents/vad.py          |    8 +-
 livekit-agents/livekit/agents/version.py      |    2 +-
 .../agents/voice_assistant/__init__.py        |    6 +-
 .../agents/voice_assistant/agent_output.py    |   90 +-
 .../agents/voice_assistant/agent_playout.py   |   83 +-
 .../agents/voice_assistant/human_input.py     |    4 +-
 .../livekit/agents/voice_assistant/plotter.py |   52 +-
 .../agents/voice_assistant/speech_handle.py   |  153 +++
 .../agents/voice_assistant/voice_assistant.py |  401 +++---
 livekit-agents/livekit/agents/worker.py       |  133 +-
 livekit-agents/package.json                   |    2 +-
 livekit-agents/setup.py                       |    3 +-
 livekit-plugins/install_plugins_editable.sh   |    1 +
 .../livekit-plugins-anthropic/CHANGELOG.md    |   13 +
 .../livekit-plugins-anthropic/README.md       |   13 +
 .../livekit/plugins/anthropic/__init__.py     |   37 +
 .../livekit/plugins/anthropic/llm.py          |  511 ++++++++
 .../livekit/plugins/anthropic/log.py          |    3 +
 .../livekit/plugins/anthropic/models.py       |    8 +
 .../livekit/plugins/anthropic/py.typed}       |    0
 .../livekit/plugins/anthropic/version.py      |   15 +
 .../livekit-plugins-anthropic/package.json    |    5 +
 .../livekit-plugins-anthropic/pyproject.toml  |    3 +
 .../livekit-plugins-anthropic/setup.py        |   59 +
 .../livekit-plugins-azure/CHANGELOG.md        |    6 +
 .../livekit/plugins/azure/stt.py              |    7 +
 .../livekit/plugins/azure/tts.py              |   52 +-
 .../livekit/plugins/azure/version.py          |    2 +-
 .../livekit-plugins-azure/package.json        |    2 +-
 .../{cef => }/.clang-format                   |    0
 .../{cef => }/.gitignore                      |    0
 .../livekit-plugins-browser/CHANGELOG.md      |    7 +
 .../{cef => }/CMakeLists.txt                  |    3 +-
 .../{cef => }/LICENSE.txt                     |    0
 .../livekit-plugins-browser/README.md         |    4 +
 .../cef/src/agents_python.cpp                 |   52 -
 .../cef/src/agents_python.hpp                 |   39 -
 .../livekit-plugins-browser/cef/src/app.hpp   |   47 -
 .../cef/src/app_mac.mm                        |  146 ---
 .../cef/src/dev_renderer.cpp                  |  195 ---
 .../cef/src/handler.cpp                       |  156 ---
 .../cef/src/handler.hpp                       |   94 --
 .../cef/src/resources/lkcef-Info.plist        |   36 -
 .../cef/src/run_browser.py                    |   27 -
 .../{cef => }/cmake/DownloadCEF.cmake         |    0
 .../livekit/plugins/browser/__init__.py       |   29 +
 .../livekit/plugins/browser/log.py            |    3 +
 .../livekit/plugins/browser/proc.py           |  239 ++++
 .../livekit/plugins/browser/proc_main.py      |  193 +++
 .../livekit/plugins/browser/proto.py          |  196 +++
 .../plugins/browser/py.typed}                 |    0
 .../plugins/browser/resources/__init__.py     |    1 +
 .../livekit/plugins/browser/version.py        |   15 +
 .../livekit-plugins-browser/package.json      |    5 +
 .../livekit-plugins-browser/pyproject.toml    |    9 +
 .../livekit-plugins-browser/setup.py          |  126 ++
 .../livekit-plugins-browser/src/.gitignore    |    3 +
 .../{cef => }/src/CMakeLists.txt              |   28 +-
 .../src/agents_python.cpp                     |  138 +++
 .../src/agents_python.hpp                     |   69 ++
 .../{cef => }/src/app.cpp                     |   47 +-
 .../livekit-plugins-browser/src/app.hpp       |   75 ++
 .../livekit-plugins-browser/src/app_mac.mm    |  110 ++
 .../src/browser_handle.cpp                    |   15 +
 .../src/browser_handle.hpp                    |   72 ++
 .../src/dev_renderer.cpp                      |  593 +++++++++
 .../{cef => }/src/dev_renderer.hpp            |   21 +-
 .../livekit-plugins-browser/src/dummy.cpp     |    3 +
 .../livekit-plugins-browser/src/gleq.h        |  419 +++++++
 .../livekit-plugins-browser/src/handler.cpp   |  181 +++
 .../livekit-plugins-browser/src/handler.hpp   |  104 ++
 .../{cef => }/src/helper_main_linux.cpp       |    0
 .../{cef => }/src/helper_main_mac.mm          |    0
 .../src/utils.hpp => src/helper_main_win.cpp} |    0
 .../src/keyboard_codes.h                      |  528 ++++++++
 .../src/resources/lkcefapp-Info.plist         |    0
 .../src/resources/lkcefhelper-Info.plist      |    0
 .../src/run_browser.py                        |   45 +
 .../livekit-plugins-cartesia/CHANGELOG.md     |   13 +
 .../livekit/plugins/cartesia/models.py        |   29 +-
 .../livekit/plugins/cartesia/tts.py           |   40 +-
 .../livekit/plugins/cartesia/version.py       |    2 +-
 .../livekit-plugins-cartesia/package.json     |    2 +-
 .../livekit-plugins-clova/README.md           |   13 +
 .../livekit/plugins/clova/__init__.py         |   21 +
 .../livekit/plugins/clova/common.py           |   13 +
 .../livekit/plugins/clova/constants.py        |    2 +
 .../livekit/plugins/clova/log.py              |    3 +
 .../livekit/plugins/clova/models.py           |   17 +
 .../livekit/plugins/clova/stt.py              |  132 ++
 .../livekit/plugins/clova/version.py          |   15 +
 .../livekit-plugins-clova/pyproject.toml      |    3 +
 .../livekit-plugins-clova/setup.py            |   56 +
 .../livekit-plugins-deepgram/CHANGELOG.md     |   20 +
 .../livekit/plugins/deepgram/stt.py           |   19 +-
 .../livekit/plugins/deepgram/utils.py         |   27 +
 .../livekit/plugins/deepgram/version.py       |    2 +-
 .../livekit-plugins-deepgram/package.json     |    2 +-
 .../livekit-plugins-deepgram/setup.py         |    2 +-
 .../livekit-plugins-elevenlabs/CHANGELOG.md   |   14 +
 .../livekit/plugins/elevenlabs/tts.py         |   59 +-
 .../livekit/plugins/elevenlabs/version.py     |    2 +-
 .../livekit-plugins-elevenlabs/package.json   |    2 +-
 .../livekit-plugins-google/CHANGELOG.md       |   22 +
 .../livekit-plugins-google/README.md          |    2 +-
 .../livekit/plugins/google/stt.py             |   45 +-
 .../livekit/plugins/google/tts.py             |   20 +-
 .../livekit/plugins/google/version.py         |    2 +-
 .../livekit-plugins-google/package.json       |    2 +-
 .../livekit-plugins-google/setup.py           |    1 +
 .../livekit-plugins-nltk/CHANGELOG.md         |   12 +
 .../livekit/plugins/nltk/version.py           |    2 +-
 .../livekit-plugins-nltk/package.json         |    2 +-
 livekit-plugins/livekit-plugins-nltk/setup.py |    2 +-
 .../livekit-plugins-openai/CHANGELOG.md       |   35 +
 .../livekit/plugins/openai/__init__.py        |    3 +
 .../livekit/plugins/openai/beta/README.md     |   78 ++
 .../livekit/plugins/openai/beta/__init__.py   |   17 +
 .../plugins/openai/beta/assistant_llm.py      |  590 +++++++++
 .../livekit/plugins/openai/llm.py             |  299 +++--
 .../livekit/plugins/openai/models.py          |   14 +
 .../livekit/plugins/openai/stt.py             |   13 +
 .../livekit/plugins/openai/tts.py             |   36 +-
 .../livekit/plugins/openai/utils.py           |   88 +-
 .../livekit/plugins/openai/version.py         |    2 +-
 .../livekit-plugins-openai/package.json       |    2 +-
 .../livekit-plugins-playht/README.md          |   13 +
 .../livekit/__init__.py                       |    0
 .../livekit/plugins/__init__.py               |    0
 .../livekit/plugins/playht/__init__.py        |   24 +
 .../livekit/plugins/playht/log.py             |    3 +
 .../livekit/plugins/playht/models.py          |   19 +
 .../livekit/plugins/playht/tts.py             |  218 ++++
 .../livekit/plugins/playht/version.py         |    1 +
 .../livekit-plugins-playht/package.json       |    6 +
 .../livekit-plugins-playht/pyproject.toml     |    3 +
 .../livekit-plugins-playht/setup.py           |   44 +
 .../livekit-plugins-rag/CHANGELOG.md          |    6 +
 .../livekit/plugins/rag/__init__.py           |    3 +
 .../livekit/plugins/rag/version.py            |    2 +-
 .../livekit-plugins-rag/package.json          |    2 +-
 .../livekit-plugins-silero/CHANGELOG.md       |    8 +
 .../livekit/plugins/silero/vad.py             |  104 +-
 .../livekit/plugins/silero/version.py         |    2 +-
 .../livekit-plugins-silero/package.json       |    2 +-
 pnpm-lock.yaml                                | 1090 +++++++++--------
 test.py                                       |   59 -
 tests/.gitignore                              |    1 +
 tests/test_ipc.py                             |   19 +-
 tests/test_llm.py                             |  224 ++--
 tests/test_tokenizer.py                       |  106 ++
 tests/test_tts.py                             |    2 +
 tests/test_vad.py                             |   66 +
 tests/utils.py                                |   60 +
 227 files changed, 9974 insertions(+), 2632 deletions(-)
 delete mode 100644 .changeset/cuddly-eels-sin.md
 delete mode 100644 .changeset/five-planes-drum.md
 delete mode 100644 .changeset/itchy-ligers-exist.md
 delete mode 100644 .changeset/lazy-cups-cross.md
 create mode 100644 .changeset/moody-doors-poke.md
 delete mode 100644 .changeset/proud-birds-press.md
 delete mode 100644 .changeset/red-taxis-smoke.md
 delete mode 100644 .changeset/shaggy-apes-matter.md
 create mode 100644 .changeset/tidy-years-refuse.md
 create mode 100644 .github/workflows/build-package.yml
 create mode 100644 examples/browser/browser_track.py
 create mode 100644 examples/browser/standalone_app.py
 create mode 100644 examples/participant-entrypoint/README.md
 create mode 100644 examples/participant-entrypoint/participant_entrypoint.py
 create mode 100644 examples/participant-entrypoint/requirements.txt
 create mode 100644 examples/text-to-speech/cartesia_tts.py
 create mode 100644 examples/voice-assistant/custom_pronunciation.py
 delete mode 100644 examples/voice-assistant/function_calling.py
 create mode 100644 examples/voice-assistant/function_calling_weather.py
 create mode 100644 examples/voice-assistant/save_chatctx.py
 create mode 100644 livekit-agents/livekit/agents/ipc/job_executor.py
 rename livekit-agents/livekit/agents/ipc/{proc_main.py => job_main.py} (71%)
 rename livekit-agents/livekit/agents/ipc/{supervised_proc.py => proc_job_executor.py} (89%)
 create mode 100644 livekit-agents/livekit/agents/ipc/proc_lazy_main.py
 create mode 100644 livekit-agents/livekit/agents/ipc/thread_job_executor.py
 create mode 100644 livekit-agents/livekit/agents/proto.py
 create mode 100644 livekit-agents/livekit/agents/tokenize/utils.py
 create mode 100644 livekit-agents/livekit/agents/utils/aio/itertools.py
 create mode 100644 livekit-agents/livekit/agents/voice_assistant/speech_handle.py
 create mode 100644 livekit-plugins/livekit-plugins-anthropic/CHANGELOG.md
 create mode 100644 livekit-plugins/livekit-plugins-anthropic/README.md
 create mode 100644 livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/__init__.py
 create mode 100644 livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/llm.py
 create mode 100644 livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/log.py
 create mode 100644 livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/models.py
 rename livekit-plugins/{livekit-plugins-browser/cef/src/helper_main_win.cpp => livekit-plugins-anthropic/livekit/plugins/anthropic/py.typed} (100%)
 create mode 100644 livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/version.py
 create mode 100644 livekit-plugins/livekit-plugins-anthropic/package.json
 create mode 100644 livekit-plugins/livekit-plugins-anthropic/pyproject.toml
 create mode 100644 livekit-plugins/livekit-plugins-anthropic/setup.py
 rename livekit-plugins/livekit-plugins-browser/{cef => }/.clang-format (100%)
 rename livekit-plugins/livekit-plugins-browser/{cef => }/.gitignore (100%)
 create mode 100644 livekit-plugins/livekit-plugins-browser/CHANGELOG.md
 rename livekit-plugins/livekit-plugins-browser/{cef => }/CMakeLists.txt (90%)
 rename livekit-plugins/livekit-plugins-browser/{cef => }/LICENSE.txt (100%)
 create mode 100644 livekit-plugins/livekit-plugins-browser/README.md
 delete mode 100644 livekit-plugins/livekit-plugins-browser/cef/src/agents_python.cpp
 delete mode 100644 livekit-plugins/livekit-plugins-browser/cef/src/agents_python.hpp
 delete mode 100644 livekit-plugins/livekit-plugins-browser/cef/src/app.hpp
 delete mode 100644 livekit-plugins/livekit-plugins-browser/cef/src/app_mac.mm
 delete mode 100644 livekit-plugins/livekit-plugins-browser/cef/src/dev_renderer.cpp
 delete mode 100644 livekit-plugins/livekit-plugins-browser/cef/src/handler.cpp
 delete mode 100644 livekit-plugins/livekit-plugins-browser/cef/src/handler.hpp
 delete mode 100644 livekit-plugins/livekit-plugins-browser/cef/src/resources/lkcef-Info.plist
 delete mode 100644 livekit-plugins/livekit-plugins-browser/cef/src/run_browser.py
 rename livekit-plugins/livekit-plugins-browser/{cef => }/cmake/DownloadCEF.cmake (100%)
 create mode 100644 livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/__init__.py
 create mode 100644 livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/log.py
 create mode 100644 livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/proc.py
 create mode 100644 livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/proc_main.py
 create mode 100644 livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/proto.py
 rename livekit-plugins/livekit-plugins-browser/{cef/src/utils.cpp => livekit/plugins/browser/py.typed} (100%)
 create mode 100644 livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/resources/__init__.py
 create mode 100644 livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/version.py
 create mode 100644 livekit-plugins/livekit-plugins-browser/package.json
 create mode 100644 livekit-plugins/livekit-plugins-browser/pyproject.toml
 create mode 100644 livekit-plugins/livekit-plugins-browser/setup.py
 create mode 100644 livekit-plugins/livekit-plugins-browser/src/.gitignore
 rename livekit-plugins/livekit-plugins-browser/{cef => }/src/CMakeLists.txt (90%)
 create mode 100644 livekit-plugins/livekit-plugins-browser/src/agents_python.cpp
 create mode 100644 livekit-plugins/livekit-plugins-browser/src/agents_python.hpp
 rename livekit-plugins/livekit-plugins-browser/{cef => }/src/app.cpp (50%)
 create mode 100644 livekit-plugins/livekit-plugins-browser/src/app.hpp
 create mode 100644 livekit-plugins/livekit-plugins-browser/src/app_mac.mm
 create mode 100644 livekit-plugins/livekit-plugins-browser/src/browser_handle.cpp
 create mode 100644 livekit-plugins/livekit-plugins-browser/src/browser_handle.hpp
 create mode 100644 livekit-plugins/livekit-plugins-browser/src/dev_renderer.cpp
 rename livekit-plugins/livekit-plugins-browser/{cef => }/src/dev_renderer.hpp (62%)
 create mode 100644 livekit-plugins/livekit-plugins-browser/src/dummy.cpp
 create mode 100644 livekit-plugins/livekit-plugins-browser/src/gleq.h
 create mode 100644 livekit-plugins/livekit-plugins-browser/src/handler.cpp
 create mode 100644 livekit-plugins/livekit-plugins-browser/src/handler.hpp
 rename livekit-plugins/livekit-plugins-browser/{cef => }/src/helper_main_linux.cpp (100%)
 rename livekit-plugins/livekit-plugins-browser/{cef => }/src/helper_main_mac.mm (100%)
 rename livekit-plugins/livekit-plugins-browser/{cef/src/utils.hpp => src/helper_main_win.cpp} (100%)
 create mode 100644 livekit-plugins/livekit-plugins-browser/src/keyboard_codes.h
 rename livekit-plugins/livekit-plugins-browser/{cef => }/src/resources/lkcefapp-Info.plist (100%)
 rename livekit-plugins/livekit-plugins-browser/{cef => }/src/resources/lkcefhelper-Info.plist (100%)
 create mode 100644 livekit-plugins/livekit-plugins-browser/src/run_browser.py
 create mode 100644 livekit-plugins/livekit-plugins-clova/README.md
 create mode 100644 livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/__init__.py
 create mode 100644 livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/common.py
 create mode 100644 livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/constants.py
 create mode 100644 livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/log.py
 create mode 100644 livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/models.py
 create mode 100644 livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/stt.py
 create mode 100644 livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/version.py
 create mode 100644 livekit-plugins/livekit-plugins-clova/pyproject.toml
 create mode 100644 livekit-plugins/livekit-plugins-clova/setup.py
 create mode 100644 livekit-plugins/livekit-plugins-deepgram/livekit/plugins/deepgram/utils.py
 create mode 100644 livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/beta/README.md
 create mode 100644 livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/beta/__init__.py
 create mode 100644 livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/beta/assistant_llm.py
 create mode 100644 livekit-plugins/livekit-plugins-playht/README.md
 create mode 100644 livekit-plugins/livekit-plugins-playht/livekit/__init__.py
 create mode 100644 livekit-plugins/livekit-plugins-playht/livekit/plugins/__init__.py
 create mode 100644 livekit-plugins/livekit-plugins-playht/livekit/plugins/playht/__init__.py
 create mode 100644 livekit-plugins/livekit-plugins-playht/livekit/plugins/playht/log.py
 create mode 100644 livekit-plugins/livekit-plugins-playht/livekit/plugins/playht/models.py
 create mode 100644 livekit-plugins/livekit-plugins-playht/livekit/plugins/playht/tts.py
 create mode 100644 livekit-plugins/livekit-plugins-playht/livekit/plugins/playht/version.py
 create mode 100644 livekit-plugins/livekit-plugins-playht/package.json
 create mode 100644 livekit-plugins/livekit-plugins-playht/pyproject.toml
 create mode 100644 livekit-plugins/livekit-plugins-playht/setup.py
 delete mode 100644 test.py
 create mode 100644 tests/.gitignore

diff --git a/.changeset/cuddly-eels-sin.md b/.changeset/cuddly-eels-sin.md
deleted file mode 100644
index 64b87dd21..000000000
--- a/.changeset/cuddly-eels-sin.md
+++ /dev/null
@@ -1,5 +0,0 @@
----
-"livekit-plugins-openai": patch
----
-
-add support for Ollama, Perplexity, Fireworks, Octo, Together, and Groq LLMs through the OpenAI API
diff --git a/.changeset/five-planes-drum.md b/.changeset/five-planes-drum.md
deleted file mode 100644
index dd4bc1d65..000000000
--- a/.changeset/five-planes-drum.md
+++ /dev/null
@@ -1,7 +0,0 @@
----
-"livekit-agents": patch
-"livekit-plugins-cartesia": patch
----
-
-Switch Cartesia to a sentence tokenizer and keep the same context id throughout.
-Propagate segment_id through the basic sentence tokenizer
diff --git a/.changeset/itchy-ligers-exist.md b/.changeset/itchy-ligers-exist.md
deleted file mode 100644
index 0ff55363e..000000000
--- a/.changeset/itchy-ligers-exist.md
+++ /dev/null
@@ -1,5 +0,0 @@
----
-"livekit-plugins-nltk": patch
----
-
-nltk: fix broken punkt download
diff --git a/.changeset/lazy-cups-cross.md b/.changeset/lazy-cups-cross.md
deleted file mode 100644
index 214469980..000000000
--- a/.changeset/lazy-cups-cross.md
+++ /dev/null
@@ -1,5 +0,0 @@
----
-"livekit-agents": patch
----
-
-limit simultaneous process initialization
diff --git a/.changeset/moody-doors-poke.md b/.changeset/moody-doors-poke.md
new file mode 100644
index 000000000..ca70304ed
--- /dev/null
+++ b/.changeset/moody-doors-poke.md
@@ -0,0 +1,5 @@
+---
+"livekit-agents": patch
+---
+
+fix VoiceAssisstant being stuck when interrupting before user speech is committed
diff --git a/.changeset/proud-birds-press.md b/.changeset/proud-birds-press.md
deleted file mode 100644
index d9b556918..000000000
--- a/.changeset/proud-birds-press.md
+++ /dev/null
@@ -1,5 +0,0 @@
----
-"livekit-agents": patch
----
-
-voiceassistant: remove fade effect when interrupting #622
diff --git a/.changeset/red-taxis-smoke.md b/.changeset/red-taxis-smoke.md
deleted file mode 100644
index 506ac97ac..000000000
--- a/.changeset/red-taxis-smoke.md
+++ /dev/null
@@ -1,5 +0,0 @@
----
-"livekit-agents": patch
----
-
-ipc improvements, fix slow shutdown & cleanup leaked resources
diff --git a/.changeset/shaggy-apes-matter.md b/.changeset/shaggy-apes-matter.md
deleted file mode 100644
index 38374e08b..000000000
--- a/.changeset/shaggy-apes-matter.md
+++ /dev/null
@@ -1,5 +0,0 @@
----
-"livekit-plugins-deepgram": patch
----
-
-deepgram: fallback to nova-2-general when the language isn't supported
diff --git a/.changeset/tidy-years-refuse.md b/.changeset/tidy-years-refuse.md
new file mode 100644
index 000000000..5f22709af
--- /dev/null
+++ b/.changeset/tidy-years-refuse.md
@@ -0,0 +1,6 @@
+---
+"livekit-agents": patch
+"livekit-plugins-openai": patch
+---
+
+Fix function for OpenAI Assistants
diff --git a/.github/workflows/build-package.yml b/.github/workflows/build-package.yml
new file mode 100644
index 000000000..148271a88
--- /dev/null
+++ b/.github/workflows/build-package.yml
@@ -0,0 +1,98 @@
+name: Build package
+
+on:
+  workflow_call:
+    inputs:
+      package:
+        required: true
+        type: string
+      artifact_name:
+        required: true
+        type: string
+  workflow_dispatch:
+    inputs:
+      package:
+        description: 'Name of the package to build'
+        required: true
+        default: 'livekit-plugins-browser'
+      artifact_name:
+        description: 'Artifact name for the distribution package'
+        required: true
+        default: 'build-artifact'
+
+jobs:
+  build_plugins:
+    runs-on: ubuntu-latest
+    if: |
+      inputs.package == 'livekit-agents' ||
+      inputs.package == 'livekit-plugins-azure' ||
+      inputs.package == 'livekit-plugins-cartesia' ||
+      inputs.package == 'livekit-plugins-deepgram' ||
+      inputs.package == 'livekit-plugins-elevenlabs' ||
+      inputs.package == 'livekit-plugins-google' ||
+      inputs.package == 'livekit-plugins-minimal' ||
+      inputs.package == 'livekit-plugins-nltk' ||
+      inputs.package == 'livekit-plugins-openai' ||
+      inputs.package == 'livekit-plugins-rag' ||
+      inputs.package == 'livekit-plugins-silero' ||
+      inputs.package == 'livekit-plugins-anthropic'
+
+    defaults:
+      run:
+        working-directory: "${{ startsWith(inputs.package, 'livekit-plugin') && 'livekit-plugins/' || '' }}${{ inputs.package }}"
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.9"
+
+      - name: Install dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install build
+
+      - name: Build package
+        run: python -m build
+
+      - name: Upload distribution package
+        uses: actions/upload-artifact@v3
+        with:
+          name: ${{ inputs.artifact_name }}
+          path: "${{ startsWith(inputs.package, 'livekit-plugin') && 'livekit-plugins/' || '' }}${{ inputs.package }}/dist/"
+
+  build_browser:
+    if: inputs.package == 'livekit-plugins-browser'
+    runs-on: ${{ matrix.os }}
+    strategy:
+      matrix:
+        os: [macos-14] # TODO(theomonnom): other platforms
+
+    defaults:
+      run:
+        working-directory: livekit-plugins/livekit-plugins-browser
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.9"
+
+      - name: Install cibuildwheel
+        run: |
+          python -m pip install --upgrade pip
+          pip install cibuildwheel
+
+      - name: Build wheels
+        run: cibuildwheel --output-dir dist
+        env:
+          CIBW_SKIP: pp* cp313-*
+          CIBW_BUILD_VERBOSITY: 3
+
+      - name: Upload distribution package
+        uses: actions/upload-artifact@v3
+        with:
+          name: ${{ inputs.artifact_name }}
+          path: livekit-plugins/livekit-plugins-browser/dist/
\ No newline at end of file
diff --git a/.github/workflows/check-types.yml b/.github/workflows/check-types.yml
index aa560e70c..927c9e2eb 100644
--- a/.github/workflows/check-types.yml
+++ b/.github/workflows/check-types.yml
@@ -40,7 +40,8 @@ jobs:
                       ./livekit-plugins/livekit-plugins-elevenlabs \
                       ./livekit-plugins/livekit-plugins-cartesia \
                       ./livekit-plugins/livekit-plugins-rag \
-                      ./livekit-plugins/livekit-plugins-azure
+                      ./livekit-plugins/livekit-plugins-azure \
+                      ./livekit-plugins/livekit-plugins-anthropic
 
       - name: Install stub packages
         run: |
@@ -67,4 +68,5 @@ jobs:
                -p livekit.plugins.elevenlabs \
                -p livekit.plugins.cartesia \
                -p livekit.plugins.rag \
-               -p livekit.plugins.azure
+               -p livekit.plugins.azure \
+               -p livekit.plugins.anthropic
diff --git a/.github/workflows/publish-package.yml b/.github/workflows/publish-package.yml
index 6f0686ccf..2b997895b 100644
--- a/.github/workflows/publish-package.yml
+++ b/.github/workflows/publish-package.yml
@@ -52,6 +52,7 @@ jobs:
           echo "exitcode=$?" >> $GITHUB_OUTPUT
         env:
           GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+
       - name: Add changes
         if: ${{ steps.release_mode.outputs.exitcode == '0' }}
         uses: EndBug/add-and-commit@v9
@@ -79,38 +80,11 @@ jobs:
     strategy:
       matrix:
         package: ${{ fromJson(needs.bump.outputs.packages) }}
-    defaults:
-      run:
-        working-directory: "${{ startsWith(matrix.package.name, 'livekit-plugin') && 'livekit-plugins/' || '' }}${{ matrix.package.name }}"
-
-    runs-on: ubuntu-latest
-
-    steps:
-      - uses: actions/checkout@v4
-        with:
-          submodules: true
-          lfs: true
-        env:
-          GITHUB_TOKEN: ${{ secrets.CHANGESETS_PUSH_PAT }}
 
-      - name: Set up Python
-        uses: actions/setup-python@v5
-        with:
-          python-version: "3.9"
-
-      - name: Install dependencies
-        run: |
-          python -m pip install --upgrade pip
-          pip install build
-
-      - name: Build package
-        run: python -m build
-
-      - name: Store the distribution packages
-        uses: actions/upload-artifact@v3
-        with:
-          name: python-package-distributions
-          path: "${{ startsWith(matrix.package.name, 'livekit-plugin') && 'livekit-plugins/' || '' }}${{ matrix.package.name }}/dist/"
+    uses: livekit/agents/.github/workflows/build-package.yml@main
+    with:
+      package: ${{ matrix.package.name }}
+      artifact_name: python-package-distributions
 
   publish:
     needs:
diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml
index 0b900ea92..9d6f73da0 100644
--- a/.github/workflows/tests.yml
+++ b/.github/workflows/tests.yml
@@ -13,6 +13,11 @@ on:
 
 jobs:
   tests:
+    if: > # don't run tests for PRs on forks
+      ${{
+        !github.event.pull_request ||
+        github.event.pull_request.head.repo.full_name == github.repository
+      }}
     strategy:
       fail-fast: false
       matrix:
@@ -75,7 +80,8 @@ jobs:
                       ./livekit-plugins/livekit-plugins-silero \
                       ./livekit-plugins/livekit-plugins-elevenlabs \
                       ./livekit-plugins/livekit-plugins-cartesia \
-                      ./livekit-plugins/livekit-plugins-azure
+                      ./livekit-plugins/livekit-plugins-azure \
+                      ./livekit-plugins/livekit-plugins-anthropic
 
       - name: Run tests
         shell: bash
@@ -90,6 +96,7 @@ jobs:
           AZURE_SPEECH_KEY: ${{ secrets.AZURE_SPEECH_KEY }}
           AZURE_SPEECH_REGION: ${{ secrets.AZURE_SPEECH_REGION }} # nit: doesn't have to be secret
           GOOGLE_CREDENTIALS_JSON: ${{ secrets.GOOGLE_CREDENTIALS_JSON }}
+          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
           GOOGLE_APPLICATION_CREDENTIALS: google.json
         run: |
           echo $GOOGLE_CREDENTIALS_JSON > google.json
diff --git a/README.md b/README.md
index ce4b8bd7f..3204d2bab 100644
--- a/README.md
+++ b/README.md
@@ -61,6 +61,7 @@ The following plugins are available today:
 
 | Plugin                                                                             | Features                        |
 | ---------------------------------------------------------------------------------- | ------------------------------- |
+| [livekit-plugins-anthropic](https://pypi.org/project/livekit-plugins-anthropic/)   | LLM                             |
 | [livekit-plugins-azure](https://pypi.org/project/livekit-plugins-azure/)           | STT, TTS                        |
 | [livekit-plugins-cartesia](https://pypi.org/project/livekit-plugins-cartesia/)     | TTS                             |
 | [livekit-plugins-deepgram](https://pypi.org/project/livekit-plugins-deepgram/)     | STT                             |
@@ -70,6 +71,38 @@ The following plugins are available today:
 | [livekit-plugins-openai](https://pypi.org/project/livekit-plugins-openai/)         | LLM, STT, TTS                   |
 | [livekit-plugins-silero](https://pypi.org/project/livekit-plugins-silero/)         | VAD                             |
 
+## Using LLM models
+
+Agents framework supports a wide range of LLMs and hosting providers.
+
+### OpenAI-compatible models
+
+Most LLM providers offer an OpenAI-compatible API, which can be used with the `livekit-plugins-openai` plugin.
+
+```python
+from livekit.plugins.openai.llm import LLM
+```
+
+- OpenAI: `LLM(model="gpt-4o")`
+- Azure: `LLM.with_azure(azure_endpoint="", azure_deployment="")`
+- Cerebras: `LLM.with_cerebras(api_key="", model="")`
+- Fireworks: `LLM.with_fireworks(api_key="", model="")`
+- Groq: `LLM.with_groq(api_key="", model="")`
+- OctoAI: `LLM.with_octo(api_key="", model="")`
+- Ollama: `LLM.with_ollama(base_url="http://localhost:11434/v1", model="")`
+- Perplexity: `LLM.with_perplexity(api_key="", model="")`
+- TogetherAI: `LLM.with_together(api_key="", model="")`
+
+### Anthropic Claude
+
+Anthropic Claude can be used with `livekit-plugins-anthropic` plugin.
+
+```python
+from livekit.plugins.anthropic.llm import LLM
+
+myllm = LLM(model="claude-3-opus-20240229")
+```
+
 ## Concepts
 
 - **Agent**: A function that defines the workflow of a programmable, server-side participant. This is your application code.
@@ -153,7 +186,9 @@ class MyPlugin(Plugin):
 ```
 
 <!--BEGIN_REPO_NAV-->
+
 <br/><table>
+
 <thead><tr><th colspan="2">LiveKit Ecosystem</th></tr></thead>
 <tbody>
 <tr><td>Realtime SDKs</td><td><a href="https://github.com/livekit/components-js">React Components</a> · <a href="https://github.com/livekit/client-sdk-js">Browser</a> · <a href="https://github.com/livekit/components-swift">Swift Components</a> · <a href="https://github.com/livekit/client-sdk-swift">iOS/macOS/visionOS</a> · <a href="https://github.com/livekit/client-sdk-android">Android</a> · <a href="https://github.com/livekit/client-sdk-flutter">Flutter</a> · <a href="https://github.com/livekit/client-sdk-react-native">React Native</a> · <a href="https://github.com/livekit/rust-sdks">Rust</a> · <a href="https://github.com/livekit/node-sdks">Node.js</a> · <a href="https://github.com/livekit/python-sdks">Python</a> · <a href="https://github.com/livekit/client-sdk-unity-web">Unity (web)</a> · <a href="https://github.com/livekit/client-sdk-unity">Unity (beta)</a></td></tr><tr></tr>
diff --git a/examples/browser/browser_track.py b/examples/browser/browser_track.py
new file mode 100644
index 000000000..998da7979
--- /dev/null
+++ b/examples/browser/browser_track.py
@@ -0,0 +1,55 @@
+import asyncio
+import logging
+
+from dotenv import load_dotenv
+from livekit import rtc
+from livekit.agents import JobContext, WorkerOptions, cli
+from livekit.plugins import browser
+
+WIDTH = 1920
+HEIGHT = 1080
+
+load_dotenv()
+
+
+async def entrypoint(job: JobContext):
+    await job.connect()
+
+    ctx = browser.BrowserContext(dev_mode=True)
+    await ctx.initialize()
+
+    page = await ctx.new_page(url="www.livekit.io")
+
+    source = rtc.VideoSource(WIDTH, HEIGHT)
+    track = rtc.LocalVideoTrack.create_video_track("single-color", source)
+    options = rtc.TrackPublishOptions(source=rtc.TrackSource.SOURCE_CAMERA)
+    publication = await job.room.local_participant.publish_track(track, options)
+    logging.info("published track", extra={"track_sid": publication.sid})
+
+    @page.on("paint")
+    def on_paint(paint_data):
+        source.capture_frame(paint_data.frame)
+
+    async def _test_cycle():
+        urls = [
+            "https://www.livekit.io",
+            "https://www.google.com",
+        ]
+
+        i = 0
+        async with ctx.playwright() as browser:
+            while True:
+                i += 1
+                await asyncio.sleep(5)
+                defaultContext = browser.contexts[0]
+                defaultPage = defaultContext.pages[0]
+                try:
+                    await defaultPage.goto(urls[i % len(urls)])
+                except Exception:
+                    logging.exception(f"failed to navigate to {urls[i % len(urls)]}")
+
+    await _test_cycle()
+
+
+if __name__ == "__main__":
+    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))
diff --git a/examples/browser/standalone_app.py b/examples/browser/standalone_app.py
new file mode 100644
index 000000000..fdc4bad04
--- /dev/null
+++ b/examples/browser/standalone_app.py
@@ -0,0 +1,3 @@
+from livekit.plugins import browser
+
+ctx = browser.BrowserContext(dev_mode=True)
diff --git a/examples/minimal_worker.py b/examples/minimal_worker.py
index aeca197cc..e3a9ed3b9 100644
--- a/examples/minimal_worker.py
+++ b/examples/minimal_worker.py
@@ -1,6 +1,6 @@
 import logging
 
-from livekit.agents import AutoSubscribe, JobContext, WorkerOptions, cli
+from livekit.agents import AutoSubscribe, JobContext, WorkerOptions, WorkerType, cli
 
 logger = logging.getLogger("my-worker")
 logger.setLevel(logging.INFO)
@@ -16,4 +16,6 @@ async def entrypoint(ctx: JobContext):
 
 
 if __name__ == "__main__":
-    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))
+    # WorkerType.ROOM is the default worker type which will create an agent for every room.
+    # You can also use WorkerType.PUBLISHER to create a single agent for all participants that publish a track.
+    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint, worker_type=WorkerType.ROOM))
diff --git a/examples/participant-entrypoint/README.md b/examples/participant-entrypoint/README.md
new file mode 100644
index 000000000..249912907
--- /dev/null
+++ b/examples/participant-entrypoint/README.md
@@ -0,0 +1,30 @@
+# Participant Entrypoint Example
+
+This example shows how to do things when participants join. For example, a common use case is to fetch some external data based on the participant's attributes.
+
+## Run
+
+### Setup and activate a virtual env:
+
+`python -m venv venv`
+
+`source venv/bin/activate`
+
+### Set environment variables:
+
+```bash
+export LIVEKIT_URL=<your LiveKit server URL>
+export LIVEKIT_API_KEY=<your API Key>
+export LIVEKIT_API_SECRET=<your API Secret>
+```
+
+### Install requirments:
+`pip install -r requirements.txt`
+
+### Run the agent worker:
+
+`python participant_entrypoint.py dev`
+
+### Test with a LiveKit frontend:
+
+We've built [Agents Playground](https://agents-playground.livekit.io) so you don't have to build your own frontend while you iterate on your agent.
diff --git a/examples/participant-entrypoint/participant_entrypoint.py b/examples/participant-entrypoint/participant_entrypoint.py
new file mode 100644
index 000000000..5c8c38c69
--- /dev/null
+++ b/examples/participant-entrypoint/participant_entrypoint.py
@@ -0,0 +1,44 @@
+import asyncio
+import logging
+
+from dotenv import load_dotenv
+from livekit import rtc
+from livekit.agents import AutoSubscribe, JobContext, WorkerOptions, cli
+
+load_dotenv()
+
+logger = logging.getLogger("my-worker")
+logger.setLevel(logging.INFO)
+
+
+async def entrypoint(ctx: JobContext):
+    logger.info("starting entrypoint")
+
+    async def participant_task_1(ctx: JobContext, p: rtc.RemoteParticipant):
+        # You can filter out participants you are not interested in
+        # if p.identity != "some_identity_of_interest":
+        # return
+
+        logger.info(f"participant task 1 starting for {p.identity}")
+        # Do something with p.attributes, p.identity, p.metadata, etc.
+        # my_stuff = await fetch_stuff_from_my_db(p)
+
+        # Do something
+        await asyncio.sleep(60)
+        logger.info(f"participant task done for {p.identity}")
+
+    async def participant_task_2(ctx: JobContext, p: rtc.RemoteParticipant):
+        # multiple tasks can be run concurrently for each participant
+        logger.info(f"participant task 2 starting for {p.identity}")
+        await asyncio.sleep(10)
+
+    # Add participant entrypoints before calling ctx.connect
+    ctx.add_participant_entrypoint(entrypoint_fnc=participant_task_1)
+    ctx.add_participant_entrypoint(entrypoint_fnc=participant_task_2)
+
+    await ctx.connect(auto_subscribe=AutoSubscribe.SUBSCRIBE_ALL)
+    logger.info("connected to the room")
+
+
+if __name__ == "__main__":
+    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))
diff --git a/examples/participant-entrypoint/requirements.txt b/examples/participant-entrypoint/requirements.txt
new file mode 100644
index 000000000..468a9e5d2
--- /dev/null
+++ b/examples/participant-entrypoint/requirements.txt
@@ -0,0 +1 @@
+livekit-agents>=0.9.0
diff --git a/examples/simple-color/agent.py b/examples/simple-color/agent.py
index e64f5cfda..57fc99952 100644
--- a/examples/simple-color/agent.py
+++ b/examples/simple-color/agent.py
@@ -1,15 +1,17 @@
 import asyncio
 import logging
+import random
 
+from dotenv import load_dotenv
 from livekit import rtc
 from livekit.agents import JobContext, WorkerOptions, cli
 
+# Load environment variables
+load_dotenv()
+
 WIDTH = 640
 HEIGHT = 480
 
-# change this color in dev mode and the agent will automatically update
-COLOR = bytes([0, 255, 0, 255])
-
 
 async def entrypoint(job: JobContext):
     await job.connect()
@@ -26,7 +28,12 @@ async def _draw_color():
         while True:
             await asyncio.sleep(0.1)  # 100ms
 
-            argb_frame[:] = COLOR * WIDTH * HEIGHT
+            # Create a new random color
+            r, g, b = [random.randint(0, 255) for _ in range(3)]
+            color = bytes([r, g, b, 255])
+
+            # Fill the frame with the new random color
+            argb_frame[:] = color * WIDTH * HEIGHT
             frame = rtc.VideoFrame(WIDTH, HEIGHT, rtc.VideoBufferType.RGBA, argb_frame)
             source.capture_frame(frame)
 
diff --git a/examples/simple-color/requirements.txt b/examples/simple-color/requirements.txt
index 0e6eb52ae..468a9e5d2 100644
--- a/examples/simple-color/requirements.txt
+++ b/examples/simple-color/requirements.txt
@@ -1 +1 @@
-livekit-agents>=0.8.5
+livekit-agents>=0.9.0
diff --git a/examples/speech-to-text/README.md b/examples/speech-to-text/README.md
index 700497899..f468b601c 100644
--- a/examples/speech-to-text/README.md
+++ b/examples/speech-to-text/README.md
@@ -1,18 +1,14 @@
 # Speech-to-text
 
-This example shows how you can transcript real-time audio data into text.
+This example show realtime transcription from audio to text.
 
-It uses Deepgram's STT API to transcript the audio data. It can be switched to
-other STT providers by changing this line:
+It uses Deepgram's STT API, but supports other STT plugins by changing this line:
 
 ```python
 stt = deepgram.STT()
 ```
 
-All transcriptions are sent to clients in the room with LiveKit's transcription protocol.
-
-It's currently supported in the JS SDK and React Components. This will be made available for
-all other SDKs in the coming weeks.
+To render the transcriptions into your client application, refer to the [full documentation](https://docs.livekit.io/agents/build/transcriptions).
 
 ## Running the example
 
diff --git a/examples/speech-to-text/deepgram_stt.py b/examples/speech-to-text/deepgram_stt.py
index 6a8cc100e..24d770f68 100644
--- a/examples/speech-to-text/deepgram_stt.py
+++ b/examples/speech-to-text/deepgram_stt.py
@@ -1,6 +1,7 @@
 import asyncio
 import logging
 
+from dotenv import load_dotenv
 from livekit import rtc
 from livekit.agents import (
     AutoSubscribe,
@@ -12,6 +13,8 @@
 )
 from livekit.plugins import deepgram
 
+load_dotenv()
+
 logger = logging.getLogger("deepgram-stt-demo")
 logger.setLevel(logging.INFO)
 
diff --git a/examples/speech-to-text/requirements.txt b/examples/speech-to-text/requirements.txt
index 852095813..eb367925c 100644
--- a/examples/speech-to-text/requirements.txt
+++ b/examples/speech-to-text/requirements.txt
@@ -1,2 +1,2 @@
-livekit-agents>=0.8.5
-livekit-plugins-deepgram>=0.6.4
+livekit-agents>=0.9.0
+livekit-plugins-deepgram>=0.6.7
diff --git a/examples/text-to-speech/cartesia_tts.py b/examples/text-to-speech/cartesia_tts.py
new file mode 100644
index 000000000..2f87ee975
--- /dev/null
+++ b/examples/text-to-speech/cartesia_tts.py
@@ -0,0 +1,43 @@
+import asyncio
+import logging
+
+from dotenv import load_dotenv
+from livekit import rtc
+from livekit.agents import AutoSubscribe, JobContext, WorkerOptions, cli
+from livekit.plugins import cartesia
+
+load_dotenv()
+
+logger = logging.getLogger("cartesia-tts-demo")
+logger.setLevel(logging.INFO)
+
+
+async def entrypoint(job: JobContext):
+    logger.info("starting tts example agent")
+
+    tts = cartesia.TTS(
+        speed="fastest",
+        emotion=["surprise:highest"],
+    )
+
+    source = rtc.AudioSource(tts.sample_rate, tts.num_channels)
+    track = rtc.LocalAudioTrack.create_audio_track("agent-mic", source)
+    options = rtc.TrackPublishOptions()
+    options.source = rtc.TrackSource.SOURCE_MICROPHONE
+
+    await job.connect(auto_subscribe=AutoSubscribe.SUBSCRIBE_NONE)
+    publication = await job.room.local_participant.publish_track(track, options)
+    await publication.wait_for_subscription()
+
+    logger.info('Saying "Hello!"')
+    async for output in tts.synthesize("Hello I hope you are having a great day."):
+        await source.capture_frame(output.frame)
+
+    await asyncio.sleep(4)
+    logger.info('Saying "Goodbye."')
+    async for output in tts.synthesize("Goodbye I hope to see you again soon."):
+        await source.capture_frame(output.frame)
+
+
+if __name__ == "__main__":
+    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))
diff --git a/examples/text-to-speech/elevenlabs_tts.py b/examples/text-to-speech/elevenlabs_tts.py
index 7f6180402..91e1bd7b5 100644
--- a/examples/text-to-speech/elevenlabs_tts.py
+++ b/examples/text-to-speech/elevenlabs_tts.py
@@ -2,6 +2,7 @@
 import logging
 from typing import Optional
 
+from dotenv import load_dotenv
 from livekit import rtc
 from livekit.agents import JobContext, WorkerOptions, cli
 from livekit.plugins import elevenlabs
@@ -9,6 +10,8 @@
 logger = logging.getLogger("elevenlabs-tts-demo")
 logger.setLevel(logging.INFO)
 
+load_dotenv()
+
 
 def _text_to_chunks(text: str) -> list[str]:
     """Split the text into chunks of 2, 3, and 4 words"""
@@ -51,9 +54,9 @@ async def entrypoint(job: JobContext):
     options.source = rtc.TrackSource.SOURCE_MICROPHONE
 
     await job.connect()
-    await job.room.local_participant.publish_track(track, options)
+    publication = await job.room.local_participant.publish_track(track, options)
+    await publication.wait_for_subscription()
 
-    await asyncio.sleep(1)
     logger.info('Saying "Bonjour, comment allez-vous?"')
     async for output in tts_11labs.synthesize("Bonjour, comment allez-vous?"):
         await source.capture_frame(output.frame)
diff --git a/examples/text-to-speech/openai_tts.py b/examples/text-to-speech/openai_tts.py
index 0edcb3b7c..fce018309 100644
--- a/examples/text-to-speech/openai_tts.py
+++ b/examples/text-to-speech/openai_tts.py
@@ -1,10 +1,13 @@
 import asyncio
 import logging
 
+from dotenv import load_dotenv
 from livekit import rtc
 from livekit.agents import AutoSubscribe, JobContext, WorkerOptions, cli
 from livekit.plugins import openai
 
+load_dotenv()
+
 logger = logging.getLogger("openai-tts-demo")
 logger.setLevel(logging.INFO)
 
@@ -20,9 +23,9 @@ async def entrypoint(job: JobContext):
     options.source = rtc.TrackSource.SOURCE_MICROPHONE
 
     await job.connect(auto_subscribe=AutoSubscribe.SUBSCRIBE_NONE)
-    await job.room.local_participant.publish_track(track, options)
+    publication = await job.room.local_participant.publish_track(track, options)
+    await publication.wait_for_subscription()
 
-    await asyncio.sleep(1)
     logger.info('Saying "Hello!"')
     async for output in tts.synthesize("Hello!"):
         await source.capture_frame(output.frame)
diff --git a/examples/text-to-speech/requirements.txt b/examples/text-to-speech/requirements.txt
index 292d588ad..e81e20304 100644
--- a/examples/text-to-speech/requirements.txt
+++ b/examples/text-to-speech/requirements.txt
@@ -1,2 +1,4 @@
-livekit-agents>=0.8.5
-livekit-plugins-openai>=0.8.0
+livekit-agents>=0.9.0
+livekit-plugins-openai>=0.8.4
+livekit-plugins-cartesia>=0.4.2
+livekit-plugins-elevenlabs>=0.7.5
diff --git a/examples/text-to-speech/sync_tts_transcription.py b/examples/text-to-speech/sync_tts_transcription.py
index 545247ccd..d7a349b56 100644
--- a/examples/text-to-speech/sync_tts_transcription.py
+++ b/examples/text-to-speech/sync_tts_transcription.py
@@ -2,6 +2,7 @@
 import logging
 from typing import Optional
 
+from dotenv import load_dotenv
 from livekit import rtc
 from livekit.agents import (
     AutoSubscribe,
@@ -13,6 +14,8 @@
 )
 from livekit.plugins import elevenlabs
 
+load_dotenv()
+
 logger = logging.getLogger("transcription-forwarding-demo")
 logger.setLevel(logging.INFO)
 
@@ -27,14 +30,14 @@ async def entrypoint(ctx: JobContext):
     options = rtc.TrackPublishOptions(source=rtc.TrackSource.SOURCE_MICROPHONE)
 
     await ctx.connect(auto_subscribe=AutoSubscribe.SUBSCRIBE_NONE)
-    await ctx.room.local_participant.publish_track(track, options)
+    publication = await ctx.room.local_participant.publish_track(track, options)
+    await publication.wait_for_subscription()
 
     # start the transcription examples
     tts_forwarder = transcription.TTSSegmentsForwarder(
         room=ctx.room, participant=ctx.room.local_participant
     )
 
-    await asyncio.sleep(2)
     await _eg_single_segment(tts_forwarder, tts_11labs, source)
 
     await asyncio.sleep(2)
diff --git a/examples/voice-assistant/README.md b/examples/voice-assistant/README.md
index d9b48ff82..6f7e176fb 100644
--- a/examples/voice-assistant/README.md
+++ b/examples/voice-assistant/README.md
@@ -1,15 +1,19 @@
-# Voice Assistant Example
+# Voice Assistant Examples
+
+We have a few examples that shows the various ways of using using the VoiceAssistant class:
 
-This example shows two usages of the VoiceAssistant class:
 - `minimal_assistant.py`: a basic conversational assistant
-- `function_calling.py`: a voice assistant capable of obeying commands (turning on/off a mock room's lights)
+- `function_calling_weather.py`: a weather assistant that calls an API endpoint to retrieve the weather
+- `custom_pronunciation.py`: using the `before_tts_cb` hook to customize how TTS pronounces words
+- `simple_rag`: a simple RAG assistant that answers questions by querying a embeddings index
+
+The demo assistants use:
 
-Both assistants use:
 - Deepgram for Speech-to-text
-- OpenAI for LLM
-- Elevenlabs for Text-to-speech
+- OpenAI for LLM and Text-to-speech
 
 ## Run
+
 Instructions for running the two agents are identical, the following steps will assume you are running `minimal_assistant.py`
 
 ### Setup and activate a virtual env:
@@ -24,17 +28,13 @@ Instructions for running the two agents are identical, the following steps will
 export LIVEKIT_URL=<your LiveKit server URL>
 export LIVEKIT_API_KEY=<your API Key>
 export LIVEKIT_API_SECRET=<your API Secret>
-export ELEVEN_API_KEY=<your ElevenLabs API key>
 export DEEPGRAM_API_KEY=<your Deepgram API key>
 export OPENAI_API_KEY=<your OpenAI API key>
 ```
 
 ### Install requirments:
-`pip install -r requirements.txt`
-
-### Download files (in this case, it downloads the model weights for Voice-activity-detection):
 
-`python minimal_assistant.py download-files`
+`pip install -r requirements.txt`
 
 ### Run the agent worker:
 
diff --git a/examples/voice-assistant/custom_pronunciation.py b/examples/voice-assistant/custom_pronunciation.py
new file mode 100644
index 000000000..e6ff7cd52
--- /dev/null
+++ b/examples/voice-assistant/custom_pronunciation.py
@@ -0,0 +1,49 @@
+from __future__ import annotations
+
+from typing import AsyncIterable
+
+from dotenv import load_dotenv
+from livekit.agents import AutoSubscribe, JobContext, WorkerOptions, cli, llm, tokenize
+from livekit.agents.voice_assistant import VoiceAssistant
+from livekit.plugins import cartesia, deepgram, openai, silero
+
+load_dotenv()
+
+
+async def entrypoint(ctx: JobContext):
+    initial_ctx = llm.ChatContext().append(
+        role="system",
+        text=(
+            "You are a voice assistant created by LiveKit. Your interface with users will be voice. "
+            "You should use short and concise responses, and avoiding usage of unpronouncable punctuation."
+        ),
+    )
+
+    await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)
+
+    def _before_tts_cb(assistant: VoiceAssistant, text: str | AsyncIterable[str]):
+        # The TTS is incorrectly pronouncing "LiveKit", so we'll replace it with a phonetic
+        # spelling
+        return tokenize.utils.replace_words(
+            text=text, replacements={"livekit": r"<<l|aɪ|v|k|ɪ|t|>>"}
+        )
+
+    # also for this example, we also intensify the keyword "LiveKit" to make it more likely to be
+    # recognized with the STT
+    deepgram_stt = deepgram.STT(keywords=[("LiveKit", 3.5)])
+
+    assistant = VoiceAssistant(
+        vad=silero.VAD.load(),
+        stt=deepgram_stt,
+        llm=openai.LLM(),
+        tts=cartesia.TTS(),
+        chat_ctx=initial_ctx,
+        before_tts_cb=_before_tts_cb,
+    )
+    assistant.start(ctx.room)
+
+    await assistant.say("Hey, LiveKit is awesome!", allow_interruptions=True)
+
+
+if __name__ == "__main__":
+    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))
diff --git a/examples/voice-assistant/function_calling.py b/examples/voice-assistant/function_calling.py
deleted file mode 100644
index 9392bc900..000000000
--- a/examples/voice-assistant/function_calling.py
+++ /dev/null
@@ -1,115 +0,0 @@
-import asyncio
-import enum
-import logging
-from typing import Annotated
-
-from dotenv import load_dotenv
-from livekit.agents import AutoSubscribe, JobContext, WorkerOptions, cli, llm
-from livekit.agents.voice_assistant import VoiceAssistant
-from livekit.plugins import deepgram, openai, silero
-
-load_dotenv()
-
-logger = logging.getLogger("function-calling-demo")
-logger.setLevel(logging.INFO)
-
-
-class Room(enum.Enum):
-    # ai_callable can understand enum types as a set of choices
-    # this is equivalent to:
-    #     `Annotated[Room, llm.TypeInfo(choices=["bedroom", "living room", "kitchen", "bathroom", "office"])]`
-    BEDROOM = "bedroom"
-    LIVING_ROOM = "living room"
-    KITCHEN = "kitchen"
-    BATHROOM = "bathroom"
-    OFFICE = "office"
-
-
-class AssistantFnc(llm.FunctionContext):
-    """
-    The class defines a set of AI functions that the assistant can execute.
-    """
-
-    def __init__(self) -> None:
-        super().__init__()
-
-        # default state of the lights in each room
-        self._light_status = {
-            Room.BEDROOM: False,
-            Room.LIVING_ROOM: True,
-            Room.KITCHEN: True,
-            Room.BATHROOM: False,
-            Room.OFFICE: False,
-        }
-
-    @property
-    def light_status(self):
-        return self._light_status
-
-    # Simple demonstration of an AI function that can be called by the user with some arguments.
-    @llm.ai_callable(description="Turn on/off the lights in a room")
-    async def toggle_light(
-        self,
-        room: Annotated[Room, llm.TypeInfo(description="The specific room")],
-        status: bool,
-    ):
-        logger.info("toggle_light - room: %s status: %s", room, status)
-        self._light_status[room] = status
-        return f"Turned the lights in the {room} {'on' if status else 'off'}"
-
-
-async def entrypoint(ctx: JobContext):
-    fnc_ctx = AssistantFnc()  # create our fnc ctx instance
-
-    async def _will_synthesize_assistant_reply(
-        assistant: VoiceAssistant, chat_ctx: llm.ChatContext
-    ):
-        # Inject the current state of the lights into the context of the LLM
-        chat_ctx = chat_ctx.copy()
-        chat_ctx.messages.append(
-            llm.ChatMessage(
-                content=(
-                    "Current state of the lights:\n"
-                    + "\n".join(
-                        f"- {room}: {'on' if status else 'off'}"
-                        for room, status in fnc_ctx.light_status.items()
-                    )
-                ),
-                role="system",
-            )
-        )
-        return assistant.llm.chat(chat_ctx=chat_ctx, fnc_ctx=assistant.fnc_ctx)
-
-    initial_chat_ctx = llm.ChatContext()
-    initial_chat_ctx.messages.append(
-        llm.ChatMessage(
-            content=(
-                "You are a home assistant created by LiveKit. Your interface with users will be voice. "
-                "You should use short and concise responses, and avoiding usage of unpronouncable punctuation. "
-            ),
-            role="system",
-        )
-    )
-
-    assistant = VoiceAssistant(
-        vad=silero.VAD.load(),
-        stt=deepgram.STT(),
-        llm=openai.LLM(),
-        tts=openai.TTS(),
-        fnc_ctx=fnc_ctx,
-        chat_ctx=initial_chat_ctx,
-        will_synthesize_assistant_reply=_will_synthesize_assistant_reply,
-    )
-
-    await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)
-
-    # Start the assistant. This will automatically publish a microphone track and listen to the first participant
-    # it finds in the current room. If you need to specify a particular participant, use the participant parameter.
-    assistant.start(ctx.room)
-
-    await asyncio.sleep(2)
-    await assistant.say("Hey, how can I help you today?")
-
-
-if __name__ == "__main__":
-    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))
diff --git a/examples/voice-assistant/function_calling_weather.py b/examples/voice-assistant/function_calling_weather.py
new file mode 100644
index 000000000..82155cce1
--- /dev/null
+++ b/examples/voice-assistant/function_calling_weather.py
@@ -0,0 +1,85 @@
+import logging
+from typing import Annotated
+
+import aiohttp
+from dotenv import load_dotenv
+from livekit.agents import (
+    AutoSubscribe,
+    JobContext,
+    JobProcess,
+    WorkerOptions,
+    cli,
+    llm,
+)
+from livekit.agents.voice_assistant import VoiceAssistant
+from livekit.plugins import deepgram, openai, silero
+
+load_dotenv()
+
+logger = logging.getLogger("weather-demo")
+logger.setLevel(logging.INFO)
+
+
+class AssistantFnc(llm.FunctionContext):
+    """
+    The class defines a set of LLM functions that the assistant can execute.
+    """
+
+    @llm.ai_callable()
+    async def get_weather(
+        self,
+        location: Annotated[
+            str, llm.TypeInfo(description="The location to get the weather for")
+        ],
+    ):
+        """Called when the user asks about the weather. This function will return the weather for the given location."""
+        logger.info(f"getting weather for {location}")
+        url = f"https://wttr.in/{location}?format=%C+%t"
+        async with aiohttp.ClientSession() as session:
+            async with session.get(url) as response:
+                if response.status == 200:
+                    weather_data = await response.text()
+                    # response from the function call is returned to the LLM
+                    return f"The weather in {location} is {weather_data}."
+                else:
+                    raise f"Failed to get weather data, status code: {response.status}"
+
+
+def prewarm_process(proc: JobProcess):
+    # preload silero VAD in memory to speed up session start
+    proc.userdata["vad"] = silero.VAD.load()
+
+
+async def entrypoint(ctx: JobContext):
+    await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)
+    fnc_ctx = AssistantFnc()  # create our fnc ctx instance
+    initial_chat_ctx = llm.ChatContext().append(
+        text=(
+            "You are a weather assistant created by LiveKit. Your interface with users will be voice. "
+            "You will provide weather information for a given location."
+        ),
+        role="system",
+    )
+    participant = await ctx.wait_for_participant()
+    assistant = VoiceAssistant(
+        vad=ctx.proc.userdata["vad"],
+        stt=deepgram.STT(),
+        llm=openai.LLM(),
+        tts=openai.TTS(),
+        fnc_ctx=fnc_ctx,
+        chat_ctx=initial_chat_ctx,
+    )
+    # Start the assistant. This will automatically publish a microphone track and listen to the participant.
+    assistant.start(ctx.room, participant)
+    await assistant.say(
+        "Hello from the weather station. Would you like to know the weather? If so, tell me your location."
+    )
+
+
+if __name__ == "__main__":
+    cli.run_app(
+        WorkerOptions(
+            entrypoint_fnc=entrypoint,
+            prewarm_fnc=prewarm_process,
+        ),
+    )
diff --git a/examples/voice-assistant/minimal_assistant.py b/examples/voice-assistant/minimal_assistant.py
index 35e0dee8e..c1aec2a44 100644
--- a/examples/voice-assistant/minimal_assistant.py
+++ b/examples/voice-assistant/minimal_assistant.py
@@ -1,12 +1,25 @@
 import asyncio
+import logging
 
 from dotenv import load_dotenv
 from livekit import rtc
-from livekit.agents import AutoSubscribe, JobContext, WorkerOptions, cli, llm
+from livekit.agents import (
+    AutoSubscribe,
+    JobContext,
+    JobProcess,
+    WorkerOptions,
+    cli,
+    llm,
+)
 from livekit.agents.voice_assistant import VoiceAssistant
 from livekit.plugins import deepgram, openai, silero
 
 load_dotenv()
+logger = logging.getLogger("voice-assistant")
+
+
+def prewarm(proc: JobProcess):
+    proc.userdata["vad"] = silero.VAD.load()
 
 
 async def entrypoint(ctx: JobContext):
@@ -18,16 +31,27 @@ async def entrypoint(ctx: JobContext):
         ),
     )
 
+    logger.info(f"connecting to room {ctx.room.name}")
     await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)
 
+    # wait for the first participant to connect
+    participant = await ctx.wait_for_participant()
+    logger.info(f"starting voice assistant for participant {participant.identity}")
+
+    dg_model = "nova-2-general"
+    if participant.kind == rtc.ParticipantKind.PARTICIPANT_KIND_SIP:
+        # use a model optimized for telephony
+        dg_model = "nova-2-phonecall"
+
     assistant = VoiceAssistant(
-        vad=silero.VAD.load(),
-        stt=deepgram.STT(),
+        vad=ctx.proc.userdata["vad"],
+        stt=deepgram.STT(model=dg_model),
         llm=openai.LLM(),
         tts=openai.TTS(),
         chat_ctx=initial_ctx,
     )
-    assistant.start(ctx.room)
+
+    assistant.start(ctx.room, participant)
 
     # listen to incoming chat messages, only required if you'd like the agent to
     # answer incoming messages from Chat
@@ -44,9 +68,8 @@ def on_chat_received(msg: rtc.ChatMessage):
         if msg.message:
             asyncio.create_task(answer_from_text(msg.message))
 
-    await asyncio.sleep(1)
     await assistant.say("Hey, how can I help you today?", allow_interruptions=True)
 
 
 if __name__ == "__main__":
-    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))
+    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint, prewarm_fnc=prewarm))
diff --git a/examples/voice-assistant/requirements.txt b/examples/voice-assistant/requirements.txt
index 1c92c23ae..7071396dc 100644
--- a/examples/voice-assistant/requirements.txt
+++ b/examples/voice-assistant/requirements.txt
@@ -1,5 +1,6 @@
-livekit-agents>=0.8.5
-livekit-plugins-openai>=0.8.0
-livekit-plugins-deepgram>=0.6.4
-livekit-plugins-silero>=0.6.3
-python-dotenv~=1.0
\ No newline at end of file
+livekit-agents>=0.9.0
+livekit-plugins-openai>=0.8.4
+livekit-plugins-deepgram>=0.6.7
+livekit-plugins-silero>=0.6.4
+python-dotenv~=1.0
+aiofile~=3.8.8
diff --git a/examples/voice-assistant/save_chatctx.py b/examples/voice-assistant/save_chatctx.py
new file mode 100644
index 000000000..d6b1b6ac6
--- /dev/null
+++ b/examples/voice-assistant/save_chatctx.py
@@ -0,0 +1,84 @@
+import asyncio
+from datetime import datetime
+
+from aiofile import async_open as open
+from dotenv import load_dotenv
+from livekit import rtc
+from livekit.agents import AutoSubscribe, JobContext, WorkerOptions, cli, llm
+from livekit.agents.voice_assistant import VoiceAssistant
+from livekit.plugins import deepgram, openai, silero
+
+load_dotenv()
+
+
+async def entrypoint(ctx: JobContext):
+    initial_ctx = llm.ChatContext().append(
+        role="system",
+        text=(
+            "You are a voice assistant created by LiveKit. Your interface with users will be voice. "
+            "You should use short and concise responses, and avoiding usage of unpronouncable punctuation."
+        ),
+    )
+
+    await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)
+
+    assistant = VoiceAssistant(
+        vad=silero.VAD.load(),
+        stt=deepgram.STT(),
+        llm=openai.LLM(),
+        tts=openai.TTS(),
+        chat_ctx=initial_ctx,
+    )
+    assistant.start(ctx.room)
+
+    # listen to incoming chat messages, only required if you'd like the agent to
+    # answer incoming messages from Chat
+    chat = rtc.ChatManager(ctx.room)
+
+    async def answer_from_text(txt: str):
+        chat_ctx = assistant.chat_ctx.copy()
+        chat_ctx.append(role="user", text=txt)
+        stream = assistant.llm.chat(chat_ctx=chat_ctx)
+        await assistant.say(stream)
+
+    @chat.on("message_received")
+    def on_chat_received(msg: rtc.ChatMessage):
+        if msg.message:
+            asyncio.create_task(answer_from_text(msg.message))
+
+    log_queue = asyncio.Queue()
+
+    @assistant.on("user_speech_committed")
+    def on_user_speech_committed(msg: llm.ChatMessage):
+        # convert string lists to strings, drop images
+        if isinstance(msg.content, list):
+            msg.content = "\n".join(
+                "[image]" if isinstance(x, llm.ChatImage) else x for x in msg
+            )
+        log_queue.put_nowait(f"[{datetime.now()}] USER:\n{msg.content}\n\n")
+
+    @assistant.on("agent_speech_committed")
+    def on_agent_speech_committed(msg: llm.ChatMessage):
+        log_queue.put_nowait(f"[{datetime.now()}] AGENT:\n{msg.content}\n\n")
+
+    async def write_transcription():
+        async with open("transcriptions.log", "w") as f:
+            while True:
+                msg = await log_queue.get()
+                if msg is None:
+                    break
+                await f.write(msg)
+
+    write_task = asyncio.create_task(write_transcription())
+
+    async def finish_queue():
+        log_queue.put_nowait(None)
+        await write_task
+
+    ctx.add_shutdown_callback(finish_queue)
+
+    await assistant.say("Hey, how can I help you today?", allow_interruptions=True)
+
+
+if __name__ == "__main__":
+    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))
diff --git a/examples/voice-assistant/simple-rag/assistant.py b/examples/voice-assistant/simple-rag/assistant.py
index 84d9aa6ae..1bbcda056 100644
--- a/examples/voice-assistant/simple-rag/assistant.py
+++ b/examples/voice-assistant/simple-rag/assistant.py
@@ -1,4 +1,3 @@
-import asyncio
 import pickle
 
 from livekit.agents import AutoSubscribe, JobContext, WorkerOptions, cli, llm
@@ -13,9 +12,10 @@
 
 
 async def entrypoint(ctx: JobContext):
-    async def _will_synthesize_assistant_answer(
-        assistant: VoiceAssistant, chat_ctx: llm.ChatContext
-    ):
+    async def _enrich_with_rag(assistant: VoiceAssistant, chat_ctx: llm.ChatContext):
+        # locate the last user message and use it to query the RAG model
+        # to get the most relevant paragraph
+        # then provide that as additional context to the LLM
         user_msg = chat_ctx.messages[-1]
         user_embedding = await openai.create_embeddings(
             input=[user_msg.content],
@@ -28,7 +28,6 @@ async def _will_synthesize_assistant_answer(
         user_msg.content = (
             "Context:\n" + paragraph + "\n\nUser question: " + user_msg.content
         )
-        return assistant.llm.chat(chat_ctx=chat_ctx, fnc_ctx=assistant.fnc_ctx)
 
     initial_ctx = llm.ChatContext().append(
         role="system",
@@ -47,12 +46,11 @@ async def _will_synthesize_assistant_answer(
         stt=deepgram.STT(),
         llm=openai.LLM(),
         tts=openai.TTS(),
-        will_synthesize_assistant_reply=_will_synthesize_assistant_answer,
+        before_llm_cb=_enrich_with_rag,
     )
 
     assistant.start(ctx.room)
 
-    await asyncio.sleep(1)
     await assistant.say("Hey, how can I help you today?", allow_interruptions=True)
 
 
diff --git a/livekit-agents/CHANGELOG.md b/livekit-agents/CHANGELOG.md
index f56c62696..698682dc5 100644
--- a/livekit-agents/CHANGELOG.md
+++ b/livekit-agents/CHANGELOG.md
@@ -1,5 +1,131 @@
 # livekit-agents
 
+## 0.9.0
+
+### Minor Changes
+
+- rename voice_assistant.state to lk.agent.state - [#772](https://github.com/livekit/agents/pull/772) ([@bcherry](https://github.com/bcherry))
+
+### Patch Changes
+
+- bump rtc - [#782](https://github.com/livekit/agents/pull/782) ([@nbsp](https://github.com/nbsp))
+
+- improve graceful shutdown - [#756](https://github.com/livekit/agents/pull/756) ([@theomonnom](https://github.com/theomonnom))
+
+- avoid returning tiny frames from TTS - [#747](https://github.com/livekit/agents/pull/747) ([@theomonnom](https://github.com/theomonnom))
+
+- windows: default to threaded executor & fix dev mode - [#755](https://github.com/livekit/agents/pull/755) ([@theomonnom](https://github.com/theomonnom))
+
+- 11labs: send phoneme in one entire xml chunk - [#766](https://github.com/livekit/agents/pull/766) ([@theomonnom](https://github.com/theomonnom))
+
+- fix: process not starting if num_idle_processes is zero - [#763](https://github.com/livekit/agents/pull/763) ([@theomonnom](https://github.com/theomonnom))
+
+- voiceassistant: avoid tiny frames on playout - [#750](https://github.com/livekit/agents/pull/750) ([@theomonnom](https://github.com/theomonnom))
+
+- voiceassistant: expose turn_completion_delay - [#752](https://github.com/livekit/agents/pull/752) ([@theomonnom](https://github.com/theomonnom))
+
+- limit concurrent process init to 1 - [#751](https://github.com/livekit/agents/pull/751) ([@theomonnom](https://github.com/theomonnom))
+
+- Add typing-extensions as a dependency - [#778](https://github.com/livekit/agents/pull/778) ([@keepingitneil](https://github.com/keepingitneil))
+
+- Allow setting LLM temperature with VoiceAssistant - [#741](https://github.com/livekit/agents/pull/741) ([@davidzhao](https://github.com/davidzhao))
+
+- better dev defaults - [#762](https://github.com/livekit/agents/pull/762) ([@theomonnom](https://github.com/theomonnom))
+
+- voiceassistant: allow to cancel llm generation inside before_llm_cb - [#753](https://github.com/livekit/agents/pull/753) ([@theomonnom](https://github.com/theomonnom))
+
+- use os.exit to exit forcefully - [#770](https://github.com/livekit/agents/pull/770) ([@theomonnom](https://github.com/theomonnom))
+
+## 0.8.12
+
+### Patch Changes
+
+- tts*forwarder: don't raise inside mark*{audio,text}\_segment_end when nothing was pushed - [#730](https://github.com/livekit/agents/pull/730) ([@theomonnom](https://github.com/theomonnom))
+
+## 0.8.11
+
+### Patch Changes
+
+- improve gracefully_cancel logic - [#720](https://github.com/livekit/agents/pull/720) ([@theomonnom](https://github.com/theomonnom))
+
+- Make ctx.room.name available prior to connection - [#716](https://github.com/livekit/agents/pull/716) ([@davidzhao](https://github.com/davidzhao))
+
+- ipc: add threaded job runner - [#684](https://github.com/livekit/agents/pull/684) ([@theomonnom](https://github.com/theomonnom))
+
+- voiceassistant: add VoiceAssistantState - [#654](https://github.com/livekit/agents/pull/654) ([@lukasIO](https://github.com/lukasIO))
+
+- add JobContext.wait_for_participant - [#712](https://github.com/livekit/agents/pull/712) ([@theomonnom](https://github.com/theomonnom))
+
+- fix non pickleable log - [#691](https://github.com/livekit/agents/pull/691) ([@theomonnom](https://github.com/theomonnom))
+
+- voiceassistant: skip speech initialization if interrupted - [#715](https://github.com/livekit/agents/pull/715) ([@theomonnom](https://github.com/theomonnom))
+
+- bump required livekit version to 0.15.2 - [#722](https://github.com/livekit/agents/pull/722) ([@theomonnom](https://github.com/theomonnom))
+
+- voiceassistant: add will_synthesize_assistant_speech - [#706](https://github.com/livekit/agents/pull/706) ([@theomonnom](https://github.com/theomonnom))
+
+- voiceassistant: fix mark_audio_segment_end with no audio data - [#719](https://github.com/livekit/agents/pull/719) ([@theomonnom](https://github.com/theomonnom))
+
+## 0.8.10
+
+### Patch Changes
+
+- Pass JobContext to participant entrypoint function - [#694](https://github.com/livekit/agents/pull/694) ([@davidzhao](https://github.com/davidzhao))
+
+- voiceassistant: keep punctuations when sending agent transcription - [#648](https://github.com/livekit/agents/pull/648) ([@theomonnom](https://github.com/theomonnom))
+
+## 0.8.9
+
+### Patch Changes
+
+- Introduce easy api for starting tasks for remote participants - [#679](https://github.com/livekit/agents/pull/679) ([@keepingitneil](https://github.com/keepingitneil))
+
+- update livekit to 0.14.0 and await tracksubscribed - [#678](https://github.com/livekit/agents/pull/678) ([@nbsp](https://github.com/nbsp))
+
+## 0.8.8
+
+### Patch Changes
+
+- fix uninitialized SpeechHandle error on interruption - [#665](https://github.com/livekit/agents/pull/665) ([@theomonnom](https://github.com/theomonnom))
+
+- voiceassistant: avoid stacking assistant replies when allow_interruptions=False - [#667](https://github.com/livekit/agents/pull/667) ([@theomonnom](https://github.com/theomonnom))
+
+- fix: disconnect event may now have a arguments - [#668](https://github.com/livekit/agents/pull/668) ([@theomonnom](https://github.com/theomonnom))
+
+- Add ServerMessage.termination handler - [#635](https://github.com/livekit/agents/pull/635) ([@nbsp](https://github.com/nbsp))
+
+## 0.8.7
+
+### Patch Changes
+
+- voiceassistant: fix llm not having the full chat context on bad interruption timing - [#659](https://github.com/livekit/agents/pull/659) ([@theomonnom](https://github.com/theomonnom))
+
+## 0.8.6
+
+### Patch Changes
+
+- voiceassistant: fix will_synthesize_assistant_reply race - [#638](https://github.com/livekit/agents/pull/638) ([@theomonnom](https://github.com/theomonnom))
+
+- Switch Cartesia to a sentence tokenizer and keep the same context id throughout. - [#608](https://github.com/livekit/agents/pull/608) ([@keepingitneil](https://github.com/keepingitneil))
+  Propagate segment_id through the basic sentence tokenizer
+
+- silero: adjust vad activation threshold - [#639](https://github.com/livekit/agents/pull/639) ([@theomonnom](https://github.com/theomonnom))
+
+- limit simultaneous process initialization - [#621](https://github.com/livekit/agents/pull/621) ([@theomonnom](https://github.com/theomonnom))
+
+- voiceassistant: remove fade effect when interrupting #622 - [#623](https://github.com/livekit/agents/pull/623) ([@theomonnom](https://github.com/theomonnom))
+
+- ipc improvements, fix slow shutdown & cleanup leaked resources - [#607](https://github.com/livekit/agents/pull/607) ([@theomonnom](https://github.com/theomonnom))
+
+- ipc: use our own duplex instead of mp.Queue - [#634](https://github.com/livekit/agents/pull/634) ([@theomonnom](https://github.com/theomonnom))
+
+- Support OpenAI Assistants API as a beta feature under `livekit.plugins.openai.beta` - [#601](https://github.com/livekit/agents/pull/601) ([@keepingitneil](https://github.com/keepingitneil))
+  Add \_metadata to ChatCtx and ChatMessage which can be used (in the case of OpenAI assistants) for bookeeping to sync local state with remote, OpenAI state
+
+- llm: fix optional arguments & non-hashable list - [#637](https://github.com/livekit/agents/pull/637) ([@theomonnom](https://github.com/theomonnom))
+
+- silero: fix vad padding & static audio - [#631](https://github.com/livekit/agents/pull/631) ([@theomonnom](https://github.com/theomonnom))
+
 ## 0.8.5
 
 ### Patch Changes
diff --git a/livekit-agents/livekit/agents/__init__.py b/livekit-agents/livekit/agents/__init__.py
index c3c168541..ce05f97f0 100644
--- a/livekit-agents/livekit/agents/__init__.py
+++ b/livekit-agents/livekit/agents/__init__.py
@@ -13,19 +13,25 @@
 # limitations under the License.
 
 from . import ipc, llm, stt, tokenize, transcription, tts, utils, vad, voice_assistant
-from .job import AutoSubscribe, JobContext, JobProcess, JobRequest
+from .job import AutoSubscribe, JobContext, JobExecutorType, JobProcess, JobRequest
 from .plugin import Plugin
+from .proto import ATTR_AGENT_STATE, AgentState
 from .version import __version__
-from .worker import Worker, WorkerOptions
+from .worker import Worker, WorkerOptions, WorkerPermissions, WorkerType
 
 __all__ = [
     "__version__",
     "Worker",
     "WorkerOptions",
+    "WorkerType",
+    "WorkerPermissions",
     "JobProcess",
     "JobContext",
     "JobRequest",
+    "JobExecutorType",
     "AutoSubscribe",
+    "AgentState",
+    "ATTR_AGENT_STATE",
     "Plugin",
     "ipc",
     "stt",
diff --git a/livekit-agents/livekit/agents/cli/cli.py b/livekit-agents/livekit/agents/cli/cli.py
index ed5c1a76b..54de0e712 100644
--- a/livekit-agents/livekit/agents/cli/cli.py
+++ b/livekit-agents/livekit/agents/cli/cli.py
@@ -1,5 +1,4 @@
 import asyncio
-import functools
 import pathlib
 import signal
 import sys
@@ -15,7 +14,11 @@
 from .log import setup_logging
 
 
-def shared_args(func):
+def run_app(opts: WorkerOptions) -> None:
+    """Run the CLI to interact with the worker"""
+    cli = click.Group()
+
+    @cli.command(help="Start the worker in production mode.")
     @click.option(
         "--log-level",
         default="INFO",
@@ -39,83 +42,6 @@ def shared_args(func):
         envvar="LIVEKIT_API_SECRET",
         help="LiveKit server or Cloud project's API secret",
     )
-    @functools.wraps(func)
-    def wrapper(*args, **kwargs):
-        return func(*args, **kwargs)
-
-    return wrapper
-
-
-def shared_dev_args(func):
-    @click.option(
-        "--asyncio-debug/--no-asyncio-debug",
-        default=False,
-        help="Enable debugging feature of asyncio",
-    )
-    @click.option(
-        "--watch/--no-watch",
-        default=True,
-        help="Watch for changes in the current directory and plugins in editable mode",
-    )
-    @functools.wraps(func)
-    def wrapper(*args, **kwargs):
-        return func(*args, **kwargs)
-
-    return wrapper
-
-
-def _run_dev(
-    opts: WorkerOptions,
-    log_level: str,
-    url: str,
-    api_key: str,
-    api_secret: str,
-    asyncio_debug: bool,
-    watch: bool,
-    room: str = "",
-    participant_identity: str = "",
-):
-    opts.ws_url = url or opts.ws_url
-    opts.api_key = api_key or opts.api_key
-    opts.api_secret = api_secret or opts.api_secret
-    args = proto.CliArgs(
-        opts=opts,
-        log_level=log_level,
-        production=False,
-        asyncio_debug=asyncio_debug,
-        watch=watch,
-        drain_timeout=0,
-        room=room,
-        participant_identity=participant_identity,
-    )
-
-    if watch:
-        from .watcher import WatchServer
-
-        setup_logging(log_level, args.production)
-        main_file = pathlib.Path(sys.argv[0]).parent
-
-        async def _run_loop():
-            server = WatchServer(
-                run_worker, main_file, args, loop=asyncio.get_event_loop()
-            )
-            await server.run()
-
-        try:
-            asyncio.run(_run_loop())
-        except KeyboardInterrupt:
-            pass
-    else:
-        run_worker(args)
-
-
-def run_app(opts: WorkerOptions) -> None:
-    """Run the CLI to interact with the worker"""
-
-    cli = click.Group()
-
-    @cli.command(help="Start the worker in production mode.")
-    @shared_args
     @click.option(
         "--drain-timeout",
         default=60,
@@ -130,7 +56,7 @@ def start(
         args = proto.CliArgs(
             opts=opts,
             log_level=log_level,
-            production=True,
+            devmode=False,
             asyncio_debug=False,
             watch=False,
             drain_timeout=drain_timeout,
@@ -138,8 +64,39 @@ def start(
         run_worker(args)
 
     @cli.command(help="Start the worker in development mode")
-    @shared_args
-    @shared_dev_args
+    @click.option(
+        "--log-level",
+        default="DEBUG",
+        type=click.Choice(
+            ["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"], case_sensitive=False
+        ),
+        help="Set the logging level",
+    )
+    @click.option(
+        "--url",
+        envvar="LIVEKIT_URL",
+        help="LiveKit server or Cloud project's websocket URL",
+    )
+    @click.option(
+        "--api-key",
+        envvar="LIVEKIT_API_KEY",
+        help="LiveKit server or Cloud project's API key",
+    )
+    @click.option(
+        "--api-secret",
+        envvar="LIVEKIT_API_SECRET",
+        help="LiveKit server or Cloud project's API secret",
+    )
+    @click.option(
+        "--asyncio-debug/--no-asyncio-debug",
+        default=False,
+        help="Enable debugging feature of asyncio",
+    )
+    @click.option(
+        "--watch/--no-watch",
+        default=True,
+        help="Watch for changes in the current directory and plugins in editable mode",
+    )
     def dev(
         log_level: str,
         url: str,
@@ -151,8 +108,39 @@ def dev(
         _run_dev(opts, log_level, url, api_key, api_secret, asyncio_debug, watch)
 
     @cli.command(help="Connect to a specific room")
-    @shared_args
-    @shared_dev_args
+    @click.option(
+        "--log-level",
+        default="DEBUG",
+        type=click.Choice(
+            ["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"], case_sensitive=False
+        ),
+        help="Set the logging level",
+    )
+    @click.option(
+        "--url",
+        envvar="LIVEKIT_URL",
+        help="LiveKit server or Cloud project's websocket URL",
+    )
+    @click.option(
+        "--api-key",
+        envvar="LIVEKIT_API_KEY",
+        help="LiveKit server or Cloud project's API key",
+    )
+    @click.option(
+        "--api-secret",
+        envvar="LIVEKIT_API_SECRET",
+        help="LiveKit server or Cloud project's API secret",
+    )
+    @click.option(
+        "--asyncio-debug/--no-asyncio-debug",
+        default=False,
+        help="Enable debugging feature of asyncio",
+    )
+    @click.option(
+        "--watch/--no-watch",
+        default=True,
+        help="Watch for changes in the current directory and plugins in editable mode",
+    )
     @click.option("--room", help="Room name to connect to", required=True)
     @click.option(
         "--participant-identity", help="Participant identity (JobType.JT_PUBLISHER)"
@@ -179,10 +167,10 @@ def connect(
             participant_identity,
         )
 
-    @cli.command(help="Download plugin dependency files (i.e. model weights, ...)")
+    @cli.command(help="Download plugin dependency files")
     @click.option(
         "--log-level",
-        default="INFO",
+        default="DEBUG",
         type=click.Choice(
             ["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"], case_sensitive=False
         ),
@@ -199,14 +187,56 @@ def download_files(log_level: str) -> None:
     cli()
 
 
-def run_worker(args: proto.CliArgs) -> None:
-    class Shutdown(SystemExit):
-        pass
+def _run_dev(
+    opts: WorkerOptions,
+    log_level: str,
+    url: str,
+    api_key: str,
+    api_secret: str,
+    asyncio_debug: bool,
+    watch: bool,
+    room: str = "",
+    participant_identity: str = "",
+):
+    opts.ws_url = url or opts.ws_url
+    opts.api_key = api_key or opts.api_key
+    opts.api_secret = api_secret or opts.api_secret
+    args = proto.CliArgs(
+        opts=opts,
+        log_level=log_level,
+        devmode=True,
+        asyncio_debug=asyncio_debug,
+        watch=watch,
+        drain_timeout=0,
+        room=room,
+        participant_identity=participant_identity,
+    )
+
+    if watch:
+        from .watcher import WatchServer
+
+        setup_logging(log_level, args.devmode)
+        main_file = pathlib.Path(sys.argv[0]).parent
 
-    setup_logging(args.log_level, args.production)
+        async def _run_loop():
+            server = WatchServer(
+                run_worker, main_file, args, loop=asyncio.get_event_loop()
+            )
+            await server.run()
+
+        try:
+            asyncio.run(_run_loop())
+        except KeyboardInterrupt:
+            pass
+    else:
+        run_worker(args)
+
+
+def run_worker(args: proto.CliArgs) -> None:
+    setup_logging(args.log_level, args.devmode)
 
     loop = asyncio.get_event_loop()
-    worker = Worker(args.opts, loop=loop)
+    worker = Worker(args.opts, devmode=args.devmode, loop=loop)
 
     loop.set_debug(args.asyncio_debug)
     loop.slow_callback_duration = 0.1  # 100ms
@@ -222,7 +252,7 @@ def _connect_on_register(worker_id: str, server_info: models.ServerInfo):
     try:
 
         def _signal_handler():
-            raise Shutdown
+            raise KeyboardInterrupt
 
         for sig in (signal.SIGINT, signal.SIGTERM):
             loop.add_signal_handler(sig, _signal_handler)
@@ -249,16 +279,22 @@ async def _worker_run(worker: Worker) -> None:
         main_task = loop.create_task(_worker_run(worker), name="agent_runner")
         try:
             loop.run_until_complete(main_task)
-        except (Shutdown, KeyboardInterrupt):
+        except KeyboardInterrupt:
             pass
 
-        if args.production:
-            loop.run_until_complete(worker.drain(timeout=args.drain_timeout))
+        try:
+            if not args.devmode:
+                loop.run_until_complete(worker.drain(timeout=args.drain_timeout))
 
-        loop.run_until_complete(worker.aclose())
+            loop.run_until_complete(worker.aclose())
+
+            if watch_client:
+                loop.run_until_complete(watch_client.aclose())
+        except KeyboardInterrupt:
+            logger.warning("exiting forcefully")
+            import os
 
-        if watch_client:
-            loop.run_until_complete(watch_client.aclose())
+            os._exit(1)  # TODO(theomonnom): add aclose(force=True) in worker
     finally:
         try:
             tasks = asyncio.all_tasks(loop)
diff --git a/livekit-agents/livekit/agents/cli/log.py b/livekit-agents/livekit/agents/cli/log.py
index 520eb71ea..2223869a0 100644
--- a/livekit-agents/livekit/agents/cli/log.py
+++ b/livekit-agents/livekit/agents/cli/log.py
@@ -11,6 +11,15 @@
 
 from ..plugin import Plugin
 
+# noisy loggers are set to warn by default
+NOISY_LOGGERS = [
+    "httpx",
+    "httpcore",
+    "openai",
+    "livekit",
+    "watchfiles",
+]
+
 # skip default LogRecord attributes
 # http://docs.python.org/library/logging.html#logrecord-attributes
 _RESERVED_ATTRS: Tuple[str, ...] = (
@@ -92,6 +101,7 @@ def format(self, record: logging.LogRecord) -> str:
         """Formats a log record and serializes to json"""
         message_dict: Dict[str, Any] = {}
         message_dict["level"] = record.levelname
+        message_dict["name"] = record.name
 
         if isinstance(record.msg, dict):
             message_dict = record.msg
@@ -180,10 +190,10 @@ def formatMessage(self, record: logging.LogRecord) -> str:
         return msg + self._esc_codes["esc_reset"]
 
 
-def setup_logging(log_level: str, production: bool = True) -> None:
+def setup_logging(log_level: str, devmode: bool) -> None:
     handler = logging.StreamHandler()
 
-    if not production:
+    if devmode:
         # colorful logs for dev (improves readability)
         colored_formatter = ColoredFormatter(
             "%(asctime)s - %(esc_levelcolor)s%(levelname)-4s%(esc_reset)s %(name)s - %(message)s %(extra)s"
@@ -196,9 +206,12 @@ def setup_logging(log_level: str, production: bool = True) -> None:
 
     root = logging.getLogger()
     root.addHandler(handler)
+    root.setLevel(log_level)
 
-    if root.level == logging.NOTSET:
-        root.setLevel(logging.WARN)
+    for noisy_logger in NOISY_LOGGERS:
+        logger = logging.getLogger(noisy_logger)
+        if logger.level == logging.NOTSET:
+            logger.setLevel(logging.WARN)
 
     from ..log import logger
 
diff --git a/livekit-agents/livekit/agents/cli/proto.py b/livekit-agents/livekit/agents/cli/proto.py
index cc278445f..f03b5f669 100644
--- a/livekit-agents/livekit/agents/cli/proto.py
+++ b/livekit-agents/livekit/agents/cli/proto.py
@@ -16,7 +16,7 @@
 class CliArgs:
     opts: WorkerOptions
     log_level: str
-    production: bool
+    devmode: bool
     asyncio_debug: bool
     watch: bool
     drain_timeout: int
diff --git a/livekit-agents/livekit/agents/cli/watcher.py b/livekit-agents/livekit/agents/cli/watcher.py
index acdf49480..1be355922 100644
--- a/livekit-agents/livekit/agents/cli/watcher.py
+++ b/livekit-agents/livekit/agents/cli/watcher.py
@@ -6,6 +6,7 @@
 import pathlib
 import socket
 import urllib.parse
+import urllib.request
 from importlib.metadata import Distribution, PackageNotFoundError
 from typing import Any, Callable, Set
 
@@ -52,7 +53,9 @@ def _try_add(name: str) -> bool:
             path: str | None = durl_json.get("url")
             if path and path.startswith("file://"):
                 parsed_url = urllib.parse.urlparse(path)
-                file_path = pathlib.Path(urllib.parse.unquote(parsed_url.path))
+                file_url_path = urllib.parse.unquote(parsed_url.path)
+                local_path = urllib.request.url2pathname(file_url_path)
+                file_path = pathlib.Path(local_path)
                 paths.append(file_path)
 
     return paths
@@ -83,15 +86,18 @@ async def run(self) -> None:
         self._pch = await utils.aio.duplex_unix._AsyncDuplex.open(self._mp_pch)
 
         read_ipc_task = self._loop.create_task(self._read_ipc_task())
-        await watchfiles.arun_process(
-            *watch_paths,
-            target=self._worker_runner,
-            args=(self._cli_args,),
-            watch_filter=watchfiles.filters.PythonFilter(),
-            callback=self._on_reload,
-        )
 
-        await utils.aio.gracefully_cancel(read_ipc_task)
+        try:
+            await watchfiles.arun_process(
+                *watch_paths,
+                target=self._worker_runner,
+                args=(self._cli_args,),
+                watch_filter=watchfiles.filters.PythonFilter(),
+                callback=self._on_reload,
+            )
+        finally:
+            await utils.aio.gracefully_cancel(read_ipc_task)
+            await self._pch.aclose()
 
     async def _on_reload(self, _: Set[watchfiles.main.FileChange]) -> None:
         if self._reloading_jobs:
@@ -138,25 +144,28 @@ def start(self) -> None:
 
     @utils.log_exceptions(logger=logger)
     async def _run(self) -> None:
-        self._cch = await utils.aio.duplex_unix._AsyncDuplex.open(self._mp_cch)
-
-        await channel.asend_message(self._cch, proto.ReloadJobsRequest())
-        while True:
-            try:
-                msg = await channel.arecv_message(self._cch, proto.IPC_MESSAGES)
-            except utils.aio.duplex_unix.DuplexClosed:
-                break
-
-            if isinstance(msg, proto.ActiveJobsRequest):
-                jobs = self._worker.active_jobs
-
-                await channel.asend_message(
-                    self._cch, proto.ActiveJobsResponse(jobs=jobs)
-                )
-            elif isinstance(msg, proto.ReloadJobsResponse):
-                # TODO(theomonnom): wait for the worker to be fully initialized/connected
-                await self._worker._reload_jobs(msg.jobs)
-                await channel.asend_message(self._cch, proto.Reloaded())
+        try:
+            self._cch = await utils.aio.duplex_unix._AsyncDuplex.open(self._mp_cch)
+
+            await channel.asend_message(self._cch, proto.ReloadJobsRequest())
+            while True:
+                try:
+                    msg = await channel.arecv_message(self._cch, proto.IPC_MESSAGES)
+                except utils.aio.duplex_unix.DuplexClosed:
+                    break
+
+                if isinstance(msg, proto.ActiveJobsRequest):
+                    jobs = self._worker.active_jobs
+
+                    await channel.asend_message(
+                        self._cch, proto.ActiveJobsResponse(jobs=jobs)
+                    )
+                elif isinstance(msg, proto.ReloadJobsResponse):
+                    # TODO(theomonnom): wait for the worker to be fully initialized/connected
+                    await self._worker._reload_jobs(msg.jobs)
+                    await channel.asend_message(self._cch, proto.Reloaded())
+        except utils.aio.duplex_unix.DuplexClosed:
+            pass
 
     async def aclose(self) -> None:
         if not self._main_task:
diff --git a/livekit-agents/livekit/agents/ipc/__init__.py b/livekit-agents/livekit/agents/ipc/__init__.py
index de6c63381..ab04d6b5e 100644
--- a/livekit-agents/livekit/agents/ipc/__init__.py
+++ b/livekit-agents/livekit/agents/ipc/__init__.py
@@ -1,3 +1,17 @@
-from . import channel, proc_pool, proto, supervised_proc
+from . import (
+    channel,
+    job_executor,
+    proc_job_executor,
+    proc_pool,
+    proto,
+    thread_job_executor,
+)
 
-__all__ = ["proto", "channel", "proc_pool", "supervised_proc"]
+__all__ = [
+    "proto",
+    "channel",
+    "proc_pool",
+    "proc_job_executor",
+    "thread_job_executor",
+    "job_executor",
+]
diff --git a/livekit-agents/livekit/agents/ipc/job_executor.py b/livekit-agents/livekit/agents/ipc/job_executor.py
new file mode 100644
index 000000000..8fe9b9848
--- /dev/null
+++ b/livekit-agents/livekit/agents/ipc/job_executor.py
@@ -0,0 +1,29 @@
+from __future__ import annotations
+
+from typing import Any, Protocol
+
+from ..job import RunningJobInfo
+
+
+class JobExecutor(Protocol):
+    @property
+    def started(self) -> bool: ...
+
+    @property
+    def start_arguments(self) -> Any | None: ...
+
+    @start_arguments.setter
+    def start_arguments(self, value: Any | None) -> None: ...
+
+    @property
+    def running_job(self) -> RunningJobInfo | None: ...
+
+    async def start(self) -> None: ...
+
+    async def join(self) -> None: ...
+
+    async def initialize(self) -> None: ...
+
+    async def aclose(self) -> None: ...
+
+    async def launch_job(self, info: RunningJobInfo) -> None: ...
diff --git a/livekit-agents/livekit/agents/ipc/proc_main.py b/livekit-agents/livekit/agents/ipc/job_main.py
similarity index 71%
rename from livekit-agents/livekit/agents/ipc/proc_main.py
rename to livekit-agents/livekit/agents/ipc/job_main.py
index 8b1bb57c0..ff0dc54b9 100644
--- a/livekit-agents/livekit/agents/ipc/proc_main.py
+++ b/livekit-agents/livekit/agents/ipc/job_main.py
@@ -4,9 +4,12 @@
 import contextlib
 import copy
 import logging
-import multiprocessing as mp
+import pickle
+import queue
 import socket
+import threading
 from dataclasses import dataclass
+from typing import Any, Callable, Optional
 
 from livekit import rtc
 
@@ -18,12 +21,33 @@
 
 
 class LogQueueHandler(logging.Handler):
-    def __init__(self, queue: mp.Queue) -> None:
+    _sentinal = None
+
+    def __init__(self, duplex: utils.aio.duplex_unix._Duplex) -> None:
         super().__init__()
-        self._q = queue
+        self._duplex = duplex
+        self._send_q = queue.SimpleQueue[Optional[bytes]]()
+        self._send_thread = threading.Thread(
+            target=self._forward_logs, name="ipc_log_forwarder"
+        )
+        self._send_thread.start()
+
+    def _forward_logs(self):
+        while True:
+            serialized_record = self._send_q.get()
+            if serialized_record is None:
+                break
+
+            try:
+                self._duplex.send_bytes(serialized_record)
+            except duplex_unix.DuplexClosed:
+                break
+
+        self._duplex.close()
 
     def emit(self, record: logging.LogRecord) -> None:
         try:
+            # from https://github.com/python/cpython/blob/91b7f2e7f6593acefda4fa860250dd87d6f849bf/Lib/logging/handlers.py#L1453
             msg = self.format(record)
             record = copy.copy(record)
             record.message = msg
@@ -31,10 +55,22 @@ def emit(self, record: logging.LogRecord) -> None:
             record.args = None
             record.exc_info = None
             record.exc_text = None
-            self._q.put_nowait(record)
+            record.stack_info = None
+
+            # https://websockets.readthedocs.io/en/stable/topics/logging.html#logging-to-json
+            # webosckets library add "websocket" attribute to log records, which is not pickleable
+            if hasattr(record, "websocket"):
+                record.websocket = None
+
+            self._send_q.put_nowait(pickle.dumps(record))
+
         except Exception:
             self.handleError(record)
 
+    def close(self) -> None:
+        super().close()
+        self._send_q.put_nowait(self._sentinal)
+
 
 @dataclass
 class _ShutdownInfo:
@@ -50,8 +86,8 @@ class JobTask:
 
 
 def _start_job(
-    args: proto.ProcStartArgs,
     proc: JobProcess,
+    job_entrypoint_fnc: Callable[[JobContext], Any],
     start_req: proto.StartJobRequest,
     exit_proc_fut: asyncio.Event,
     cch: utils.aio.duplex_unix._AsyncDuplex,
@@ -82,6 +118,7 @@ def _on_ctx_shutdown(reason: str) -> None:
             )
 
     info = start_req.running_job
+    room._info.name = info.job.room.name
     job_ctx = JobContext(
         proc=proc,
         info=info,
@@ -94,7 +131,7 @@ def _on_ctx_shutdown(reason: str) -> None:
     async def _run_job_task() -> None:
         utils.http_context._new_session_ctx()
         job_entry_task = asyncio.create_task(
-            args.job_entrypoint_fnc(job_ctx), name="job_entrypoint"
+            job_entrypoint_fnc(job_ctx), name="job_entrypoint"
         )
 
         async def _warn_not_connected_task():
@@ -152,7 +189,9 @@ def log_exception(t: asyncio.Task) -> None:
 
 
 async def _async_main(
-    args: proto.ProcStartArgs, proc: JobProcess, mp_cch: socket.socket
+    proc: JobProcess,
+    job_entrypoint_fnc: Callable[[JobContext], Any],
+    mp_cch: socket.socket,
 ) -> None:
     cch = await duplex_unix._AsyncDuplex.open(mp_cch)
 
@@ -165,7 +204,8 @@ async def _read_ipc_task():
         nonlocal job_task
         while True:
             msg = await channel.arecv_message(cch, proto.IPC_MESSAGES)
-            no_msg_timeout.reset()
+            with contextlib.suppress(utils.aio.SleepFinished):
+                no_msg_timeout.reset()
 
             if isinstance(msg, proto.PingRequest):
                 pong = proto.PongResponse(
@@ -175,7 +215,7 @@ async def _read_ipc_task():
 
             if isinstance(msg, proto.StartJobRequest):
                 assert job_task is None, "job task already running"
-                job_task = _start_job(args, proc, msg, exit_proc_fut, cch)
+                job_task = _start_job(proc, job_entrypoint_fnc, msg, exit_proc_fut, cch)
 
             if isinstance(msg, proto.ShutdownRequest):
                 if job_task is None:
@@ -209,48 +249,58 @@ def _done_cb(task: asyncio.Task) -> None:
         await cch.aclose()
 
 
-def main(args: proto.ProcStartArgs) -> None:
-    root_logger = logging.getLogger()
-    root_logger.setLevel(logging.NOTSET)
+@dataclass
+class ProcStartArgs:
+    initialize_process_fnc: Callable[[JobProcess], Any]
+    job_entrypoint_fnc: Callable[[JobContext], Any]
+    log_cch: socket.socket
+    mp_cch: socket.socket
+    asyncio_debug: bool
+    user_arguments: Any | None = None
+
+
+@dataclass
+class ThreadStartArgs:
+    mp_cch: socket.socket
+    initialize_process_fnc: Callable[[JobProcess], Any]
+    job_entrypoint_fnc: Callable[[JobContext], Any]
+    user_arguments: Any | None
+    asyncio_debug: bool
+    join_fnc: Callable[[], None]
 
-    log_q = args.log_q
-    log_q.cancel_join_thread()
-    log_handler = LogQueueHandler(log_q)
-    root_logger.addHandler(log_handler)
 
+def thread_main(
+    args: ThreadStartArgs,
+) -> None:
+    """main function for the job process when using the ThreadedJobRunner"""
+    tid = threading.get_native_id()
     loop = asyncio.new_event_loop()
     asyncio.set_event_loop(loop)
     loop.set_debug(args.asyncio_debug)
     loop.slow_callback_duration = 0.1  # 100ms
-    utils.aio.debug.hook_slow_callbacks(2.0)
 
     cch = duplex_unix._Duplex.open(args.mp_cch)
     try:
         init_req = channel.recv_message(cch, proto.IPC_MESSAGES)
-
         assert isinstance(
             init_req, proto.InitializeRequest
         ), "first message must be InitializeRequest"
-
         job_proc = JobProcess(start_arguments=args.user_arguments)
-        logger.debug("initializing process", extra={"pid": job_proc.pid})
+
+        logger.debug("initializing job runner", extra={"tid": tid})
         args.initialize_process_fnc(job_proc)
-        logger.debug("process initialized", extra={"pid": job_proc.pid})
+        logger.debug("job runner initialized", extra={"tid": tid})
         channel.send_message(cch, proto.InitializeResponse())
 
         main_task = loop.create_task(
-            _async_main(args, job_proc, cch.detach()), name="job_proc_main"
+            _async_main(job_proc, args.job_entrypoint_fnc, cch.detach()),
+            name="job_proc_main",
         )
-        while not main_task.done():
-            try:
-                loop.run_until_complete(main_task)
-            except KeyboardInterrupt:
-                # ignore the keyboard interrupt, we handle the process shutdown ourselves on the worker process
-                pass
+        loop.run_until_complete(main_task)
     except duplex_unix.DuplexClosed:
         pass
+    except Exception:
+        logger.exception("error while running job process", extra={"tid": tid})
     finally:
-        log_handler.close()
-        log_q.close()
-        cch.close()
+        args.join_fnc()
         loop.run_until_complete(loop.shutdown_default_executor())
diff --git a/livekit-agents/livekit/agents/ipc/supervised_proc.py b/livekit-agents/livekit/agents/ipc/proc_job_executor.py
similarity index 89%
rename from livekit-agents/livekit/agents/ipc/supervised_proc.py
rename to livekit-agents/livekit/agents/ipc/proc_job_executor.py
index 1cb4ae7da..f5f846130 100644
--- a/livekit-agents/livekit/agents/ipc/supervised_proc.py
+++ b/livekit-agents/livekit/agents/ipc/proc_job_executor.py
@@ -3,41 +3,40 @@
 import asyncio
 import contextlib
 import logging
-import multiprocessing as mp
+import pickle
 import socket
 import sys
 import threading
 from dataclasses import dataclass
 from multiprocessing.context import BaseContext
-from typing import Any, Callable, Coroutine
+from typing import Any, Awaitable, Callable
 
 from .. import utils
 from ..job import JobContext, JobProcess, RunningJobInfo
 from ..log import logger
 from ..utils.aio import duplex_unix
-from . import channel, proc_main, proto
+from . import channel, job_main, proc_lazy_main, proto
 
 
 class LogQueueListener:
-    _sentinel = None
-
     def __init__(
-        self, queue: mp.Queue, prepare_fnc: Callable[[logging.LogRecord], None]
+        self,
+        duplex: utils.aio.duplex_unix._Duplex,
+        prepare_fnc: Callable[[logging.LogRecord], None],
     ):
         self._thread: threading.Thread | None = None
-        self._q = queue
+        self._duplex = duplex
         self._prepare_fnc = prepare_fnc
 
     def start(self) -> None:
-        self._thread = t = threading.Thread(
-            target=self._monitor, daemon=True, name="log_listener"
-        )
-        t.start()
+        self._thread = threading.Thread(target=self._monitor, name="ipc_log_listener")
+        self._thread.start()
 
     def stop(self) -> None:
         if self._thread is None:
             return
-        self._q.put_nowait(self._sentinel)
+
+        self._duplex.close()
         self._thread.join()
         self._thread = None
 
@@ -52,28 +51,30 @@ def handle(self, record: logging.LogRecord) -> None:
 
     def _monitor(self):
         while True:
-            record = self._q.get()
-            if record is self._sentinel:
+            try:
+                data = self._duplex.recv_bytes()
+            except utils.aio.duplex_unix.DuplexClosed:
                 break
 
+            record = pickle.loads(data)
             self.handle(record)
 
 
 @dataclass
 class _ProcOpts:
     initialize_process_fnc: Callable[[JobProcess], Any]
-    job_entrypoint_fnc: Callable[[JobContext], Coroutine]
+    job_entrypoint_fnc: Callable[[JobContext], Awaitable[None]]
     mp_ctx: BaseContext
     initialize_timeout: float
     close_timeout: float
 
 
-class SupervisedProc:
+class ProcJobExecutor:
     def __init__(
         self,
         *,
         initialize_process_fnc: Callable[[JobProcess], Any],
-        job_entrypoint_fnc: Callable[[JobContext], Coroutine],
+        job_entrypoint_fnc: Callable[[JobContext], Awaitable[None]],
         initialize_timeout: float,
         close_timeout: float,
         mp_ctx: BaseContext,
@@ -145,29 +146,32 @@ def _add_proc_ctx_log(record: logging.LogRecord) -> None:
                 setattr(record, key, value)
 
         async with self._lock:
-            log_q = self._opts.mp_ctx.Queue()
-            log_q.cancel_join_thread()
-
             mp_pch, mp_cch = socket.socketpair()
+            mp_log_pch, mp_log_cch = socket.socketpair()
 
             self._pch = await duplex_unix._AsyncDuplex.open(mp_pch)
-            log_listener = LogQueueListener(log_q, _add_proc_ctx_log)
+
+            log_pch = duplex_unix._Duplex.open(mp_log_pch)
+            log_listener = LogQueueListener(log_pch, _add_proc_ctx_log)
             log_listener.start()
 
-            self._proc_args = proto.ProcStartArgs(
+            self._proc_args = job_main.ProcStartArgs(
                 initialize_process_fnc=self._opts.initialize_process_fnc,
                 job_entrypoint_fnc=self._opts.job_entrypoint_fnc,
-                log_q=log_q,
+                log_cch=mp_log_cch,
                 mp_cch=mp_cch,
                 asyncio_debug=self._loop.get_debug(),
                 user_arguments=self._user_args,
             )
 
             self._proc = self._opts.mp_ctx.Process(  # type: ignore
-                target=proc_main.main, args=(self._proc_args,), name="job_proc"
+                target=proc_lazy_main.proc_main,
+                args=(self._proc_args,),
+                name="job_proc",
             )
 
             self._proc.start()
+            mp_log_cch.close()
             mp_cch.close()
 
             self._pid = self._proc.pid
@@ -176,7 +180,6 @@ def _add_proc_ctx_log(record: logging.LogRecord) -> None:
             def _sync_run():
                 self._proc.join()
                 log_listener.stop()
-                log_q.close()
                 try:
                     self._loop.call_soon_threadsafe(self._join_fut.set_result, None)
                 except RuntimeError:
@@ -278,7 +281,7 @@ def _send_kill_signal(self) -> None:
         except ValueError:
             return
 
-        logger.debug("killing job process", extra=self.logging_extra())
+        logger.info("killing job process", extra=self.logging_extra())
         if sys.platform == "win32":
             self._proc.terminate()
         else:
@@ -334,7 +337,7 @@ async def _monitor_task(self, pong_timeout: utils.aio.Sleep) -> None:
                     pong_timeout.reset()
 
             if isinstance(msg, proto.Exiting):
-                logger.debug(
+                logger.info(
                     "job exiting", extra={"reason": msg.reason, **self.logging_extra()}
                 )
 
@@ -366,8 +369,8 @@ async def _pong_timeout_co():
         finally:
             await utils.aio.gracefully_cancel(*tasks)
 
-    def logging_extra(self) -> dict:
-        extra: dict = {
+    def logging_extra(self):
+        extra: dict[str, Any] = {
             "pid": self.pid,
         }
         if self._running_job:
diff --git a/livekit-agents/livekit/agents/ipc/proc_lazy_main.py b/livekit-agents/livekit/agents/ipc/proc_lazy_main.py
new file mode 100644
index 000000000..be09e7f5a
--- /dev/null
+++ b/livekit-agents/livekit/agents/ipc/proc_lazy_main.py
@@ -0,0 +1,72 @@
+import multiprocessing
+
+if multiprocessing.current_process().name == "job_proc":
+    import signal
+    import sys
+
+    # ignore signals in the jobs process (the parent process will handle them)
+    signal.signal(signal.SIGINT, signal.SIG_IGN)
+    signal.signal(signal.SIGTERM, signal.SIG_IGN)
+
+    def _no_traceback_excepthook(exc_type, exc_val, traceback):
+        if isinstance(exc_val, KeyboardInterrupt):
+            return
+        sys.__excepthook__(exc_type, exc_val, traceback)
+
+    sys.excepthook = _no_traceback_excepthook
+
+
+def proc_main(args) -> None:
+    """main function for the job process when using the ProcessJobRunner"""
+
+    # import every package lazily
+    import asyncio
+    import logging
+
+    from .. import utils
+    from ..job import JobProcess
+    from ..log import logger
+    from . import channel, job_main, proto
+
+    root_logger = logging.getLogger()
+    root_logger.setLevel(logging.NOTSET)
+
+    log_cch = utils.aio.duplex_unix._Duplex.open(args.log_cch)
+    log_handler = job_main.LogQueueHandler(log_cch)
+    root_logger.addHandler(log_handler)
+
+    loop = asyncio.new_event_loop()
+    asyncio.set_event_loop(loop)
+    loop.set_debug(args.asyncio_debug)
+    loop.slow_callback_duration = 0.1  # 100ms
+    utils.aio.debug.hook_slow_callbacks(2.0)
+
+    cch = utils.aio.duplex_unix._Duplex.open(args.mp_cch)
+    try:
+        init_req = channel.recv_message(cch, proto.IPC_MESSAGES)
+
+        assert isinstance(
+            init_req, proto.InitializeRequest
+        ), "first message must be InitializeRequest"
+
+        job_proc = JobProcess(start_arguments=args.user_arguments)
+        logger.info("initializing process", extra={"pid": job_proc.pid})
+        args.initialize_process_fnc(job_proc)
+        logger.info("process initialized", extra={"pid": job_proc.pid})
+        channel.send_message(cch, proto.InitializeResponse())
+
+        main_task = loop.create_task(
+            job_main._async_main(job_proc, args.job_entrypoint_fnc, cch.detach()),
+            name="job_proc_main",
+        )
+        while not main_task.done():
+            try:
+                loop.run_until_complete(main_task)
+            except KeyboardInterrupt:
+                # ignore the keyboard interrupt, we handle the process shutdown ourselves on the worker process
+                pass
+    except (utils.aio.duplex_unix.DuplexClosed, KeyboardInterrupt):
+        pass
+    finally:
+        log_handler.close()
+        loop.run_until_complete(loop.shutdown_default_executor())
diff --git a/livekit-agents/livekit/agents/ipc/proc_pool.py b/livekit-agents/livekit/agents/ipc/proc_pool.py
index e281aed96..307227876 100644
--- a/livekit-agents/livekit/agents/ipc/proc_pool.py
+++ b/livekit-agents/livekit/agents/ipc/proc_pool.py
@@ -2,13 +2,14 @@
 
 import asyncio
 from multiprocessing.context import BaseContext
-from typing import Any, Callable, Coroutine, Literal
+from typing import Any, Awaitable, Callable, Literal
 
 from .. import utils
-from ..job import JobContext, JobProcess, RunningJobInfo
+from ..job import JobContext, JobExecutorType, JobProcess, RunningJobInfo
 from ..log import logger
 from ..utils import aio
-from .supervised_proc import SupervisedProc
+from . import proc_job_executor, thread_job_executor
+from .job_executor import JobExecutor
 
 EventTypes = Literal[
     "process_created", "process_started", "process_ready", "process_closed"
@@ -22,14 +23,16 @@ def __init__(
         self,
         *,
         initialize_process_fnc: Callable[[JobProcess], Any],
-        job_entrypoint_fnc: Callable[[JobContext], Coroutine],
+        job_entrypoint_fnc: Callable[[JobContext], Awaitable[None]],
         num_idle_processes: int,
         initialize_timeout: float,
         close_timeout: float,
+        job_executor_type: JobExecutorType,
         mp_ctx: BaseContext,
         loop: asyncio.AbstractEventLoop,
     ) -> None:
         super().__init__()
+        self._job_executor_type = job_executor_type
         self._mp_ctx = mp_ctx
         self._initialize_process_fnc = initialize_process_fnc
         self._job_entrypoint_fnc = job_entrypoint_fnc
@@ -37,16 +40,27 @@ def __init__(
         self._initialize_timeout = initialize_timeout
         self._loop = loop
 
+        self._num_idle_processes = num_idle_processes
         self._init_sem = asyncio.Semaphore(MAX_CONCURRENT_INITIALIZATIONS)
         self._proc_needed_sem = asyncio.Semaphore(num_idle_processes)
-        self._warmed_proc_queue = asyncio.Queue[SupervisedProc]()
-        self._processes: list[SupervisedProc] = []
+        self._warmed_proc_queue = asyncio.Queue[JobExecutor]()
+        self._executors: list[JobExecutor] = []
         self._started = False
         self._closed = False
 
     @property
-    def processes(self) -> list[SupervisedProc]:
-        return self._processes
+    def processes(self) -> list[JobExecutor]:
+        return self._executors
+
+    def get_by_job_id(self, job_id: str) -> JobExecutor | None:
+        return next(
+            (
+                x
+                for x in self._executors
+                if x.running_job and x.running_job.job.id == job_id
+            ),
+            None,
+        )
 
     def start(self) -> None:
         if self._started:
@@ -63,22 +77,40 @@ async def aclose(self) -> None:
         await aio.gracefully_cancel(self._main_atask)
 
     async def launch_job(self, info: RunningJobInfo) -> None:
-        proc = await self._warmed_proc_queue.get()
-        self._proc_needed_sem.release()  # notify that a new process needs to be warmed/started
+        if self._num_idle_processes == 0:
+            self._proc_needed_sem.release()  # ask for a process if prewarmed processes are not disabled
+            proc = await self._warmed_proc_queue.get()
+        else:
+            proc = await self._warmed_proc_queue.get()
+            self._proc_needed_sem.release()  # notify that a new process can be warmed/started
+
         await proc.launch_job(info)
 
     @utils.log_exceptions(logger=logger)
     async def _proc_watch_task(self) -> None:
-        proc = SupervisedProc(
-            initialize_process_fnc=self._initialize_process_fnc,
-            job_entrypoint_fnc=self._job_entrypoint_fnc,
-            initialize_timeout=self._initialize_timeout,
-            close_timeout=self._close_timeout,
-            mp_ctx=self._mp_ctx,
-            loop=self._loop,
-        )
+        proc: JobExecutor
+        if self._job_executor_type == JobExecutorType.THREAD:
+            proc = thread_job_executor.ThreadJobExecutor(
+                initialize_process_fnc=self._initialize_process_fnc,
+                job_entrypoint_fnc=self._job_entrypoint_fnc,
+                initialize_timeout=self._initialize_timeout,
+                close_timeout=self._close_timeout,
+                loop=self._loop,
+            )
+        elif self._job_executor_type == JobExecutorType.PROCESS:
+            proc = proc_job_executor.ProcJobExecutor(
+                initialize_process_fnc=self._initialize_process_fnc,
+                job_entrypoint_fnc=self._job_entrypoint_fnc,
+                initialize_timeout=self._initialize_timeout,
+                close_timeout=self._close_timeout,
+                mp_ctx=self._mp_ctx,
+                loop=self._loop,
+            )
+        else:
+            raise ValueError(f"unsupported job executor: {self._job_executor_type}")
+
         try:
-            self._processes.append(proc)
+            self._executors.append(proc)
 
             async with self._init_sem:
                 if self._closed:
@@ -99,11 +131,11 @@ async def _proc_watch_task(self) -> None:
             await proc.join()
             self.emit("process_closed", proc)
         finally:
-            self._processes.remove(proc)
+            self._executors.remove(proc)
 
     @utils.log_exceptions(logger=logger)
     async def _main_task(self) -> None:
-        watch_tasks = []
+        watch_tasks: list[asyncio.Task[None]] = []
         try:
             while True:
                 await self._proc_needed_sem.acquire()
@@ -111,5 +143,5 @@ async def _main_task(self) -> None:
                 watch_tasks.append(task)
                 task.add_done_callback(watch_tasks.remove)
         except asyncio.CancelledError:
-            await asyncio.gather(*[proc.aclose() for proc in self._processes])
+            await asyncio.gather(*[proc.aclose() for proc in self._executors])
             await asyncio.gather(*watch_tasks)
diff --git a/livekit-agents/livekit/agents/ipc/proto.py b/livekit-agents/livekit/agents/ipc/proto.py
index 9e8567ffe..7dd7c29e3 100644
--- a/livekit-agents/livekit/agents/ipc/proto.py
+++ b/livekit-agents/livekit/agents/ipc/proto.py
@@ -1,14 +1,12 @@
 from __future__ import annotations
 
 import io
-import multiprocessing as mp
-import socket
 from dataclasses import dataclass, field
-from typing import Any, Callable, ClassVar, Coroutine
+from typing import ClassVar
 
 from livekit.protocol import agent
 
-from ..job import JobAcceptArguments, JobContext, JobProcess, RunningJobInfo
+from ..job import JobAcceptArguments, RunningJobInfo
 from . import channel
 
 PING_INTERVAL = 2.5
@@ -17,16 +15,6 @@
 NO_MESSAGE_TIMEOUT = 15.0
 
 
-@dataclass
-class ProcStartArgs:
-    initialize_process_fnc: Callable[[JobProcess], Any]
-    job_entrypoint_fnc: Callable[[JobContext], Coroutine]
-    log_q: mp.Queue
-    mp_cch: socket.socket
-    asyncio_debug: bool
-    user_arguments: Any | None = None
-
-
 @dataclass
 class InitializeRequest:
     """sent by the main process to the subprocess to initialize it. this is going to call initialize_process_fnc"""
diff --git a/livekit-agents/livekit/agents/ipc/thread_job_executor.py b/livekit-agents/livekit/agents/ipc/thread_job_executor.py
new file mode 100644
index 000000000..99e75f74c
--- /dev/null
+++ b/livekit-agents/livekit/agents/ipc/thread_job_executor.py
@@ -0,0 +1,256 @@
+from __future__ import annotations
+
+import asyncio
+import contextlib
+import socket
+import threading
+from dataclasses import dataclass
+from typing import Any, Awaitable, Callable
+
+from .. import utils
+from ..job import JobContext, JobProcess, RunningJobInfo
+from ..log import logger
+from ..utils.aio import duplex_unix
+from . import channel, job_main, proto
+
+
+@dataclass
+class _ProcOpts:
+    initialize_process_fnc: Callable[[JobProcess], Any]
+    job_entrypoint_fnc: Callable[[JobContext], Awaitable[None]]
+    initialize_timeout: float
+    close_timeout: float
+
+
+class ThreadJobExecutor:
+    def __init__(
+        self,
+        *,
+        initialize_process_fnc: Callable[[JobProcess], Any],
+        job_entrypoint_fnc: Callable[[JobContext], Awaitable[None]],
+        initialize_timeout: float,
+        close_timeout: float,
+        loop: asyncio.AbstractEventLoop,
+    ) -> None:
+        self._loop = loop
+        self._opts = _ProcOpts(
+            initialize_process_fnc=initialize_process_fnc,
+            job_entrypoint_fnc=job_entrypoint_fnc,
+            initialize_timeout=initialize_timeout,
+            close_timeout=close_timeout,
+        )
+
+        self._user_args: Any | None = None
+        self._running_job: RunningJobInfo | None = None
+
+        self._main_atask: asyncio.Task[None] | None = None
+        self._closing = False
+        self._initialize_fut = asyncio.Future[None]()
+
+        self._lock = asyncio.Lock()
+
+    @property
+    def started(self) -> bool:
+        return self._main_atask is not None
+
+    @property
+    def start_arguments(self) -> Any | None:
+        return self._user_args
+
+    @start_arguments.setter
+    def start_arguments(self, value: Any | None) -> None:
+        self._user_args = value
+
+    @property
+    def running_job(self) -> RunningJobInfo | None:
+        return self._running_job
+
+    async def start(self) -> None:
+        if self.started:
+            raise RuntimeError("runner already started")
+
+        if self._closing:
+            raise RuntimeError("runner is closed")
+
+        await asyncio.shield(self._start())
+
+    async def _start(self) -> None:
+        async with self._lock:
+            # to simplify the runners implementation, we also use a duplex in the threaded executor
+            # (ThreadedRunners), so we can use the same protocol
+            mp_pch, mp_cch = socket.socketpair()
+            self._pch = await duplex_unix._AsyncDuplex.open(mp_pch)
+
+            self._join_fut = asyncio.Future[None]()
+
+            def _on_join() -> None:
+                with contextlib.suppress(RuntimeError):
+                    self._loop.call_soon_threadsafe(self._join_fut.set_result, None)
+
+            targs = job_main.ThreadStartArgs(
+                mp_cch=mp_cch,
+                initialize_process_fnc=self._opts.initialize_process_fnc,
+                job_entrypoint_fnc=self._opts.job_entrypoint_fnc,
+                user_arguments=self._user_args,
+                asyncio_debug=self._loop.get_debug(),
+                join_fnc=_on_join,
+            )
+
+            self._thread = t = threading.Thread(
+                target=job_main.thread_main,
+                args=(targs,),
+                name="job_thread_runner",
+            )
+            t.start()
+
+            self._main_atask = asyncio.create_task(self._main_task())
+
+    async def join(self) -> None:
+        """wait for the thread to finish"""
+        if not self.started:
+            raise RuntimeError("runner not started")
+
+        async with self._lock:
+            if self._main_atask:
+                await asyncio.shield(self._main_atask)
+
+    async def initialize(self) -> None:
+        await channel.asend_message(self._pch, proto.InitializeRequest())
+
+        try:
+            init_res = await asyncio.wait_for(
+                channel.arecv_message(self._pch, proto.IPC_MESSAGES),
+                timeout=self._opts.initialize_timeout,
+            )
+            assert isinstance(
+                init_res, proto.InitializeResponse
+            ), "first message must be InitializeResponse"
+        except asyncio.TimeoutError:
+            self._initialize_fut.set_exception(
+                asyncio.TimeoutError("runner initialization timed out")
+            )
+            logger.error(
+                "job initialization is taking too much time..",
+                extra=self.logging_extra(),
+            )
+            raise
+        except Exception as e:  # should be channel.ChannelClosed most of the time
+            self._initialize_fut.set_exception(e)
+            raise
+        else:
+            self._initialize_fut.set_result(None)
+
+    async def aclose(self) -> None:
+        """
+        attempt to gracefully close the job. warn if it takes too long to close
+        (in the threaded executor, the job can't be "killed")
+        """
+        if not self.started:
+            return
+
+        self._closing = True
+        with contextlib.suppress(utils.aio.duplex_unix.DuplexClosed):
+            await channel.asend_message(self._pch, proto.ShutdownRequest())
+
+        try:
+            if self._main_atask:
+                await asyncio.wait_for(
+                    asyncio.shield(self._main_atask), timeout=self._opts.close_timeout
+                )
+        except asyncio.TimeoutError:
+            logger.error(
+                "job shutdown is taking too much time..", extra=self.logging_extra()
+            )
+
+        async with self._lock:
+            if self._main_atask:
+                await asyncio.shield(self._main_atask)
+
+    async def launch_job(self, info: RunningJobInfo) -> None:
+        """start/assign a job to the executor"""
+        if self._running_job is not None:
+            raise RuntimeError("executor already has a running job")
+
+        self._running_job = info
+        start_req = proto.StartJobRequest()
+        start_req.running_job = info
+        await channel.asend_message(self._pch, start_req)
+
+    @utils.log_exceptions(logger=logger)
+    async def _main_task(self) -> None:
+        try:
+            await self._initialize_fut
+        except asyncio.TimeoutError:
+            pass  # this happens when the initialization takes longer than self._initialize_timeout
+        except Exception:
+            pass  # initialization failed
+
+        pong_timeout = utils.aio.sleep(proto.PING_TIMEOUT)
+        ping_task = asyncio.create_task(self._ping_pong_task(pong_timeout))
+        monitor_task = asyncio.create_task(self._monitor_task(pong_timeout))
+
+        await self._join_fut
+        await utils.aio.gracefully_cancel(ping_task, monitor_task)
+
+        with contextlib.suppress(duplex_unix.DuplexClosed):
+            await self._pch.aclose()
+
+    @utils.log_exceptions(logger=logger)
+    async def _monitor_task(self, pong_timeout: utils.aio.Sleep) -> None:
+        while True:
+            try:
+                msg = await channel.arecv_message(self._pch, proto.IPC_MESSAGES)
+            except utils.aio.duplex_unix.DuplexClosed:
+                break
+
+            if isinstance(msg, proto.PongResponse):
+                delay = utils.time_ms() - msg.timestamp
+                if delay > proto.HIGH_PING_THRESHOLD * 1000:
+                    logger.warning(
+                        "job executor is unresponsive",
+                        extra={"delay": delay, **self.logging_extra()},
+                    )
+
+                with contextlib.suppress(utils.aio.SleepFinished):
+                    pong_timeout.reset()
+
+            if isinstance(msg, proto.Exiting):
+                logger.debug(
+                    "job exiting", extra={"reason": msg.reason, **self.logging_extra()}
+                )
+
+    @utils.log_exceptions(logger=logger)
+    async def _ping_pong_task(self, pong_timeout: utils.aio.Sleep) -> None:
+        ping_interval = utils.aio.interval(proto.PING_INTERVAL)
+
+        async def _send_ping_co():
+            while True:
+                await ping_interval.tick()
+                try:
+                    await channel.asend_message(
+                        self._pch, proto.PingRequest(timestamp=utils.time_ms())
+                    )
+                except utils.aio.duplex_unix.DuplexClosed:
+                    break
+
+        async def _pong_timeout_co():
+            await pong_timeout
+            logger.error("job is unresponsive..", extra=self.logging_extra())
+
+        tasks = [
+            asyncio.create_task(_send_ping_co()),
+            asyncio.create_task(_pong_timeout_co()),
+        ]
+        try:
+            await asyncio.gather(*tasks)
+        finally:
+            await utils.aio.gracefully_cancel(*tasks)
+
+    def logging_extra(self):
+        extra: dict[str, Any] = {
+            "tid": self._thread.native_id,
+        }
+        if self._running_job:
+            extra["job_id"] = self._running_job.job.id
+
+        return extra
diff --git a/livekit-agents/livekit/agents/job.py b/livekit-agents/livekit/agents/job.py
index 6d66abdd8..19574b71b 100644
--- a/livekit-agents/livekit/agents/job.py
+++ b/livekit-agents/livekit/agents/job.py
@@ -17,12 +17,20 @@
 import asyncio
 import multiprocessing as mp
 from dataclasses import dataclass
-from enum import Enum
-from typing import Any, Callable, Coroutine
+from enum import Enum, unique
+from typing import Any, Callable, Coroutine, Tuple
 
 from livekit import rtc
 from livekit.protocol import agent, models
 
+from .log import logger
+
+
+@unique
+class JobExecutorType(Enum):
+    PROCESS = "process"
+    THREAD = "thread"
+
 
 class AutoSubscribe(str, Enum):
     SUBSCRIBE_ALL = "subscribe_all"
@@ -61,27 +69,74 @@ def __init__(
         self._room = room
         self._on_connect = on_connect
         self._on_shutdown = on_shutdown
-        self._shutdown_callbacks: list[Callable[[], Coroutine]] = []
+        self._shutdown_callbacks: list[Callable[[], Coroutine[None, None, None]]] = []
+        self._participant_entrypoints: list[
+            Callable[[JobContext, rtc.RemoteParticipant], Coroutine[None, None, None]]
+        ] = []
+        self._participant_tasks = dict[Tuple[str, Callable], asyncio.Task[None]]()
+        self._room.on("participant_connected", self._participant_available)
 
     @property
     def proc(self) -> JobProcess:
+        """Returns the process running the job. Useful for storing process-specific state."""
         return self._proc
 
     @property
     def job(self) -> agent.Job:
+        """Returns the current job that the worker is executing."""
         return self._info.job
 
     @property
     def room(self) -> rtc.Room:
+        """The Room object is the main interface that the worker should interact with.
+
+        When the entrypoint is called, the worker has not connected to the Room yet.
+        Certain properties of Room would not be available before calling JobContext.connect()
+        """
         return self._room
 
     @property
     def agent(self) -> rtc.LocalParticipant:
         return self._room.local_participant
 
-    def add_shutdown_callback(self, callback: Callable[[], Coroutine]) -> None:
+    def add_shutdown_callback(
+        self, callback: Callable[[], Coroutine[None, None, None]]
+    ) -> None:
         self._shutdown_callbacks.append(callback)
 
+    async def wait_for_participant(
+        self, *, identity: str | None = None
+    ) -> rtc.RemoteParticipant:
+        """
+        Returns a participant that matches the given identity. If identity is None, the first
+        participant that joins the room will be returned.
+        If the participant has already joined, the function will return immediately.
+        """
+        if not self._room.isconnected():
+            raise RuntimeError("room is not connected")
+
+        fut = asyncio.Future[rtc.RemoteParticipant]()
+
+        for p in self._room.remote_participants.values():
+            if (
+                identity is None or p.identity == identity
+            ) and p.kind != rtc.ParticipantKind.PARTICIPANT_KIND_AGENT:
+                fut.set_result(p)
+                break
+
+        def _on_participant_connected(p: rtc.RemoteParticipant):
+            if (
+                identity is None or p.identity == identity
+            ) and p.kind != rtc.ParticipantKind.PARTICIPANT_KIND_AGENT:
+                self._room.off("participant_connected", _on_participant_connected)
+                if not fut.done():
+                    fut.set_result(p)
+
+        if not fut.done():
+            self._room.on("participant_connected", _on_participant_connected)
+
+        return await fut
+
     async def connect(
         self,
         *,
@@ -89,6 +144,13 @@ async def connect(
         auto_subscribe: AutoSubscribe = AutoSubscribe.SUBSCRIBE_ALL,
         rtc_config: rtc.RtcConfiguration | None = None,
     ) -> None:
+        """Connect to the room. This method should be called only once.
+
+        Args:
+            e2ee: End-to-end encryption options. If provided, the Agent will utilize end-to-end encryption. Note: clients will also need to handle E2EE.
+            auto_subscribe: Whether to automatically subscribe to tracks. Default is AutoSubscribe.SUBSCRIBE_ALL.
+            rtc_config: Custom RTC configuration to use when connecting to the room.
+        """
         room_options = rtc.RoomOptions(
             e2ee=e2ee,
             auto_subscribe=auto_subscribe == AutoSubscribe.SUBSCRIBE_ALL,
@@ -97,12 +159,43 @@ async def connect(
 
         await self._room.connect(self._info.url, self._info.token, options=room_options)
         self._on_connect()
+        for p in self._room.remote_participants.values():
+            self._participant_available(p)
 
         _apply_auto_subscribe_opts(self._room, auto_subscribe)
 
     def shutdown(self, reason: str = "") -> None:
         self._on_shutdown(reason)
 
+    def add_participant_entrypoint(
+        self,
+        entrypoint_fnc: Callable[
+            [JobContext, rtc.RemoteParticipant], Coroutine[None, None, None]
+        ],
+    ):
+        """Adds an entrypoint function to be run when a participant joins the room. In cases where
+        the participant has already joined, the entrypoint will be run immediately. Multiple unique entrypoints can be
+        added and they will each be run in parallel for each participant.
+        """
+
+        if entrypoint_fnc in self._participant_entrypoints:
+            raise ValueError("entrypoints cannot be added more than once")
+
+        self._participant_entrypoints.append(entrypoint_fnc)
+
+    def _participant_available(self, p: rtc.RemoteParticipant) -> None:
+        for coro in self._participant_entrypoints:
+            if (p.identity, coro) in self._participant_tasks:
+                logger.warning(
+                    f"a participant has joined before a prior participant task matching the same identity has finished: '{p.identity}'"
+                )
+            task_name = f"part-entry-{p.identity}-{coro.__name__}"
+            task = asyncio.create_task(coro(self, p), name=task_name)
+            self._participant_tasks[(p.identity, coro)] = task
+            task.add_done_callback(
+                lambda _: self._participant_tasks.pop((p.identity, coro))
+            )
+
 
 def _apply_auto_subscribe_opts(room: rtc.Room, auto_subscribe: AutoSubscribe) -> None:
     if auto_subscribe not in (AutoSubscribe.AUDIO_ONLY, AutoSubscribe.VIDEO_ONLY):
@@ -151,7 +244,7 @@ def __init__(
         self,
         *,
         job: agent.Job,
-        on_reject: Callable[[], Coroutine],
+        on_reject: Callable[[], Coroutine[None, None, None]],
         on_accept: Callable[[JobAcceptArguments], Coroutine[None, None, None]],
     ) -> None:
         self._job = job
@@ -175,6 +268,10 @@ def room(self) -> models.Room:
     def publisher(self) -> models.ParticipantInfo | None:
         return self._job.participant
 
+    @property
+    def agent_name(self) -> str:
+        return self._job.agent_name
+
     async def reject(self) -> None:
         """Reject the job request. The job may be assigned to another worker"""
         await self._on_reject()
diff --git a/livekit-agents/livekit/agents/llm/_oai_api.py b/livekit-agents/livekit/agents/llm/_oai_api.py
index bd46e7bf9..9d7dcf302 100644
--- a/livekit-agents/livekit/agents/llm/_oai_api.py
+++ b/livekit-agents/livekit/agents/llm/_oai_api.py
@@ -141,7 +141,7 @@ def type2str(t: type) -> str:
 
 
 def _sanitize_primitive(
-    *, value: Any, expected_type: type, choices: list | None
+    *, value: Any, expected_type: type, choices: tuple | None
 ) -> Any:
     if expected_type is str:
         if not isinstance(value, str):
diff --git a/livekit-agents/livekit/agents/llm/chat_context.py b/livekit-agents/livekit/agents/llm/chat_context.py
index 08fd9d630..081a33ad7 100644
--- a/livekit-agents/livekit/agents/llm/chat_context.py
+++ b/livekit-agents/livekit/agents/llm/chat_context.py
@@ -41,6 +41,8 @@ class ChatMessage:
     content: str | list[str | ChatImage] | None = None
     tool_calls: list[function_context.FunctionCallInfo] | None = None
     tool_call_id: str | None = None
+    tool_exception: Exception | None = None
+    _metadata: dict[str, Any] = field(default_factory=dict, repr=False, init=False)
 
     @staticmethod
     def create_tool_from_called_function(
@@ -49,9 +51,12 @@ def create_tool_from_called_function(
         if not called_function.task.done():
             raise ValueError("cannot create a tool result from a running ai function")
 
+        tool_exception: Exception | None = None
         try:
             content = called_function.task.result()
         except BaseException as e:
+            if isinstance(e, Exception):
+                tool_exception = e
             content = f"Error: {e}"
 
         return ChatMessage(
@@ -59,6 +64,7 @@ def create_tool_from_called_function(
             name=called_function.call_info.function_info.name,
             content=content,
             tool_call_id=called_function.call_info.tool_call_id,
+            tool_exception=tool_exception,
         )
 
     @staticmethod
@@ -92,18 +98,21 @@ def copy(self):
         if tool_calls is not None:
             tool_calls = tool_calls.copy()
 
-        return ChatMessage(
+        copied_msg = ChatMessage(
             role=self.role,
             name=self.name,
             content=content,
             tool_calls=tool_calls,
             tool_call_id=self.tool_call_id,
         )
+        copied_msg._metadata = self._metadata
+        return copied_msg
 
 
 @dataclass
 class ChatContext:
     messages: list[ChatMessage] = field(default_factory=list)
+    _metadata: dict[str, Any] = field(default_factory=dict, repr=False, init=False)
 
     def append(
         self, *, text: str = "", images: list[ChatImage] = [], role: ChatRole = "system"
@@ -112,4 +121,6 @@ def append(
         return self
 
     def copy(self) -> ChatContext:
-        return ChatContext(messages=[m.copy() for m in self.messages])
+        copied_chat_ctx = ChatContext(messages=[m.copy() for m in self.messages])
+        copied_chat_ctx._metadata = self._metadata
+        return copied_chat_ctx
diff --git a/livekit-agents/livekit/agents/llm/function_context.py b/livekit-agents/livekit/agents/llm/function_context.py
index 42d893d96..9564c3a1c 100644
--- a/livekit-agents/livekit/agents/llm/function_context.py
+++ b/livekit-agents/livekit/agents/llm/function_context.py
@@ -19,7 +19,7 @@
 import functools
 import inspect
 import typing
-from dataclasses import dataclass, field
+from dataclasses import dataclass
 from typing import Any, Callable, Tuple
 
 from ..log import logger
@@ -33,10 +33,18 @@ class _UseDocMarker:
 USE_DOCSTRING = _UseDocMarker()
 
 
-@dataclass(frozen=True)
+@dataclass(frozen=True, init=False)
 class TypeInfo:
-    description: str = ""
-    choices: list[Any] = field(default_factory=list)
+    description: str
+    choices: tuple
+
+    def __init__(self, description: str, choices: tuple | list[Any] = tuple()) -> None:
+        object.__setattr__(self, "description", description)
+
+        if isinstance(choices, list):
+            choices = tuple(choices)
+
+        object.__setattr__(self, "choices", choices)
 
 
 @dataclass(frozen=True)
@@ -45,7 +53,7 @@ class FunctionArgInfo:
     description: str
     type: type
     default: Any
-    choices: list[Any] | None
+    choices: tuple | None
 
 
 @dataclass(frozen=True)
@@ -137,8 +145,13 @@ def _register_ai_function(self, fnc: Callable) -> None:
             raise ValueError(f"duplicate ai_callable name: {fnc_name}")
 
         sig = inspect.signature(fnc)
-        type_hints = typing.get_type_hints(fnc)  # Annotated[T, ...] -> T
-        args = dict()
+
+        # get_type_hints with include_extra=True is needed when using Annotated
+        # using typing.get_args with param.Annotated is returning an empty tuple for some reason
+        type_hints = typing.get_type_hints(
+            fnc, include_extras=True
+        )  # Annotated[T, ...] -> T
+        args = dict[str, FunctionArgInfo]()
 
         for name, param in sig.parameters.items():
             if param.kind not in (
@@ -147,37 +160,32 @@ def _register_ai_function(self, fnc: Callable) -> None:
             ):
                 raise ValueError(f"{fnc_name}: unsupported parameter kind {param.kind}")
 
-            if param.annotation is inspect.Parameter.empty:
-                raise ValueError(
-                    f"{fnc_name}: missing type annotation for parameter {name}"
-                )
+            inner_th, type_info = _extract_types(type_hints[name])
 
-            th = type_hints[name]
-            if not is_type_supported(th):
+            if not is_type_supported(inner_th):
                 raise ValueError(
-                    f"{fnc_name}: unsupported type {th} for parameter {name}"
+                    f"{fnc_name}: unsupported type {inner_th} for parameter {name}"
                 )
 
-            type_info = _find_param_type_info(param.annotation)
             desc = type_info.description if type_info else ""
             choices = type_info.choices if type_info else None
 
-            is_optional, inner_type = _is_optional_type(th)
+            is_optional, optional_inner = _is_optional_type(inner_th)
             if is_optional:
                 # when the type is optional, only the inner type is relevant
                 # the argument info for default would be None
-                th = inner_type
+                inner_th = optional_inner
 
-            if issubclass(th, enum.Enum) and not choices:
+            if issubclass(inner_th, enum.Enum) and not choices:
                 # the enum must be a str or int (and at least one value)
                 # this is verified by is_type_supported
-                choices = [item.value for item in th]
-                th = type(choices[0])
+                choices = tuple([item.value for item in inner_th])
+                inner_th = type(choices[0])
 
             args[name] = FunctionArgInfo(
                 name=name,
                 description=desc,
-                type=th,
+                type=inner_th,
                 default=param.default,
                 choices=choices,
             )
@@ -202,15 +210,33 @@ class _AIFncMetadata:
     auto_retry: bool
 
 
-def _find_param_type_info(annotation: type) -> TypeInfo | None:
+def _extract_types(annotation: type) -> tuple[type, TypeInfo | None]:
+    """Return inner_type, TypeInfo"""
     if typing.get_origin(annotation) is not typing.Annotated:
-        return None
-
-    for a in typing.get_args(annotation):
+        # email: Annotated[
+        #    Optional[str], TypeInfo(description="The user address email")
+        # ] = None,
+        #
+        # An argument like the above will return us:
+        # `typing.Optional[typing.Annotated[typing.Optional[str], TypeInfo(description='The user address email', choices=())]]`
+        # So we ignore the first typing.Optional
+
+        is_optional, optional_inner = _is_optional_type(annotation)
+        if is_optional:
+            return _extract_types(optional_inner)
+
+        return annotation, None
+
+    # assume the first argument is always the inner type the LLM will use
+    args = typing.get_args(annotation)
+    if len(args) < 2:
+        return args[0], None
+
+    for a in args:
         if isinstance(a, TypeInfo):
-            return a
+            return args[0], a
 
-    return None
+    return args[0], None
 
 
 def _set_metadata(
diff --git a/livekit-agents/livekit/agents/log.py b/livekit-agents/livekit/agents/log.py
index f8236850c..7757aff59 100644
--- a/livekit-agents/livekit/agents/log.py
+++ b/livekit-agents/livekit/agents/log.py
@@ -1,6 +1,6 @@
 import logging
 
-DEV_LEVEL = 25
+DEV_LEVEL = 23
 logging.addLevelName(DEV_LEVEL, "DEV")
 
 logger = logging.getLogger("livekit.agents")
diff --git a/livekit-agents/livekit/agents/proto.py b/livekit-agents/livekit/agents/proto.py
new file mode 100644
index 000000000..3fc3dbd31
--- /dev/null
+++ b/livekit-agents/livekit/agents/proto.py
@@ -0,0 +1,5 @@
+from typing import Literal, Union
+
+ATTR_AGENT_STATE = "lk.agent.state"
+
+AgentState = Union[Literal["initializing", "listening", "thinking", "speaking"], str]
diff --git a/livekit-agents/livekit/agents/tokenize/__init__.py b/livekit-agents/livekit/agents/tokenize/__init__.py
index 1a9eafb57..5b18d0e29 100644
--- a/livekit-agents/livekit/agents/tokenize/__init__.py
+++ b/livekit-agents/livekit/agents/tokenize/__init__.py
@@ -1,4 +1,4 @@
-from . import basic
+from . import basic, utils
 from .token_stream import (
     BufferedSentenceStream,
     BufferedWordStream,
@@ -20,4 +20,5 @@
     "BufferedSentenceStream",
     "BufferedWordStream",
     "basic",
+    "utils",
 ]
diff --git a/livekit-agents/livekit/agents/tokenize/_basic_paragraph.py b/livekit-agents/livekit/agents/tokenize/_basic_paragraph.py
index 726515103..263a87f33 100644
--- a/livekit-agents/livekit/agents/tokenize/_basic_paragraph.py
+++ b/livekit-agents/livekit/agents/tokenize/_basic_paragraph.py
@@ -1,12 +1,18 @@
-def split_paragraphs(text: str) -> list[str]:
-    sep = "\n\n"
-
-    paragraphs = text.split(sep)
-    new_paragraphs = []
-    for p in paragraphs:
-        p = p.strip()
-        if not p:
-            continue
-        new_paragraphs.append(p)
-
-    return new_paragraphs
+import re
+
+
+def split_paragraphs(text: str) -> list[tuple[str, int, int]]:
+    """
+    Split the text into paragraphs.
+    Returns a list of paragraphs with their start and end indices of the original text.
+    """
+    matches = re.finditer(r"\n{2,}", text)
+    paragraphs = []
+
+    for match in matches:
+        paragraph = match.group(0)
+        start_pos = match.start()
+        end_pos = match.end()
+        paragraphs.append((paragraph.strip(), start_pos, end_pos))
+
+    return paragraphs
diff --git a/livekit-agents/livekit/agents/tokenize/_basic_sent.py b/livekit-agents/livekit/agents/tokenize/_basic_sent.py
index 1e8721dc5..9b33fc4e2 100644
--- a/livekit-agents/livekit/agents/tokenize/_basic_sent.py
+++ b/livekit-agents/livekit/agents/tokenize/_basic_sent.py
@@ -1,9 +1,13 @@
 import re
 
 
-# rule based segmentation from https://stackoverflow.com/a/31505798, works surprisingly well
-def split_sentences(text: str, min_sentence_len: int = 20) -> list[str]:
-    """the text can't contains substrings "<prd>" or "<stop>"""
+# rule based segmentation based on https://stackoverflow.com/a/31505798, works surprisingly well
+def split_sentences(
+    text: str, min_sentence_len: int = 20
+) -> list[tuple[str, int, int]]:
+    """
+    the text may not contain substrings "<prd>" or "<stop>"
+    """
     alphabets = r"([A-Za-z])"
     prefixes = r"(Mr|St|Mrs|Ms|Dr)[.]"
     suffixes = r"(Inc|Ltd|Jr|Sr|Co)"
@@ -14,12 +18,11 @@ def split_sentences(text: str, min_sentence_len: int = 20) -> list[str]:
     multiple_dots = r"\.{2,}"
 
     # fmt: off
-    text = " " + text + "  "
     text = text.replace("\n"," ")
-    text = re.sub(prefixes,"\\1<prd>",text)
-    text = re.sub(websites,"<prd>\\1",text)
+    text = re.sub(prefixes,"\\1<prd>", text)
+    text = re.sub(websites,"<prd>\\1", text)
     text = re.sub(digits + "[.]" + digits,"\\1<prd>\\2",text)
-    #text = re.sub(multiple_dots, lambda match: "<prd>" * len(match.group(0)) + "<stop>", text)
+    # text = re.sub(multiple_dots, lambda match: "<prd>" * len(match.group(0)) + "<stop>", text)
     # TODO(theomonnom): need improvement for ""..." dots", check capital + next sentence should not be
     # small
     text = re.sub(multiple_dots, lambda match: "<prd>" * len(match.group(0)), text)
@@ -44,21 +47,29 @@ def split_sentences(text: str, min_sentence_len: int = 20) -> list[str]:
     text = text.replace("?","?<stop>")
     text = text.replace("!","!<stop>")
     text = text.replace("<prd>",".")
-    sentences = text.split("<stop>")
-    sentences = [s.strip() for s in sentences]
-    if sentences and not sentences[-1]:
-        sentences = sentences[:-1]
     # fmt: on
 
-    new_sentences = []
+    splitted_sentences = text.split("<stop>")
+    text = text.replace("<stop>", "")
+
+    sentences: list[tuple[str, int, int]] = []
+
     buff = ""
-    for sentence in sentences:
+    start_pos = 0
+    end_pos = 0
+    for match in splitted_sentences:
+        sentence = match.strip()
+        if not sentence:
+            continue
+
         buff += " " + sentence
+        end_pos += len(match)
         if len(buff) > min_sentence_len:
-            new_sentences.append(buff[1:])
+            sentences.append((buff[1:], start_pos, end_pos))
+            start_pos = end_pos
             buff = ""
 
     if buff:
-        new_sentences.append(buff[1:])
+        sentences.append((buff[1:], start_pos, len(text) - 1))
 
-    return new_sentences
+    return sentences
diff --git a/livekit-agents/livekit/agents/tokenize/_basic_word.py b/livekit-agents/livekit/agents/tokenize/_basic_word.py
index e19f8bac6..109ee7160 100644
--- a/livekit-agents/livekit/agents/tokenize/_basic_word.py
+++ b/livekit-agents/livekit/agents/tokenize/_basic_word.py
@@ -1,22 +1,31 @@
 import re
 
+from . import tokenizer
 
-def split_words(text: str, ignore_punctuation: bool = True) -> list[str]:
-    # fmt: off
-    punctuations = [".", ",", "!", "?", ";", ":", "'", '"', "(", ")", "[", "]", "{", "}", "<", ">",
-                    "—"]
-    # fmt: on
-
-    if ignore_punctuation:
-        for p in punctuations:
-            # TODO(theomonnom): Ignore acronyms
-            text = text.replace(p, "")
-
-    words = re.split("[ \n]+", text)
-    new_words = []
-    for word in words:
-        if not word:
-            continue  # ignore empty
-        new_words.append(word)
-
-    return new_words
+
+def split_words(
+    text: str, ignore_punctuation: bool = True
+) -> list[tuple[str, int, int]]:
+    """
+    Split the text into words.
+    Returns a list of words with their start and end indices of the original text.
+    """
+    matches = re.finditer(r"\S+", text)
+    words: list[tuple[str, int, int]] = []
+
+    for match in matches:
+        word = match.group(0)
+        start_pos = match.start()
+        end_pos = match.end()
+
+        if ignore_punctuation:
+            # TODO(theomonnom): acronyms passthrough
+            translation_table = str.maketrans("", "", "".join(tokenizer.PUNCTUATIONS))
+            word = word.translate(translation_table)
+
+            if not word:
+                continue
+
+        words.append((word, start_pos, end_pos))
+
+    return words
diff --git a/livekit-agents/livekit/agents/tokenize/basic.py b/livekit-agents/livekit/agents/tokenize/basic.py
index fd8f84c22..70bbd09cd 100644
--- a/livekit-agents/livekit/agents/tokenize/basic.py
+++ b/livekit-agents/livekit/agents/tokenize/basic.py
@@ -45,9 +45,12 @@ def __init__(
         )
 
     def tokenize(self, text: str, *, language: str | None = None) -> list[str]:
-        return _basic_sent.split_sentences(
-            text, min_sentence_len=self._config.min_sentence_len
-        )
+        return [
+            tok[0]
+            for tok in _basic_sent.split_sentences(
+                text, min_sentence_len=self._config.min_sentence_len
+            )
+        ]
 
     def stream(self, *, language: str | None = None) -> tokenizer.SentenceStream:
         return token_stream.BufferedSentenceStream(
@@ -65,9 +68,12 @@ def __init__(self, *, ignore_punctuation: bool = True) -> None:
         self._ignore_punctuation = ignore_punctuation
 
     def tokenize(self, text: str, *, language: str | None = None) -> list[str]:
-        return _basic_word.split_words(
-            text, ignore_punctuation=self._ignore_punctuation
-        )
+        return [
+            tok[0]
+            for tok in _basic_word.split_words(
+                text, ignore_punctuation=self._ignore_punctuation
+            )
+        ]
 
     def stream(self, *, language: str | None = None) -> tokenizer.WordStream:
         return token_stream.BufferedWordStream(
@@ -84,4 +90,4 @@ def hyphenate_word(word: str) -> list[str]:
 
 
 def tokenize_paragraphs(text: str) -> list[str]:
-    return _basic_paragraph.split_paragraphs(text)
+    return [tok[0] for tok in _basic_paragraph.split_paragraphs(text)]
diff --git a/livekit-agents/livekit/agents/tokenize/token_stream.py b/livekit-agents/livekit/agents/tokenize/token_stream.py
index 9be14e2ec..a7e09734d 100644
--- a/livekit-agents/livekit/agents/tokenize/token_stream.py
+++ b/livekit-agents/livekit/agents/tokenize/token_stream.py
@@ -1,16 +1,21 @@
 from __future__ import annotations
 
-from typing import Callable
+import typing
+from typing import Callable, Union
 
 from ..utils import aio, shortuuid
 from .tokenizer import SentenceStream, TokenData, WordStream
 
+# Tokenizers can either provide us with a list of tokens or a list of tokens along with their start and end indices.
+# If the start and end indices are not available, we attempt to locate the token within the text using str.find.
+TokenizeCallable = Callable[[str], Union[list[str], list[tuple[str, int, int]]]]
+
 
 class BufferedTokenStream:
     def __init__(
         self,
         *,
-        tokenize_fnc: Callable[[str], list[str]],
+        tokenize_fnc: TokenizeCallable,
         min_token_len: int,
         min_ctx_len: int,
     ) -> None:
@@ -21,53 +26,68 @@ def __init__(
         self._current_segment_id = shortuuid()
 
         self._buf_tokens: list[str] = []  # <= min_token_len
-        self._buf = ""
+        self._in_buf = ""
+        self._out_buf = ""
 
+    @typing.no_type_check
     def push_text(self, text: str) -> None:
         self._check_not_closed()
-        self._buf += text
+        self._in_buf += text
 
-        if len(self._buf) < self._min_ctx_len:
+        if len(self._in_buf) < self._min_ctx_len:
             return
 
-        tokens = self._tokenize_fnc(self._buf)
+        while True:
+            tokens = self._tokenize_fnc(self._in_buf)
+            if len(tokens) <= 1:
+                break
 
-        buf_toks = []
-        buf = ""
-        while len(tokens) > 1:
-            if buf:
-                buf += " "
+            if self._out_buf:
+                self._out_buf += " "
 
             tok = tokens.pop(0)
-            buf += tok
-            buf_toks.append(tok)
-            if len(buf) >= self._min_token_len:
+            tok_text = tok
+            if isinstance(tok, tuple):
+                tok_text = tok[0]
+
+            self._out_buf += tok_text
+            if len(self._out_buf) >= self._min_token_len:
                 self._event_ch.send_nowait(
-                    TokenData(token=buf, segment_id=self._current_segment_id)
+                    TokenData(token=self._out_buf, segment_id=self._current_segment_id)
                 )
 
-                for i, tok in enumerate(buf_toks):
-                    tok_i = self._buf.find(tok)
-                    self._buf = self._buf[tok_i + len(tok) :].lstrip()
+                self._out_buf = ""
 
-                buf_toks = []
-                buf = ""
+            if isinstance(tok, tuple):
+                self._in_buf = self._in_buf[tok[2] :]
+            else:
+                tok_i = max(self._in_buf.find(tok), 0)
+                self._in_buf = self._in_buf[tok_i + len(tok) :].lstrip()
 
+    @typing.no_type_check
     def flush(self) -> None:
         self._check_not_closed()
-        if self._buf:
-            tokens = self._tokenize_fnc(self._buf)
+
+        if self._in_buf or self._out_buf:
+            tokens = self._tokenize_fnc(self._in_buf)
             if tokens:
-                buf = " ".join(tokens)
-            else:
-                buf = self._buf
+                if self._out_buf:
+                    self._out_buf += " "
+
+                if isinstance(tokens[0], tuple):
+                    self._out_buf += " ".join([tok[0] for tok in tokens])
+                else:
+                    self._out_buf += " ".join(tokens)
+
+            if self._out_buf:
+                self._event_ch.send_nowait(
+                    TokenData(token=self._out_buf, segment_id=self._current_segment_id)
+                )
 
-            self._event_ch.send_nowait(
-                TokenData(token=buf, segment_id=self._current_segment_id)
-            )
             self._current_segment_id = shortuuid()
 
-        self._buf = ""
+        self._in_buf = ""
+        self._out_buf = ""
 
     def end_input(self) -> None:
         self.flush()
@@ -92,7 +112,7 @@ class BufferedSentenceStream(BufferedTokenStream, SentenceStream):
     def __init__(
         self,
         *,
-        tokenizer: Callable[[str], list[str]],
+        tokenizer: TokenizeCallable,
         min_token_len: int,
         min_ctx_len: int,
     ) -> None:
@@ -107,7 +127,7 @@ class BufferedWordStream(BufferedTokenStream, WordStream):
     def __init__(
         self,
         *,
-        tokenizer: Callable[[str], list[str]],
+        tokenizer: TokenizeCallable,
         min_token_len: int,
         min_ctx_len: int,
     ) -> None:
diff --git a/livekit-agents/livekit/agents/tokenize/tokenizer.py b/livekit-agents/livekit/agents/tokenize/tokenizer.py
index c4734a204..b785edb0e 100644
--- a/livekit-agents/livekit/agents/tokenize/tokenizer.py
+++ b/livekit-agents/livekit/agents/tokenize/tokenizer.py
@@ -6,6 +6,12 @@
 
 from ..utils import aio
 
+# fmt: off
+PUNCTUATIONS = ['!', '"', '#', '$', '%', '&', "'", '(', ')', '*', '+', ',', '-', '.', '/', ':', ';', '<', '=', '>',
+                '?', '@', '[', '\\', ']', '^', '_', '`', '{', '|', '}', '~', '±', '—', '‘', '’', '“', '”', '…']
+
+# fmt: on
+
 
 @dataclass
 class TokenData:
diff --git a/livekit-agents/livekit/agents/tokenize/utils.py b/livekit-agents/livekit/agents/tokenize/utils.py
new file mode 100644
index 000000000..82e8b302c
--- /dev/null
+++ b/livekit-agents/livekit/agents/tokenize/utils.py
@@ -0,0 +1,82 @@
+from __future__ import annotations
+
+from typing import AsyncIterable, overload
+
+from . import _basic_word, tokenizer
+
+
+@overload
+def replace_words(
+    *,
+    text: str,
+    replacements: dict[str, str],
+) -> str: ...
+
+
+@overload
+def replace_words(
+    *,
+    text: AsyncIterable[str],
+    replacements: dict[str, str],
+) -> AsyncIterable[str]: ...
+
+
+def replace_words(
+    *,
+    text: str | AsyncIterable[str],
+    replacements: dict[str, str],
+) -> str | AsyncIterable[str]:
+    """
+    Replace words in the given (async) text. The replacements are case-insensitive and the
+    replacement will keep the case of the original word.
+    Args:
+        text: text to replace words in
+        words: dictionary of words to replace
+    """
+
+    replacements = {k.lower(): v for k, v in replacements.items()}
+
+    def _process_words(text, words):
+        offset = 0
+        processed_index = 0
+        for word, start_index, end_index in words:
+            no_punctuation = word.rstrip("".join(tokenizer.PUNCTUATIONS))
+            punctuation_off = len(word) - len(no_punctuation)
+            replacement = replacements.get(no_punctuation.lower())
+            if replacement:
+                text = (
+                    text[: start_index + offset]
+                    + replacement
+                    + text[end_index + offset - punctuation_off :]
+                )
+                offset += len(replacement) - len(word) + punctuation_off
+
+            processed_index = end_index + offset
+
+        return text, processed_index
+
+    if isinstance(text, str):
+        words = _basic_word.split_words(text, ignore_punctuation=False)
+        text, _ = _process_words(text, words)
+        return text
+    else:
+
+        async def _replace_words():
+            buffer = ""
+            async for chunk in text:
+                buffer += chunk
+                words = _basic_word.split_words(buffer, ignore_punctuation=False)
+
+                if len(words) <= 1:
+                    continue
+
+                buffer, procesed_index = _process_words(buffer, words[:-1])
+                yield buffer[:procesed_index]
+                buffer = buffer[procesed_index:]
+
+            if buffer:
+                words = _basic_word.split_words(buffer, ignore_punctuation=False)
+                buffer, _ = _process_words(buffer, words)
+                yield buffer
+
+        return _replace_words()
diff --git a/livekit-agents/livekit/agents/transcription/stt_forwarder.py b/livekit-agents/livekit/agents/transcription/stt_forwarder.py
index 410b343eb..0d526a3a6 100644
--- a/livekit-agents/livekit/agents/transcription/stt_forwarder.py
+++ b/livekit-agents/livekit/agents/transcription/stt_forwarder.py
@@ -10,13 +10,16 @@
 from ..log import logger
 from . import _utils
 
-WillForwardTranscription = Callable[
+BeforeForwardCallback = Callable[
     ["STTSegmentsForwarder", rtc.Transcription],
     Union[rtc.Transcription, Awaitable[Optional[rtc.Transcription]]],
 ]
 
 
-def _default_will_forward_transcription(
+WillForwardTranscription = BeforeForwardCallback
+
+
+def _default_before_forward_cb(
     fwd: STTSegmentsForwarder, transcription: rtc.Transcription
 ) -> rtc.Transcription:
     return transcription
@@ -33,7 +36,9 @@ def __init__(
         room: rtc.Room,
         participant: rtc.Participant | str,
         track: rtc.Track | rtc.TrackPublication | str | None = None,
-        will_forward_transcription: WillForwardTranscription = _default_will_forward_transcription,
+        before_forward_cb: BeforeForwardCallback = _default_before_forward_cb,
+        # backward compatibility
+        will_forward_transcription: WillForwardTranscription | None = None,
     ):
         identity = participant if isinstance(participant, str) else participant.identity
         if track is None:
@@ -41,8 +46,14 @@ def __init__(
         elif isinstance(track, (rtc.TrackPublication, rtc.Track)):
             track = track.sid
 
+        if will_forward_transcription is not None:
+            logger.warning(
+                "will_forward_transcription is deprecated and will be removed in 1.5.0, use before_forward_cb instead",
+            )
+            before_forward_cb = will_forward_transcription
+
         self._room, self._participant_identity, self._track_id = room, identity, track
-        self._will_forward_transcription = will_forward_transcription
+        self._before_forward_cb = before_forward_cb
         self._queue = asyncio.Queue[Optional[rtc.TranscriptionSegment]]()
         self._main_task = asyncio.create_task(self._run())
         self._current_id = _utils.segment_uuid()
@@ -60,16 +71,12 @@ async def _run(self):
                     segments=[seg],  # no history for now
                 )
 
-                transcription = self._will_forward_transcription(
-                    self, base_transcription
-                )
+                transcription = self._before_forward_cb(self, base_transcription)
                 if asyncio.iscoroutine(transcription):
                     transcription = await transcription
 
                 if not isinstance(transcription, rtc.Transcription):
-                    transcription = _default_will_forward_transcription(
-                        self, base_transcription
-                    )
+                    transcription = _default_before_forward_cb(self, base_transcription)
 
                 if transcription.segments and self._room.isconnected():
                     await self._room.local_participant.publish_transcription(
diff --git a/livekit-agents/livekit/agents/transcription/tts_forwarder.py b/livekit-agents/livekit/agents/transcription/tts_forwarder.py
index d613e2970..867e86732 100644
--- a/livekit-agents/livekit/agents/transcription/tts_forwarder.py
+++ b/livekit-agents/livekit/agents/transcription/tts_forwarder.py
@@ -3,27 +3,31 @@
 import asyncio
 import contextlib
 import time
-from collections import deque
 from dataclasses import dataclass
-from typing import Awaitable, Callable, Deque, Optional, Union
+from typing import Awaitable, Callable, Optional, Union
 
 from livekit import rtc
+from livekit.rtc.participant import PublishTranscriptionError
 
 from .. import tokenize, utils
 from ..log import logger
+from ..tokenize.tokenizer import PUNCTUATIONS
 from . import _utils
 
 # 3.83 is the "baseline", the number of hyphens per second TTS returns in avg.
 STANDARD_SPEECH_RATE = 3.83
 
 
-WillForwardTranscription = Callable[
+BeforeForwardCallback = Callable[
     ["TTSSegmentsForwarder", rtc.Transcription],
     Union[rtc.Transcription, Awaitable[Optional[rtc.Transcription]]],
 ]
 
 
-def _default_will_forward_transcription(
+WillForwardTranscription = BeforeForwardCallback
+
+
+def _default_before_forward_callback(
     fwd: TTSSegmentsForwarder, transcription: rtc.Transcription
 ) -> rtc.Transcription:
     return transcription
@@ -40,27 +44,23 @@ class _TTSOptions:
     sentence_tokenizer: tokenize.SentenceTokenizer
     hyphenate_word: Callable[[str], list[str]]
     new_sentence_delay: float
-    will_forward_transcription: WillForwardTranscription
+    before_forward_cb: BeforeForwardCallback
 
 
 @dataclass
-class _SegmentData:
-    segment_index: int
-    sentence_stream: tokenize.SentenceStream
-    pushed_text: str = ""
+class _AudioData:
     pushed_duration: float = 0.0
-    real_speed: float | None = None
-    processed_sentences: int = 0
-    processed_hyphens: int = 0
-    validated: bool = False
-    forward_start_time: float | None = 0.0
+    done: bool = False
 
 
 @dataclass
-class _FormingSegments:
-    audio: _SegmentData
-    text: _SegmentData
-    q: deque[_SegmentData]
+class _TextData:
+    sentence_stream: tokenize.SentenceStream
+    pushed_text: str = ""
+    done: bool = False
+
+    forwarded_hyphens: int = 0
+    forwarded_sentences: int = 0
 
 
 class TTSSegmentsForwarder:
@@ -83,8 +83,10 @@ def __init__(
         word_tokenizer: tokenize.WordTokenizer = tokenize.basic.WordTokenizer(),
         sentence_tokenizer: tokenize.SentenceTokenizer = tokenize.basic.SentenceTokenizer(),
         hyphenate_word: Callable[[str], list[str]] = tokenize.basic.hyphenate_word,
-        will_forward_transcription: WillForwardTranscription = _default_will_forward_transcription,
+        before_forward_cb: BeforeForwardCallback = _default_before_forward_callback,
         loop: asyncio.AbstractEventLoop | None = None,
+        # backward compatibility
+        will_forward_transcription: WillForwardTranscription | None = None,
     ):
         """
         Args:
@@ -109,6 +111,12 @@ def __init__(
         elif isinstance(track, (rtc.TrackPublication, rtc.Track)):
             track = track.sid
 
+        if will_forward_transcription is not None:
+            logger.warning(
+                "will_forward_transcription is deprecated and will be removed in 1.5.0, use before_forward_cb instead",
+            )
+            before_forward_cb = will_forward_transcription
+
         speed = speed * STANDARD_SPEECH_RATE
         self._opts = _TTSOptions(
             room=room,
@@ -120,31 +128,28 @@ def __init__(
             sentence_tokenizer=sentence_tokenizer,
             hyphenate_word=hyphenate_word,
             new_sentence_delay=new_sentence_delay,
-            will_forward_transcription=will_forward_transcription,
+            before_forward_cb=before_forward_cb,
         )
         self._closed = False
         self._loop = loop or asyncio.get_event_loop()
         self._close_future = asyncio.Future[None]()
 
-        self._next_segment_index = 0
         self._playing_seg_index = -1
         self._finshed_seg_index = -1
 
-        first_segment = self._create_segment()
-        segments_q: Deque[_SegmentData] = deque()
-        segments_q.append(first_segment)
+        self._text_q_changed = asyncio.Event()
+        self._text_q = list[Union[_TextData, None]]()
+        self._audio_q_changed = asyncio.Event()
+        self._audio_q = list[Union[_AudioData, None]]()
 
-        self._forming_segments = _FormingSegments(
-            audio=first_segment, text=first_segment, q=segments_q
-        )
+        self._text_data: _TextData | None = None
+        self._audio_data: _AudioData | None = None
+
+        self._played_text = ""
 
-        self._seg_queue = asyncio.Queue[Optional[_SegmentData]]()
-        self._seg_queue.put_nowait(first_segment)
         self._main_atask = self._loop.create_task(self._main_task())
         self._task_set = utils.aio.TaskSet(loop)
 
-        self._played_text = ""
-
     def segment_playout_started(self) -> None:
         """
         Notify that the playout of the audio segment has started.
@@ -164,47 +169,48 @@ def segment_playout_finished(self) -> None:
 
     def push_audio(self, frame: rtc.AudioFrame) -> None:
         self._check_not_closed()
+
+        if self._audio_data is None:
+            self._audio_data = _AudioData()
+            self._audio_q.append(self._audio_data)
+            self._audio_q_changed.set()
+
         frame_duration = frame.samples_per_channel / frame.sample_rate
-        cur_seg = self._forming_segments.audio
-        cur_seg.pushed_duration += frame_duration
-        cur_seg.validated = True
+        self._audio_data.pushed_duration += frame_duration
 
     def mark_audio_segment_end(self) -> None:
         self._check_not_closed()
-        try:
-            # get last ended segment (text always end before audio)
-            seg = self._forming_segments.q.popleft()
-        except IndexError:
-            raise IndexError(
-                "mark_audio_segment_end called before any mark_text_segment_end"
-            )
 
-        if seg.pushed_duration > 0.0:
-            seg.real_speed = (
-                len(self._calc_hyphens(seg.pushed_text)) / seg.pushed_duration
-            )
+        if self._audio_data is None:
+            self.push_audio(rtc.AudioFrame(bytes(), 24000, 1, 0))
 
-        seg.validated = True
-        self._forming_segments.audio = self._forming_segments.q[0]
+        assert self._audio_data is not None
+        self._audio_data.done = True
+        self._audio_data = None
 
     def push_text(self, text: str) -> None:
         self._check_not_closed()
-        cur_seg = self._forming_segments.text
-        cur_seg.pushed_text += text
-        cur_seg.sentence_stream.push_text(text)
+
+        if self._text_data is None:
+            self._text_data = _TextData(
+                sentence_stream=self._opts.sentence_tokenizer.stream()
+            )
+            self._text_q.append(self._text_data)
+            self._text_q_changed.set()
+
+        self._text_data.pushed_text += text
+        self._text_data.sentence_stream.push_text(text)
 
     def mark_text_segment_end(self) -> None:
         self._check_not_closed()
-        stream = self._forming_segments.text.sentence_stream
-        stream.end_input()
 
-        # create a new segment on "mark_text_segment_end"
-        # further text can already be pushed even if mark_audio_segment_end has not been
-        # called yet
-        new_seg = self._create_segment()
-        self._forming_segments.text = new_seg
-        self._forming_segments.q.append(new_seg)
-        self._seg_queue.put_nowait(new_seg)
+        if self._text_data is None:
+            self.push_text("")
+
+        assert self._text_data is not None
+        self._text_data.done = True
+        self._text_data.sentence_stream.end_input()
+        self._text_data = None
 
     @property
     def closed(self) -> bool:
@@ -220,10 +226,15 @@ async def aclose(self) -> None:
 
         self._closed = True
         self._close_future.set_result(None)
-        self._seg_queue.put_nowait(None)
 
-        for seg in self._forming_segments.q:
-            await seg.sentence_stream.aclose()
+        for text_data in self._text_q:
+            assert text_data is not None
+            await text_data.sentence_stream.aclose()
+
+        self._text_q.append(None)
+        self._audio_q.append(None)
+        self._text_q_changed.set()
+        self._audio_q_changed.set()
 
         await self._task_set.aclose()
         await self._main_atask
@@ -231,78 +242,105 @@ async def aclose(self) -> None:
     @utils.log_exceptions(logger=logger)
     async def _main_task(self) -> None:
         """Main task that forwards the transcription to the room."""
-        rtc_seg_q = asyncio.Queue[Optional[rtc.TranscriptionSegment]]()
+        rtc_seg_ch = utils.aio.Chan[rtc.TranscriptionSegment]()
 
         @utils.log_exceptions(logger=logger)
         async def _forward_task():
-            while True:
-                seg = await rtc_seg_q.get()
-                if seg is None:
-                    break
-
+            async for rtc_seg in rtc_seg_ch:
                 base_transcription = rtc.Transcription(
                     participant_identity=self._opts.participant_identity,
                     track_sid=self._opts.track_id,
-                    segments=[seg],  # no history for now
+                    segments=[rtc_seg],  # no history for now
                 )
 
-                transcription = self._opts.will_forward_transcription(
-                    self, base_transcription
-                )
+                transcription = self._opts.before_forward_cb(self, base_transcription)
                 if asyncio.iscoroutine(transcription):
                     transcription = await transcription
 
                 # fallback to default impl if no custom/user stream is returned
                 if not isinstance(transcription, rtc.Transcription):
-                    transcription = _default_will_forward_transcription(
+                    transcription = _default_before_forward_callback(
                         self, base_transcription
                     )
 
                 if transcription.segments and self._opts.room.isconnected():
-                    await self._opts.room.local_participant.publish_transcription(
-                        transcription
-                    )
+                    try:
+                        await self._opts.room.local_participant.publish_transcription(
+                            transcription
+                        )
+                    except PublishTranscriptionError:
+                        continue
 
         forward_task = asyncio.create_task(_forward_task())
 
-        while True:
-            seg = await self._seg_queue.get()
-            if seg is None:
-                break
+        seg_index = 0
+        q_done = False
+        while not q_done:
+            await self._text_q_changed.wait()
+            await self._audio_q_changed.wait()
+
+            while self._text_q and self._audio_q:
+                text_data = self._text_q.pop(0)
+                audio_data = self._audio_q.pop(0)
 
-            # wait until the segment is validated and has started playing
-            while not self._closed:
-                if seg.validated and self._playing_seg_index >= seg.segment_index:
+                if text_data is None or audio_data is None:
+                    q_done = True
                     break
 
-                await self._sleep_if_not_closed(0.1)
+                # wait until the segment is validated and has started playing
+                while not self._closed:
+                    if self._playing_seg_index >= seg_index:
+                        break
+
+                    await self._sleep_if_not_closed(0.125)
 
-            sentence_stream = seg.sentence_stream
-            seg.forward_start_time = time.time()
+                sentence_stream = text_data.sentence_stream
+                forward_start_time = time.time()
+
+                async for ev in sentence_stream:
+                    await self._sync_sentence_co(
+                        seg_index,
+                        forward_start_time,
+                        text_data,
+                        audio_data,
+                        ev.token,
+                        rtc_seg_ch,
+                    )
 
-            async for ev in sentence_stream:
-                await self._sync_sentence_co(seg, ev.token, rtc_seg_q)
+                seg_index += 1
 
-        rtc_seg_q.put_nowait(None)
+            self._text_q_changed.clear()
+            self._audio_q_changed.clear()
+
+        rtc_seg_ch.close()
         await forward_task
 
     async def _sync_sentence_co(
         self,
-        seg: _SegmentData,
-        tokenized_sentence: str,
-        rtc_seg_q: asyncio.Queue[Optional[rtc.TranscriptionSegment]],
+        segment_index: int,
+        segment_start_time: float,
+        text_data: _TextData,
+        audio_data: _AudioData,
+        sentence: str,
+        rtc_seg_ch: utils.aio.Chan[rtc.TranscriptionSegment],
     ):
         """Synchronize the transcription with the audio playout for a given sentence."""
-        assert seg.forward_start_time is not None
-
         # put each sentence in a different transcription segment
+
+        real_speed = None
+        if audio_data.pushed_duration > 0 and audio_data.done:
+            real_speed = (
+                len(self._calc_hyphens(text_data.pushed_text))
+                / audio_data.pushed_duration
+            )
+
         seg_id = _utils.segment_uuid()
-        words = self._opts.word_tokenizer.tokenize(text=tokenized_sentence)
+        words = self._opts.word_tokenizer.tokenize(text=sentence)
         processed_words: list[str] = []
 
         og_text = self._played_text
         for word in words:
-            if seg.segment_index <= self._finshed_seg_index:
+            if segment_index <= self._finshed_seg_index:
                 # playout of the audio segment already finished
                 # break the loop and send the final transcription
                 break
@@ -315,19 +353,22 @@ async def _sync_sentence_co(
             processed_words.append(word)
 
             # elapsed time since the start of the seg
-            elapsed_time = time.time() - seg.forward_start_time
+            elapsed_time = time.time() - segment_start_time
             text = self._opts.word_tokenizer.format_words(processed_words)
 
+            # remove any punctuation at the end of a non-final transcript
+            text = text.rstrip("".join(PUNCTUATIONS))
+
             speed = self._opts.speed
-            if seg.real_speed is not None:
-                speed = seg.real_speed
+            if real_speed is not None:
+                speed = real_speed
                 estimated_pauses_s = (
-                    seg.processed_sentences * self._opts.new_sentence_delay
+                    text_data.forwarded_sentences * self._opts.new_sentence_delay
                 )
                 hyph_pauses = estimated_pauses_s * speed
 
                 target_hyphens = round(speed * elapsed_time)
-                dt = target_hyphens - seg.processed_hyphens - hyph_pauses
+                dt = target_hyphens - text_data.forwarded_hyphens - hyph_pauses
                 to_wait_hyphens = max(0.0, word_hyphens - dt)
                 delay = to_wait_hyphens / speed
             else:
@@ -335,7 +376,8 @@ async def _sync_sentence_co(
 
             first_delay = min(delay / 2, 2 / speed)
             await self._sleep_if_not_closed(first_delay)
-            rtc_seg_q.put_nowait(
+
+            rtc_seg_ch.send_nowait(
                 rtc.TranscriptionSegment(
                     id=seg_id,
                     text=text,
@@ -346,23 +388,24 @@ async def _sync_sentence_co(
                 )
             )
             self._played_text = f"{og_text} {text}"
+
             await self._sleep_if_not_closed(delay - first_delay)
-            seg.processed_hyphens += word_hyphens
+            text_data.forwarded_hyphens += word_hyphens
 
-        rtc_seg_q.put_nowait(
+        rtc_seg_ch.send_nowait(
             rtc.TranscriptionSegment(
                 id=seg_id,
-                text=tokenized_sentence,
+                text=sentence,
                 start_time=0,
                 end_time=0,
                 final=True,
                 language=self._opts.language,
             )
         )
-        self._played_text = f"{og_text} {tokenized_sentence}"
+        self._played_text = f"{og_text} {sentence}"
 
         await self._sleep_if_not_closed(self._opts.new_sentence_delay)
-        seg.processed_sentences += 1
+        text_data.forwarded_sentences += 1
 
     async def _sleep_if_not_closed(self, delay: float) -> None:
         with contextlib.suppress(asyncio.TimeoutError):
@@ -377,14 +420,6 @@ def _calc_hyphens(self, text: str) -> list[str]:
 
         return hyphens
 
-    def _create_segment(self) -> _SegmentData:
-        data = _SegmentData(
-            segment_index=self._next_segment_index,
-            sentence_stream=self._opts.sentence_tokenizer.stream(),
-        )
-        self._next_segment_index += 1
-        return data
-
     def _check_not_closed(self) -> None:
         if self._closed:
             raise RuntimeError("TTSForwarder is closed")
diff --git a/livekit-agents/livekit/agents/utils/aio/__init__.py b/livekit-agents/livekit/agents/utils/aio/__init__.py
index 803e12f73..df97e26e9 100644
--- a/livekit-agents/livekit/agents/utils/aio/__init__.py
+++ b/livekit-agents/livekit/agents/utils/aio/__init__.py
@@ -1,7 +1,7 @@
 import asyncio
-import contextlib
+import functools
 
-from . import debug, duplex_unix
+from . import debug, duplex_unix, itertools
 from .channel import Chan, ChanClosed, ChanReceiver, ChanSender
 from .interval import Interval, interval
 from .sleep import Sleep, SleepFinished, sleep
@@ -9,11 +9,28 @@
 
 
 async def gracefully_cancel(*futures: asyncio.Future):
-    for f in futures:
-        f.cancel()
+    loop = asyncio.get_running_loop()
+    waiters = []
 
-    with contextlib.suppress(asyncio.CancelledError):
-        await asyncio.gather(*futures)
+    for fut in futures:
+        waiter = loop.create_future()
+        cb = functools.partial(_release_waiter, waiter)
+        waiters.append((waiter, cb))
+        fut.add_done_callback(cb)
+        fut.cancel()
+
+    try:
+        for waiter, _ in waiters:
+            await waiter
+    finally:
+        for i, fut in enumerate(futures):
+            _, cb = waiters[i]
+            fut.remove_done_callback(cb)
+
+
+def _release_waiter(waiter, *args):
+    if not waiter.done():
+        waiter.set_result(None)
 
 
 __all__ = [
@@ -31,4 +48,5 @@ async def gracefully_cancel(*futures: asyncio.Future):
     "debug",
     "gracefully_cancel",
     "duplex_unix",
+    "itertools",
 ]
diff --git a/livekit-agents/livekit/agents/utils/aio/duplex_unix.py b/livekit-agents/livekit/agents/utils/aio/duplex_unix.py
index de9b1c446..a679c2ed2 100644
--- a/livekit-agents/livekit/agents/utils/aio/duplex_unix.py
+++ b/livekit-agents/livekit/agents/utils/aio/duplex_unix.py
@@ -36,8 +36,7 @@ async def recv_bytes(self) -> bytes:
             len = struct.unpack("!I", len_bytes)[0]
             return await self._reader.readexactly(len)
         except (
-            BrokenPipeError,
-            ConnectionResetError,
+            OSError,
             EOFError,
             asyncio.IncompleteReadError,
         ):
@@ -49,7 +48,7 @@ async def send_bytes(self, data: bytes) -> None:
             self._writer.write(len_bytes)
             self._writer.write(data)
             await self._writer.drain()
-        except (ConnectionResetError, BrokenPipeError):
+        except OSError:
             raise DuplexClosed()
 
     async def aclose(self) -> None:
@@ -57,7 +56,7 @@ async def aclose(self) -> None:
             self._writer.close()
             await self._writer.wait_closed()
             self._sock.close()
-        except (BrokenPipeError, ConnectionResetError):
+        except OSError:
             raise DuplexClosed()
 
 
@@ -80,25 +79,31 @@ def open(sock: socket.socket) -> _Duplex:
         return _Duplex(sock)
 
     def recv_bytes(self) -> bytes:
-        assert self._sock is not None
+        if self._sock is None:
+            raise DuplexClosed()
+
         try:
             len_bytes = _read_exactly(self._sock, 4)
             len = struct.unpack("!I", len_bytes)[0]
             return _read_exactly(self._sock, len)
-        except (BrokenPipeError, ConnectionResetError, EOFError):
+        except (OSError, EOFError):
             raise DuplexClosed()
 
     def send_bytes(self, data: bytes) -> None:
-        assert self._sock is not None
+        if self._sock is None:
+            raise DuplexClosed()
+
         try:
             len_bytes = struct.pack("!I", len(data))
             self._sock.sendall(len_bytes)
             self._sock.sendall(data)
-        except (BrokenPipeError, ConnectionResetError):
+        except OSError:
             raise DuplexClosed()
 
     def detach(self) -> socket.socket:
-        assert self._sock is not None
+        if self._sock is None:
+            raise DuplexClosed()
+
         sock = self._sock
         self._sock = None
         return sock
@@ -108,5 +113,5 @@ def close(self) -> None:
             if self._sock is not None:
                 self._sock.close()
                 self._sock = None
-        except (BrokenPipeError, ConnectionResetError):
+        except OSError:
             raise DuplexClosed()
diff --git a/livekit-agents/livekit/agents/utils/aio/itertools.py b/livekit-agents/livekit/agents/utils/aio/itertools.py
new file mode 100644
index 000000000..0076f8eb5
--- /dev/null
+++ b/livekit-agents/livekit/agents/utils/aio/itertools.py
@@ -0,0 +1,114 @@
+import asyncio
+from collections import deque
+from typing import (
+    Any,
+    AsyncGenerator,
+    AsyncIterable,
+    AsyncIterator,
+    Deque,
+    Generic,
+    Iterator,
+    List,
+    Protocol,
+    Tuple,
+    TypeVar,
+    Union,
+    overload,
+    runtime_checkable,
+)
+
+from typing_extensions import AsyncContextManager
+
+# based on https://github.com/maxfischer2781/asyncstdlib/blob/master/asyncstdlib/itertools.py
+
+
+@runtime_checkable
+class _ACloseable(Protocol):
+    async def aclose(self) -> None:
+        """Asynchronously close this object"""
+
+
+T = TypeVar("T")
+
+
+async def tee_peer(
+    iterator: AsyncIterator[T],
+    buffer: Deque[T],
+    peers: List[Deque[T]],
+    lock: AsyncContextManager[Any],
+) -> AsyncGenerator[T, None]:
+    try:
+        while True:
+            if not buffer:
+                async with lock:
+                    if buffer:
+                        continue
+                    try:
+                        item = await iterator.__anext__()
+                    except StopAsyncIteration:
+                        break
+                    else:
+                        for peer_buffer in peers:
+                            peer_buffer.append(item)
+            yield buffer.popleft()
+    finally:
+        for idx, peer_buffer in enumerate(peers):  # pragma: no branch
+            if peer_buffer is buffer:
+                peers.pop(idx)
+                break
+
+        if not peers and isinstance(iterator, _ACloseable):
+            await iterator.aclose()
+
+
+class Tee(Generic[T]):
+    __slots__ = ("_iterator", "_buffers", "_children")
+
+    def __init__(
+        self,
+        iterator: AsyncIterable[T],
+        n: int = 2,
+    ):
+        self._iterator = iterator.__aiter__()
+        self._buffers: List[Deque[T]] = [deque() for _ in range(n)]
+
+        lock = asyncio.Lock()
+        self._children = tuple(
+            tee_peer(
+                iterator=self._iterator,
+                buffer=buffer,
+                peers=self._buffers,
+                lock=lock,
+            )
+            for buffer in self._buffers
+        )
+
+    def __len__(self) -> int:
+        return len(self._children)
+
+    @overload
+    def __getitem__(self, item: int) -> AsyncIterator[T]: ...
+
+    @overload
+    def __getitem__(self, item: slice) -> Tuple[AsyncIterator[T], ...]: ...
+
+    def __getitem__(
+        self, item: Union[int, slice]
+    ) -> Union[AsyncIterator[T], Tuple[AsyncIterator[T], ...]]:
+        return self._children[item]
+
+    def __iter__(self) -> Iterator[AsyncIterator[T]]:
+        yield from self._children
+
+    async def __aenter__(self) -> "Tee[T]":
+        return self
+
+    async def __aexit__(self, exc_type: Any, exc_val: Any, exc_tb: Any) -> None:
+        await self.aclose()
+
+    async def aclose(self) -> None:
+        for child in self._children:
+            await child.aclose()
+
+
+tee = Tee
diff --git a/livekit-agents/livekit/agents/utils/audio.py b/livekit-agents/livekit/agents/utils/audio.py
index 2c6975b0d..a8cbf1c17 100644
--- a/livekit-agents/livekit/agents/utils/audio.py
+++ b/livekit-agents/livekit/agents/utils/audio.py
@@ -18,7 +18,7 @@ def __init__(
         self._num_channels = num_channels
 
         if samples_per_channel is None:
-            samples_per_channel = sample_rate // 50  # 20ms by default
+            samples_per_channel = sample_rate // 10  # 100ms by default
 
         self._bytes_per_frame = (
             num_channels * samples_per_channel * ctypes.sizeof(ctypes.c_int16)
diff --git a/livekit-agents/livekit/agents/utils/misc.py b/livekit-agents/livekit/agents/utils/misc.py
index 7720a53db..f85ae15b7 100644
--- a/livekit-agents/livekit/agents/utils/misc.py
+++ b/livekit-agents/livekit/agents/utils/misc.py
@@ -1,3 +1,5 @@
+from __future__ import annotations
+
 import time
 import uuid
 from typing import List, Union
diff --git a/livekit-agents/livekit/agents/vad.py b/livekit-agents/livekit/agents/vad.py
index 55c64a5b8..ea42e9158 100644
--- a/livekit-agents/livekit/agents/vad.py
+++ b/livekit-agents/livekit/agents/vad.py
@@ -27,7 +27,11 @@ class VADEvent:
     silence_duration: float
     """duration of the silence in seconds"""
     frames: List[rtc.AudioFrame] = field(default_factory=list)
-    """list of audio frames of the speech"""
+    """list of audio frames of the speech
+
+    start_of_speech: contains the complete audio chunks that triggered the detection)
+    end_of_speech: contains the complete user speech
+    """
     probability: float = 0.0
     """smoothed probability of the speech (only for INFERENCE_DONE event)"""
     inference_duration: float = 0.0
@@ -65,7 +69,7 @@ def __init__(self):
         self._task.add_done_callback(lambda _: self._event_ch.close())
 
     @abstractmethod
-    def _main_task(self) -> None: ...
+    async def _main_task(self) -> None: ...
 
     def push_frame(self, frame: rtc.AudioFrame) -> None:
         """Push some text to be synthesized"""
diff --git a/livekit-agents/livekit/agents/version.py b/livekit-agents/livekit/agents/version.py
index 3292eba82..654ad56ec 100644
--- a/livekit-agents/livekit/agents/version.py
+++ b/livekit-agents/livekit/agents/version.py
@@ -12,4 +12,4 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-__version__ = "0.8.5"
+__version__ = "0.9.0"
diff --git a/livekit-agents/livekit/agents/voice_assistant/__init__.py b/livekit-agents/livekit/agents/voice_assistant/__init__.py
index c36009752..f151ac5d4 100644
--- a/livekit-agents/livekit/agents/voice_assistant/__init__.py
+++ b/livekit-agents/livekit/agents/voice_assistant/__init__.py
@@ -4,4 +4,8 @@
     VoiceAssistant,
 )
 
-__all__ = ["VoiceAssistant", "AssistantCallContext", "AssistantTranscriptionOptions"]
+__all__ = [
+    "VoiceAssistant",
+    "AssistantCallContext",
+    "AssistantTranscriptionOptions",
+]
diff --git a/livekit-agents/livekit/agents/voice_assistant/agent_output.py b/livekit-agents/livekit/agents/voice_assistant/agent_output.py
index ec50fde98..f7747af6b 100644
--- a/livekit-agents/livekit/agents/voice_assistant/agent_output.py
+++ b/livekit-agents/livekit/agents/voice_assistant/agent_output.py
@@ -2,7 +2,7 @@
 
 import asyncio
 import time
-from typing import Any, AsyncIterable, Callable, Union
+from typing import Any, AsyncIterable, Awaitable, Callable, Union
 
 from livekit import rtc
 
@@ -12,7 +12,7 @@
 from .agent_playout import AgentPlayout, PlayoutHandle
 from .log import logger
 
-SpeechSource = Union[AsyncIterable[str], str]
+SpeechSource = Union[AsyncIterable[str], str, Awaitable[str]]
 
 
 class SynthesisHandle:
@@ -20,13 +20,21 @@ def __init__(
         self,
         *,
         speech_id: str,
-        speech_source: SpeechSource,
+        tts_source: SpeechSource,
+        transcript_source: SpeechSource,
         agent_playout: AgentPlayout,
         tts: text_to_speech.TTS,
         transcription_fwd: agent_transcription.TTSSegmentsForwarder,
     ) -> None:
-        self._speech_source, self._agent_playout, self._tts, self._tr_fwd = (
-            speech_source,
+        (
+            self._tts_source,
+            self._transcript_source,
+            self._agent_playout,
+            self._tts,
+            self._tr_fwd,
+        ) = (
+            tts_source,
+            transcript_source,
             agent_playout,
             tts,
             transcription_fwd,
@@ -113,14 +121,15 @@ def synthesize(
         self,
         *,
         speech_id: str,
-        transcript: SpeechSource,
+        tts_source: SpeechSource,
+        transcript_source: SpeechSource,
         transcription: bool,
         transcription_speed: float,
         sentence_tokenizer: tokenize.SentenceTokenizer,
         word_tokenizer: tokenize.WordTokenizer,
         hyphenate_word: Callable[[str], list[str]],
     ) -> SynthesisHandle:
-        def _will_forward_transcription(
+        def _before_forward(
             fwd: agent_transcription.TTSSegmentsForwarder,
             transcription: rtc.Transcription,
         ):
@@ -136,11 +145,12 @@ def _will_forward_transcription(
             sentence_tokenizer=sentence_tokenizer,
             word_tokenizer=word_tokenizer,
             hyphenate_word=hyphenate_word,
-            will_forward_transcription=_will_forward_transcription,
+            before_forward_cb=_before_forward,
         )
 
         handle = SynthesisHandle(
-            speech_source=transcript,
+            tts_source=tts_source,
+            transcript_source=transcript_source,
             agent_playout=self._agent_playout,
             tts=self._tts,
             transcription_fwd=transcription_fwd,
@@ -155,10 +165,16 @@ def _will_forward_transcription(
     @utils.log_exceptions(logger=logger)
     async def _synthesize_task(self, handle: SynthesisHandle) -> None:
         """Synthesize speech from the source"""
-        if isinstance(handle._speech_source, str):
-            co = _str_synthesis_task(handle._speech_source, handle)
+        tts_source = handle._tts_source
+        transcript_source = handle._transcript_source
+
+        if isinstance(tts_source, Awaitable):
+            tts_source = await tts_source
+            co = _str_synthesis_task(tts_source, transcript_source, handle)
+        elif isinstance(tts_source, str):
+            co = _str_synthesis_task(tts_source, transcript_source, handle)
         else:
-            co = _stream_synthesis_task(handle._speech_source, handle)
+            co = _stream_synthesis_task(tts_source, transcript_source, handle)
 
         synth = asyncio.create_task(co)
         synth.add_done_callback(lambda _: handle._buf_ch.close())
@@ -171,17 +187,19 @@ async def _synthesize_task(self, handle: SynthesisHandle) -> None:
 
 
 @utils.log_exceptions(logger=logger)
-async def _str_synthesis_task(text: str, handle: SynthesisHandle) -> None:
+async def _str_synthesis_task(
+    tts_text: str, transcript: str, handle: SynthesisHandle
+) -> None:
     """synthesize speech from a string"""
     if not handle.tts_forwarder.closed:
-        handle.tts_forwarder.push_text(text)
+        handle.tts_forwarder.push_text(transcript)
         handle.tts_forwarder.mark_text_segment_end()
 
     start_time = time.time()
     first_frame = True
 
     try:
-        async for audio in handle._tts.synthesize(text):
+        async for audio in handle._tts.synthesize(tts_text):
             if first_frame:
                 first_frame = False
                 logger.debug(
@@ -206,7 +224,9 @@ async def _str_synthesis_task(text: str, handle: SynthesisHandle) -> None:
 
 @utils.log_exceptions(logger=logger)
 async def _stream_synthesis_task(
-    streamed_text: AsyncIterable[str], handle: SynthesisHandle
+    tts_source: AsyncIterable[str],
+    transcript_source: AsyncIterable[str],
+    handle: SynthesisHandle,
 ) -> None:
     """synthesize speech from streamed text"""
 
@@ -232,33 +252,41 @@ async def _read_generated_audio_task():
             handle._buf_ch.send_nowait(audio.frame)
 
         if handle._tr_fwd and not handle._tr_fwd.closed:
-            # mark_audio_segment_end must be called *after* mart_text_segment_end
             handle._tr_fwd.mark_audio_segment_end()
 
+    @utils.log_exceptions(logger=logger)
+    async def _read_transcript_task():
+        async for seg in transcript_source:
+            if not handle._tr_fwd.closed:
+                handle._tr_fwd.push_text(seg)
+
+        if not handle.tts_forwarder.closed:
+            handle.tts_forwarder.mark_text_segment_end()
+
     # otherwise, stream the text to the TTS
     tts_stream = handle._tts.stream()
-    read_atask: asyncio.Task | None = None
+    read_tts_atask: asyncio.Task | None = None
+    read_transcript_atask: asyncio.Task | None = None
 
     try:
-        async for seg in streamed_text:
-            if not handle.tts_forwarder.closed:
-                handle.tts_forwarder.push_text(seg)
-
-            if read_atask is None:
+        async for seg in tts_source:
+            if read_tts_atask is None:
                 # start the task when we receive the first text segment (so start_time is more accurate)
-                read_atask = asyncio.create_task(_read_generated_audio_task())
+                read_tts_atask = asyncio.create_task(_read_generated_audio_task())
+                read_transcript_atask = asyncio.create_task(_read_transcript_task())
 
             tts_stream.push_text(seg)
 
-        if not handle.tts_forwarder.closed:
-            handle.tts_forwarder.mark_text_segment_end()
-
         tts_stream.end_input()
 
-        if read_atask is not None:
-            await read_atask
+        if read_tts_atask is not None:
+            assert read_transcript_atask is not None
+            await read_tts_atask
+            await read_transcript_atask
+
     finally:
-        if read_atask is not None:
-            await utils.aio.gracefully_cancel(read_atask)
+        if read_tts_atask is not None:
+            assert read_transcript_atask is not None
+            await utils.aio.gracefully_cancel(read_tts_atask, read_transcript_atask)
 
         await tts_stream.aclose()
diff --git a/livekit-agents/livekit/agents/voice_assistant/agent_playout.py b/livekit-agents/livekit/agents/voice_assistant/agent_playout.py
index ee32f3608..cd7ddc320 100644
--- a/livekit-agents/livekit/agents/voice_assistant/agent_playout.py
+++ b/livekit-agents/livekit/agents/voice_assistant/agent_playout.py
@@ -15,16 +15,22 @@ class PlayoutHandle:
     def __init__(
         self,
         speech_id: str,
+        audio_source: rtc.AudioSource,
         playout_source: AsyncIterable[rtc.AudioFrame],
         transcription_fwd: transcription.TTSSegmentsForwarder,
     ) -> None:
         self._playout_source = playout_source
+        self._audio_source = audio_source
         self._tr_fwd = transcription_fwd
         self._interrupted = False
-        self._time_played = 0.0
+        self._int_fut = asyncio.Future[None]()
         self._done_fut = asyncio.Future[None]()
         self._speech_id = speech_id
 
+        self._pushed_duration = 0.0
+
+        self._total_played_time: float | None = None  # set whem the playout is done
+
     @property
     def speech_id(self) -> str:
         return self._speech_id
@@ -35,15 +41,19 @@ def interrupted(self) -> bool:
 
     @property
     def time_played(self) -> float:
-        return self._time_played
+        if self._total_played_time is not None:
+            return self._total_played_time
+
+        return self._pushed_duration - self._audio_source.queued_duration
 
     def done(self) -> bool:
-        return self._done_fut.done()
+        return self._done_fut.done() or self._interrupted
 
     def interrupt(self) -> None:
         if self.done():
             return
 
+        self._int_fut.set_result(None)
         self._interrupted = True
 
     def join(self) -> asyncio.Future:
@@ -51,9 +61,9 @@ def join(self) -> asyncio.Future:
 
 
 class AgentPlayout(utils.EventEmitter[EventTypes]):
-    def __init__(self, *, source: rtc.AudioSource, alpha: float = 0.95) -> None:
+    def __init__(self, *, audio_source: rtc.AudioSource) -> None:
         super().__init__()
-        self._source = source
+        self._audio_source = audio_source
         self._target_volume = 1.0
         self._playout_atask: asyncio.Task[None] | None = None
         self._closed = False
@@ -90,6 +100,7 @@ def play(
 
         handle = PlayoutHandle(
             speech_id=speech_id,
+            audio_source=self._audio_source,
             playout_source=playout_source,
             transcription_fwd=transcription_fwd,
         )
@@ -103,12 +114,24 @@ def play(
     async def _playout_task(
         self, old_task: asyncio.Task[None] | None, handle: PlayoutHandle
     ) -> None:
-        first_frame = True
+        if old_task is not None:
+            await utils.aio.gracefully_cancel(old_task)
 
-        try:
-            if old_task is not None:
-                await utils.aio.gracefully_cancel(old_task)
+        if self._audio_source.queued_duration > 0:
+            # this should not happen, but log it just in case
+            logger.warning(
+                "new playout while the source is still playing",
+                extra={
+                    "speech_id": handle.speech_id,
+                    "queued_duration": self._audio_source.queued_duration,
+                },
+            )
+
+        first_frame = True
 
+        @utils.log_exceptions(logger=logger)
+        async def _capture_task():
+            nonlocal first_frame
             async for frame in handle._playout_source:
                 if first_frame:
                     handle._tr_fwd.segment_playout_started()
@@ -121,29 +144,27 @@ async def _playout_task(
                     self.emit("playout_started")
                     first_frame = False
 
-                if handle.interrupted:
-                    break
-
-                # divide the frame by chunks of 20ms
-                ms20 = frame.sample_rate // 50
-                i = 0
-                while i < len(frame.data):
-                    if handle.interrupted:
-                        break
-
-                    rem = min(ms20, len(frame.data) - i)
-                    data = frame.data[i : i + rem]
-                    i += rem
-
-                    chunk_frame = rtc.AudioFrame(
-                        data=data.tobytes(),
-                        sample_rate=frame.sample_rate,
-                        num_channels=frame.num_channels,
-                        samples_per_channel=rem,
-                    )
-                    await self._source.capture_frame(chunk_frame)
-                    handle._time_played += rem / frame.sample_rate
+                handle._pushed_duration += frame.samples_per_channel / frame.sample_rate
+                await self._audio_source.capture_frame(frame)
+
+            await self._audio_source.wait_for_playout()
+
+        capture_task = asyncio.create_task(_capture_task())
+        try:
+            await asyncio.wait(
+                [capture_task, handle._int_fut],
+                return_when=asyncio.FIRST_COMPLETED,
+            )
         finally:
+            await utils.aio.gracefully_cancel(capture_task)
+
+            handle._total_played_time = (
+                handle._pushed_duration - self._audio_source.queued_duration
+            )
+
+            if handle.interrupted or capture_task.exception():
+                self._audio_source.clear_queue()  # make sure to remove any queued frames
+
             if not first_frame:
                 if not handle.interrupted:
                     handle._tr_fwd.segment_playout_finished()
diff --git a/livekit-agents/livekit/agents/voice_assistant/human_input.py b/livekit-agents/livekit/agents/voice_assistant/human_input.py
index a3ddc5248..22fec121e 100644
--- a/livekit-agents/livekit/agents/voice_assistant/human_input.py
+++ b/livekit-agents/livekit/agents/voice_assistant/human_input.py
@@ -101,7 +101,7 @@ async def _recognize_task(self, audio_stream: rtc.AudioStream) -> None:
         vad_stream = self._vad.stream()
         stt_stream = self._stt.stream()
 
-        def _will_forward_transcription(
+        def _before_forward(
             fwd: transcription.STTSegmentsForwarder, transcription: rtc.Transcription
         ):
             if not self._transcription:
@@ -113,7 +113,7 @@ def _will_forward_transcription(
             room=self._room,
             participant=self._participant,
             track=self._subscribed_track,
-            will_forward_transcription=_will_forward_transcription,
+            before_forward_cb=_before_forward,
         )
 
         async def _audio_stream_co() -> None:
diff --git a/livekit-agents/livekit/agents/voice_assistant/plotter.py b/livekit-agents/livekit/agents/voice_assistant/plotter.py
index 3b8d583ae..c0a9a1ca9 100644
--- a/livekit-agents/livekit/agents/voice_assistant/plotter.py
+++ b/livekit-agents/livekit/agents/voice_assistant/plotter.py
@@ -1,10 +1,14 @@
 import asyncio
+import contextlib
 import io
 import multiprocessing as mp
+import selectors
+import socket
 import time
 from dataclasses import dataclass
 from typing import ClassVar, Literal, Tuple
 
+from .. import utils
 from ..ipc import channel
 
 PlotType = Literal["vad_probability", "raw_vol", "smoothed_vol"]
@@ -57,7 +61,7 @@ def read(self, b: io.BytesIO) -> None:
 }
 
 
-def _draw_plot(reader):
+def _draw_plot(mp_cch):
     try:
         import matplotlib as mpl  # type: ignore
         import matplotlib.pyplot as plt  # type: ignore
@@ -77,11 +81,18 @@ def _draw_plot(reader):
 
     max_points = 250
 
-    plot_rx = channel.ProcChannel(conn=reader, messages=PLT_MESSAGES)
+    duplex = utils.aio.duplex_unix._Duplex.open(mp_cch)
+
+    selector = selectors.DefaultSelector()
+    selector.register(mp_cch, selectors.EVENT_READ)
 
     def _draw_cb(sp, pv):
-        while reader.poll():
-            msg = plot_rx.recv()
+        while True:
+            events = selector.select(timeout=0.01)
+            if not events:
+                break
+
+            msg = channel.recv_message(duplex, PLT_MESSAGES)
             if isinstance(msg, PlotMessage):
                 data = plot_data.setdefault(msg.which, ([], []))
                 data[0].append(msg.x)
@@ -129,7 +140,7 @@ def _draw_cb(sp, pv):
 
         fig.canvas.draw()
 
-    timer = fig.canvas.new_timer(interval=150)
+    timer = fig.canvas.new_timer(interval=33)
     timer.add_callback(_draw_cb, sp, pv)
     timer.start()
     plt.show()
@@ -140,18 +151,18 @@ def __init__(self, loop: asyncio.AbstractEventLoop) -> None:
         self._loop = loop
         self._started = False
 
-    def start(self):
+    async def start(self):
         if self._started:
             return
 
-        mp_pch, mp_cch = mp.Pipe(duplex=True)
-        self._plot_tx = channel.AsyncProcChannel(
-            conn=mp_pch, loop=self._loop, messages=PLT_MESSAGES
-        )
+        mp_pch, mp_cch = socket.socketpair()
+        self._duplex = await utils.aio.duplex_unix._AsyncDuplex.open(mp_pch)
         self._plot_proc = mp.Process(target=_draw_plot, args=(mp_cch,), daemon=True)
         self._plot_proc.start()
+        mp_cch.close()
 
         self._started = True
+        self._closed = False
         self._start_time = time.time()
 
     def plot_value(self, which: PlotType, y: float):
@@ -159,17 +170,32 @@ def plot_value(self, which: PlotType, y: float):
             return
 
         ts = time.time() - self._start_time
-        asyncio.ensure_future(self._plot_tx.asend(PlotMessage(which=which, x=ts, y=y)))
+        self._send_message(PlotMessage(which=which, x=ts, y=y))
 
     def plot_event(self, which: EventType):
         if not self._started:
             return
 
         ts = time.time() - self._start_time
-        asyncio.ensure_future(self._plot_tx.asend(PlotEventMessage(which=which, x=ts)))
+        self._send_message(PlotEventMessage(which=which, x=ts))
+
+    def _send_message(self, msg: channel.Message) -> None:
+        if self._closed:
+            return
+
+        async def _asend_message():
+            try:
+                await channel.asend_message(self._duplex, msg)
+            except Exception:
+                self._closed = True
 
-    def terminate(self):
+        asyncio.ensure_future(_asend_message())
+
+    async def terminate(self):
         if not self._started:
             return
 
         self._plot_proc.terminate()
+
+        with contextlib.suppress(utils.aio.duplex_unix.DuplexClosed):
+            await self._duplex.aclose()
diff --git a/livekit-agents/livekit/agents/voice_assistant/speech_handle.py b/livekit-agents/livekit/agents/voice_assistant/speech_handle.py
new file mode 100644
index 000000000..684bf1933
--- /dev/null
+++ b/livekit-agents/livekit/agents/voice_assistant/speech_handle.py
@@ -0,0 +1,153 @@
+from __future__ import annotations
+
+import asyncio
+from typing import AsyncIterable
+
+from .. import utils
+from ..llm import LLMStream
+from .agent_output import SynthesisHandle
+
+
+class SpeechHandle:
+    def __init__(
+        self,
+        *,
+        id: str,
+        allow_interruptions: bool,
+        add_to_chat_ctx: bool,
+        is_reply: bool,
+        user_question: str,
+    ) -> None:
+        self._id = id
+        self._allow_interruptions = allow_interruptions
+        self._add_to_chat_ctx = add_to_chat_ctx
+
+        # is_reply is True when the speech is answering to a user question
+        self._is_reply = is_reply
+        self._user_question = user_question
+        self._user_commited = False
+
+        self._init_fut: asyncio.Future[None] = asyncio.Future()
+        self._initialized = False
+        self._speech_commited = False  # speech committed (interrupted or not)
+
+        # source and synthesis_handle are None until the speech is initialized
+        self._source: str | LLMStream | AsyncIterable[str] | None = None
+        self._synthesis_handle: SynthesisHandle | None = None
+
+    @staticmethod
+    def create_assistant_reply(
+        *,
+        allow_interruptions: bool,
+        add_to_chat_ctx: bool,
+        user_question: str,
+    ) -> SpeechHandle:
+        return SpeechHandle(
+            id=utils.shortuuid(),
+            allow_interruptions=allow_interruptions,
+            add_to_chat_ctx=add_to_chat_ctx,
+            is_reply=True,
+            user_question=user_question,
+        )
+
+    @staticmethod
+    def create_assistant_speech(
+        *,
+        allow_interruptions: bool,
+        add_to_chat_ctx: bool,
+    ) -> SpeechHandle:
+        return SpeechHandle(
+            id=utils.shortuuid(),
+            allow_interruptions=allow_interruptions,
+            add_to_chat_ctx=add_to_chat_ctx,
+            is_reply=False,
+            user_question="",
+        )
+
+    async def wait_for_initialization(self) -> None:
+        await asyncio.shield(self._init_fut)
+
+    def initialize(
+        self,
+        *,
+        source: str | LLMStream | AsyncIterable[str],
+        synthesis_handle: SynthesisHandle,
+    ) -> None:
+        if self.interrupted:
+            raise RuntimeError("speech is interrupted")
+
+        self._source = source
+        self._synthesis_handle = synthesis_handle
+        self._initialized = True
+        self._init_fut.set_result(None)
+
+    def mark_user_commited(self) -> None:
+        self._user_commited = True
+
+    def mark_speech_commited(self) -> None:
+        self._speech_commited = True
+
+    @property
+    def user_commited(self) -> bool:
+        return self._user_commited
+
+    @property
+    def speech_commited(self) -> bool:
+        return self._speech_commited
+
+    @property
+    def id(self) -> str:
+        return self._id
+
+    @property
+    def allow_interruptions(self) -> bool:
+        return self._allow_interruptions
+
+    @property
+    def add_to_chat_ctx(self) -> bool:
+        return self._add_to_chat_ctx
+
+    @property
+    def source(self) -> str | LLMStream | AsyncIterable[str]:
+        if self._source is None:
+            raise RuntimeError("speech not initialized")
+        return self._source
+
+    @property
+    def synthesis_handle(self) -> SynthesisHandle:
+        if self._synthesis_handle is None:
+            raise RuntimeError("speech not initialized")
+        return self._synthesis_handle
+
+    @synthesis_handle.setter
+    def synthesis_handle(self, synthesis_handle: SynthesisHandle) -> None:
+        """synthesis handle can be replaced for the same speech.
+        This is useful when we need to do a new generation. (e.g for automatic function call answers)"""
+        if self._synthesis_handle is None:
+            raise RuntimeError("speech not initialized")
+
+        self._synthesis_handle = synthesis_handle
+
+    @property
+    def initialized(self) -> bool:
+        return self._initialized
+
+    @property
+    def is_reply(self) -> bool:
+        return self._is_reply
+
+    @property
+    def user_question(self) -> str:
+        return self._user_question
+
+    @property
+    def interrupted(self) -> bool:
+        return self._init_fut.cancelled() or (
+            self._synthesis_handle is not None and self._synthesis_handle.interrupted
+        )
+
+    def interrupt(self) -> None:
+        self._init_fut.cancel()
+
+        if self._synthesis_handle is not None:
+            self._synthesis_handle.interrupt()
diff --git a/livekit-agents/livekit/agents/voice_assistant/voice_assistant.py b/livekit-agents/livekit/agents/voice_assistant/voice_assistant.py
index 60ceed1d0..a1c7e465e 100644
--- a/livekit-agents/livekit/agents/voice_assistant/voice_assistant.py
+++ b/livekit-agents/livekit/agents/voice_assistant/voice_assistant.py
@@ -10,31 +10,27 @@
 
 from .. import stt, tokenize, tts, utils, vad
 from ..llm import LLM, ChatContext, ChatMessage, FunctionContext, LLMStream
+from ..proto import ATTR_AGENT_STATE, AgentState
 from .agent_output import AgentOutput, SynthesisHandle
 from .agent_playout import AgentPlayout
 from .human_input import HumanInput
 from .log import logger
 from .plotter import AssistantPlotter
+from .speech_handle import SpeechHandle
 
+BeforeLLMCallback = Callable[
+    ["VoiceAssistant", ChatContext],
+    Union[Optional[LLMStream], Awaitable[Optional[LLMStream]], Literal[False]],
+]
 
-@dataclass
-class _SpeechInfo:
-    id: str  # useful to recognize a specific speech in logs
-    source: str | LLMStream | AsyncIterable[str]
-    allow_interruptions: bool
-    add_to_chat_ctx: bool
-    synthesis_handle: SynthesisHandle
-
-    # is_reply = True when the speech is answering to a user question
-    is_reply: bool = False
-    user_question: str = ""
-
+WillSynthesizeAssistantReply = BeforeLLMCallback
 
-WillSynthesizeAssistantReply = Callable[
-    ["VoiceAssistant", ChatContext],
-    Union[Optional[LLMStream], Awaitable[Optional[LLMStream]]],
+BeforeTTSCallback = Callable[
+    ["VoiceAssistant", Union[str, AsyncIterable[str]]],
+    Union[str, AsyncIterable[str], Awaitable[str]],
 ]
 
+
 EventTypes = Literal[
     "user_started_speaking",
     "user_stopped_speaking",
@@ -47,7 +43,6 @@ class _SpeechInfo:
     "function_calls_finished",
 ]
 
-
 _CallContextVar = contextvars.ContextVar["AssistantCallContext"](
     "voice_assistant_contextvar"
 )
@@ -77,10 +72,19 @@ def llm_stream(self) -> LLMStream:
         return self._llm_stream
 
 
-def _default_will_synthesize_assistant_reply(
+def _default_before_llm_cb(
     assistant: VoiceAssistant, chat_ctx: ChatContext
 ) -> LLMStream:
-    return assistant.llm.chat(chat_ctx=chat_ctx, fnc_ctx=assistant.fnc_ctx)
+    return assistant.llm.chat(
+        chat_ctx=chat_ctx,
+        fnc_ctx=assistant.fnc_ctx,
+    )
+
+
+def _default_before_tts_cb(
+    assistant: VoiceAssistant, text: str | AsyncIterable[str]
+) -> str | AsyncIterable[str]:
+    return text
 
 
 @dataclass(frozen=True)
@@ -88,8 +92,10 @@ class _ImplOptions:
     allow_interruptions: bool
     int_speech_duration: float
     int_min_words: int
+    min_endpointing_delay: float
     preemptive_synthesis: bool
-    will_synthesize_assistant_reply: WillSynthesizeAssistantReply
+    before_llm_cb: BeforeLLMCallback
+    before_tts_cb: BeforeTTSCallback
     plotting: bool
     transcription: AssistantTranscriptionOptions
 
@@ -106,7 +112,9 @@ class AssistantTranscriptionOptions:
     sentence_tokenizer: tokenize.SentenceTokenizer = tokenize.basic.SentenceTokenizer()
     """The tokenizer used to split the speech into sentences.
     This is used to decide when to mark a transcript as final for the agent transcription."""
-    word_tokenizer: tokenize.WordTokenizer = tokenize.basic.WordTokenizer()
+    word_tokenizer: tokenize.WordTokenizer = tokenize.basic.WordTokenizer(
+        ignore_punctuation=False
+    )
     """The tokenizer used to split the speech into words.
     This is used to simulate the "interim results" of the agent transcription."""
     hyphenate_word: Callable[[str], list[str]] = tokenize.basic.hyphenate_word
@@ -130,11 +138,15 @@ def __init__(
         allow_interruptions: bool = True,
         interrupt_speech_duration: float = 0.5,
         interrupt_min_words: int = 0,
+        min_endpointing_delay: float = 0.5,
         preemptive_synthesis: bool = True,
         transcription: AssistantTranscriptionOptions = AssistantTranscriptionOptions(),
-        will_synthesize_assistant_reply: WillSynthesizeAssistantReply = _default_will_synthesize_assistant_reply,
+        before_llm_cb: BeforeLLMCallback = _default_before_llm_cb,
+        before_tts_cb: BeforeTTSCallback = _default_before_tts_cb,
         plotting: bool = False,
         loop: asyncio.AbstractEventLoop | None = None,
+        # backward compatibility
+        will_synthesize_assistant_reply: WillSynthesizeAssistantReply | None = None,
     ) -> None:
         """
         Create a new VoiceAssistant.
@@ -150,23 +162,41 @@ def __init__(
             interrupt_speech_duration: Minimum duration of speech to consider for interruption.
             interrupt_min_words: Minimum number of words to consider for interruption.
                 Defaults to 0 as this may increase the latency depending on the STT.
+            min_endpointing_delay: Delay to wait before considering the user finished speaking.
             preemptive_synthesis: Whether to preemptively synthesize responses.
             transcription: Options for assistant transcription.
-            will_synthesize_assistant_reply: Callback called when the assistant is about to synthesize a reply.
+            before_llm_cb: Callback called when the assistant is about to synthesize a reply.
                 This can be used to customize the reply (e.g: inject context/RAG).
+
+                Returning None will create a default LLM stream. You can also return your own llm
+                stream by calling the llm.chat() method.
+
+                Returning False will cancel the synthesis of the reply.
+            before_tts_cb: Callback called when the assistant is about to
+                synthesize a speech. This can be used to customize text before the speech synthesis.
+                (e.g: editing the pronunciation of a word).
             plotting: Whether to enable plotting for debugging. matplotlib must be installed.
             loop: Event loop to use. Default to asyncio.get_event_loop().
         """
         super().__init__()
         self._loop = loop or asyncio.get_event_loop()
+
+        if will_synthesize_assistant_reply is not None:
+            logger.warning(
+                "will_synthesize_assistant_reply is deprecated and will be removed in 1.5.0, use before_llm_cb instead",
+            )
+            before_llm_cb = will_synthesize_assistant_reply
+
         self._opts = _ImplOptions(
             plotting=plotting,
             allow_interruptions=allow_interruptions,
             int_speech_duration=interrupt_speech_duration,
             int_min_words=interrupt_min_words,
+            min_endpointing_delay=min_endpointing_delay,
             preemptive_synthesis=preemptive_synthesis,
             transcription=transcription,
-            will_synthesize_assistant_reply=will_synthesize_assistant_reply,
+            before_llm_cb=before_llm_cb,
+            before_tts_cb=before_tts_cb,
         )
         self._plotter = AssistantPlotter(self._loop)
 
@@ -199,21 +229,25 @@ def __init__(
         # done when the agent output track is published
         self._track_published_fut = asyncio.Future[None]()
 
-        self._pending_agent_reply: _SpeechInfo | None = None
-        self._pending_agent_reply_task: asyncio.Task[None] | None = None
+        self._pending_agent_reply: SpeechHandle | None = None
+        self._agent_reply_task: asyncio.Task[None] | None = None
 
-        self._playing_speech: _SpeechInfo | None = None
+        self._playing_speech: SpeechHandle | None = None
         self._transcribed_text, self._transcribed_interim_text = "", ""
 
         self._deferred_validation = _DeferredReplyValidation(
-            self._validate_reply_if_possible, loop=self._loop
+            self._validate_reply_if_possible,
+            self._opts.min_endpointing_delay,
+            loop=self._loop,
         )
 
-        self._speech_q: list[_SpeechInfo] = []
+        self._speech_q: list[SpeechHandle] = []
         self._speech_q_changed = asyncio.Event()
 
         self._last_end_of_speech_time: float | None = None
 
+        self._update_state_task: asyncio.Task | None = None
+
     @property
     def fnc_ctx(self) -> FunctionContext | None:
         return self._fnc_ctx
@@ -306,16 +340,30 @@ async def say(
             add_to_chat_ctx: Whether to add the speech to the chat context.
         """
         await self._track_published_fut
-        speech_id = utils.shortuuid()
-        self._add_speech_for_playout(
-            _SpeechInfo(
-                id=speech_id,
-                source=source,
-                allow_interruptions=allow_interruptions,
-                add_to_chat_ctx=add_to_chat_ctx,
-                synthesis_handle=self._synthesize_agent_speech(speech_id, source),
-            )
+
+        new_handle = SpeechHandle.create_assistant_speech(
+            allow_interruptions=allow_interruptions, add_to_chat_ctx=add_to_chat_ctx
         )
+        synthesis_handle = self._synthesize_agent_speech(new_handle.id, source)
+        new_handle.initialize(source=source, synthesis_handle=synthesis_handle)
+        self._add_speech_for_playout(new_handle)
+
+    def _update_state(self, state: AgentState, delay: float = 0.0):
+        """Set the current state of the agent"""
+
+        @utils.log_exceptions(logger=logger)
+        async def _run_task(delay: float) -> None:
+            await asyncio.sleep(delay)
+
+            if self._room.isconnected():
+                await self._room.local_participant.set_attributes(
+                    {ATTR_AGENT_STATE: state}
+                )
+
+        if self._update_state_task is not None:
+            self._update_state_task.cancel()
+
+        self._update_state_task = asyncio.create_task(_run_task(delay))
 
     async def aclose(self) -> None:
         """Close the voice assistant"""
@@ -349,6 +397,7 @@ def _on_start_of_speech(ev: vad.VADEvent) -> None:
             self._plotter.plot_event("user_started_speaking")
             self.emit("user_started_speaking")
             self._deferred_validation.on_human_start_of_speech(ev)
+            self._update_state("listening")
 
         def _on_vad_updated(ev: vad.VADEvent) -> None:
             if not self._track_published_fut.done():
@@ -407,15 +456,16 @@ def _on_final_transcript(ev: stt.SpeechEvent) -> None:
     @utils.log_exceptions(logger=logger)
     async def _main_task(self) -> None:
         if self._opts.plotting:
-            self._plotter.start()
+            await self._plotter.start()
 
+        self._update_state("initializing")
         audio_source = rtc.AudioSource(self._tts.sample_rate, self._tts.num_channels)
         track = rtc.LocalAudioTrack.create_audio_track("assistant_voice", audio_source)
         self._agent_publication = await self._room.local_participant.publish_track(
             track, rtc.TrackPublishOptions(source=rtc.TrackSource.SOURCE_MICROPHONE)
         )
 
-        agent_playout = AgentPlayout(source=audio_source)
+        agent_playout = AgentPlayout(audio_source=audio_source)
         self._agent_output = AgentOutput(
             room=self._room,
             agent_playout=agent_playout,
@@ -426,10 +476,12 @@ async def _main_task(self) -> None:
         def _on_playout_started() -> None:
             self._plotter.plot_event("agent_started_speaking")
             self.emit("agent_started_speaking")
+            self._update_state("speaking")
 
         def _on_playout_stopped(interrupted: bool) -> None:
             self._plotter.plot_event("agent_stopped_speaking")
             self.emit("agent_stopped_speaking")
+            self._update_state("listening")
 
         agent_playout.on("playout_started", _on_playout_started)
         agent_playout.on("playout_stopped", _on_playout_stopped)
@@ -448,103 +500,119 @@ def _on_playout_stopped(interrupted: bool) -> None:
 
             self._speech_q_changed.clear()
 
-    def _synthesize_agent_reply(self, *, validated: bool = False) -> None:
+    def _synthesize_agent_reply(self) -> None:
         """Synthesize the agent reply to the user question, also make sure only one reply
         is synthesized/played at a time"""
 
-        @utils.log_exceptions(logger=logger)
-        async def _synthesize_answer_task(
-            old_task: asyncio.Task[None], user_transcript: str
-        ) -> None:
-            if old_task is not None:
-                await utils.aio.gracefully_cancel(old_task)
-
-            user_msg = ChatMessage.create(text=user_transcript, role="user")
-            copied_ctx = self._chat_ctx.copy()
-            copied_ctx.messages.append(user_msg)
-
-            llm_stream = self._opts.will_synthesize_assistant_reply(self, copied_ctx)
-            if asyncio.iscoroutine(llm_stream):
-                llm_stream = await llm_stream
-
-            # fallback to default impl if no custom/user stream is returned
-            if not isinstance(llm_stream, LLMStream):
-                llm_stream = _default_will_synthesize_assistant_reply(
-                    self, chat_ctx=copied_ctx
+        if self._pending_agent_reply is not None:
+            self._pending_agent_reply.interrupt()
+
+        if self._human_input is not None and not self._human_input.speaking:
+            self._update_state("thinking", 0.2)
+
+        self._pending_agent_reply = new_handle = SpeechHandle.create_assistant_reply(
+            allow_interruptions=self._opts.allow_interruptions,
+            add_to_chat_ctx=True,
+            user_question=self._transcribed_text,
+        )
+
+        self._agent_reply_task = asyncio.create_task(
+            self._synthesize_answer_task(self._agent_reply_task, new_handle)
+        )
+
+    @utils.log_exceptions(logger=logger)
+    async def _synthesize_answer_task(
+        self, old_task: asyncio.Task[None], handle: SpeechHandle
+    ) -> None:
+        if old_task is not None:
+            await utils.aio.gracefully_cancel(old_task)
+
+        copied_ctx = self._chat_ctx.copy()
+
+        playing_speech = self._playing_speech
+        if playing_speech is not None and playing_speech.initialized:
+            if (
+                not playing_speech.user_question or playing_speech.user_commited
+            ) and not playing_speech.speech_commited:
+                # the speech is playing but not committed yet, add it to the chat context for this new reply synthesis
+                copied_ctx.messages.append(
+                    ChatMessage.create(
+                        text=playing_speech.synthesis_handle.tts_forwarder.played_text,
+                        role="assistant",
+                    )
                 )
 
-            speech_id = utils.shortuuid()
-            reply = _SpeechInfo(
-                id=speech_id,
-                source=llm_stream,
-                allow_interruptions=self._opts.allow_interruptions,
-                add_to_chat_ctx=True,
-                synthesis_handle=self._synthesize_agent_speech(speech_id, llm_stream),
-                is_reply=True,
-                user_question=user_transcript,
-            )
+        copied_ctx.messages.append(
+            ChatMessage.create(text=handle.user_question, role="user")
+        )
 
-            if self._last_end_of_speech_time is not None:
-                elapsed = round(time.time() - self._last_end_of_speech_time, 3)
-            else:
-                elapsed = -1.0
+        llm_stream = self._opts.before_llm_cb(self, copied_ctx)
+        if llm_stream is False:
+            return
 
-            logger.debug(
-                "synthesizing agent reply",
-                extra={
-                    "user_transcript": user_transcript,
-                    "validated": validated,
-                    "speech_id": reply.id,
-                    "elapsed": elapsed,
-                },
-            )
+        if asyncio.iscoroutine(llm_stream):
+            llm_stream = await llm_stream
 
-            if validated:
-                self._add_speech_for_playout(reply)
-            else:
-                self._pending_agent_reply = reply
+        # fallback to default impl if no custom/user stream is returned
+        if not isinstance(llm_stream, LLMStream):
+            llm_stream = _default_before_llm_cb(self, chat_ctx=copied_ctx)
 
-        # interrupt the current reply synthesis
-        if self._pending_agent_reply is not None:
-            self._pending_agent_reply.synthesis_handle.interrupt()
-            self._pending_agent_reply = None
+        if handle.interrupted:
+            return
 
-        self._pending_agent_reply_task = asyncio.create_task(
-            _synthesize_answer_task(
-                self._pending_agent_reply_task, self._transcribed_text
-            )
+        synthesis_handle = self._synthesize_agent_speech(handle.id, llm_stream)
+        handle.initialize(source=llm_stream, synthesis_handle=synthesis_handle)
+
+        # TODO(theomonnom): Find a more reliable way to get the elapsed time from the last end of speech
+        # (VAD could not have detected any speech - maybe unlikely?)
+        if self._last_end_of_speech_time is not None:
+            elapsed = round(time.time() - self._last_end_of_speech_time, 3)
+        else:
+            elapsed = -1.0
+
+        logger.debug(
+            "synthesizing agent reply",
+            extra={
+                "user_transcript": handle.user_question,
+                "speech_id": handle.id,
+                "elapsed": elapsed,
+            },
         )
 
-    async def _play_speech(self, speech_info: _SpeechInfo) -> None:
-        synthesis_handle = speech_info.synthesis_handle
+    async def _play_speech(self, speech_handle: SpeechHandle) -> None:
+        try:
+            await speech_handle.wait_for_initialization()
+        except asyncio.CancelledError:
+            return
+
+        await self._agent_publication.wait_for_subscription()
+
+        synthesis_handle = speech_handle.synthesis_handle
         if synthesis_handle.interrupted:
             return
 
-        user_question = speech_info.user_question
-        user_speech_committed = False
+        user_question = speech_handle.user_question
 
         play_handle = synthesis_handle.play()
         join_fut = play_handle.join()
 
         def _commit_user_question_if_needed() -> None:
-            nonlocal user_speech_committed
-
             if (
                 not user_question
                 or synthesis_handle.interrupted
-                or user_speech_committed
+                or speech_handle.user_commited
             ):
                 return
 
-            is_using_tools = isinstance(speech_info.source, LLMStream) and len(
-                speech_info.source.function_calls
+            is_using_tools = isinstance(speech_handle.source, LLMStream) and len(
+                speech_handle.source.function_calls
             )
 
             # make sure at least some speech was played before committing the user message
             # since we try to validate as fast as possible it is possible the agent gets interrupted
             # really quickly (barely audible), we don't want to mark this question as "answered".
             if (
-                speech_info.allow_interruptions
+                speech_handle.allow_interruptions
                 and not is_using_tools
                 and (
                     play_handle.time_played < self.MIN_TIME_PLAYED_FOR_COMMIT
@@ -561,7 +629,7 @@ def _commit_user_question_if_needed() -> None:
             self.emit("user_speech_committed", user_msg)
 
             self._transcribed_text = self._transcribed_text[len(user_question) :]
-            user_speech_committed = True
+            speech_handle.mark_user_commited()
 
         # wait for the play_handle to finish and check every 1s if the user question should be committed
         _commit_user_question_if_needed()
@@ -572,13 +640,16 @@ def _commit_user_question_if_needed() -> None:
             )
 
             _commit_user_question_if_needed()
+            
+            if speech_handle.interrupted:
+                break
 
         _commit_user_question_if_needed()
 
-        collected_text = speech_info.synthesis_handle.tts_forwarder.played_text
-        interrupted = speech_info.synthesis_handle.interrupted
-        is_using_tools = isinstance(speech_info.source, LLMStream) and len(
-            speech_info.source.function_calls
+        collected_text = speech_handle.synthesis_handle.tts_forwarder.played_text
+        interrupted = speech_handle.interrupted
+        is_using_tools = isinstance(speech_handle.source, LLMStream) and len(
+            speech_handle.source.function_calls
         )
 
         extra_tools_messages = []  # additional messages from the functions to add to the context if needed
@@ -586,16 +657,16 @@ def _commit_user_question_if_needed() -> None:
         # if the answer is using tools, execute the functions and automatically generate
         # a response to the user question from the returned values
         if is_using_tools and not interrupted:
-            assert isinstance(speech_info.source, LLMStream)
+            assert isinstance(speech_handle.source, LLMStream)
             assert (
-                not user_question or user_speech_committed
+                not user_question or speech_handle.user_commited
             ), "user speech should have been committed before using tools"
 
             # execute functions
-            call_ctx = AssistantCallContext(self, speech_info.source)
+            call_ctx = AssistantCallContext(self, speech_handle.source)
             tk = _CallContextVar.set(call_ctx)
-            self.emit("function_calls_collected", speech_info.source.function_calls)
-            called_fncs_info = speech_info.source.function_calls
+            self.emit("function_calls_collected", speech_handle.source.function_calls)
+            called_fncs_info = speech_handle.source.function_calls
 
             called_fncs = []
             for fnc in called_fncs_info:
@@ -605,7 +676,7 @@ def _commit_user_question_if_needed() -> None:
                     "executing ai function",
                     extra={
                         "function": fnc.function_info.name,
-                        "speech_id": speech_info.id,
+                        "speech_id": speech_handle.id,
                     },
                 )
                 try:
@@ -633,24 +704,27 @@ def _commit_user_question_if_needed() -> None:
                 extra_tools_messages.append(ChatMessage.create_tool_calls(tool_calls))
                 extra_tools_messages.extend(tool_calls_results_msg)
 
-                chat_ctx = speech_info.source.chat_ctx.copy()
+                chat_ctx = speech_handle.source.chat_ctx.copy()
                 chat_ctx.messages.extend(extra_tools_messages)
 
                 answer_llm_stream = self._llm.chat(
-                    chat_ctx=chat_ctx, fnc_ctx=self._fnc_ctx
+                    chat_ctx=chat_ctx,
+                    fnc_ctx=self._fnc_ctx,
                 )
                 answer_synthesis = self._synthesize_agent_speech(
-                    speech_info.id, answer_llm_stream
+                    speech_handle.id, answer_llm_stream
                 )
                 # replace the synthesis handle with the new one to allow interruption
-                speech_info.synthesis_handle = answer_synthesis
+                speech_handle.synthesis_handle = answer_synthesis
                 play_handle = answer_synthesis.play()
                 await play_handle.join()
 
                 collected_text = answer_synthesis.tts_forwarder.played_text
                 interrupted = answer_synthesis.interrupted
 
-        if speech_info.add_to_chat_ctx and (not user_question or user_speech_committed):
+        if speech_handle.add_to_chat_ctx and (
+            not user_question or speech_handle.user_commited
+        ):
             self._chat_ctx.messages.extend(extra_tools_messages)
 
             if interrupted:
@@ -659,6 +733,8 @@ def _commit_user_question_if_needed() -> None:
             msg = ChatMessage.create(text=collected_text, role="assistant")
             self._chat_ctx.messages.append(msg)
 
+            speech_handle.mark_speech_commited()
+
             if interrupted:
                 self.emit("agent_speech_interrupted", msg)
             else:
@@ -669,7 +745,7 @@ def _commit_user_question_if_needed() -> None:
                 extra={
                     "agent_transcript": collected_text,
                     "interrupted": interrupted,
-                    "speech_id": speech_info.id,
+                    "speech_id": speech_handle.id,
                 },
             )
 
@@ -685,9 +761,19 @@ def _synthesize_agent_speech(
         if isinstance(source, LLMStream):
             source = _llm_stream_to_str_iterable(speech_id, source)
 
+        og_source = source
+        transcript_source = source
+        if isinstance(og_source, AsyncIterable):
+            og_source, transcript_source = utils.aio.itertools.tee(og_source, 2)
+
+        tts_source = self._opts.before_tts_cb(self, og_source)
+        if tts_source is None:
+            logger.error("before_tts_cb must return str or AsyncIterable[str]")
+
         return self._agent_output.synthesize(
             speech_id=speech_id,
-            transcript=source,
+            tts_source=tts_source,
+            transcript_source=transcript_source,
             transcription=self._opts.transcription.agent_transcription,
             transcription_speed=self._opts.transcription.agent_transcription_speed,
             sentence_tokenizer=self._opts.transcription.sentence_tokenizer,
@@ -697,35 +783,42 @@ def _synthesize_agent_speech(
 
     def _validate_reply_if_possible(self) -> None:
         """Check if the new agent speech should be played"""
-        if (
-            self._pending_agent_reply is not None
-            and not self._pending_agent_reply.synthesis_handle.interrupted
-        ):
-            # in some timing, we could end up with two pushed agent replies inside the speech queue.
-            # so make sure we directly interrupt every reply when pushing a new one
-            for speech in self._speech_q:
-                if speech.allow_interruptions and speech.is_reply:
-                    speech.synthesis_handle.interrupt()
 
-            logger.debug(
-                "validated agent reply",
-                extra={"speech_id": self._pending_agent_reply.id},
-            )
-            self._add_speech_for_playout(self._pending_agent_reply)
-            self._pending_agent_reply = None
-        elif not self._opts.preemptive_synthesis and self._transcribed_text:
-            # validated=True is going to call _add_speech_for_playout
-            self._synthesize_agent_reply(validated=True)
+        if self._pending_agent_reply is None:
+            if self._opts.preemptive_synthesis or not self._transcribed_text:
+                return
 
-        # self._transcribed_text is reset after MIN_TIME_PLAYED_FOR_COMMIT, see self._play_speech
+            self._synthesize_agent_reply()  # this will populate self._pending_agent_reply
+
+        assert self._pending_agent_reply is not None
+
+        # in some bad timing, we could end up with two pushed agent replies inside the speech queue.
+        # so make sure we directly interrupt every reply when validating a new one
+        for speech in self._speech_q:
+            if not speech.is_reply:
+                continue
+
+            if not speech.allow_interruptions:
+                return  # we shouldn't validate this speech to avoid stacking replies
+
+            speech.interrupt()
+
+        logger.debug(
+            "validated agent reply",
+            extra={"speech_id": self._pending_agent_reply.id},
+        )
+
+        self._add_speech_for_playout(self._pending_agent_reply)
+        self._pending_agent_reply = None
         self._transcribed_interim_text = ""
+        # self._transcribed_text is reset after MIN_TIME_PLAYED_FOR_COMMIT, see self._play_speech
 
     def _interrupt_if_possible(self) -> None:
         """Check whether the current assistant speech should be interrupted"""
         if (
             self._playing_speech is None
             or not self._playing_speech.allow_interruptions
-            or self._playing_speech.synthesis_handle.interrupted
+            or self._playing_speech.interrupted
         ):
             return
 
@@ -738,10 +831,10 @@ def _interrupt_if_possible(self) -> None:
             if len(interim_words) < self._opts.int_min_words:
                 return
 
-        self._playing_speech.synthesis_handle.interrupt()
+        self._playing_speech.interrupt()
 
-    def _add_speech_for_playout(self, speech: _SpeechInfo) -> None:
-        self._speech_q.append(speech)
+    def _add_speech_for_playout(self, speech_handle: SpeechHandle) -> None:
+        self._speech_q.append(speech_handle)
         self._speech_q_changed.set()
 
 
@@ -773,14 +866,15 @@ class _DeferredReplyValidation:
 
     # if the STT gives us punctuation, we can try validate the reply faster.
     PUNCTUATION = ".!?"
-    PUNCTUATION_REDUCE_FACTOR = 0.5
+    PUNCTUATION_REDUCE_FACTOR = 0.75
 
-    DEFER_DELAY_END_OF_SPEECH = 0.2
-    DEFER_DELAY_FINAL_TRANSCRIPT = 1.0
     LATE_TRANSCRIPT_TOLERANCE = 1.5  # late compared to end of speech
 
     def __init__(
-        self, validate_fnc: Callable[[], None], loop: asyncio.AbstractEventLoop
+        self,
+        validate_fnc: Callable[[], None],
+        min_endpointing_delay: float,
+        loop: asyncio.AbstractEventLoop | None = None,
     ) -> None:
         self._validate_fnc = validate_fnc
         self._validating_task: asyncio.Task | None = None
@@ -788,6 +882,9 @@ def __init__(
         self._last_recv_end_of_speech_time: float = 0.0
         self._speaking = False
 
+        self._end_of_speech_delay = min_endpointing_delay
+        self._final_transcript_delay = min_endpointing_delay + 1.0
+
     @property
     def validating(self) -> bool:
         return self._validating_task is not None and not self._validating_task.done()
@@ -803,9 +900,9 @@ def on_human_final_transcript(self, transcript: str) -> None:
             < self.LATE_TRANSCRIPT_TOLERANCE
         )
         delay = (
-            self.DEFER_DELAY_END_OF_SPEECH
+            self._end_of_speech_delay
             if has_recent_end_of_speech
-            else self.DEFER_DELAY_FINAL_TRANSCRIPT
+            else self._final_transcript_delay
         )
         delay = (
             delay * self.PUNCTUATION_REDUCE_FACTOR
@@ -827,7 +924,7 @@ def on_human_end_of_speech(self, ev: vad.VADEvent) -> None:
 
         if self._last_final_transcript:
             delay = (
-                self.DEFER_DELAY_END_OF_SPEECH * self.PUNCTUATION_REDUCE_FACTOR
+                self._end_of_speech_delay * self.PUNCTUATION_REDUCE_FACTOR
                 if self._end_with_punctuation()
                 else 1.0
             )
diff --git a/livekit-agents/livekit/agents/worker.py b/livekit-agents/livekit/agents/worker.py
index 8aaf56d6c..1193ea269 100644
--- a/livekit-agents/livekit/agents/worker.py
+++ b/livekit-agents/livekit/agents/worker.py
@@ -17,12 +17,15 @@
 import asyncio
 import contextlib
 import datetime
+import math
 import multiprocessing as mp
 import os
+import sys
 import threading
 from dataclasses import dataclass, field
+from enum import Enum
 from functools import reduce
-from typing import Any, Callable, Coroutine, Literal
+from typing import Any, Awaitable, Callable, Generic, Literal, TypeVar
 from urllib.parse import urljoin, urlparse
 
 import aiohttp
@@ -33,7 +36,14 @@
 
 from . import http_server, ipc, utils
 from .exceptions import AssignmentTimeoutError
-from .job import JobAcceptArguments, JobContext, JobProcess, JobRequest, RunningJobInfo
+from .job import (
+    JobAcceptArguments,
+    JobContext,
+    JobExecutorType,
+    JobProcess,
+    JobRequest,
+    RunningJobInfo,
+)
 from .log import DEV_LEVEL, logger
 from .version import __version__
 
@@ -49,6 +59,11 @@ async def _default_request_fnc(ctx: JobRequest) -> None:
     await ctx.accept()
 
 
+class WorkerType(Enum):
+    ROOM = agent.JobType.JT_ROOM
+    PUBLISHER = agent.JobType.JT_PUBLISHER
+
+
 class _DefaultLoadCalc:
     _instance = None
 
@@ -89,12 +104,34 @@ class WorkerPermissions:
     hidden: bool = False
 
 
+if sys.platform.startswith("win"):
+    # Some python versions on Windows gets a BrokenPipeError when creating a new process
+    _default_job_executor_type = JobExecutorType.THREAD
+else:
+    _default_job_executor_type = JobExecutorType.PROCESS
+
+
+T = TypeVar("T")
+
+
+@dataclass(frozen=True)
+class _WorkerEnvOption(Generic[T]):
+    dev_default: T
+    prod_default: T
+
+    @staticmethod
+    def getvalue(opt: T | _WorkerEnvOption[T], devmode: bool) -> T:
+        if isinstance(opt, _WorkerEnvOption):
+            return opt.dev_default if devmode else opt.prod_default
+        return opt
+
+
 # NOTE: this object must be pickle-able
 @dataclass
 class WorkerOptions:
-    entrypoint_fnc: Callable[[JobContext], Coroutine]
+    entrypoint_fnc: Callable[[JobContext], Awaitable[None]]
     """Entrypoint function that will be called when a job is assigned to this worker."""
-    request_fnc: Callable[[JobRequest], Coroutine] = _default_request_fnc
+    request_fnc: Callable[[JobRequest], Awaitable[None]] = _default_request_fnc
     """Inspect the request and decide if the current worker should handle it.
 
     When left empty, all jobs are accepted."""
@@ -102,9 +139,18 @@ class WorkerOptions:
     """A function to perform any necessary initialization before the job starts."""
     load_fnc: Callable[[], float] = _DefaultLoadCalc.get_load
     """Called to determine the current load of the worker. Should return a value between 0 and 1."""
-    load_threshold: float = 0.65
-    """When the load exceeds this threshold, the worker will be marked as unavailable."""
-    num_idle_processes: int = 3
+    job_executor_type: JobExecutorType = _default_job_executor_type
+    """Which executor to use to run jobs. (currently thread or process are supported)"""
+    load_threshold: float | _WorkerEnvOption[float] = _WorkerEnvOption(
+        dev_default=math.inf, prod_default=0.75
+    )
+    """When the load exceeds this threshold, the worker will be marked as unavailable.
+    
+    Defaults to 0.75 on "production" mode, and is disabled in "development" mode.
+    """
+    num_idle_processes: int | _WorkerEnvOption[int] = _WorkerEnvOption(
+        dev_default=0, prod_default=3
+    )
     """Number of idle processes to keep warm."""
     shutdown_process_timeout: float = 60.0
     """Maximum amount of time to wait for a job to shut down gracefully"""
@@ -114,7 +160,9 @@ class WorkerOptions:
     """Namespace for the agent to be in"""
     permissions: WorkerPermissions = field(default_factory=WorkerPermissions)
     """Permissions that the agent should join the room with."""
-    worker_type: agent.JobType = agent.JobType.JT_ROOM
+    agent_name: str = ""
+    """Agent name can be used when multiple agents are required to join the same room. The LiveKit SFU will dispatch jobs to unique agent_name workers independently."""
+    worker_type: WorkerType = WorkerType.ROOM
     """Whether to spin up an agent for each room or publisher."""
     max_retry: int = 16
     """Maximum number of times to retry connecting to LiveKit."""
@@ -131,10 +179,13 @@ class WorkerOptions:
 
     By default it uses ``LIVEKIT_API_SECRET`` from environment"""
     host: str = ""  # default to all interfaces
-    port: int = 8081
+    port: int | _WorkerEnvOption[int] = _WorkerEnvOption(
+        dev_default=0, prod_default=8081
+    )
     """Port for local HTTP server to listen on.
 
-    The HTTP server is used as a health check endpoint."""
+    The HTTP server is used as a health check endpoint.
+    """
 
 
 EventTypes = Literal["worker_registered"]
@@ -142,10 +193,14 @@ class WorkerOptions:
 
 class Worker(utils.EventEmitter[EventTypes]):
     def __init__(
-        self, opts: WorkerOptions, *, loop: asyncio.AbstractEventLoop | None = None
+        self,
+        opts: WorkerOptions,
+        *,
+        devmode: bool = True,
+        loop: asyncio.AbstractEventLoop | None = None,
     ) -> None:
         super().__init__()
-        opts.ws_url = opts.ws_url or opts.ws_url or os.environ.get("LIVEKIT_URL") or ""
+        opts.ws_url = opts.ws_url or os.environ.get("LIVEKIT_URL") or ""
         opts.api_key = opts.api_key or os.environ.get("LIVEKIT_API_KEY") or ""
         opts.api_secret = opts.api_secret or os.environ.get("LIVEKIT_API_SECRET") or ""
 
@@ -173,6 +228,7 @@ def __init__(
         self._pending_assignments: dict[str, asyncio.Future[agent.JobAssignment]] = {}
         self._close_future: asyncio.Future[None] | None = None
         self._msg_chan = utils.aio.Chan[agent.WorkerMessage](128, loop=self._loop)
+        self._devmode = devmode
 
         # using spawn context for all platforms. We may have further optimizations for
         # Linux with forkserver, but for now, this is the safest option
@@ -180,8 +236,11 @@ def __init__(
         self._proc_pool = ipc.proc_pool.ProcPool(
             initialize_process_fnc=opts.prewarm_fnc,
             job_entrypoint_fnc=opts.entrypoint_fnc,
-            num_idle_processes=opts.num_idle_processes,
+            num_idle_processes=_WorkerEnvOption.getvalue(
+                opts.num_idle_processes, self._devmode
+            ),
             loop=self._loop,
+            job_executor_type=opts.job_executor_type,
             mp_ctx=mp_ctx,
             initialize_timeout=opts.initialize_process_timeout,
             close_timeout=opts.shutdown_process_timeout,
@@ -190,10 +249,12 @@ def __init__(
         self._api: api.LiveKitAPI | None = None
         self._http_session: aiohttp.ClientSession | None = None
         self._http_server = http_server.HttpServer(
-            opts.host, opts.port, loop=self._loop
+            opts.host,
+            _WorkerEnvOption.getvalue(opts.port, self._devmode),
+            loop=self._loop,
         )
 
-        self._main_task: asyncio.Task | None = None
+        self._main_task: asyncio.Task[None] | None = None
 
     async def run(self):
         if not self._closed:
@@ -346,7 +407,7 @@ async def _worker_task(self) -> None:
 
                 # register the worker
                 req = agent.WorkerMessage()
-                req.register.type = self._opts.worker_type
+                req.register.type = self._opts.worker_type.value
                 req.register.allowed_permissions.CopyFrom(
                     models.ParticipantPermission(
                         can_publish=self._opts.permissions.can_publish,
@@ -409,7 +470,9 @@ async def _load_task():
                     None, self._opts.load_fnc
                 )
 
-                is_full = current_load >= self._opts.load_threshold
+                is_full = current_load >= _WorkerEnvOption.getvalue(
+                    self._opts.load_threshold, self._devmode
+                )
                 currently_available = not is_full and not self._draining
 
                 current_status = (
@@ -479,6 +542,13 @@ async def _recv_task():
                     self._handle_availability(msg.availability)
                 elif which == "assignment":
                     self._handle_assignment(msg.assignment)
+                elif which == "termination":
+                    user_task = self._loop.create_task(
+                        self._handle_termination(msg.termination),
+                        name="agent_job_termination",
+                    )
+                    self._tasks.add(user_task)
+                    user_task.add_done_callback(self._tasks.discard)
 
         tasks = [
             asyncio.create_task(_load_task()),
@@ -492,7 +562,11 @@ async def _recv_task():
 
     async def _reload_jobs(self, jobs: list[RunningJobInfo]) -> None:
         for aj in jobs:
-            logger.log(DEV_LEVEL, "reloading job", extra={"job_id": aj.job.id})
+            logger.log(
+                DEV_LEVEL,
+                "reloading job",
+                extra={"job_id": aj.job.id, "agent_name": aj.job.agent_name},
+            )
             url = self._opts.ws_url
 
             # take the original jwt token and extend it while keeping all the same data that was generated
@@ -561,7 +635,7 @@ async def _on_accept(args: JobAcceptArguments) -> None:
             except asyncio.TimeoutError:
                 logger.warning(
                     f"assignment for job {job_req.id} timed out",
-                    extra={"job_request": job_req},
+                    extra={"job_request": job_req, "agent_name": self._opts.agent_name},
                 )
                 raise AssignmentTimeoutError()
 
@@ -579,7 +653,11 @@ async def _on_accept(args: JobAcceptArguments) -> None:
 
         logger.info(
             "received job request",
-            extra={"job_request": msg.job, "resuming": msg.resuming},
+            extra={
+                "job_request": msg.job,
+                "resuming": msg.resuming,
+                "agent_name": self._opts.agent_name,
+            },
         )
 
         @utils.log_exceptions(logger=logger)
@@ -588,13 +666,14 @@ async def _job_request_task():
                 await self._opts.request_fnc(job_req)
             except Exception:
                 logger.exception(
-                    "job_request_fnc failed", extra={"job_request": job_req}
+                    "job_request_fnc failed",
+                    extra={"job_request": job_req, "agent_name": self._opts.agent_name},
                 )
 
             if not answered:
                 logger.warning(
                     "no answer was given inside the job_request_fnc, automatically rejecting the job",
-                    extra={"job_request": job_req},
+                    extra={"job_request": job_req, "agent_name": self._opts.agent_name},
                 )
                 await _on_reject()
 
@@ -609,5 +688,13 @@ def _handle_assignment(self, assignment: agent.JobAssignment):
                 fut.set_result(assignment)
         else:
             logger.warning(
-                "received assignment for an unknown job", extra={"job": assignment.job}
+                "received assignment for an unknown job",
+                extra={"job": assignment.job, "agent_name": self._opts.agent_name},
             )
+
+    async def _handle_termination(self, msg: agent.JobTermination):
+        proc = self._proc_pool.get_by_job_id(msg.job_id)
+        if not proc:
+            # safe to ignore
+            return
+        await proc.aclose()
diff --git a/livekit-agents/package.json b/livekit-agents/package.json
index cc0f161f3..2327f51d3 100644
--- a/livekit-agents/package.json
+++ b/livekit-agents/package.json
@@ -1,5 +1,5 @@
 {
   "name": "livekit-agents",
   "private": true,
-  "version": "0.8.5"
+  "version": "0.9.0"
 }
diff --git a/livekit-agents/setup.py b/livekit-agents/setup.py
index ca4a6eae7..93716feb1 100644
--- a/livekit-agents/setup.py
+++ b/livekit-agents/setup.py
@@ -48,7 +48,7 @@
     python_requires=">=3.9.0",
     install_requires=[
         "click~=8.1",
-        "livekit~=0.12",
+        "livekit>=0.16.3",
         "livekit-api~=0.6",
         "livekit-protocol~=0.6",
         "protobuf>=3",
@@ -57,6 +57,7 @@
         "watchfiles~=0.22",
         "psutil~=5.9",
         "aiohttp~=3.10",
+        "typing-extensions~=4.12",
     ],
     extras_require={
         ':sys_platform=="win32"': [
diff --git a/livekit-plugins/install_plugins_editable.sh b/livekit-plugins/install_plugins_editable.sh
index a56570708..eead3d9f8 100755
--- a/livekit-plugins/install_plugins_editable.sh
+++ b/livekit-plugins/install_plugins_editable.sh
@@ -16,3 +16,4 @@ pip install -e ./livekit-plugins-nltk --config-settings editable_mode=strict
 pip install -e ./livekit-plugins-openai --config-settings editable_mode=strict
 pip install -e ./livekit-plugins-rag --config-settings editable_mode=strict
 pip install -e ./livekit-plugins-silero --config-settings editable_mode=strict
+pip install -e ./livekit-plugins-browser --config-settings editable_mode=strict
diff --git a/livekit-plugins/livekit-plugins-anthropic/CHANGELOG.md b/livekit-plugins/livekit-plugins-anthropic/CHANGELOG.md
new file mode 100644
index 000000000..81b9b2221
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-anthropic/CHANGELOG.md
@@ -0,0 +1,13 @@
+# livekit-plugins-anthropic
+
+## 0.2.1
+
+### Patch Changes
+
+- Fixes to Anthropic Function Calling - [#708](https://github.com/livekit/agents/pull/708) ([@keepingitneil](https://github.com/keepingitneil))
+
+## 0.2.0
+
+### Minor Changes
+
+- bump anthropic for release - [#724](https://github.com/livekit/agents/pull/724) ([@theomonnom](https://github.com/theomonnom))
diff --git a/livekit-plugins/livekit-plugins-anthropic/README.md b/livekit-plugins/livekit-plugins-anthropic/README.md
new file mode 100644
index 000000000..3eabfa1c2
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-anthropic/README.md
@@ -0,0 +1,13 @@
+# LiveKit Plugins Anthropic
+
+Agent Framework plugin for services from Anthropic.
+
+## Installation
+
+```bash
+pip install livekit-plugins-anthropic
+```
+
+## Pre-requisites
+
+You'll need an API key from Anthropic. It can be set as an environment variable: `ANTHROPIC_API_KEY`
diff --git a/livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/__init__.py b/livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/__init__.py
new file mode 100644
index 000000000..464766951
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/__init__.py
@@ -0,0 +1,37 @@
+# Copyright 2023 LiveKit, Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+from .llm import LLM, LLMStream
+from .log import logger
+from .models import ChatModels
+from .version import __version__
+
+__all__ = [
+    "LLM",
+    "LLMStream",
+    "ChatModels",
+    "logger",
+    "__version__",
+]
+
+from livekit.agents import Plugin
+
+
+class AnthropicPlugin(Plugin):
+    def __init__(self) -> None:
+        super().__init__(__name__, __version__, __package__, logger)
+
+
+Plugin.register_plugin(AnthropicPlugin())
diff --git a/livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/llm.py b/livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/llm.py
new file mode 100644
index 000000000..6fa6df13c
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/llm.py
@@ -0,0 +1,511 @@
+# Copyright 2023 LiveKit, Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import annotations
+
+import base64
+import inspect
+import json
+import os
+from dataclasses import dataclass
+from typing import Any, Awaitable, List, Tuple, get_args, get_origin
+
+import httpx
+from livekit import rtc
+from livekit.agents import llm, utils
+
+import anthropic
+
+from .log import logger
+from .models import (
+    ChatModels,
+)
+
+
+@dataclass
+class LLMOptions:
+    model: str | ChatModels
+    user: str | None
+    temperature: float | None
+
+
+class LLM(llm.LLM):
+    def __init__(
+        self,
+        *,
+        model: str | ChatModels = "claude-3-haiku-20240307",
+        api_key: str | None = None,
+        base_url: str | None = None,
+        user: str | None = None,
+        client: anthropic.AsyncClient | None = None,
+        temperature: float | None = None,
+    ) -> None:
+        """
+        Create a new instance of Anthropic LLM.
+
+        ``api_key`` must be set to your Anthropic API key, either using the argument or by setting
+        the ``ANTHROPIC_API_KEY`` environmental variable.
+        """
+        # throw an error on our end
+        api_key = api_key or os.environ.get("ANTHROPIC_API_KEY")
+        if api_key is None:
+            raise ValueError("Anthropic API key is required")
+
+        self._opts = LLMOptions(model=model, user=user, temperature=temperature)
+        self._client = client or anthropic.AsyncClient(
+            api_key=api_key,
+            base_url=base_url,
+            http_client=httpx.AsyncClient(
+                timeout=5.0,
+                follow_redirects=True,
+                limits=httpx.Limits(
+                    max_connections=1000,
+                    max_keepalive_connections=100,
+                    keepalive_expiry=120,
+                ),
+            ),
+        )
+
+    def chat(
+        self,
+        *,
+        chat_ctx: llm.ChatContext,
+        fnc_ctx: llm.FunctionContext | None = None,
+        temperature: float | None = None,
+        n: int | None = 1,
+        parallel_tool_calls: bool | None = None,
+    ) -> "LLMStream":
+        if temperature is None:
+            temperature = self._opts.temperature
+
+        opts: dict[str, Any] = dict()
+        if fnc_ctx and len(fnc_ctx.ai_functions) > 0:
+            fncs_desc: list[anthropic.types.ToolParam] = []
+            for fnc in fnc_ctx.ai_functions.values():
+                fncs_desc.append(_build_function_description(fnc))
+
+            opts["tools"] = fncs_desc
+
+            if fnc_ctx and parallel_tool_calls is not None:
+                opts["parallel_tool_calls"] = parallel_tool_calls
+
+        latest_system_message = _latest_system_message(chat_ctx)
+        anthropic_ctx = _build_anthropic_context(chat_ctx.messages, id(self))
+        collaped_anthropic_ctx = _merge_messages(anthropic_ctx)
+        stream = self._client.messages.create(
+            max_tokens=opts.get("max_tokens", 1000),
+            system=latest_system_message,
+            messages=collaped_anthropic_ctx,
+            model=self._opts.model,
+            temperature=temperature or anthropic.NOT_GIVEN,
+            top_k=n or anthropic.NOT_GIVEN,
+            stream=True,
+            **opts,
+        )
+
+        return LLMStream(anthropic_stream=stream, chat_ctx=chat_ctx, fnc_ctx=fnc_ctx)
+
+
+class LLMStream(llm.LLMStream):
+    def __init__(
+        self,
+        *,
+        anthropic_stream: Awaitable[
+            anthropic.AsyncStream[anthropic.types.RawMessageStreamEvent]
+        ],
+        chat_ctx: llm.ChatContext,
+        fnc_ctx: llm.FunctionContext | None,
+    ) -> None:
+        super().__init__(chat_ctx=chat_ctx, fnc_ctx=fnc_ctx)
+        self._awaitable_anthropic_stream = anthropic_stream
+        self._anthropic_stream: (
+            anthropic.AsyncStream[anthropic.types.RawMessageStreamEvent] | None
+        ) = None
+
+        # current function call that we're waiting for full completion (args are streamed)
+        self._tool_call_id: str | None = None
+        self._fnc_name: str | None = None
+        self._fnc_raw_arguments: str | None = None
+
+    async def aclose(self) -> None:
+        if self._anthropic_stream:
+            await self._anthropic_stream.close()
+
+        return await super().aclose()
+
+    async def __anext__(self):
+        if not self._anthropic_stream:
+            self._anthropic_stream = await self._awaitable_anthropic_stream
+
+        fn_calling_enabled = self._fnc_ctx is not None
+        ignore = False
+
+        async for event in self._anthropic_stream:
+            if event.type == "message_start":
+                pass
+            elif event.type == "message_delta":
+                pass
+            elif event.type == "message_stop":
+                pass
+            elif event.type == "content_block_start":
+                if event.content_block.type == "tool_use":
+                    self._tool_call_id = event.content_block.id
+                    self._fnc_raw_arguments = ""
+                    self._fnc_name = event.content_block.name
+            elif event.type == "content_block_delta":
+                delta = event.delta
+                if delta.type == "text_delta":
+                    text = delta.text
+
+                    # Anthropic seems to add a prompt when tool calling is enabled
+                    # where responses always start with a "<thinking>" block containing
+                    # the LLM's chain of thought. It's very verbose and not useful for voice
+                    # applications.
+                    if fn_calling_enabled:
+                        if text.startswith("<thinking>"):
+                            ignore = True
+
+                        if "</thinking>" in text:
+                            text = text.split("</thinking>")[-1]
+                            ignore = False
+
+                    if ignore:
+                        continue
+
+                    return llm.ChatChunk(
+                        choices=[
+                            llm.Choice(
+                                delta=llm.ChoiceDelta(content=text, role="assistant")
+                            )
+                        ]
+                    )
+                elif delta.type == "input_json_delta":
+                    assert self._fnc_raw_arguments is not None
+                    self._fnc_raw_arguments += delta.partial_json
+
+            elif event.type == "content_block_stop":
+                if self._tool_call_id is not None and self._fnc_ctx:
+                    assert self._fnc_name is not None
+                    assert self._fnc_raw_arguments is not None
+                    fnc_info = _create_ai_function_info(
+                        self._fnc_ctx,
+                        self._tool_call_id,
+                        self._fnc_name,
+                        self._fnc_raw_arguments,
+                    )
+                    self._function_calls_info.append(fnc_info)
+                    chunk = llm.ChatChunk(
+                        choices=[
+                            llm.Choice(
+                                delta=llm.ChoiceDelta(
+                                    role="assistant", tool_calls=[fnc_info]
+                                ),
+                                index=0,
+                            )
+                        ]
+                    )
+                    self._tool_call_id = None
+                    self._fnc_raw_arguments = None
+                    self._fnc_name = None
+                    return chunk
+
+        raise StopAsyncIteration
+
+
+def _latest_system_message(chat_ctx: llm.ChatContext) -> str:
+    latest_system_message: llm.ChatMessage | None = None
+    for m in chat_ctx.messages:
+        if m.role == "system":
+            latest_system_message = m
+            continue
+
+    latest_system_str = ""
+    if latest_system_message:
+        if isinstance(latest_system_message.content, str):
+            latest_system_str = latest_system_message.content
+        elif isinstance(latest_system_message.content, list):
+            latest_system_str = " ".join(
+                [c for c in latest_system_message.content if isinstance(c, str)]
+            )
+    return latest_system_str
+
+
+def _merge_messages(
+    messages: List[anthropic.types.MessageParam],
+) -> List[anthropic.types.MessageParam]:
+    # Anthropic enforces alternating messages
+    combined_messages: list[anthropic.types.MessageParam] = []
+    for m in messages:
+        if len(combined_messages) == 0 or m["role"] != combined_messages[-1]["role"]:
+            combined_messages.append(m)
+            continue
+        last_message = combined_messages[-1]
+        if not isinstance(last_message["content"], list) or not isinstance(
+            m["content"], list
+        ):
+            logger.error("message content is not a list")
+            continue
+
+        last_message["content"].extend(m["content"])
+
+    if len(combined_messages) == 0 or combined_messages[0]["role"] != "user":
+        combined_messages.insert(
+            0, {"role": "user", "content": [{"type": "text", "text": "(empty)"}]}
+        )
+
+    return combined_messages
+
+
+def _build_anthropic_context(
+    chat_ctx: List[llm.ChatMessage], cache_key: Any
+) -> List[anthropic.types.MessageParam]:
+    result: List[anthropic.types.MessageParam] = []
+    for msg in chat_ctx:
+        a_msg = _build_anthropic_message(msg, cache_key, chat_ctx)
+        if a_msg:
+            result.append(a_msg)
+    return result
+
+
+def _build_anthropic_message(
+    msg: llm.ChatMessage, cache_key: Any, chat_ctx: List[llm.ChatMessage]
+) -> anthropic.types.MessageParam | None:
+    if msg.role == "user" or msg.role == "assistant":
+        a_msg: anthropic.types.MessageParam = {
+            "role": msg.role,
+            "content": [],
+        }
+        assert isinstance(a_msg["content"], list)
+        a_content = a_msg["content"]
+
+        # add content if provided
+        if isinstance(msg.content, str):
+            a_msg["content"].append(
+                anthropic.types.TextBlock(
+                    text=msg.content,
+                    type="text",
+                )
+            )
+        elif isinstance(msg.content, list):
+            for cnt in msg.content:
+                if isinstance(cnt, str):
+                    content: anthropic.types.TextBlock = anthropic.types.TextBlock(
+                        text=cnt,
+                        type="text",
+                    )
+                    a_content.append(content)
+                elif isinstance(cnt, llm.ChatImage):
+                    a_content.append(_build_anthropic_image_content(cnt, cache_key))
+
+        if msg.tool_calls is not None:
+            for fnc in msg.tool_calls:
+                tool_use = anthropic.types.ToolUseBlockParam(
+                    id=fnc.tool_call_id,
+                    type="tool_use",
+                    name=fnc.function_info.name,
+                    input=fnc.arguments,
+                )
+                a_content.append(tool_use)
+
+        return a_msg
+    elif msg.role == "tool":
+        if not isinstance(msg.content, str):
+            logger.warning("tool message content is not a string")
+            return None
+        if not msg.tool_call_id:
+            return None
+
+        u_content = anthropic.types.ToolResultBlockParam(
+            tool_use_id=msg.tool_call_id,
+            type="tool_result",
+            content=msg.content,
+            is_error=msg.tool_exception is not None,
+        )
+        return {
+            "role": "user",
+            "content": [u_content],
+        }
+
+    return None
+
+
+def _build_anthropic_image_content(
+    image: llm.ChatImage, cache_key: Any
+) -> anthropic.types.ImageBlockParam:
+    if isinstance(image.image, str):  # image url
+        logger.warning(
+            "image url not supported by anthropic, skipping image '%s'", image.image
+        )
+    elif isinstance(image.image, rtc.VideoFrame):  # VideoFrame
+        if cache_key not in image._cache:
+            # inside our internal implementation, we allow to put extra metadata to
+            # each ChatImage (avoid to reencode each time we do a chatcompletion request)
+            opts = utils.images.EncodeOptions()
+            if image.inference_width and image.inference_height:
+                opts.resize_options = utils.images.ResizeOptions(
+                    width=image.inference_width,
+                    height=image.inference_height,
+                    strategy="center_aspect_fit",
+                )
+
+            encoded_data = utils.images.encode(image.image, opts)
+            image._cache[cache_key] = base64.b64encode(encoded_data).decode("utf-8")
+
+        return {
+            "type": "image",
+            "source": {
+                "type": "base64",
+                "data": image._cache[cache_key],
+                "media_type": "image/jpeg",
+            },
+        }
+
+    raise ValueError(f"unknown image type {type(image.image)}")
+
+
+def _create_ai_function_info(
+    fnc_ctx: llm.function_context.FunctionContext,
+    tool_call_id: str,
+    fnc_name: str,
+    raw_arguments: str,  # JSON string
+) -> llm.function_context.FunctionCallInfo:
+    if fnc_name not in fnc_ctx.ai_functions:
+        raise ValueError(f"AI function {fnc_name} not found")
+
+    parsed_arguments: dict[str, Any] = {}
+    try:
+        if raw_arguments:  # ignore empty string
+            parsed_arguments = json.loads(raw_arguments)
+    except json.JSONDecodeError:
+        raise ValueError(
+            f"AI function {fnc_name} received invalid JSON arguments - {raw_arguments}"
+        )
+
+    fnc_info = fnc_ctx.ai_functions[fnc_name]
+
+    # Ensure all necessary arguments are present and of the correct type.
+    sanitized_arguments: dict[str, Any] = {}
+    for arg_info in fnc_info.arguments.values():
+        if arg_info.name not in parsed_arguments:
+            if arg_info.default is inspect.Parameter.empty:
+                raise ValueError(
+                    f"AI function {fnc_name} missing required argument {arg_info.name}"
+                )
+            continue
+
+        arg_value = parsed_arguments[arg_info.name]
+        if get_origin(arg_info.type) is not None:
+            if not isinstance(arg_value, list):
+                raise ValueError(
+                    f"AI function {fnc_name} argument {arg_info.name} should be a list"
+                )
+
+            inner_type = get_args(arg_info.type)[0]
+            sanitized_value = [
+                _sanitize_primitive(
+                    value=v, expected_type=inner_type, choices=arg_info.choices
+                )
+                for v in arg_value
+            ]
+        else:
+            sanitized_value = _sanitize_primitive(
+                value=arg_value, expected_type=arg_info.type, choices=arg_info.choices
+            )
+
+        sanitized_arguments[arg_info.name] = sanitized_value
+
+    return llm.function_context.FunctionCallInfo(
+        tool_call_id=tool_call_id,
+        raw_arguments=raw_arguments,
+        function_info=fnc_info,
+        arguments=sanitized_arguments,
+    )
+
+
+def _build_function_description(
+    fnc_info: llm.function_context.FunctionInfo,
+) -> anthropic.types.ToolParam:
+    def build_schema_field(arg_info: llm.function_context.FunctionArgInfo):
+        def type2str(t: type) -> str:
+            if t is str:
+                return "string"
+            elif t in (int, float):
+                return "number"
+            elif t is bool:
+                return "boolean"
+
+            raise ValueError(f"unsupported type {t} for ai_property")
+
+        p: dict[str, Any] = {}
+        if arg_info.default is inspect.Parameter.empty:
+            p["required"] = True
+        else:
+            p["required"] = False
+
+        if arg_info.description:
+            p["description"] = arg_info.description
+
+        if get_origin(arg_info.type) is list:
+            inner_type = get_args(arg_info.type)[0]
+            p["type"] = "array"
+            p["items"] = {}
+            p["items"]["type"] = type2str(inner_type)
+
+            if arg_info.choices:
+                p["items"]["enum"] = arg_info.choices
+        else:
+            p["type"] = type2str(arg_info.type)
+            if arg_info.choices:
+                p["enum"] = arg_info.choices
+
+        return p
+
+    input_schema: dict[str, object] = {"type": "object"}
+
+    for arg_info in fnc_info.arguments.values():
+        input_schema[arg_info.name] = build_schema_field(arg_info)
+
+    return {
+        "name": fnc_info.name,
+        "description": fnc_info.description,
+        "input_schema": input_schema,
+    }
+
+
+def _sanitize_primitive(
+    *, value: Any, expected_type: type, choices: Tuple[Any] | None
+) -> Any:
+    if expected_type is str:
+        if not isinstance(value, str):
+            raise ValueError(f"expected str, got {type(value)}")
+    elif expected_type in (int, float):
+        if not isinstance(value, (int, float)):
+            raise ValueError(f"expected number, got {type(value)}")
+
+        if expected_type is int:
+            if value % 1 != 0:
+                raise ValueError("expected int, got float")
+
+            value = int(value)
+        elif expected_type is float:
+            value = float(value)
+
+    elif expected_type is bool:
+        if not isinstance(value, bool):
+            raise ValueError(f"expected bool, got {type(value)}")
+
+    if choices and value not in choices:
+        raise ValueError(f"invalid value {value}, not in {choices}")
+
+    return value
diff --git a/livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/log.py b/livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/log.py
new file mode 100644
index 000000000..aac7cf6eb
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/log.py
@@ -0,0 +1,3 @@
+import logging
+
+logger = logging.getLogger("livekit.plugins.anthropic")
diff --git a/livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/models.py b/livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/models.py
new file mode 100644
index 000000000..502d52d03
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/models.py
@@ -0,0 +1,8 @@
+from typing import Literal
+
+ChatModels = Literal[
+    "claude-3-5-sonnet-20240620",
+    "claude-3-opus-20240229",
+    "claude-3-sonnet-20240229",
+    "claude-3-haiku-20240307",
+]
diff --git a/livekit-plugins/livekit-plugins-browser/cef/src/helper_main_win.cpp b/livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/py.typed
similarity index 100%
rename from livekit-plugins/livekit-plugins-browser/cef/src/helper_main_win.cpp
rename to livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/py.typed
diff --git a/livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/version.py b/livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/version.py
new file mode 100644
index 000000000..875ee5214
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/version.py
@@ -0,0 +1,15 @@
+# Copyright 2023 LiveKit, Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+__version__ = "0.2.1"
diff --git a/livekit-plugins/livekit-plugins-anthropic/package.json b/livekit-plugins/livekit-plugins-anthropic/package.json
new file mode 100644
index 000000000..3394ee822
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-anthropic/package.json
@@ -0,0 +1,5 @@
+{
+  "name": "livekit-plugins-anthropic",
+  "private": true,
+  "version": "0.2.1"
+}
diff --git a/livekit-plugins/livekit-plugins-anthropic/pyproject.toml b/livekit-plugins/livekit-plugins-anthropic/pyproject.toml
new file mode 100644
index 000000000..8cf32563a
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-anthropic/pyproject.toml
@@ -0,0 +1,3 @@
+[build-system]
+requires = ["setuptools>=61.0"]
+build-backend = "setuptools.build_meta"
\ No newline at end of file
diff --git a/livekit-plugins/livekit-plugins-anthropic/setup.py b/livekit-plugins/livekit-plugins-anthropic/setup.py
new file mode 100644
index 000000000..5cbeb9625
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-anthropic/setup.py
@@ -0,0 +1,59 @@
+# Copyright 2023 LiveKit, Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import pathlib
+
+import setuptools
+import setuptools.command.build_py
+
+here = pathlib.Path(__file__).parent.resolve()
+about = {}
+with open(
+    os.path.join(here, "livekit", "plugins", "anthropic", "version.py"), "r"
+) as f:
+    exec(f.read(), about)
+
+
+setuptools.setup(
+    name="livekit-plugins-anthropic",
+    version=about["__version__"],
+    description="Agent Framework plugin for services from Anthropic",
+    long_description=(here / "README.md").read_text(encoding="utf-8"),
+    long_description_content_type="text/markdown",
+    url="https://github.com/livekit/agents",
+    cmdclass={},
+    classifiers=[
+        "Intended Audience :: Developers",
+        "License :: OSI Approved :: Apache Software License",
+        "Topic :: Multimedia :: Sound/Audio",
+        "Topic :: Multimedia :: Video",
+        "Topic :: Scientific/Engineering :: Artificial Intelligence",
+        "Programming Language :: Python :: 3",
+        "Programming Language :: Python :: 3.9",
+        "Programming Language :: Python :: 3.10",
+        "Programming Language :: Python :: 3 :: Only",
+    ],
+    keywords=["webrtc", "realtime", "audio", "video", "livekit"],
+    license="Apache-2.0",
+    packages=setuptools.find_namespace_packages(include=["livekit.*"]),
+    python_requires=">=3.9.0",
+    install_requires=["livekit-agents~=0.8", "anthropic ~= 0.34"],
+    package_data={"livekit.plugins.anthropic": ["py.typed"]},
+    project_urls={
+        "Documentation": "https://docs.livekit.io",
+        "Website": "https://livekit.io/",
+        "Source": "https://github.com/livekit/agents",
+    },
+)
diff --git a/livekit-plugins/livekit-plugins-azure/CHANGELOG.md b/livekit-plugins/livekit-plugins-azure/CHANGELOG.md
index 7a8b527bf..fcee6ff88 100644
--- a/livekit-plugins/livekit-plugins-azure/CHANGELOG.md
+++ b/livekit-plugins/livekit-plugins-azure/CHANGELOG.md
@@ -1,5 +1,11 @@
 # livekit-plugins-azure
 
+## 0.3.2
+
+### Patch Changes
+
+- avoid returning tiny frames from TTS - [#747](https://github.com/livekit/agents/pull/747) ([@theomonnom](https://github.com/theomonnom))
+
 ## 0.3.1
 
 ### Patch Changes
diff --git a/livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/stt.py b/livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/stt.py
index 98fa8de2f..b3ae6b9ee 100644
--- a/livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/stt.py
+++ b/livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/stt.py
@@ -45,6 +45,13 @@ def __init__(
         num_channels: int = 1,
         languages: list[str] = [],  # when empty, auto-detect the language
     ):
+        """
+        Create a new instance of Azure STT.
+
+        ``speech_key`` and ``speech_region`` must be set, either using arguments or by setting the
+        ``AZURE_SPEECH_KEY`` and ``AZURE_SPEECH_REGION`` environmental variables, respectively.
+        """
+
         super().__init__(
             capabilities=stt.STTCapabilities(streaming=True, interim_results=True)
         )
diff --git a/livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/tts.py b/livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/tts.py
index 3efeea38a..a21d9e948 100644
--- a/livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/tts.py
+++ b/livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/tts.py
@@ -16,7 +16,6 @@
 import os
 from dataclasses import dataclass
 
-from livekit import rtc
 from livekit.agents import tts, utils
 
 import azure.cognitiveservices.speech as speechsdk  # type: ignore
@@ -42,6 +41,13 @@ def __init__(
         speech_region: str | None = None,
         voice: str | None = None,
     ) -> None:
+        """
+        Create a new instance of Azure TTS.
+
+        ``speech_key`` and ``speech_region`` must be set, either using arguments or by setting the
+        ``AZURE_SPEECH_KEY`` and ``AZURE_SPEECH_REGION`` environmental variables, respectively.
+        """
+
         super().__init__(
             capabilities=tts.TTSCapabilities(
                 streaming=False,
@@ -73,17 +79,18 @@ def __init__(self, text: str, opts: _TTSOptions) -> None:
 
     @utils.log_exceptions()
     async def _main_task(self):
-        stream_callback = _PushAudioOutputStreamCallback(
-            asyncio.get_running_loop(), self._event_ch
+        stream_callback = speechsdk.audio.PushAudioOutputStream(
+            _PushAudioOutputStreamCallback(asyncio.get_running_loop(), self._event_ch)
         )
         synthesizer = _create_speech_synthesizer(
             config=self._opts,
-            stream=speechsdk.audio.PushAudioOutputStream(stream_callback),
+            stream=stream_callback,
         )
 
         def _synthesize() -> speechsdk.SpeechSynthesisResult:
             return synthesizer.speak_text_async(self._text).get()  # type: ignore
 
+        result = None
         try:
             result = await asyncio.to_thread(_synthesize)
             if result.reason != speechsdk.ResultReason.SynthesizingAudioCompleted:
@@ -93,8 +100,11 @@ def _synthesize() -> speechsdk.SpeechSynthesisResult:
         finally:
 
             def _cleanup() -> None:
-                nonlocal synthesizer, result
+                # cleanup resources inside an Executor
+                # to avoid blocking the event loop
+                nonlocal synthesizer, stream_callback, result
                 del synthesizer
+                del stream_callback
                 del result
 
             await asyncio.to_thread(_cleanup)
@@ -112,20 +122,30 @@ def __init__(
         self._request_id = utils.shortuuid()
         self._segment_id = utils.shortuuid()
 
-    def write(self, audio_buffer: memoryview) -> int:
-        audio = tts.SynthesizedAudio(
-            request_id=self._request_id,
-            segment_id=self._segment_id,
-            frame=rtc.AudioFrame(
-                data=audio_buffer,
-                sample_rate=AZURE_SAMPLE_RATE,
-                num_channels=AZURE_NUM_CHANNELS,
-                samples_per_channel=audio_buffer.nbytes // 2,
-            ),
+        self._bstream = utils.audio.AudioByteStream(
+            sample_rate=AZURE_SAMPLE_RATE, num_channels=AZURE_NUM_CHANNELS
         )
-        self._loop.call_soon_threadsafe(self._event_ch.send_nowait, audio)
+
+    def write(self, audio_buffer: memoryview) -> int:
+        for frame in self._bstream.write(audio_buffer.tobytes()):
+            audio = tts.SynthesizedAudio(
+                request_id=self._request_id,
+                segment_id=self._segment_id,
+                frame=frame,
+            )
+            self._loop.call_soon_threadsafe(self._event_ch.send_nowait, audio)
+
         return audio_buffer.nbytes
 
+    def close(self) -> None:
+        for frame in self._bstream.flush():
+            audio = tts.SynthesizedAudio(
+                request_id=self._request_id,
+                segment_id=self._segment_id,
+                frame=frame,
+            )
+            self._loop.call_soon_threadsafe(self._event_ch.send_nowait, audio)
+
 
 def _create_speech_synthesizer(
     *, config: _TTSOptions, stream: speechsdk.audio.AudioOutputStream
diff --git a/livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/version.py b/livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/version.py
index 8787f001e..38fc4a80e 100644
--- a/livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/version.py
+++ b/livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/version.py
@@ -12,4 +12,4 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-__version__ = "0.3.1"
+__version__ = "0.3.2"
diff --git a/livekit-plugins/livekit-plugins-azure/package.json b/livekit-plugins/livekit-plugins-azure/package.json
index 40342724b..e1db756ac 100644
--- a/livekit-plugins/livekit-plugins-azure/package.json
+++ b/livekit-plugins/livekit-plugins-azure/package.json
@@ -1,5 +1,5 @@
 {
   "name": "livekit-plugins-azure",
   "private": true,
-  "version": "0.3.1"
+  "version": "0.3.2"
 }
diff --git a/livekit-plugins/livekit-plugins-browser/cef/.clang-format b/livekit-plugins/livekit-plugins-browser/.clang-format
similarity index 100%
rename from livekit-plugins/livekit-plugins-browser/cef/.clang-format
rename to livekit-plugins/livekit-plugins-browser/.clang-format
diff --git a/livekit-plugins/livekit-plugins-browser/cef/.gitignore b/livekit-plugins/livekit-plugins-browser/.gitignore
similarity index 100%
rename from livekit-plugins/livekit-plugins-browser/cef/.gitignore
rename to livekit-plugins/livekit-plugins-browser/.gitignore
diff --git a/livekit-plugins/livekit-plugins-browser/CHANGELOG.md b/livekit-plugins/livekit-plugins-browser/CHANGELOG.md
new file mode 100644
index 000000000..f000991ea
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/CHANGELOG.md
@@ -0,0 +1,7 @@
+# livekit-plugins-browser
+
+## 0.0.2
+
+### Patch Changes
+
+- livekit-plugins-browser: prepare for release - [#659](https://github.com/livekit/agents/pull/659) ([@theomonnom](https://github.com/theomonnom))
diff --git a/livekit-plugins/livekit-plugins-browser/cef/CMakeLists.txt b/livekit-plugins/livekit-plugins-browser/CMakeLists.txt
similarity index 90%
rename from livekit-plugins/livekit-plugins-browser/cef/CMakeLists.txt
rename to livekit-plugins/livekit-plugins-browser/CMakeLists.txt
index 0d113bd32..30b9e1255 100644
--- a/livekit-plugins/livekit-plugins-browser/cef/CMakeLists.txt
+++ b/livekit-plugins/livekit-plugins-browser/CMakeLists.txt
@@ -11,7 +11,8 @@ set(USE_SANDBOX OFF) # TODO(theomonnom): I don't think we want to enable sandbox
 
 # Specify the CEF distribution version.
 if(NOT DEFINED CEF_VERSION)
-  set(CEF_VERSION "122.1.10+gc902316+chromium-122.0.6261.112")
+  # set(CEF_VERSION "122.1.10+gc902316+chromium-122.0.6261.112")
+  set(CEF_VERSION "127.3.5+g114ea2a+chromium-127.0.6533.120")
 endif()
 
 if("${CMAKE_SYSTEM_NAME}" STREQUAL "Darwin")
diff --git a/livekit-plugins/livekit-plugins-browser/cef/LICENSE.txt b/livekit-plugins/livekit-plugins-browser/LICENSE.txt
similarity index 100%
rename from livekit-plugins/livekit-plugins-browser/cef/LICENSE.txt
rename to livekit-plugins/livekit-plugins-browser/LICENSE.txt
diff --git a/livekit-plugins/livekit-plugins-browser/README.md b/livekit-plugins/livekit-plugins-browser/README.md
new file mode 100644
index 000000000..ae9207bfd
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/README.md
@@ -0,0 +1,4 @@
+# LiveKit Plugins Browser 
+
+Chromium Embedded Framework (CEF) for LiveKit Agents
+
diff --git a/livekit-plugins/livekit-plugins-browser/cef/src/agents_python.cpp b/livekit-plugins/livekit-plugins-browser/cef/src/agents_python.cpp
deleted file mode 100644
index 7e44d624a..000000000
--- a/livekit-plugins/livekit-plugins-browser/cef/src/agents_python.cpp
+++ /dev/null
@@ -1,52 +0,0 @@
-#include "agents_python.hpp"
-
-#include "app.hpp"
-#include "include/internal/cef_mac.h"
-
-#include <pybind11/pybind11.h>
-#include <pybind11/functional.h>
-
-namespace py = pybind11;
-
-BrowserApp::BrowserApp(const AppOptions& options) : options_(options) {
-  app_ = new AgentApp(options_.dev_mode, options_.initialized_callback);
-}
-
-std::shared_ptr<BrowserImpl> BrowserApp::CreateBrowser(
-    const std::string& url,
-    const BrowserOptions& options) {
-
-  app_->CreateBrowser(url, options.framerate, options.created_callback);
-  return nullptr;//std::make_shared<BrowserImpl>();
-}
-
-int BrowserApp::Run() {
-  return RunAgentApp(app_);
-}
-
-BrowserImpl::BrowserImpl() {}
-
-void BrowserImpl::SetSize(int width, int height) {}
-
-PYBIND11_MODULE(lkcef_python, m) {
-  // Isn't that fucking cool? llm using browsers
-  m.doc() = "Chromium Embedded Framework (CEF) for LiveKit Agents";
-
-  py::class_<AppOptions>(m, "AppOptions")
-      .def(py::init())
-      .def_readwrite("dev_mode", &AppOptions::dev_mode)
-      .def_readwrite("initialized_callback", &AppOptions::initialized_callback);
-
-  py::class_<BrowserOptions>(m, "BrowserOptions")
-      .def(py::init())
-      .def_readwrite("framerate", &BrowserOptions::framerate)
-      .def_readwrite("created_callback", &BrowserOptions::created_callback);
-
-  py::class_<BrowserApp>(m, "BrowserApp")
-      .def(py::init<const AppOptions&>())
-      .def("create_browser", &BrowserApp::CreateBrowser)
-      .def("run", &BrowserApp::Run);
-
-  py::class_<BrowserImpl>(m, "BrowserImpl")
-      .def("set_size", &BrowserImpl::SetSize);
-}
diff --git a/livekit-plugins/livekit-plugins-browser/cef/src/agents_python.hpp b/livekit-plugins/livekit-plugins-browser/cef/src/agents_python.hpp
deleted file mode 100644
index e77a59776..000000000
--- a/livekit-plugins/livekit-plugins-browser/cef/src/agents_python.hpp
+++ /dev/null
@@ -1,39 +0,0 @@
-#ifndef LKCEF_AGENTS_PYTHON_HPP
-#define LKCEF_AGENTS_PYTHON_HPP
-
-#include <functional>
-#include <memory>
-
-#include "app.hpp"
-
-class BrowserImpl;
-
-struct AppOptions {
-  bool dev_mode = false;
-  std::function<void()> initialized_callback = nullptr;
-};
-
-struct BrowserOptions {
-  int framerate = 30;
-  std::function<void()> created_callback = nullptr;
-};
-
-struct BrowserApp {
-  BrowserApp(const AppOptions& options);
-
-  std::shared_ptr<BrowserImpl> CreateBrowser(const std::string& url,
-                                             const BrowserOptions& options);
-  int Run();
-
- private:
-  AppOptions options_;
-  CefRefPtr<AgentApp> app_;
-};
-
-struct BrowserImpl {
-  BrowserImpl();
-
-  void SetSize(int width, int height);
-};
-
-#endif  // LKCEF_AGENTS_PYTHON_HPP
diff --git a/livekit-plugins/livekit-plugins-browser/cef/src/app.hpp b/livekit-plugins/livekit-plugins-browser/cef/src/app.hpp
deleted file mode 100644
index aa5b8d1ab..000000000
--- a/livekit-plugins/livekit-plugins-browser/cef/src/app.hpp
+++ /dev/null
@@ -1,47 +0,0 @@
-#ifndef LKCEF_APP_HPP
-#define LKCEF_APP_HPP
-
-#include "dev_renderer.hpp"
-#include "handler.hpp"
-#include "include/cef_app.h"
-#include "include/cef_base.h"
-#include "include/cef_browser_process_handler.h"
-#include "include/cef_client.h"
-#include "include/internal/cef_ptr.h"
-
-class AgentApp : public CefApp, public CefBrowserProcessHandler {
- public:
-  AgentApp(bool dev_mode, std::function<void()> initialized_callback);
-
-  CefRefPtr<CefBrowserProcessHandler> GetBrowserProcessHandler() override {
-    return this;
-  }
-
-  void OnBeforeCommandLineProcessing(
-      const CefString& process_type,
-      CefRefPtr<CefCommandLine> command_line) override;
-
-  void OnContextInitialized() override;
-
-  CefRefPtr<CefClient> GetDefaultClient() override;
-
-  CefRefPtr<BrowserHandle> CreateBrowser(
-      const std::string& url,
-      int framerate,
-      std::function<void()> created_callback);
-
-  int Run();
-
- private:
-  IMPLEMENT_REFCOUNTING(AgentApp);
-
-  CefRefPtr<AgentHandler> client_;
-  CefRefPtr<DevRenderer> dev_renderer_;
-
-  bool dev_mode_;
-  std::function<void()> initialized_callback_;
-};
-
-int RunAgentApp(CefRefPtr<AgentApp> app);
-
-#endif  // LKCEF_APP_HPP
diff --git a/livekit-plugins/livekit-plugins-browser/cef/src/app_mac.mm b/livekit-plugins/livekit-plugins-browser/cef/src/app_mac.mm
deleted file mode 100644
index 3136303eb..000000000
--- a/livekit-plugins/livekit-plugins-browser/cef/src/app_mac.mm
+++ /dev/null
@@ -1,146 +0,0 @@
-
-#import <Cocoa/Cocoa.h>
-
-#include <iostream>
-
-#include "app.hpp"
-#include "handler.hpp"
-#include "include/cef_application_mac.h"
-#include "include/cef_command_line.h"
-#include "include/wrapper/cef_library_loader.h"
-
-// Receives notifications from the application.
-@interface AgentsAppDelegate : NSObject <NSApplicationDelegate>
-
-- (void)createApplication:(id)object;
-- (void)tryToTerminateApplication:(NSApplication*)app;
-@end
-
-// Provide the CefAppProtocol implementation required by CEF.
-@interface AgentsApplication : NSApplication <CefAppProtocol> {
- @private
-  BOOL handlingSendEvent_;
-}
-@end
-
-@implementation AgentsApplication
-- (BOOL)isHandlingSendEvent {
-  return handlingSendEvent_;
-}
-
-- (void)setHandlingSendEvent:(BOOL)handlingSendEvent {
-  handlingSendEvent_ = handlingSendEvent;
-}
-
-- (void)sendEvent:(NSEvent*)event {
-  CefScopedSendingEvent sendingEventScoper;
-  [super sendEvent:event];
-}
-
-- (void)terminate:(id)sender {
-  AgentsAppDelegate* delegate =
-      static_cast<AgentsAppDelegate*>([NSApp delegate]);
-  [delegate tryToTerminateApplication:self];
-  // Return, don't exit. The application is responsible for exiting on its own.
-}
-@end
-
-@implementation AgentsAppDelegate
-
-// Create the application on the UI thread.
-- (void)createApplication:(id)object {
-  [[NSBundle mainBundle] loadNibNamed:@"MainMenu"
-                                owner:NSApp
-                      topLevelObjects:nil];
-
-  // Set the delegate for application events.
-  [[NSApplication sharedApplication] setDelegate:self];
-}
-
-- (void)tryToTerminateApplication:(NSApplication*)app {
-}
-
-- (NSApplicationTerminateReply)applicationShouldTerminate:
-    (NSApplication*)sender {
-  return NSTerminateNow;
-}
-
-// Called when the user clicks the app dock icon while the application is
-// already running.
-- (BOOL)applicationShouldHandleReopen:(NSApplication*)theApplication
-                    hasVisibleWindows:(BOOL)flag {
-  return NO;
-}
-@end
-
-// Entry point function for the browser process.
-int RunAgentApp(CefRefPtr<AgentApp> app) {
-  CefMainArgs main_args(0, nullptr);
-
-  @autoreleasepool {
-    [AgentsApplication sharedApplication];
-
-    // If there was an invocation to NSApp prior to this method, then the NSApp
-    // will not be a AgentsApplication, but will instead be an NSApplication.
-    // This is undesirable and we must enforce that this doesn't happen.
-    CHECK([NSApp isKindOfClass:[AgentsApplication class]]);
-
-    std::string framework_path =
-        "/Users/theomonnom/livekit/agents/livekit-plugins/"
-        "livekit-plugins-browser/cef/src/Debug/lkcef_app.app/Contents/"
-        "Frameworks/Chromium Embedded Framework.framework";
-    std::string main_bundle_path =
-        "/Users/theomonnom/livekit/agents/livekit-plugins/"
-        "livekit-plugins-browser/cef/src/Debug/lkcef_app.app";
-    std::string subprocess_path =
-        "/Users/theomonnom/livekit/agents/livekit-plugins/"
-        "livekit-plugins-browser/cef/src/Debug/lkcef_app.app/Contents/"
-        "Frameworks/lkcef Helper.app/Contents/MacOS/lkcef Helper";
-
-    std::string framework_lib = framework_path + "/Chromium Embedded Framework";
-    if (!cef_load_library(framework_lib.c_str())) {
-      std::cerr << "lkcef: Failed to load CEF library" << std::endl;
-      return 1;
-    }
-
-    CefSettings settings{};
-    // settings.remote_debugging_port = 8088;
-    CefString(&settings.framework_dir_path).FromString(framework_path);
-    CefString(&settings.main_bundle_path).FromString(main_bundle_path);
-    CefString(&settings.browser_subprocess_path).FromString(subprocess_path);
-
-    settings.no_sandbox = true;  // No sandbox for MacOS, for livekit-agents,
-                                 // we're only going to support Linux
-    settings.windowless_rendering_enabled = true;
-
-    // Initialize the CEF browser process. May return false if initialization
-    // fails or if early exit is desired (for example, due to process singleton
-    // relaunch behavior).
-    if (!CefInitialize(main_args, settings, app.get(), nullptr)) {
-      std::cerr << "lkcef: Failed to initialize CEF" << std::endl;
-      // TODO(theomonnom): Use CefGetExitCode();
-      return 1;
-    }
-
-    // Create the application delegate.
-    AgentsAppDelegate* delegate = [[AgentsAppDelegate alloc] init];
-    // Set as the delegate for application events.
-    NSApp.delegate = delegate;
-
-    [delegate performSelectorOnMainThread:@selector(createApplication:)
-                               withObject:nil
-                            waitUntilDone:NO];
-
-    app->Run();
-
-    CefShutdown();
-    cef_unload_library();
-
-#if !__has_feature(objc_arc)
-    [delegate release];
-#endif  // !__has_feature(objc_arc)
-    delegate = nil;
-  }  // @autoreleasepool
-
-  return 0;
-}
diff --git a/livekit-plugins/livekit-plugins-browser/cef/src/dev_renderer.cpp b/livekit-plugins/livekit-plugins-browser/cef/src/dev_renderer.cpp
deleted file mode 100644
index a1e10d316..000000000
--- a/livekit-plugins/livekit-plugins-browser/cef/src/dev_renderer.cpp
+++ /dev/null
@@ -1,195 +0,0 @@
-#include "dev_renderer.hpp"
-
-#include <iostream>
-
-#include "imgui.h"
-#include "imgui_impl_glfw.h"
-#include "imgui_impl_opengl3.h"
-
-#include "include/wrapper/cef_helpers.h"
-
-#include "include/cef_app.h"
-
-// DCHECK on gl errors.
-#if DCHECK_IS_ON()
-#define VERIFY_NO_ERROR                                                      \
-  {                                                                          \
-    int _gl_error = glGetError();                                            \
-    DCHECK(_gl_error == GL_NO_ERROR) << "glGetError returned " << _gl_error; \
-  }
-#else
-#define VERIFY_NO_ERROR
-#endif
-
-static void glfw_error_callback(int error, const char* description) {
-  fprintf(stderr, "GLFW Error %d: %s\n", error, description);
-}
-
-DevRenderer::DevRenderer() {
-}
-
-void DevRenderer::OnAfterCreated(CefRefPtr<CefBrowser> browser) {
-  CEF_REQUIRE_UI_THREAD();
-  int identifier = browser->GetIdentifier();
-
-  unsigned int texture_id;
-  glGenTextures(1, &texture_id);
-  VERIFY_NO_ERROR;
-
-  RenderData render_data{};
-  render_data.texture_id = texture_id;
-  render_data_.insert({identifier, render_data});
-
-  glBindTexture(GL_TEXTURE_2D, texture_id);
-  VERIFY_NO_ERROR;
-  glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
-  VERIFY_NO_ERROR;
-  glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
-}
-
-void DevRenderer::OnPaint(CefRefPtr<CefBrowser> browser,
-             CefRenderHandler::PaintElementType type,
-             const CefRenderHandler::RectList& dirtyRects,
-             const void* buffer,
-             int width,
-             int height) {
-  CEF_REQUIRE_UI_THREAD();
-
-  if (type != CefRenderHandler::PaintElementType::PET_VIEW){
-    std::cout << "Ignoring PET_POPUP" << std::endl;
-    return; // Ignore PET_POPUP for now, bc I'm lazy
-  }
-
-  int identifier = browser->GetIdentifier();
-  RenderData* render_data = &render_data_[identifier];
-
-  int old_width = render_data->view_width;
-  int old_height = render_data->view_height;
-
-  render_data->view_width = width;
-  render_data->view_height = height;
-
-  glBindTexture(GL_TEXTURE_2D, render_data->texture_id);
-
-  glPixelStorei(GL_UNPACK_ROW_LENGTH, width);
-  VERIFY_NO_ERROR;
-
-  bool has_fullscreen_rect = dirtyRects.size() == 1 &&
-                         dirtyRects[0] == CefRect(0, 0, width, height);
-
-  if (old_width != width || old_height != height || has_fullscreen_rect) {
-    glPixelStorei(GL_UNPACK_SKIP_PIXELS, 0);
-    VERIFY_NO_ERROR;
-    glPixelStorei(GL_UNPACK_SKIP_ROWS, 0);
-    VERIFY_NO_ERROR;
-    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0,
-                 GL_BGRA, GL_UNSIGNED_INT_8_8_8_8_REV, buffer);
-    VERIFY_NO_ERROR;
-  } else {
-    CefRenderHandler::RectList::const_iterator i = dirtyRects.begin();
-    for (; i != dirtyRects.end(); ++i) {
-      const CefRect& rect = *i;
-      glPixelStorei(GL_UNPACK_SKIP_PIXELS, rect.x);
-      VERIFY_NO_ERROR;
-      glPixelStorei(GL_UNPACK_SKIP_ROWS, rect.y);
-      VERIFY_NO_ERROR;
-      glTexSubImage2D(GL_TEXTURE_2D, 0, rect.x, rect.y, rect.width,
-                      rect.height, GL_BGRA, GL_UNSIGNED_INT_8_8_8_8_REV,
-                      buffer);
-      VERIFY_NO_ERROR;
-    }
-  }
-}
-
-void DevRenderer::OnBeforeClose(CefRefPtr<CefBrowser> browser) {
-  CEF_REQUIRE_UI_THREAD();
-  int identifier = browser->GetIdentifier();
-  RenderData* render_data = &render_data_[identifier];
-  glDeleteTextures(1, &render_data->texture_id);
-  render_data_.erase(identifier);
-}
-
-void DevRenderer::Run() {
-  glfwSetErrorCallback(glfw_error_callback);
-
-  if (!glfwInit()) {
-    std::cerr << "Failed to initialize GLFW" << std::endl;
-    return;
-  }
-
-  const char* glsl_version = "#version 150";
-  glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 3);
-  glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 2);
-  glfwWindowHint(GLFW_OPENGL_PROFILE, GLFW_OPENGL_CORE_PROFILE);
-  glfwWindowHint(GLFW_OPENGL_FORWARD_COMPAT, GL_TRUE);
-
-  window_ =
-      glfwCreateWindow(800, 600, "livekit-plugins-browser (Development Window)",
-                       nullptr, nullptr);
-
-  if (!window_) {
-    std::cerr << "Failed to create GLFW window" << std::endl;
-    glfwTerminate();
-    return;
-  }
-  glfwMakeContextCurrent(window_);
-  glfwSwapInterval(1);  // Enable vsync
-
-  IMGUI_CHECKVERSION();
-
-  ImGui::CreateContext();
-  ImGuiIO& io = ImGui::GetIO();
-  io.ConfigFlags |= ImGuiConfigFlags_NavEnableKeyboard;
-  io.ConfigFlags |= ImGuiConfigFlags_DockingEnable;
-
-  // Setup Platform/Renderer backends
-  ImGui_ImplGlfw_InitForOpenGL(window_, true);
-  ImGui_ImplOpenGL3_Init(glsl_version);
-
-
-  ImVec4 clear_color = ImVec4(0.45f, 0.55f, 0.60f, 1.00f);
-  while (!glfwWindowShouldClose(window_)) {
-    glfwPollEvents();
-
-    CefDoMessageLoopWork();
-
-    ImGui_ImplOpenGL3_NewFrame();
-    ImGui_ImplGlfw_NewFrame();
-    ImGui::NewFrame();
-    ImGui::ShowDemoWindow();
-
-
-    for (auto& [identifier, render_data] : render_data_) {
-      ImGui::Begin("Browser");
-      ImGui::Text("Browser %d", identifier);
-      ImGui::Image((void*)(intptr_t)render_data.texture_id,
-                   ImVec2(render_data.view_width, render_data.view_height));
-      ImGui::End();
-    }
-
-
-
-    // Rendering
-    ImGui::Render();
-    int display_w, display_h;
-    glfwGetFramebufferSize(window_, &display_w, &display_h);
-    glViewport(0, 0, display_w, display_h);
-    glClearColor(clear_color.x * clear_color.w, clear_color.y * clear_color.w,
-                 clear_color.z * clear_color.w, clear_color.w);
-    glClear(GL_COLOR_BUFFER_BIT);
-    ImGui_ImplOpenGL3_RenderDrawData(ImGui::GetDrawData());
-
-    glfwSwapBuffers(window_);
-  }
-
-  ImGui_ImplOpenGL3_Shutdown();
-  ImGui_ImplGlfw_Shutdown();
-  ImGui::DestroyContext();
-
-  glfwDestroyWindow(window_);
-  glfwTerminate();
-}
-
-void DevRenderer::Close() {
-  //glfwSetWindowShouldClose(window_, GLFW_TRUE);
-}
diff --git a/livekit-plugins/livekit-plugins-browser/cef/src/handler.cpp b/livekit-plugins/livekit-plugins-browser/cef/src/handler.cpp
deleted file mode 100644
index 8ca9c88c5..000000000
--- a/livekit-plugins/livekit-plugins-browser/cef/src/handler.cpp
+++ /dev/null
@@ -1,156 +0,0 @@
-#include "handler.hpp"
-
-#include <iostream>
-
-#include "include/base/cef_callback.h"
-#include "include/cef_app.h"
-#include "include/cef_parser.h"
-#include "include/views/cef_browser_view.h"
-#include "include/views/cef_window.h"
-#include "include/wrapper/cef_closure_task.h"
-#include "include/wrapper/cef_helpers.h"
-
-namespace {
-
-// Returns a data: URI with the specified contents.
-std::string GetDataURI(const std::string& data, const std::string& mime_type) {
-  return "data:" + mime_type + ";base64," +
-         CefURIEncode(CefBase64Encode(data.data(), data.size()), false)
-             .ToString();
-}
-
-}  // namespace
-
-AgentHandler::AgentHandler(CefRefPtr<DevRenderer> dev_renderer)
-    : dev_renderer_(dev_renderer) {}
-
-void AgentHandler::OnTitleChange(CefRefPtr<CefBrowser> browser,
-                                 const CefString& title) {
-  CEF_REQUIRE_UI_THREAD();
-}
-
-void AgentHandler::OnPaint(CefRefPtr<CefBrowser> browser,
-                           PaintElementType type,
-                           const RectList& dirtyRects,
-                           const void* buffer,
-                           int width,
-                           int height) {
-
-  std::cout << "OnPaint" << std::endl;
-
-  if (dev_renderer_)
-    dev_renderer_->OnPaint(browser, type, dirtyRects, buffer, width, height);
-}
-
-void AgentHandler::GetViewRect(CefRefPtr<CefBrowser> browser, CefRect& rect) {
-        CEF_REQUIRE_UI_THREAD();
-        rect.Set(0, 0, 800, 600);
-};
-
-void AgentHandler::OnAudioStreamPacket(CefRefPtr<CefBrowser> browser,
-                                       const float** data,
-                                       int frames,
-                                       int64_t pts) {
-  std::cout << "OnAudioStreamPacket" << std::endl;
-}
-
-void AgentHandler::OnAudioStreamStarted(CefRefPtr<CefBrowser> browser,
-                                        const CefAudioParameters& params,
-                                        int channels) {}
-
-void AgentHandler::OnAudioStreamStopped(CefRefPtr<CefBrowser> browser) {}
-
-void AgentHandler::OnAudioStreamError(CefRefPtr<CefBrowser> browser,
-                                      const CefString& message) {}
-
-void AgentHandler::OnAfterCreated(CefRefPtr<CefBrowser> browser) {
-  CEF_REQUIRE_UI_THREAD();
-
-  int identifier = browser->GetIdentifier();
-  CefRefPtr<BrowserHandle> handle = pending_handles_.front();
-  pending_handles_.pop_front();
-
-  handle->browser_ = browser;
-  if (handle->created_callback_)
-    handle->created_callback_();
-
-  browser_handles_[identifier] = handle;
-
-  if (dev_renderer_)
-    dev_renderer_->OnAfterCreated(browser);
-}
-
-bool AgentHandler::DoClose(CefRefPtr<CefBrowser> browser) {
-  CEF_REQUIRE_UI_THREAD();
-
-  return false;
-}
-
-void AgentHandler::OnBeforeClose(CefRefPtr<CefBrowser> browser) {
-  CEF_REQUIRE_UI_THREAD();
-
-
-  if (dev_renderer_)
-    dev_renderer_->OnBeforeClose(browser);
-}
-
-void AgentHandler::OnLoadError(CefRefPtr<CefBrowser> browser,
-                               CefRefPtr<CefFrame> frame,
-                               ErrorCode errorCode,
-                               const CefString& errorText,
-                               const CefString& failedUrl) {
-  CEF_REQUIRE_UI_THREAD();
-
-  // Allow Chrome to show the error page.
-  if (IsChromeRuntimeEnabled()) {
-    return;
-  }
-
-  // Don't display an error for downloaded files.
-  if (errorCode == ERR_ABORTED) {
-    return;
-  }
-
-  // Display a load error message using a data: URI.
-  std::stringstream ss;
-  ss << "<html><body bgcolor=\"white\">"
-        "<h2>Failed to load URL "
-     << std::string(failedUrl) << " with error " << std::string(errorText)
-     << " (" << errorCode << ").</h2></body></html>";
-
-  frame->LoadURL(GetDataURI(ss.str(), "text/html"));
-}
-
-/*
-void AgentHandler::CloseAllBrowsers(bool force_close) {
-  if (!CefCurrentlyOn(TID_UI)) {
-    // Execute on the UI thread.
-    CefPostTask(TID_UI, base::BindOnce(&AgentHandler::CloseAllBrowsers, this,
-                                       force_close));
-    return;
-  }
-
-  if (browser_list_.empty()) {
-    return;
-  }
-
-  BrowserList::const_iterator it = browser_list_.begin();
-  for (; it != browser_list_.end(); ++it) {
-    (*it)->GetHost()->CloseBrowser(force_close);
-  }
-}
- */
-
-bool AgentHandler::IsChromeRuntimeEnabled() {
-  static bool enabled = []() {
-    return CefCommandLine::GetGlobalCommandLine()->HasSwitch(
-        "enable-chrome-runtime");
-  }();
-  return enabled;
-}
-
-#if !defined(OS_MAC)
-void AgentHandler::PlatformShowWindow(CefRefPtr<CefBrowser> browser) {
-  NOTIMPLEMENTED();
-}
-#endif
diff --git a/livekit-plugins/livekit-plugins-browser/cef/src/handler.hpp b/livekit-plugins/livekit-plugins-browser/cef/src/handler.hpp
deleted file mode 100644
index 2406c6d91..000000000
--- a/livekit-plugins/livekit-plugins-browser/cef/src/handler.hpp
+++ /dev/null
@@ -1,94 +0,0 @@
-#ifndef LKCEF_HANDLER_HPP
-#define LKCEF_HANDLER_HPP
-
-#include "include/cef_client.h"
-
-#include "dev_renderer.hpp"
-#include <list>
-
-class BrowserHandle: public CefBaseRefCounted{
- public:
-  BrowserHandle(std::function<void()> created_callback) : created_callback_(created_callback) {}
-
-
-  CefRefPtr<CefBrowser> browser_ = nullptr;
-  std::function<void()> created_callback_ = nullptr;
-
-
- IMPLEMENT_REFCOUNTING(BrowserHandle);
-};
-
-
-class AgentHandler : public CefClient,
-                     public CefDisplayHandler,
-                     public CefRenderHandler,
-                     public CefAudioHandler,
-                     public CefLifeSpanHandler,
-                     public CefLoadHandler {
-
-public:
-  AgentHandler(CefRefPtr<DevRenderer> dev_renderer);
-
-  CefRefPtr<CefDisplayHandler> GetDisplayHandler() override { return this; }
-  CefRefPtr<CefRenderHandler> GetRenderHandler() override { return this; }
-  CefRefPtr<CefAudioHandler> GetAudioHandler() override { return this; }
-  CefRefPtr<CefLifeSpanHandler> GetLifeSpanHandler() override { return this; }
-  CefRefPtr<CefLoadHandler> GetLoadHandler() override { return this; }
-
-  // CefDisplayHandler methods
-  void OnTitleChange(CefRefPtr<CefBrowser> browser,
-                     const CefString &title) override;
-
-  // CefRenderHandler methods
-  void OnPaint(CefRefPtr<CefBrowser> browser, PaintElementType type,
-               const RectList &dirtyRects, const void *buffer, int width,
-               int height) override;
-
-  void GetViewRect(CefRefPtr<CefBrowser> browser, CefRect &rect) override;
-
-  // CefAudioHandler methods
-  void OnAudioStreamPacket(CefRefPtr<CefBrowser> browser, const float **data,
-                           int frames, int64_t pts) override;
-
-  void OnAudioStreamStarted(CefRefPtr<CefBrowser> browser,
-                            const CefAudioParameters &params,
-                            int channels) override;
-
-  void OnAudioStreamStopped(CefRefPtr<CefBrowser> browser) override;
-
-  void OnAudioStreamError(CefRefPtr<CefBrowser> browser,
-                          const CefString &message) override;
-
-  // CefLifeSpanHandler methods
-  void OnAfterCreated(CefRefPtr<CefBrowser> browser) override;
-  bool DoClose(CefRefPtr<CefBrowser> browser) override;
-  void OnBeforeClose(CefRefPtr<CefBrowser> browser) override;
-
-  // CefLoadHandler methods
-  void OnLoadError(CefRefPtr<CefBrowser> browser, CefRefPtr<CefFrame> frame,
-                   ErrorCode errorCode, const CefString &errorText,
-                   const CefString &failedUrl) override;
-
-  //void CloseAllBrowsers(bool force_close);
-
-  static bool IsChromeRuntimeEnabled();
-
-
-  void AddPendingHandle(CefRefPtr<BrowserHandle> handle) {
-    pending_handles_.push_back(handle);
-  }
-
-  void RemovePendingHandle(CefRefPtr<BrowserHandle> handle) {
-    pending_handles_.remove(handle);
-  }
-
-private:
-  std::unordered_map<int, CefRefPtr<BrowserHandle>> browser_handles_;
-  std::list<CefRefPtr<BrowserHandle>> pending_handles_;
-
-  CefRefPtr<DevRenderer> dev_renderer_;
-
-  IMPLEMENT_REFCOUNTING(AgentHandler);
-};
-
-#endif // LKCEF_HANDLER_HPP
diff --git a/livekit-plugins/livekit-plugins-browser/cef/src/resources/lkcef-Info.plist b/livekit-plugins/livekit-plugins-browser/cef/src/resources/lkcef-Info.plist
deleted file mode 100644
index ce63cb8f6..000000000
--- a/livekit-plugins/livekit-plugins-browser/cef/src/resources/lkcef-Info.plist
+++ /dev/null
@@ -1,36 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
-<plist version="1.0">
-<dict>
-	<key>CFBundleDevelopmentRegion</key>
-	<string>en</string>
-	<key>CFBundleDisplayName</key>
-	<string>${EXECUTABLE_NAME}</string>
-	<key>CFBundleExecutable</key>
-	<string>${EXECUTABLE_NAME}</string>
-	<key>CFBundleIdentifier</key>
-	<string>io.livekit.cef.helper${BUNDLE_ID_SUFFIX}</string>
-	<key>CFBundleInfoDictionaryVersion</key>
-	<string>6.0</string>
-	<key>CFBundleName</key>
-	<string>${PRODUCT_NAME}</string>
-	<key>CFBundlePackageType</key>
-	<string>APPL</string>
-	<key>CFBundleSignature</key>
-	<string>????</string>
-	<key>LSEnvironment</key>
-	<dict>
-		<key>MallocNanoZone</key>
-		<string>0</string>
-	</dict>
-	<key>LSFileQuarantineEnabled</key>
-	<true/>
-	<key>LSMinimumSystemVersion</key>
-	<string>10.11.0</string>
-	<key>LSUIElement</key>
-	<string>1</string>
-	<key>NSSupportsAutomaticGraphicsSwitching</key>
-	<true/>
-</dict>
-</plist>
-
diff --git a/livekit-plugins/livekit-plugins-browser/cef/src/run_browser.py b/livekit-plugins/livekit-plugins-browser/cef/src/run_browser.py
deleted file mode 100644
index c12ab3744..000000000
--- a/livekit-plugins/livekit-plugins-browser/cef/src/run_browser.py
+++ /dev/null
@@ -1,27 +0,0 @@
-# flake8: noqa
-
-import sys
-
-print("cwd: ", sys.path[0])
-
-sys.path.insert(0, "./Debug")
-import lkcef_python as lkcef
-
-print("lkcef __dict__: ", lkcef.__dict__)
-print("BrowserImpl __dict__: ", lkcef.BrowserImpl.__dict__)
-
-
-def _context_initialized():
-    opts = lkcef.BrowserOptions()
-    opts.framerate = 30
-
-    app.create_browser("http://www.livekit.io", opts)
-    print("LOL: Context initialized")
-
-
-opts = lkcef.AppOptions()
-opts.dev_mode = True
-opts.initialized_callback = _context_initialized
-
-app = lkcef.BrowserApp(opts)
-app.run()
diff --git a/livekit-plugins/livekit-plugins-browser/cef/cmake/DownloadCEF.cmake b/livekit-plugins/livekit-plugins-browser/cmake/DownloadCEF.cmake
similarity index 100%
rename from livekit-plugins/livekit-plugins-browser/cef/cmake/DownloadCEF.cmake
rename to livekit-plugins/livekit-plugins-browser/cmake/DownloadCEF.cmake
diff --git a/livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/__init__.py b/livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/__init__.py
new file mode 100644
index 000000000..66009b84e
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/__init__.py
@@ -0,0 +1,29 @@
+# Copyright 2023 LiveKit, Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from livekit.agents import Plugin
+
+from .log import logger
+from .proc import BrowserContext, BrowserPage
+from .version import __version__
+
+__all__ = ["BrowserContext", "BrowserPage"]
+
+
+class BrowserPlugin(Plugin):
+    def __init__(self):
+        super().__init__(__name__, __version__, __package__, logger)
+
+
+Plugin.register_plugin(BrowserPlugin())
diff --git a/livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/log.py b/livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/log.py
new file mode 100644
index 000000000..8179ee6a5
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/log.py
@@ -0,0 +1,3 @@
+import logging
+
+logger = logging.getLogger("livekit.plugins.browser")
diff --git a/livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/proc.py b/livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/proc.py
new file mode 100644
index 000000000..6910a0ba9
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/proc.py
@@ -0,0 +1,239 @@
+from __future__ import annotations
+
+import asyncio
+import contextlib
+import multiprocessing as mp
+import multiprocessing.context as mpc
+import multiprocessing.shared_memory as mp_shm
+import socket
+import tempfile
+from contextlib import asynccontextmanager
+from dataclasses import dataclass
+from typing import Callable, Literal
+
+from livekit import rtc
+from livekit.agents import ipc, utils
+
+from . import logger, proc_main, proto
+
+
+@dataclass
+class _PageOptions:
+    page_id: int
+    url: str
+    width: int
+    height: int
+    framerate: int
+
+
+EventTypes = Literal["paint"]
+
+
+@dataclass
+class PaintData:
+    dirty_rects: list[tuple[int, int, int, int]]
+    frame: rtc.VideoFrame
+    width: int
+    height: int
+
+
+@dataclass
+class BrowserOptions:
+    url: str
+    framerate: int
+    width: int
+    height: int
+    paint_callback: Callable[[PaintData], None]
+
+
+class BrowserPage(utils.EventEmitter[EventTypes]):
+    def __init__(
+        self,
+        mp_ctx: mpc.SpawnContext,
+        opts: _PageOptions,
+        ctx_duplex: utils.aio.duplex_unix._AsyncDuplex,
+    ) -> None:
+        super().__init__()
+        self._mp_ctx = mp_ctx
+        self._opts = opts
+        self._ctx_duplex = ctx_duplex
+
+        self._view_width = 0
+        self._view_height = 0
+
+        self._created_fut = asyncio.Future()
+        self._close_fut = asyncio.Future()
+
+    @property
+    def id(self) -> int:
+        return self._opts.page_id
+
+    async def start(self) -> None:
+        shm_name = f"lkcef_browser_{utils.shortuuid()}"
+        self._shm = mp_shm.SharedMemory(
+            create=True,
+            size=proto.SHM_MAX_WIDTH * proto.SHM_MAX_HEIGHT * 4,
+            name=shm_name,
+        )
+
+        self._framebuffer = rtc.VideoFrame(
+            proto.SHM_MAX_WIDTH,
+            proto.SHM_MAX_HEIGHT,
+            rtc.VideoBufferType.BGRA,
+            bytearray(proto.SHM_MAX_WIDTH * proto.SHM_MAX_HEIGHT * 4),
+        )
+
+        req = proto.CreateBrowserRequest(
+            page_id=self._opts.page_id,
+            width=self._opts.width,
+            height=self._opts.height,
+            shm_name=shm_name,
+            url=self._opts.url,
+            framerate=self._opts.framerate,
+        )
+
+        await ipc.channel.asend_message(self._ctx_duplex, req)
+
+        # TODO(theomonnom): create timeout (would prevent never resolving futures if the
+        #  browser process crashed for some reasons)
+        await asyncio.shield(self._created_fut)
+
+    async def aclose(self) -> None:
+        await ipc.channel.asend_message(
+            self._ctx_duplex, proto.CloseBrowserRequest(page_id=self.id)
+        )
+        await asyncio.shield(self._close_fut)
+
+        self._shm.unlink()
+        self._shm.close()
+
+    async def _handle_created(self, msg: proto.CreateBrowserResponse) -> None:
+        self._created_fut.set_result(None)
+
+    async def _handle_paint(self, acq: proto.AcquirePaintData) -> None:
+        old_width = self._view_width
+        old_height = self._view_height
+        self._view_width = acq.width
+        self._view_height = acq.height
+
+        # TODO(theomonnom): remove hacky alloc-free resizing
+        self._framebuffer._width = acq.width
+        self._framebuffer._height = acq.height
+
+        proto.copy_paint_data(
+            acq, old_width, old_height, self._shm.buf, self._framebuffer.data
+        )
+
+        paint_data = PaintData(
+            dirty_rects=acq.dirty_rects,
+            frame=self._framebuffer,
+            width=acq.width,
+            height=acq.height,
+        )
+        self.emit("paint", paint_data)
+
+        release_paint = proto.ReleasePaintData(page_id=acq.page_id)
+        await ipc.channel.asend_message(self._ctx_duplex, release_paint)
+
+    async def _handle_close(self, msg: proto.BrowserClosed) -> None:
+        logger.debug("browser page closed", extra={"page_id": self.id})
+        self._close_fut.set_result(None)
+
+
+class BrowserContext:
+    def __init__(self, *, dev_mode: bool, remote_debugging_port: int = 0) -> None:
+        self._mp_ctx = mp.get_context("spawn")
+        self._pages: dict[int, BrowserPage] = {}
+        self._dev_mode = dev_mode
+        self._initialized = False
+        self._next_page_id = 1
+        self._remote_debugging_port = remote_debugging_port
+
+    async def initialize(self) -> None:
+        mp_pch, mp_cch = socket.socketpair()
+        self._duplex = await utils.aio.duplex_unix._AsyncDuplex.open(mp_pch)
+
+        self._proc = self._mp_ctx.Process(target=proc_main.main, args=(mp_cch,))
+        self._proc.start()
+        mp_cch.close()
+
+        if not self._remote_debugging_port:
+            with contextlib.closing(
+                socket.socket(socket.AF_INET, socket.SOCK_STREAM)
+            ) as s:
+                s.bind(("", 0))
+                s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
+                self._remote_debugging_port = s.getsockname()[1]
+
+            logger.debug("using remote debugging port %d", self._remote_debugging_port)
+
+        await ipc.channel.asend_message(
+            self._duplex,
+            proto.InitializeContextRequest(
+                dev_mode=self._dev_mode,
+                remote_debugging_port=self._remote_debugging_port,
+                root_cache_path=tempfile.mkdtemp(),  # TODO(theomonnom): cleanup
+            ),
+        )
+        resp = await ipc.channel.arecv_message(self._duplex, proto.IPC_MESSAGES)
+        assert isinstance(resp, proto.ContextInitializedResponse)
+        self._initialized = True
+        logger.debug("browser context initialized", extra={"pid": self._proc.pid})
+
+        self._main_atask = asyncio.create_task(self._main_task(self._duplex))
+
+    @asynccontextmanager
+    async def playwright(self, timeout: float | None = None):
+        if not self._initialized:
+            raise RuntimeError("BrowserContext not initialized")
+
+        from playwright.async_api import async_playwright
+
+        async with async_playwright() as p:
+            url = f"http://localhost:{self._remote_debugging_port}"
+            browser = await p.chromium.connect_over_cdp(url, timeout=timeout)
+            try:
+                yield browser
+            finally:
+                await browser.close()
+
+    @utils.log_exceptions(logger)
+    async def _main_task(self, duplex: utils.aio.duplex_unix._AsyncDuplex) -> None:
+        while True:
+            try:
+                msg = await ipc.channel.arecv_message(duplex, proto.IPC_MESSAGES)
+            except utils.aio.duplex_unix.DuplexClosed:
+                break
+
+            if isinstance(msg, proto.CreateBrowserResponse):
+                page = self._pages[msg.page_id]
+                await page._handle_created(msg)
+            elif isinstance(msg, proto.AcquirePaintData):
+                page = self._pages[msg.page_id]
+                await page._handle_paint(msg)
+            elif isinstance(msg, proto.BrowserClosed):
+                page = self._pages[msg.page_id]
+                await page._handle_close(msg)
+
+    async def new_page(
+        self, *, url: str, width: int = 800, height: int = 600, framerate: int = 30
+    ) -> BrowserPage:
+        if not self._initialized:
+            raise RuntimeError("BrowserContext not initialized")
+
+        page_id = self._next_page_id
+        self._next_page_id += 1
+        page = BrowserPage(
+            self._mp_ctx,
+            _PageOptions(
+                page_id=page_id,
+                url=url,
+                width=width,
+                height=height,
+                framerate=framerate,
+            ),
+            self._duplex,
+        )
+        self._pages[page_id] = page
+        await page.start()
+        return page
diff --git a/livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/proc_main.py b/livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/proc_main.py
new file mode 100644
index 000000000..c9ca11706
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/proc_main.py
@@ -0,0 +1,193 @@
+import importlib.resources
+import multiprocessing.shared_memory as mp_shm
+import socket
+import threading
+
+from livekit.agents import ipc, utils
+
+from . import logger, proto
+
+
+class BrowserServer:
+    def __init__(
+        self,
+        duplex: utils.aio.duplex_unix._Duplex,
+        shm: mp_shm.SharedMemory,
+        page_id: int,
+    ):
+        self._duplex = duplex
+        self._shm = shm
+        self._page_id = page_id
+
+        self._view_width = 0
+        self._view_height = 0
+
+        self._closing = False
+        self._release_paint_e = threading.Event()
+
+    @staticmethod
+    def create(
+        *,
+        duplex: utils.aio.duplex_unix._Duplex,
+        create_req: proto.CreateBrowserRequest,
+        browser_app,
+    ) -> "BrowserServer":
+        logger.debug(
+            "creating browser",
+            extra={
+                "page_id": create_req.page_id,
+                "url": create_req.url,
+                "framerate": create_req.framerate,
+                "width": create_req.width,
+                "height": create_req.height,
+                "shm_name": create_req.shm_name,
+            },
+        )
+
+        import lkcef_python as lkcef
+
+        opts = lkcef.BrowserOptions()
+        opts.framerate = create_req.framerate
+        opts.width = create_req.width
+        opts.height = create_req.height
+
+        shm = mp_shm.SharedMemory(name=create_req.shm_name)
+        bserver = BrowserServer(duplex, shm, create_req.page_id)
+
+        opts.created_callback = bserver._browser_created
+        opts.paint_callback = bserver._paint
+        opts.close_callback = bserver._closed
+        browser_app.create_browser(create_req.url, opts)
+        return bserver
+
+    def _browser_created(self, impl):
+        browser_id = impl.identifier()
+        logger.debug(
+            "browser created",
+            extra={"browser_id": browser_id, "page_id": self._page_id},
+        )
+
+        self._impl = impl
+
+        try:
+            ipc.channel.send_message(
+                self._duplex,
+                proto.CreateBrowserResponse(
+                    page_id=self._page_id, browser_id=browser_id
+                ),
+            )
+        except utils.aio.duplex_unix.DuplexClosed:
+            logger.exception("failed to send CreateBrowserResponse")
+
+    def _paint(self, frame_data):
+        if self._closing:
+            return  # make sure to not use the shm
+
+        acq = proto.AcquirePaintData()
+        acq.page_id = self._page_id
+        acq.width = frame_data.width
+        acq.height = frame_data.height
+
+        dirty_rects = []
+        for rect in frame_data.dirty_rects:
+            dirty_rects.append((rect.x, rect.y, rect.width, rect.height))
+
+        acq.dirty_rects = dirty_rects
+
+        old_width = self._view_width
+        old_height = self._view_height
+        self._view_width = frame_data.width
+        self._view_height = frame_data.height
+
+        proto.copy_paint_data(
+            acq, old_width, old_height, frame_data.buffer, self._shm.buf
+        )
+
+        try:
+            ipc.channel.send_message(self._duplex, acq)
+            self._release_paint_e.wait()  # wait for release
+            self._release_paint_e.clear()
+        except utils.aio.duplex_unix.DuplexClosed:
+            logger.exception("failed to send AcquirePaintData")
+
+    def _closed(self) -> None:
+        ipc.channel.send_message(
+            self._duplex, proto.BrowserClosed(page_id=self._page_id)
+        )
+
+    def handle_release_paint(self, msg: proto.ReleasePaintData):
+        self._release_paint_e.set()
+
+    def handle_close(self, msg: proto.CloseBrowserRequest):
+        self._closing = True
+        self._impl.close()
+
+
+def _manager_thread(duplex: utils.aio.duplex_unix._Duplex, browser_app):
+    browsers: dict[int, BrowserServer] = {}
+
+    while True:
+        try:
+            msg = ipc.channel.recv_message(duplex, proto.IPC_MESSAGES)
+        except utils.aio.duplex_unix.DuplexClosed:
+            break
+
+        if isinstance(msg, proto.CreateBrowserRequest):
+            server = BrowserServer.create(
+                duplex=duplex, create_req=msg, browser_app=browser_app
+            )
+            browsers[msg.page_id] = server
+        elif isinstance(msg, proto.ReleasePaintData):
+            server = browsers[msg.page_id]
+            server.handle_release_paint(msg)
+        elif isinstance(msg, proto.CloseBrowserRequest):
+            server = browsers[msg.page_id]
+            server.handle_close(msg)
+            del browsers[msg.page_id]
+
+
+def main(mp_cch: socket.socket):
+    import lkcef_python as lkcef
+
+    duplex = utils.aio.duplex_unix._Duplex.open(mp_cch)
+
+    init_req = ipc.channel.recv_message(duplex, proto.IPC_MESSAGES)
+    assert isinstance(init_req, proto.InitializeContextRequest)
+
+    logger.debug("initializing browser context", extra={"dev_mode": init_req.dev_mode})
+
+    def _context_initialized():
+        try:
+            ipc.channel.send_message(duplex, proto.ContextInitializedResponse())
+        except utils.aio.duplex_unix.DuplexClosed:
+            logger.exception("failed to send ContextInitializedResponse")
+
+    opts = lkcef.AppOptions()
+    opts.dev_mode = init_req.dev_mode
+    opts.remote_debugging_port = init_req.remote_debugging_port
+    opts.root_cache_path = init_req.root_cache_path
+    opts.initialized_callback = _context_initialized
+
+    res = (
+        importlib.resources.files("livekit.plugins.browser.resources") / "lkcef_app.app"
+    )
+    with importlib.resources.as_file(res) as path:
+        opts.framework_path = str(
+            path / "Contents" / "Frameworks" / "Chromium Embedded Framework.framework"
+        )
+        opts.main_bundle_path = str(path)
+        opts.subprocess_path = str(
+            path
+            / "Contents"
+            / "Frameworks"
+            / "lkcef Helper.app"
+            / "Contents"
+            / "MacOS"
+            / "lkcef Helper"
+        )
+
+        app = lkcef.BrowserApp(opts)
+        man_t = threading.Thread(target=_manager_thread, args=(duplex, app))
+        man_t.start()
+
+        app.run()  # run indefinitely
diff --git a/livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/proto.py b/livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/proto.py
new file mode 100644
index 000000000..17d0cac0f
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/proto.py
@@ -0,0 +1,196 @@
+import io
+from dataclasses import dataclass, field
+from typing import ClassVar
+
+import numpy as np
+from livekit.agents.ipc import channel
+
+# there is no risk to increase these values. just using these defaults for now
+SHM_MAX_WIDTH = 1920
+SHM_MAX_HEIGHT = 1080
+
+
+@dataclass
+class InitializeContextRequest:
+    MSG_ID: ClassVar[int] = 0
+    dev_mode: bool = False
+    remote_debugging_port: int = 0
+    root_cache_path: str = ""
+
+    def write(self, b: io.BytesIO) -> None:
+        channel.write_bool(b, self.dev_mode)
+        channel.write_int(b, self.remote_debugging_port)
+        channel.write_string(b, self.root_cache_path)
+
+    def read(self, b: io.BytesIO) -> None:
+        self.dev_mode = channel.read_bool(b)
+        self.remote_debugging_port = channel.read_int(b)
+        self.root_cache_path = channel.read_string(b)
+
+
+@dataclass
+class ContextInitializedResponse:
+    MSG_ID: ClassVar[int] = 1
+
+
+@dataclass
+class CreateBrowserRequest:
+    MSG_ID: ClassVar[int] = 2
+    page_id: int = -1
+    url: str = ""
+    framerate: int = 0
+    width: int = 0
+    height: int = 0
+    shm_name: str = ""
+
+    def write(self, b: io.BytesIO) -> None:
+        channel.write_int(b, self.page_id)
+        channel.write_string(b, self.url)
+        channel.write_int(b, self.framerate)
+        channel.write_int(b, self.width)
+        channel.write_int(b, self.height)
+        channel.write_string(b, self.shm_name)
+
+    def read(self, b: io.BytesIO) -> None:
+        self.page_id = channel.read_int(b)
+        self.url = channel.read_string(b)
+        self.framerate = channel.read_int(b)
+        self.width = channel.read_int(b)
+        self.height = channel.read_int(b)
+        self.shm_name = channel.read_string(b)
+
+
+@dataclass
+class CreateBrowserResponse:
+    """
+    This is going to wait for the created_callback to be called.
+    (The create_browser function will be async)
+    """
+
+    MSG_ID: ClassVar[int] = 3
+    page_id: int = -1
+    browser_id: int = 0
+
+    def write(self, b: io.BytesIO) -> None:
+        channel.write_int(b, self.page_id)
+        channel.write_int(b, self.browser_id)
+
+    def read(self, b: io.BytesIO) -> None:
+        self.page_id = channel.read_int(b)
+        self.browser_id = channel.read_int(b)
+
+
+@dataclass
+class AcquirePaintData:
+    MSG_ID: ClassVar[int] = 4
+    page_id: int = -1
+    width: int = 0
+    height: int = 0
+    dirty_rects: list[tuple[int, int, int, int]] = field(default_factory=list)
+
+    def write(self, b: io.BytesIO) -> None:
+        channel.write_int(b, self.page_id)
+        channel.write_int(b, self.width)
+        channel.write_int(b, self.height)
+        channel.write_int(b, len(self.dirty_rects))
+        for rect in self.dirty_rects:
+            channel.write_int(b, rect[0])
+            channel.write_int(b, rect[1])
+            channel.write_int(b, rect[2])
+            channel.write_int(b, rect[3])
+
+    def read(self, b: io.BytesIO) -> None:
+        self.page_id = channel.read_int(b)
+        self.width = channel.read_int(b)
+        self.height = channel.read_int(b)
+        num_rects = channel.read_int(b)
+        self.dirty_rects = []
+        for _ in range(num_rects):
+            x = channel.read_int(b)
+            y = channel.read_int(b)
+            width = channel.read_int(b)
+            height = channel.read_int(b)
+            self.dirty_rects.append((x, y, width, height))
+
+
+@dataclass
+class ReleasePaintData:
+    MSG_ID: ClassVar[int] = 5
+    page_id: int = -1
+
+    def write(self, b: io.BytesIO) -> None:
+        channel.write_int(b, self.page_id)
+
+    def read(self, b: io.BytesIO) -> None:
+        self.page_id = channel.read_int(b)
+
+
+@dataclass
+class CloseBrowserRequest:
+    MSG_ID: ClassVar[int] = 6
+    page_id: int = -1
+
+    def write(self, b: io.BytesIO) -> None:
+        channel.write_int(b, self.page_id)
+
+    def read(self, b: io.BytesIO) -> None:
+        self.page_id = channel.read_int(b)
+
+
+@dataclass
+class BrowserClosed:
+    MSG_ID: ClassVar[int] = 7
+    page_id: int = -1
+
+    def write(self, b: io.BytesIO) -> None:
+        channel.write_int(b, self.page_id)
+
+    def read(self, b: io.BytesIO) -> None:
+        self.page_id = channel.read_int(b)
+
+
+IPC_MESSAGES = {
+    InitializeContextRequest.MSG_ID: InitializeContextRequest,
+    ContextInitializedResponse.MSG_ID: ContextInitializedResponse,
+    CreateBrowserRequest.MSG_ID: CreateBrowserRequest,
+    CreateBrowserResponse.MSG_ID: CreateBrowserResponse,
+    AcquirePaintData.MSG_ID: AcquirePaintData,
+    ReleasePaintData.MSG_ID: ReleasePaintData,
+    CloseBrowserRequest.MSG_ID: CloseBrowserRequest,
+    BrowserClosed.MSG_ID: BrowserClosed,
+}
+
+
+def copy_paint_data(
+    acq: AcquirePaintData,
+    old_width: int,
+    old_height: int,
+    source: memoryview,
+    dest: memoryview,
+):
+    dirty_rects = acq.dirty_rects
+
+    # source_arr = np.frombuffer(source, dtype=np.uint32).reshape((acq.height, acq.width))
+    source_arr = np.ndarray(
+        (acq.height, acq.width),
+        dtype=np.uint32,
+        buffer=source,
+    )
+    dest_arr = np.ndarray(
+        (acq.height, acq.width),
+        dtype=np.uint32,
+        buffer=dest,
+    )
+
+    has_fullscreen_rect = len(dirty_rects) == 1 and dirty_rects[0] == (
+        0,
+        0,
+        acq.width,
+        acq.height,
+    )
+    if old_width != acq.width or old_height != acq.height or has_fullscreen_rect:
+        np.copyto(dest_arr, source_arr)
+    else:
+        for rect in dirty_rects:
+            x, y, w, h = rect
+            dest_arr[y : y + h, x : x + w] = source_arr[y : y + h, x : x + w]
diff --git a/livekit-plugins/livekit-plugins-browser/cef/src/utils.cpp b/livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/py.typed
similarity index 100%
rename from livekit-plugins/livekit-plugins-browser/cef/src/utils.cpp
rename to livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/py.typed
diff --git a/livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/resources/__init__.py b/livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/resources/__init__.py
new file mode 100644
index 000000000..2133c6432
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/resources/__init__.py
@@ -0,0 +1 @@
+"""Used by importlib.resources and setuptools"""
diff --git a/livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/version.py b/livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/version.py
new file mode 100644
index 000000000..f3454fa71
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/livekit/plugins/browser/version.py
@@ -0,0 +1,15 @@
+# Copyright 2023 LiveKit, Inc.
+
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+__version__ = "0.0.2"
diff --git a/livekit-plugins/livekit-plugins-browser/package.json b/livekit-plugins/livekit-plugins-browser/package.json
new file mode 100644
index 000000000..795a90d4e
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/package.json
@@ -0,0 +1,5 @@
+{
+  "name": "livekit-plugins-browser",
+  "private": true,
+  "version": "0.0.2"
+}
diff --git a/livekit-plugins/livekit-plugins-browser/pyproject.toml b/livekit-plugins/livekit-plugins-browser/pyproject.toml
new file mode 100644
index 000000000..4ece2e4c8
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/pyproject.toml
@@ -0,0 +1,9 @@
+[build-system]
+requires = ["setuptools>=61.0"]
+build-backend = "setuptools.build_meta"
+
+[tool.cibuildwheel.macos]
+repair-wheel-command = "" # getting issues with unresolved files
+
+[tool.cibuildwheel]
+before-build = "pip install pybind11[global]"
\ No newline at end of file
diff --git a/livekit-plugins/livekit-plugins-browser/setup.py b/livekit-plugins/livekit-plugins-browser/setup.py
new file mode 100644
index 000000000..96c557142
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/setup.py
@@ -0,0 +1,126 @@
+# Copyright 2023 LiveKit, Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import pathlib
+import re
+import subprocess
+import sys
+from pathlib import Path
+
+import setuptools
+from setuptools import Extension
+from setuptools.command.build_ext import build_ext
+
+here = pathlib.Path(__file__).parent.resolve()
+about = {}
+with open(os.path.join(here, "livekit", "plugins", "browser", "version.py"), "r") as f:
+    exec(f.read(), about)
+
+
+class CMakeExtension(Extension):
+    def __init__(self, name: str, sourcedir: str = "") -> None:
+        super().__init__(name, sources=[])
+        self.sourcedir = os.fspath(Path(sourcedir).resolve())
+
+
+class CMakeBuild(build_ext):
+    def build_extension(self, ext: CMakeExtension) -> None:
+        # Must be in this form due to bug in .resolve() only fixed in Python 3.10+
+        ext_fullpath = Path.cwd() / self.get_ext_fullpath(ext.name)
+        extdir = ext_fullpath.parent.resolve()
+
+        debug = int(os.environ.get("DEBUG", 0)) if self.debug is None else self.debug
+        cfg = "Debug" if debug else "Release"
+
+        cmake_args = [
+            f"-DCMAKE_LIBRARY_OUTPUT_DIRECTORY={extdir}",
+            f"-DPYTHON_EXECUTABLE={sys.executable}",
+            f"-DCMAKE_BUILD_TYPE={cfg}",
+        ]
+
+        print(f"cmake_args: {cmake_args}")
+
+        if sys.platform.startswith("darwin"):
+            # Cross-compile support for macOS - respect ARCHFLAGS if set
+            archs = re.findall(r"-arch (\S+)", os.environ.get("ARCHFLAGS", ""))
+            if archs:
+                cmake_args += ["-DCMAKE_OSX_ARCHITECTURES={}".format(";".join(archs))]
+
+        self.build_temp = Path(self.build_temp) / ext.name
+        if not self.build_temp.exists():
+            self.build_temp.mkdir(parents=True)
+
+        subprocess.run(
+            ["cmake", ext.sourcedir, *cmake_args], cwd=self.build_temp, check=True
+        )
+        subprocess.run(["cmake", "--build", "."], cwd=self.build_temp, check=True)
+
+        build_output = self.build_temp / "src" / cfg
+
+        for f in build_output.iterdir():
+            if f.suffix == ".so":
+                self.copy_file(f, extdir / f.name)
+
+        if sys.platform.startswith("darwin"):
+            # on macos, copy the dummy app
+            app = build_output / "lkcef_app.app"
+            self.copy_tree(
+                app,
+                str(
+                    extdir
+                    / "livekit"
+                    / "plugins"
+                    / "browser"
+                    / "resources"
+                    / "lkcef_app.app"
+                ),
+            )
+
+
+setuptools.setup(
+    name="livekit-plugins-browser",
+    version=about["__version__"],
+    description="Chromium Embedded Framework (CEF) for LiveKit Agents",
+    long_description=(here / "README.md").read_text(encoding="utf-8"),
+    long_description_content_type="text/markdown",
+    url="https://github.com/livekit/agents",
+    classifiers=[
+        "Intended Audience :: Developers",
+        "License :: OSI Approved :: Apache Software License",
+        "Topic :: Multimedia :: Sound/Audio",
+        "Topic :: Multimedia :: Video",
+        "Topic :: Scientific/Engineering :: Artificial Intelligence",
+        "Programming Language :: Python :: 3",
+        "Programming Language :: Python :: 3.9",
+        "Programming Language :: Python :: 3.10",
+        "Programming Language :: Python :: 3 :: Only",
+    ],
+    keywords=["webrtc", "realtime", "audio", "video", "livekit"],
+    license="Apache-2.0",
+    ext_modules=[CMakeExtension("lkcef_python")],
+    cmdclass={"build_ext": CMakeBuild},
+    packages=setuptools.find_namespace_packages(include=["livekit.*"]),
+    python_requires=">=3.9.0",
+    install_requires=["livekit-agents>=0.8.0"],
+    package_data={
+        "livekit.plugins.browser": ["py.typed"],
+        "livekit.plugins.browser.resources": ["**", "lkcef_app.app"],
+    },
+    project_urls={
+        "Documentation": "https://docs.livekit.io",
+        "Website": "https://livekit.io/",
+        "Source": "https://github.com/livekit/agents",
+    },
+)
diff --git a/livekit-plugins/livekit-plugins-browser/src/.gitignore b/livekit-plugins/livekit-plugins-browser/src/.gitignore
new file mode 100644
index 000000000..28f37169b
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/src/.gitignore
@@ -0,0 +1,3 @@
+Debug/
+Release/
+lib*
\ No newline at end of file
diff --git a/livekit-plugins/livekit-plugins-browser/cef/src/CMakeLists.txt b/livekit-plugins/livekit-plugins-browser/src/CMakeLists.txt
similarity index 90%
rename from livekit-plugins/livekit-plugins-browser/cef/src/CMakeLists.txt
rename to livekit-plugins/livekit-plugins-browser/src/CMakeLists.txt
index 648f16864..298ee3c37 100644
--- a/livekit-plugins/livekit-plugins-browser/cef/src/CMakeLists.txt
+++ b/livekit-plugins/livekit-plugins-browser/src/CMakeLists.txt
@@ -15,14 +15,17 @@ FetchContent_Declare(imgui GIT_REPOSITORY https://github.com/ocornut/imgui GIT_T
 FetchContent_GetProperties(imgui)
 FetchContent_MakeAvailable(imgui)
 file(GLOB IMGUI_SOURCES ${imgui_SOURCE_DIR}/*.cpp)
-file(GLOB IMGUI_HEADERS ${imgui_SOURCE_DIR}/*.h)
-add_library(imgui STATIC ${IMGUI_SOURCES} ${IMGUI_SOURCES} ${imgui_SOURCE_DIR}/backends/imgui_impl_glfw.cpp ${imgui_SOURCE_DIR}/backends/imgui_impl_opengl3.cpp)
+add_library(imgui STATIC ${IMGUI_SOURCES}
+        ${imgui_SOURCE_DIR}/backends/imgui_impl_glfw.cpp
+        ${imgui_SOURCE_DIR}/backends/imgui_impl_opengl3.cpp
+        ${imgui_SOURCE_DIR}/misc/cpp/imgui_stdlib.cpp
+)
 set_target_properties(imgui PROPERTIES CXX_STANDARD 17)
-target_include_directories(imgui PUBLIC ${imgui_SOURCE_DIR} ${imgui_SOURCE_DIR}/backends ${GLFW_INCLUDE_DIR})
+target_include_directories(imgui PUBLIC ${imgui_SOURCE_DIR} ${imgui_SOURCE_DIR}/misc/cpp ${imgui_SOURCE_DIR}/backends ${GLFW_INCLUDE_DIR})
 target_link_libraries(imgui PRIVATE glfw)
 
 
-set(LKCEF_SRCS app.cpp app.hpp handler.hpp handler.cpp dev_renderer.hpp dev_renderer.cpp)
+set(LKCEF_SRCS app.cpp app.hpp handler.hpp handler.cpp dev_renderer.hpp dev_renderer.cpp gleq.h browser_handle.hpp browser_handle.cpp)
 set(LKCEF_SRCS_LINUX main_linux.cpp)
 set(LKCEF_SRCS_MAC app_mac.mm)
 set(LKCEF_SRCS_WINDOWS main_win.cpp )
@@ -86,8 +89,12 @@ if(OS_MAC)
     cmake_policy(SET CMP0068 NEW)
   endif()
 
-  # output path for the main app bundle.
-  set(LKCEF_APP "${CEF_TARGET_OUT_DIR}/lkcef_app.app")
+  add_executable(lkcef_app MACOSX_BUNDLE dummy.cpp) # dummy app
+  set_target_properties(lkcef_app PROPERTIES
+          MACOSX_BUNDLE_INFO_PLIST "${CMAKE_CURRENT_SOURCE_DIR}/resources/lkcefapp-Info.plist"
+          OUTPUT_NAME "lkcef_app"
+  )
+
 
   # library target.
   add_library(lkcef STATIC ${LKCEF_SRCS})
@@ -103,10 +110,7 @@ if(OS_MAC)
     COMMAND
       ${CMAKE_COMMAND} -E copy_directory
       "${CEF_BINARY_DIR}/Chromium Embedded Framework.framework"
-      "${LKCEF_APP}/Contents/Frameworks/Chromium Embedded Framework.framework"
-    # Copy the library into the main app bindle. COMMAND ${CMAKE_COMMAND} -E
-    # copy_if_different "${CEF_TARGET_OUT_DIR}/liblkcef.dylib"
-    # "${LKCEF_APP}/Contents/MacOS/liblkcef.dylib"
+      "$<TARGET_BUNDLE_DIR:lkcef_app>/Contents/Frameworks/Chromium Embedded Framework.framework"
     VERBATIM)
 
   # Create the multiple Helper app bundle targets.
@@ -140,6 +144,8 @@ if(OS_MAC)
     add_dependencies(${_helper_target} libcef_dll_wrapper)
     target_link_libraries(${_helper_target} libcef_dll_wrapper
                           ${CEF_STANDARD_LIBS})
+
+
     set_target_properties(
       ${_helper_target}
       PROPERTIES MACOSX_BUNDLE_INFO_PLIST ${_helper_info_plist}
@@ -155,7 +161,7 @@ if(OS_MAC)
       COMMAND
         ${CMAKE_COMMAND} -E copy_directory
         "${CEF_TARGET_OUT_DIR}/${_helper_output_name}.app"
-        "${LKCEF_APP}/Contents/Frameworks/${_helper_output_name}.app"
+        "$<TARGET_BUNDLE_DIR:lkcef_app>/Contents/Frameworks/${_helper_output_name}.app"
       VERBATIM)
   endforeach()
 endif()
diff --git a/livekit-plugins/livekit-plugins-browser/src/agents_python.cpp b/livekit-plugins/livekit-plugins-browser/src/agents_python.cpp
new file mode 100644
index 000000000..bf344f867
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/src/agents_python.cpp
@@ -0,0 +1,138 @@
+#include "agents_python.hpp"
+
+#include <pybind11/functional.h>
+#include <pybind11/pybind11.h>
+#include <pybind11/stl.h>
+
+#include "app.hpp"
+#include "include/base/cef_callback.h"
+#include "include/internal/cef_mac.h"
+#include "include/wrapper/cef_closure_task.h"
+
+namespace py = pybind11;
+
+BrowserApp::BrowserApp(const AppOptions& options) : options_(options) {
+  app_ = new AgentApp(options_.dev_mode, options.remote_debugging_port,
+                      options.root_cache_path, options.framework_path,
+                      options.main_bundle_path, options.subprocess_path,
+                      options_.initialized_callback);
+}
+
+bool BrowserApp::CreateBrowser(const std::string& url,
+                               const BrowserOptions& options) {
+  if (CefCurrentlyOn(TID_UI)) {
+    CreateBrowserOnUIThread(url, options);
+    return true;
+  }
+
+  // TODO(theomonnom): Document base::Unretained
+  CefPostTask(TID_UI, base::BindOnce(&BrowserApp::CreateBrowserOnUIThread,
+                                     base::Unretained(this), url, options));
+
+  return true;
+}
+
+void BrowserApp::CreateBrowserOnUIThread(const std::string& url,
+                                         const BrowserOptions& options) {
+  std::shared_ptr<BrowserImpl> browser_impl = std::make_shared<BrowserImpl>();
+  browsers_.push_back(browser_impl);
+
+  CefRefPtr<BrowserHandle> handle = app_->CreateBrowser(
+      url, options.framerate, options.width, options.height,
+      [options, browser_impl]() { options.created_callback(browser_impl); },
+      [options](std::vector<CefRect> dirtyRects, const void* buffer, int width,
+                int height) {
+        PaintData event{};
+        std::vector<PaintRect> rects;
+        rects.reserve(dirtyRects.size());
+
+        for (const auto& rect : dirtyRects) {
+          rects.push_back({rect.x, rect.y, rect.width, rect.height});
+        }
+
+        event.dirtyRect = rects;
+        event.buffer = buffer;
+        event.width = width;
+        event.height = height;
+        options.paint_callback(event);
+      },
+      options.close_callback);
+
+  browser_impl->handle = handle;
+}
+
+int BrowserApp::Run() {
+  return RunAgentApp(app_);
+}
+
+BrowserImpl::BrowserImpl() {}
+
+void BrowserImpl::SetSize(int width, int height) {
+  if (handle)
+    handle->SetSize(width, height);
+}
+
+void BrowserImpl::Close() {
+  if (handle)
+    handle->Close();
+}
+
+int BrowserImpl::Identifier() const {
+  return handle->GetBrowser()->GetIdentifier();
+}
+
+py::memoryview paint_data_to_memoryview(const PaintData& event) {
+  return py::memoryview::from_buffer(
+      const_cast<uint32_t*>(static_cast<const uint32_t*>(event.buffer)),
+      {event.height * event.width}, {sizeof(uint32_t)}, true);
+}
+
+PYBIND11_MODULE(lkcef_python, m) {
+  // Isn't that fucking cool? llm using browsers
+  m.doc() = "Chromium Embedded Framework (CEF) for LiveKit Agents";
+
+  py::class_<AppOptions>(m, "AppOptions")
+      .def(py::init())
+      .def_readwrite("dev_mode", &AppOptions::dev_mode)
+      .def_readwrite("remote_debugging_port",
+                     &AppOptions::remote_debugging_port)
+      .def_readwrite("root_cache_path", &AppOptions::root_cache_path)
+      .def_readwrite("framework_path", &AppOptions::framework_path)
+      .def_readwrite("main_bundle_path", &AppOptions::main_bundle_path)
+      .def_readwrite("subprocess_path", &AppOptions::subprocess_path)
+      .def_readwrite("initialized_callback", &AppOptions::initialized_callback);
+
+  py::class_<BrowserOptions>(m, "BrowserOptions")
+      .def(py::init())
+      .def_readwrite("framerate", &BrowserOptions::framerate)
+      .def_readwrite("width", &BrowserOptions::width)
+      .def_readwrite("height", &BrowserOptions::height)
+      .def_readwrite("created_callback", &BrowserOptions::created_callback)
+      .def_readwrite("paint_callback", &BrowserOptions::paint_callback)
+      .def_readwrite("close_callback", &BrowserOptions::close_callback);
+
+  py::class_<BrowserApp>(m, "BrowserApp")
+      .def(py::init<const AppOptions&>())
+      .def("create_browser", &BrowserApp::CreateBrowser)
+      .def("run", &BrowserApp::Run, py::call_guard<py::gil_scoped_release>());
+
+  py::class_<BrowserImpl, std::shared_ptr<BrowserImpl>>(m, "BrowserImpl")
+      .def("set_size", &BrowserImpl::SetSize)
+      .def("close", &BrowserImpl::Close)
+      .def("identifier", &BrowserImpl::Identifier);
+
+  py::class_<PaintRect>(m, "PaintRect")
+      .def_readwrite("x", &PaintRect::x)
+      .def_readwrite("y", &PaintRect::y)
+      .def_readwrite("width", &PaintRect::width)
+      .def_readwrite("height", &PaintRect::height);
+
+  py::class_<PaintData>(m, "PaintData")
+      .def(py::init())
+      .def_readwrite("dirty_rects", &PaintData::dirtyRect)
+      .def_readwrite("width", &PaintData::width)
+      .def_readwrite("height", &PaintData::height)
+      .def_property_readonly("buffer", [](const PaintData& event) {
+        return paint_data_to_memoryview(event);
+      });
+}
diff --git a/livekit-plugins/livekit-plugins-browser/src/agents_python.hpp b/livekit-plugins/livekit-plugins-browser/src/agents_python.hpp
new file mode 100644
index 000000000..7312b464c
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/src/agents_python.hpp
@@ -0,0 +1,69 @@
+#ifndef LKCEF_AGENTS_PYTHON_HPP
+#define LKCEF_AGENTS_PYTHON_HPP
+
+#include <functional>
+#include <memory>
+
+#include "app.hpp"
+
+class BrowserImpl;
+struct PaintData;
+
+struct AppOptions {
+  bool dev_mode = false;
+  int remote_debugging_port = 0;
+  std::string root_cache_path;
+  std::string framework_path;
+  std::string main_bundle_path;
+  std::string subprocess_path;
+  std::function<void()> initialized_callback = nullptr;
+};
+
+struct BrowserOptions {
+  int framerate = 30;
+  int width = 800;
+  int height = 600;
+  std::function<void(std::shared_ptr<BrowserImpl>)> created_callback = nullptr;
+  std::function<void(const PaintData&)> paint_callback = nullptr;
+  std::function<void()> close_callback = nullptr;
+};
+
+struct BrowserApp {
+  BrowserApp(const AppOptions& options);
+
+  bool CreateBrowser(const std::string& url, const BrowserOptions& options);
+  void CreateBrowserOnUIThread(const std::string& url, const BrowserOptions& options);
+
+  int Run();
+
+ private:
+  AppOptions options_;
+  CefRefPtr<AgentApp> app_;
+  std::list<std::shared_ptr<BrowserImpl>> browsers_;
+};
+
+struct BrowserImpl {
+  BrowserImpl();
+
+  void SetSize(int width, int height);
+  void Close();
+  int Identifier() const;
+
+  CefRefPtr<BrowserHandle> handle = nullptr;
+};
+
+struct PaintRect {
+  int x = 0;
+  int y = 0;
+  int width = 0;
+  int height = 0;
+};
+
+struct PaintData {
+  std::vector<PaintRect> dirtyRect;
+  const void* buffer;
+  int width;
+  int height;
+};
+
+#endif  // LKCEF_AGENTS_PYTHON_HPP
diff --git a/livekit-plugins/livekit-plugins-browser/cef/src/app.cpp b/livekit-plugins/livekit-plugins-browser/src/app.cpp
similarity index 50%
rename from livekit-plugins/livekit-plugins-browser/cef/src/app.cpp
rename to livekit-plugins/livekit-plugins-browser/src/app.cpp
index 1dfbe7976..ae688bb54 100644
--- a/livekit-plugins/livekit-plugins-browser/cef/src/app.cpp
+++ b/livekit-plugins/livekit-plugins-browser/src/app.cpp
@@ -8,11 +8,24 @@
 #include "include/views/cef_window.h"
 #include "include/wrapper/cef_helpers.h"
 
-AgentApp::AgentApp(bool dev_mode, std::function<void()> initialized_callback)
+AgentApp::AgentApp(bool dev_mode,
+                   int remote_debugging_port,
+                   std::string root_cache_path,
+                   std::string framework_path,
+                   std::string main_bundle_path,
+                   std::string subprocess_path,
+                   std::function<void()> initialized_callback)
     : dev_mode_(dev_mode),
+      remote_debugging_port_(remote_debugging_port),
+      root_cache_path_(std::move(root_cache_path)),
+      framework_path_(std::move(framework_path)),
+      main_bundle_path_(std::move(main_bundle_path)),
+      subprocess_path_(std::move(subprocess_path)),
       initialized_callback_(std::move(initialized_callback)) {
+  browser_store_ = CefRefPtr<BrowserStore>(new BrowserStore());
+
   if (dev_mode)
-    dev_renderer_ = CefRefPtr<DevRenderer>(new DevRenderer());
+    dev_renderer_ = CefRefPtr<DevRenderer>(new DevRenderer(browser_store_));
 }
 
 void AgentApp::OnBeforeCommandLineProcessing(
@@ -20,12 +33,15 @@ void AgentApp::OnBeforeCommandLineProcessing(
     CefRefPtr<CefCommandLine> command_line) {
   command_line->AppendSwitch("--disable-gpu");
   command_line->AppendSwitch("--disable-gpu-compositing");
+  command_line->AppendSwitch("--enable-chrome-runtime");
   // command_line->AppendSwitch("--enable-begin-frame-scheduling");
 }
 
 void AgentApp::OnContextInitialized() {
   CEF_REQUIRE_UI_THREAD();  // Main thread in our case
-  client_ = CefRefPtr<AgentHandler>(new AgentHandler(dev_renderer_));
+  client_ =
+      CefRefPtr<AgentHandler>(new AgentHandler(browser_store_, dev_renderer_));
+  dev_client_ = CefRefPtr<DevToolsHandler>(new DevToolsHandler());
 
   if (initialized_callback_)
     initialized_callback_();
@@ -38,27 +54,34 @@ CefRefPtr<CefClient> AgentApp::GetDefaultClient() {
 CefRefPtr<BrowserHandle> AgentApp::CreateBrowser(
     const std::string& url,
     int framerate,
-    std::function<void()> created_callback) {
+    int width,
+    int height,
+    std::function<void()> created_callback,
+    std::function<void(std::vector<CefRect> dirtyRects,
+                       const void* buffer,
+                       int width,
+                       int height)> paint_callback,
+    std::function<void()> close_callback) {
   CEF_REQUIRE_UI_THREAD();
-  CefWindowInfo windowInfo;
+
   // windowInfo.SetAsWindowless(dev_renderer_->getNativeWindowHandle());
+  CefWindowInfo windowInfo;
   windowInfo.SetAsWindowless(nullptr);
 
-  CefRefPtr<CefCommandLine> command_line =
-      CefCommandLine::GetGlobalCommandLine();
-
   CefBrowserSettings settings;
+  settings.windowless_frame_rate = framerate;
   settings.background_color = CefColorSetARGB(255, 255, 255, 255);
 
   CefRefPtr<BrowserHandle> browser_handle =
-        new BrowserHandle(created_callback);
+      new BrowserHandle(std::move(created_callback), std::move(paint_callback),
+                        std::move(close_callback), width, height);
 
-  client_->AddPendingHandle(browser_handle);
+  browser_store_->AddPendingHandle(browser_handle);
 
   bool result = CefBrowserHost::CreateBrowser(windowInfo, client_, url,
                                               settings, nullptr, nullptr);
   if (!result) {
-    client_->RemovePendingHandle(browser_handle);
+    browser_store_->RemovePendingHandle(browser_handle);
     return nullptr;
   }
   return browser_handle;
@@ -71,5 +94,7 @@ int AgentApp::Run() {
     CefRunMessageLoop();
   }
 
+  // Close all browsers
+
   return 0;
 }
diff --git a/livekit-plugins/livekit-plugins-browser/src/app.hpp b/livekit-plugins/livekit-plugins-browser/src/app.hpp
new file mode 100644
index 000000000..da7a27cee
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/src/app.hpp
@@ -0,0 +1,75 @@
+#ifndef LKCEF_APP_HPP
+#define LKCEF_APP_HPP
+
+#include "browser_handle.hpp"
+#include "dev_renderer.hpp"
+#include "handler.hpp"
+#include "include/cef_app.h"
+#include "include/cef_base.h"
+#include "include/cef_browser_process_handler.h"
+#include "include/cef_client.h"
+#include "include/internal/cef_ptr.h"
+
+class AgentApp : public CefApp, public CefBrowserProcessHandler {
+ public:
+  AgentApp(bool dev_mode,
+           int remote_debugging_port,
+           std::string root_cache_path,
+           std::string framework_path,
+           std::string main_bundle_path,
+           std::string subprocess_path,
+           std::function<void()> initialized_callback);
+
+  CefRefPtr<CefBrowserProcessHandler> GetBrowserProcessHandler() override {
+    return this;
+  }
+
+  void OnBeforeCommandLineProcessing(
+      const CefString& process_type,
+      CefRefPtr<CefCommandLine> command_line) override;
+
+  void OnContextInitialized() override;
+
+  CefRefPtr<CefClient> GetDefaultClient() override;
+
+  CefRefPtr<BrowserHandle> CreateBrowser(
+      const std::string& url,
+      int framerate,
+      int width,
+      int height,
+      std::function<void()> created_callback,
+      std::function<void(std::vector<CefRect> dirtyRect,
+                         const void* buffer,
+                         int width,
+                         int height)> paint_callback,
+      std::function<void()> close_callback);
+
+  int Run();
+
+  bool IsDevMode() const { return dev_mode_; }
+  int GetRemoteDebuggingPort() const { return remote_debugging_port_; }
+  std::string GetRootCachePath() const { return root_cache_path_; }
+  std::string GetFrameworkPath() const { return framework_path_; }
+  std::string GetMainBundlePath() const { return main_bundle_path_; }
+  std::string GetSubprocessPath() const { return subprocess_path_; }
+
+ private:
+  IMPLEMENT_REFCOUNTING(AgentApp);
+
+  CefRefPtr<BrowserStore> browser_store_;
+  CefRefPtr<AgentHandler> client_;
+  CefRefPtr<DevToolsHandler> dev_client_;
+  CefRefPtr<DevRenderer> dev_renderer_;
+
+  bool dev_mode_;
+  int remote_debugging_port_;
+  std::string root_cache_path_;
+  std::string framework_path_;
+  std::string main_bundle_path_;
+  std::string subprocess_path_;
+  std::function<void()> initialized_callback_;
+};
+
+int RunAgentApp(CefRefPtr<AgentApp> app);
+
+#endif  // LKCEF_APP_HPP
diff --git a/livekit-plugins/livekit-plugins-browser/src/app_mac.mm b/livekit-plugins/livekit-plugins-browser/src/app_mac.mm
new file mode 100644
index 000000000..68a5822bf
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/src/app_mac.mm
@@ -0,0 +1,110 @@
+
+#import <Cocoa/Cocoa.h>
+
+#include <iostream>
+
+#import <Cocoa/Cocoa.h>
+#include <objc/runtime.h>
+
+#include "app.hpp"
+#include "handler.hpp"
+#include "include/cef_application_mac.h"
+#include "include/cef_command_line.h"
+#include "include/wrapper/cef_library_loader.h"
+
+BOOL g_handling_send_event = false;
+
+@interface NSApplication (AgentsApplication) <CefAppProtocol>
+
+- (BOOL)isHandlingSendEvent;
+- (void)setHandlingSendEvent:(BOOL)handlingSendEvent;
+- (void)_swizzled_sendEvent:(NSEvent*)event;
+- (void)_swizzled_terminate:(id)sender;
+
+@end
+
+@implementation NSApplication (AgentsApplication)
+
+// This selector is called very early during the application initialization.
++ (void)load {
+  NSLog(@"AgentsApplication::load");
+  // Swap NSApplication::sendEvent with _swizzled_sendEvent.
+  Method original = class_getInstanceMethod(self, @selector(sendEvent));
+  Method swizzled =
+      class_getInstanceMethod(self, @selector(_swizzled_sendEvent));
+  method_exchangeImplementations(original, swizzled);
+
+  Method originalTerm = class_getInstanceMethod(self, @selector(terminate:));
+  Method swizzledTerm =
+      class_getInstanceMethod(self, @selector(_swizzled_terminate:));
+  method_exchangeImplementations(originalTerm, swizzledTerm);
+}
+
+- (BOOL)isHandlingSendEvent {
+  return g_handling_send_event;
+}
+
+- (void)setHandlingSendEvent:(BOOL)handlingSendEvent {
+  g_handling_send_event = handlingSendEvent;
+}
+
+- (void)_swizzled_sendEvent:(NSEvent*)event {
+  CefScopedSendingEvent sendingEventScoper;
+  // Calls NSApplication::sendEvent due to the swizzling.
+  [self _swizzled_sendEvent:event];
+}
+
+- (void)_swizzled_terminate:(id)sender {
+  [self _swizzled_terminate:sender];
+}
+
+@end
+
+// Entry point function for the browser process.
+int RunAgentApp(CefRefPtr<AgentApp> app) {
+  CefMainArgs main_args(0, nullptr);
+
+  @autoreleasepool {
+    [NSApplication sharedApplication];
+
+    // If there was an invocation to NSApp prior to this method, then the NSApp
+    // will not be a AgentsApplication, but will instead be an NSApplication.
+    // This is undesirable and we must enforce that this doesn't happen.
+    CHECK([NSApp isKindOfClass:[NSApplication class]]);
+
+    std::string framework_lib = app->GetFrameworkPath() + "/Chromium Embedded Framework";
+    if (!cef_load_library(framework_lib.c_str())) {
+      std::cerr << "lkcef: Failed to load CEF library" << std::endl;
+      return 1;
+    }
+
+    CefSettings settings{};
+    settings.chrome_runtime = true;
+    settings.external_message_pump = app->IsDevMode();
+    settings.remote_debugging_port = app->GetRemoteDebuggingPort();
+    CefString(&settings.root_cache_path).FromString(app->GetRootCachePath());
+    CefString(&settings.framework_dir_path).FromString(app->GetFrameworkPath());
+    CefString(&settings.main_bundle_path).FromString(app->GetMainBundlePath());
+    CefString(&settings.browser_subprocess_path).FromString(app->GetSubprocessPath());
+
+    settings.no_sandbox = true;  // No sandbox for MacOS, for livekit-agents,
+                                 // we're only going to support Linux
+    settings.windowless_rendering_enabled = true;
+
+    // Initialize the CEF browser process. May return false if initialization
+    // fails or if early exit is desired (for example, due to process singleton
+    // relaunch behavior).
+    if (!CefInitialize(main_args, settings, app.get(), nullptr)) {
+      std::cerr << "lkcef: Failed to initialize CEF" << std::endl;
+      // TODO(theomonnom): Use CefGetExitCode();
+      return 1;
+    }
+
+    app->Run();
+    CefShutdown();
+
+    cef_unload_library();
+  }  // @autoreleasepool
+
+  return 0;
+}
diff --git a/livekit-plugins/livekit-plugins-browser/src/browser_handle.cpp b/livekit-plugins/livekit-plugins-browser/src/browser_handle.cpp
new file mode 100644
index 000000000..9e0893bef
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/src/browser_handle.cpp
@@ -0,0 +1,15 @@
+#include "browser_handle.hpp"
+
+void BrowserHandle::SetSize(int width, int height) {
+  width_ = width;
+  height_ = height;
+
+  if (browser_)
+    browser_->GetHost()->WasResized();
+}
+
+
+void BrowserHandle::Close() {
+  if (browser_)
+    browser_->GetHost()->CloseBrowser(true);
+}
diff --git a/livekit-plugins/livekit-plugins-browser/src/browser_handle.hpp b/livekit-plugins/livekit-plugins-browser/src/browser_handle.hpp
new file mode 100644
index 000000000..d93da9dad
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/src/browser_handle.hpp
@@ -0,0 +1,72 @@
+#ifndef LKCEF_BROWSER_HANDLE_HPP
+#define LKCEF_BROWSER_HANDLE_HPP
+
+#include <list>
+
+#include "include/cef_client.h"
+#include "include/wrapper/cef_helpers.h"
+
+class BrowserHandle : public CefBaseRefCounted {
+ public:
+  BrowserHandle(
+              std::function<void()> created_callback,
+                std::function<void(std::vector<CefRect> dirtyRects,
+                                   const void* buffer,
+                                   int width,
+                                   int height)> paint_callback,
+                std::function<void()> close_callback,
+                int width,
+                int height)
+      : created_callback_(std::move(created_callback)),
+        paint_callback_(std::move(paint_callback)),
+        close_callback_(std::move(close_callback)),
+        width_(width),
+        height_(height) {}
+
+  CefRefPtr<CefBrowser> browser_ = nullptr;
+  std::function<void()> created_callback_ = nullptr;
+  std::function<void(std::vector<CefRect> dirtyRect,
+                     const void* buffer,
+                     int width,
+                     int height)>
+      paint_callback_ = nullptr;
+  std::function<void()> close_callback_ = nullptr;
+
+  void SetSize(int width, int height);
+  void Close();
+
+  int GetWidth() const { return width_; }
+  int GetHeight() const { return height_; }
+
+  CefRefPtr<CefBrowser> GetBrowser() const { return browser_; }
+
+ private:
+  int width_ = 0;
+  int height_ = 0;
+
+  IMPLEMENT_REFCOUNTING(BrowserHandle);
+};
+
+struct BrowserStore : public CefBaseRefCounted {
+  std::unordered_map<int, CefRefPtr<BrowserHandle>> browser_handles_;
+  std::list<CefRefPtr<BrowserHandle>> pending_handles_;
+
+  void AddPendingHandle(CefRefPtr<BrowserHandle> handle) {
+    CEF_REQUIRE_UI_THREAD();
+    pending_handles_.push_back(handle);
+  }
+
+  void RemovePendingHandle(CefRefPtr<BrowserHandle> handle) {
+    CEF_REQUIRE_UI_THREAD();
+    pending_handles_.remove(handle);
+  }
+
+  CefRefPtr<BrowserHandle> GetBrowserHandle(int identifier) {
+    CEF_REQUIRE_UI_THREAD();
+    return browser_handles_[identifier];
+  }
+
+  IMPLEMENT_REFCOUNTING(BrowserStore);
+};
+
+#endif  // LKCEF_BROWSER_HANDLE_HPP
diff --git a/livekit-plugins/livekit-plugins-browser/src/dev_renderer.cpp b/livekit-plugins/livekit-plugins-browser/src/dev_renderer.cpp
new file mode 100644
index 000000000..1eed5c94e
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/src/dev_renderer.cpp
@@ -0,0 +1,593 @@
+#include "dev_renderer.hpp"
+
+#include <iostream>
+
+#include "handler.hpp"
+
+#define IMGUI_DEFINE_MATH_OPERATORS
+#include "imgui.h"
+#include "imgui_impl_glfw.h"
+#include "imgui_impl_opengl3.h"
+#include "imgui_stdlib.h"
+#include "include/cef_app.h"
+#include "include/wrapper/cef_helpers.h"
+#include "keyboard_codes.h"
+
+#define GLEQ_IMPLEMENTATION
+#define GLEQ_STATIC
+#include "gleq.h"
+
+// DCHECK on gl errors.
+#if DCHECK_IS_ON()
+#define VERIFY_NO_ERROR                                                      \
+  {                                                                          \
+    int _gl_error = glGetError();                                            \
+    DCHECK(_gl_error == GL_NO_ERROR) << "glGetError returned " << _gl_error; \
+  }
+#else
+#define VERIFY_NO_ERROR
+#endif
+
+int glfw_key_to_cef_key(int glfwKey) {
+  switch (glfwKey) {
+    case GLFW_KEY_SPACE:
+      return WebCore::VK_SPACE;
+    case GLFW_KEY_APOSTROPHE:
+      return WebCore::VK_OEM_7;
+    case GLFW_KEY_COMMA:
+      return WebCore::VK_OEM_COMMA;
+    case GLFW_KEY_MINUS:
+      return WebCore::VK_OEM_MINUS;
+    case GLFW_KEY_PERIOD:
+      return WebCore::VK_OEM_PERIOD;
+    case GLFW_KEY_SLASH:
+      return WebCore::VK_OEM_2;
+    case GLFW_KEY_0:
+      return WebCore::VK_0;
+    case GLFW_KEY_1:
+      return WebCore::VK_1;
+    case GLFW_KEY_2:
+      return WebCore::VK_2;
+    case GLFW_KEY_3:
+      return WebCore::VK_3;
+    case GLFW_KEY_4:
+      return WebCore::VK_4;
+    case GLFW_KEY_5:
+      return WebCore::VK_5;
+    case GLFW_KEY_6:
+      return WebCore::VK_6;
+    case GLFW_KEY_7:
+      return WebCore::VK_7;
+    case GLFW_KEY_8:
+      return WebCore::VK_8;
+    case GLFW_KEY_9:
+      return WebCore::VK_9;
+    case GLFW_KEY_SEMICOLON:
+      return WebCore::VK_OEM_1;
+    case GLFW_KEY_EQUAL:
+      return WebCore::VK_OEM_PLUS;
+    case GLFW_KEY_A:
+      return WebCore::VK_A;
+    case GLFW_KEY_B:
+      return WebCore::VK_B;
+    case GLFW_KEY_C:
+      return WebCore::VK_C;
+    case GLFW_KEY_D:
+      return WebCore::VK_D;
+    case GLFW_KEY_E:
+      return WebCore::VK_E;
+    case GLFW_KEY_F:
+      return WebCore::VK_F;
+    case GLFW_KEY_G:
+      return WebCore::VK_G;
+    case GLFW_KEY_H:
+      return WebCore::VK_H;
+    case GLFW_KEY_I:
+      return WebCore::VK_I;
+    case GLFW_KEY_J:
+      return WebCore::VK_J;
+    case GLFW_KEY_K:
+      return WebCore::VK_K;
+    case GLFW_KEY_L:
+      return WebCore::VK_L;
+    case GLFW_KEY_M:
+      return WebCore::VK_M;
+    case GLFW_KEY_N:
+      return WebCore::VK_N;
+    case GLFW_KEY_O:
+      return WebCore::VK_O;
+    case GLFW_KEY_P:
+      return WebCore::VK_P;
+    case GLFW_KEY_Q:
+      return WebCore::VK_Q;
+    case GLFW_KEY_R:
+      return WebCore::VK_R;
+    case GLFW_KEY_S:
+      return WebCore::VK_S;
+    case GLFW_KEY_T:
+      return WebCore::VK_T;
+    case GLFW_KEY_U:
+      return WebCore::VK_U;
+    case GLFW_KEY_V:
+      return WebCore::VK_V;
+    case GLFW_KEY_W:
+      return WebCore::VK_W;
+    case GLFW_KEY_X:
+      return WebCore::VK_X;
+    case GLFW_KEY_Y:
+      return WebCore::VK_Y;
+    case GLFW_KEY_Z:
+      return WebCore::VK_Z;
+    case GLFW_KEY_LEFT_BRACKET:
+      return WebCore::VK_OEM_4;
+    case GLFW_KEY_BACKSLASH:
+      return WebCore::VK_OEM_5;
+    case GLFW_KEY_RIGHT_BRACKET:
+      return WebCore::VK_OEM_6;
+    case GLFW_KEY_GRAVE_ACCENT:
+      return WebCore::VK_OEM_3;
+    case GLFW_KEY_ESCAPE:
+      return WebCore::VK_ESCAPE;
+    case GLFW_KEY_ENTER:
+      return WebCore::VK_RETURN;
+    case GLFW_KEY_TAB:
+      return WebCore::VK_TAB;
+    case GLFW_KEY_BACKSPACE:
+      return WebCore::VK_BACK;
+    case GLFW_KEY_INSERT:
+      return WebCore::VK_INSERT;
+    case GLFW_KEY_DELETE:
+      return WebCore::VK_DELETE;
+    case GLFW_KEY_RIGHT:
+      return WebCore::VK_RIGHT;
+    case GLFW_KEY_LEFT:
+      return WebCore::VK_LEFT;
+    case GLFW_KEY_DOWN:
+      return WebCore::VK_DOWN;
+    case GLFW_KEY_UP:
+      return WebCore::VK_UP;
+    case GLFW_KEY_PAGE_UP:
+      return WebCore::VK_PRIOR;
+    case GLFW_KEY_PAGE_DOWN:
+      return WebCore::VK_NEXT;
+    case GLFW_KEY_HOME:
+      return WebCore::VK_HOME;
+    case GLFW_KEY_END:
+      return WebCore::VK_END;
+    case GLFW_KEY_CAPS_LOCK:
+      return WebCore::VK_CAPITAL;
+    case GLFW_KEY_SCROLL_LOCK:
+      return WebCore::VK_SCROLL;
+    case GLFW_KEY_NUM_LOCK:
+      return WebCore::VK_NUMLOCK;
+    case GLFW_KEY_PRINT_SCREEN:
+      return WebCore::VK_SNAPSHOT;
+    case GLFW_KEY_PAUSE:
+      return WebCore::VK_PAUSE;
+    case GLFW_KEY_F1:
+      return WebCore::VK_F1;
+    case GLFW_KEY_F2:
+      return WebCore::VK_F2;
+    case GLFW_KEY_F3:
+      return WebCore::VK_F3;
+    case GLFW_KEY_F4:
+      return WebCore::VK_F4;
+    case GLFW_KEY_F5:
+      return WebCore::VK_F5;
+    case GLFW_KEY_F6:
+      return WebCore::VK_F6;
+    case GLFW_KEY_F7:
+      return WebCore::VK_F7;
+    case GLFW_KEY_F8:
+      return WebCore::VK_F8;
+    case GLFW_KEY_F9:
+      return WebCore::VK_F9;
+    case GLFW_KEY_F10:
+      return WebCore::VK_F10;
+    case GLFW_KEY_F11:
+      return WebCore::VK_F11;
+    case GLFW_KEY_F12:
+      return WebCore::VK_F12;
+    // Add more cases as needed
+    default:
+      return WebCore::VK_UNKNOWN;
+  }
+}
+
+static uint32_t glfw_mods_to_cef_mods(int glfw_mods) {
+  uint32_t cef_flags = 0;
+
+  if (glfw_mods & 0x0001) {  // GLFW_MOD_SHIFT
+    cef_flags |= (1 << 1);   // EVENTFLAG_SHIFT_DOWN
+  }
+  if (glfw_mods & 0x0002) {  // GLFW_MOD_CONTROL
+    cef_flags |= (1 << 2);   // EVENTFLAG_CONTROL_DOWN
+  }
+  if (glfw_mods & 0x0004) {  // GLFW_MOD_ALT
+    cef_flags |= (1 << 3);   // EVENTFLAG_ALT_DOWN
+  }
+  if (glfw_mods & 0x0008) {  // GLFW_MOD_SUPER
+    cef_flags |=
+        (1 << 7);  // EVENTFLAG_COMMAND_DOWN (Super key -> Command on Mac)
+  }
+  if (glfw_mods & 0x0010) {  // GLFW_MOD_CAPS_LOCK
+    cef_flags |= (1 << 0);   // EVENTFLAG_CAPS_LOCK_ON
+  }
+  if (glfw_mods & 0x0020) {  // GLFW_MOD_NUM_LOCK
+    cef_flags |= (1 << 8);   // EVENTFLAG_NUM_LOCK_ON
+  }
+
+  return cef_flags;
+}
+
+static std::optional<CefBrowserHost::MouseButtonType> glfw_button_to_cef_button(
+    int button) {
+  switch (button) {
+    case GLFW_MOUSE_BUTTON_LEFT:
+      return CefBrowserHost::MouseButtonType::MBT_LEFT;
+    case GLFW_MOUSE_BUTTON_MIDDLE:
+      return CefBrowserHost::MouseButtonType::MBT_MIDDLE;
+    case GLFW_MOUSE_BUTTON_RIGHT:
+      return CefBrowserHost::MouseButtonType::MBT_RIGHT;
+    default:
+      return std::nullopt;
+  }
+}
+
+static void glfw_error_callback(int error, const char* description) {
+  fprintf(stderr, "GLFW Error %d: %s\n", error, description);
+}
+
+DevRenderer::DevRenderer(CefRefPtr<BrowserStore> browser_store)
+    : browser_store_(browser_store) {}
+
+void DevRenderer::OnTitleChange(CefRefPtr<CefBrowser> browser,
+                                const CefString& title) {
+  CEF_REQUIRE_UI_THREAD();
+  int identifier = browser->GetIdentifier();
+  BrowserData* data = &browser_data_[identifier];
+  data->title = title;
+}
+
+void DevRenderer::OnLoadingStateChange(CefRefPtr<CefBrowser> browser,
+                                       bool isLoading,
+                                       bool canGoBack,
+                                       bool canGoForward) {
+  if (!isLoading) {
+    int identifier = browser->GetIdentifier();
+    BrowserData* data = &browser_data_[identifier];
+    data->url = browser->GetMainFrame()->GetURL();
+  }
+}
+
+void DevRenderer::OnAfterCreated(CefRefPtr<CefBrowser> browser) {
+  CEF_REQUIRE_UI_THREAD();
+  int identifier = browser->GetIdentifier();
+
+  unsigned int texture_id;
+  glGenTextures(1, &texture_id);
+  VERIFY_NO_ERROR;
+
+  BrowserData data{};
+  data.browser = browser;
+  data.texture_id = texture_id;
+  browser_data_.insert({identifier, data});
+
+  glBindTexture(GL_TEXTURE_2D, texture_id);
+  VERIFY_NO_ERROR;
+  glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
+  VERIFY_NO_ERROR;
+  glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
+}
+
+void DevRenderer::OnPaint(CefRefPtr<CefBrowser> browser,
+                          CefRenderHandler::PaintElementType type,
+                          const CefRenderHandler::RectList& dirtyRects,
+                          const void* buffer,
+                          int width,
+                          int height) {
+  CEF_REQUIRE_UI_THREAD();
+
+  if (type != CefRenderHandler::PaintElementType::PET_VIEW) {
+    return;  // Ignore PET_POPUP for now, bc I'm lazy
+  }
+
+  int identifier = browser->GetIdentifier();
+  BrowserData* data = &browser_data_[identifier];
+
+  int old_width = data->view_width;
+  int old_height = data->view_height;
+
+  data->view_width = width;
+  data->view_height = height;
+
+  glBindTexture(GL_TEXTURE_2D, data->texture_id);
+
+  glPixelStorei(GL_UNPACK_ROW_LENGTH, width);
+  VERIFY_NO_ERROR;
+
+  bool has_fullscreen_rect =
+      dirtyRects.size() == 1 && dirtyRects[0] == CefRect(0, 0, width, height);
+
+  if (old_width != width || old_height != height || has_fullscreen_rect) {
+    glPixelStorei(GL_UNPACK_SKIP_PIXELS, 0);
+    VERIFY_NO_ERROR;
+    glPixelStorei(GL_UNPACK_SKIP_ROWS, 0);
+    VERIFY_NO_ERROR;
+    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_BGRA,
+                 GL_UNSIGNED_INT_8_8_8_8_REV, buffer);
+    VERIFY_NO_ERROR;
+  } else {
+    CefRenderHandler::RectList::const_iterator i = dirtyRects.begin();
+    for (; i != dirtyRects.end(); ++i) {
+      const CefRect& rect = *i;
+      glPixelStorei(GL_UNPACK_SKIP_PIXELS, rect.x);
+      VERIFY_NO_ERROR;
+      glPixelStorei(GL_UNPACK_SKIP_ROWS, rect.y);
+      VERIFY_NO_ERROR;
+      glTexSubImage2D(GL_TEXTURE_2D, 0, rect.x, rect.y, rect.width, rect.height,
+                      GL_BGRA, GL_UNSIGNED_INT_8_8_8_8_REV, buffer);
+      VERIFY_NO_ERROR;
+    }
+  }
+}
+
+void DevRenderer::OnBeforeClose(CefRefPtr<CefBrowser> browser) {
+  CEF_REQUIRE_UI_THREAD();
+  int identifier = browser->GetIdentifier();
+  BrowserData* data = &browser_data_[identifier];
+  glDeleteTextures(1, &data->texture_id);
+  browser_data_.erase(identifier);
+}
+
+void DevRenderer::Run() {
+  glfwSetErrorCallback(glfw_error_callback);
+
+  if (!glfwInit()) {
+    std::cerr << "Failed to initialize GLFW" << std::endl;
+    return;
+  }
+
+  gleqInit();
+
+  const char* glsl_version = "#version 150";
+  glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 3);
+  glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 2);
+  glfwWindowHint(GLFW_OPENGL_PROFILE, GLFW_OPENGL_CORE_PROFILE);
+  glfwWindowHint(GLFW_OPENGL_FORWARD_COMPAT, GL_TRUE);
+
+  window_ =
+      glfwCreateWindow(800, 600, "livekit-plugins-browser (Development Window)",
+                       nullptr, nullptr);
+
+  gleqTrackWindow(window_);
+
+  if (!window_) {
+    std::cerr << "Failed to create GLFW window" << std::endl;
+    glfwTerminate();
+    return;
+  }
+  glfwMakeContextCurrent(window_);
+  glfwSwapInterval(1);  // Enable vsync
+
+  IMGUI_CHECKVERSION();
+
+  ImGui::CreateContext();
+  ImGuiIO& io = ImGui::GetIO();
+  io.ConfigFlags |= ImGuiConfigFlags_NavEnableKeyboard;
+  io.ConfigFlags |= ImGuiConfigFlags_DockingEnable;
+
+  ImGui_ImplGlfw_InitForOpenGL(window_, true);
+  ImGui_ImplOpenGL3_Init(glsl_version);
+
+  ImVec4 clear_color = ImVec4(0.03f, 0.03f, 0.03f, 1.0f);
+  while (!glfwWindowShouldClose(window_)) {
+    glfwPollEvents();
+
+    CefDoMessageLoopWork();
+
+    ImGui_ImplOpenGL3_NewFrame();
+    ImGui_ImplGlfw_NewFrame();
+    ImGui::NewFrame();
+
+    // Flags used for the "invisible" dockspace frame
+    ImGuiWindowFlags windowFlags =
+        ImGuiWindowFlags_NoDocking | ImGuiWindowFlags_NoTitleBar |
+        ImGuiWindowFlags_NoCollapse | ImGuiWindowFlags_NoResize |
+        ImGuiWindowFlags_NoMove | ImGuiWindowFlags_NoBringToFrontOnFocus |
+        ImGuiWindowFlags_NoNavFocus | ImGuiWindowFlags_NoBackground;
+
+    ImGuiViewport* viewport = ImGui::GetMainViewport();
+    ImGui::SetNextWindowPos(viewport->Pos);
+    ImGui::SetNextWindowSize(viewport->Size);
+    ImGui::SetNextWindowViewport(viewport->ID);
+
+    ImGui::PushStyleVar(ImGuiStyleVar_WindowRounding, 0);
+    ImGui::PushStyleVar(ImGuiStyleVar_WindowBorderSize, 0);
+    ImGui::PushStyleVar(ImGuiStyleVar_WindowPadding, ImVec2(0, 0));
+    ImGui::Begin("Editor", nullptr, windowFlags);
+    ImGui::PopStyleVar(3);
+    ImGui::DockSpace(ImGui::GetID("EditorDockSpace"), ImVec2(),
+                     ImGuiDockNodeFlags_PassthruCentralNode);
+
+    // Focused browser input states
+    BrowserData* focused_browser = nullptr;
+    int browser_view_x = 0;
+    int browser_view_y = 0;
+
+    for (auto& [identifier, data] : browser_data_) {
+      std::string name =
+          (data.title.empty() ? "Browser #" + std::to_string(identifier)
+                              : data.title) +
+          "###Browser" + std::to_string(identifier);
+
+      ImGui::PushStyleVar(ImGuiStyleVar_WindowPadding, ImVec2(0, 0));
+      if (ImGui::Begin(name.c_str())) {
+        ImGui::BeginDisabled(!data.browser->CanGoBack());
+        if (ImGui::ArrowButton("##BrowserBack", ImGuiDir_Left)) {
+          data.browser->GoBack();
+        }
+        ImGui::EndDisabled();
+        ImGui::SameLine();
+
+        ImGui::BeginDisabled(!data.browser->CanGoForward());
+        if (ImGui::ArrowButton("##BrowserForward", ImGuiDir_Right)) {
+          data.browser->GoForward();
+        }
+        ImGui::EndDisabled();
+        ImGui::SameLine();
+
+        if (ImGui::InputText("##BrowserURL", &data.url,
+                             ImGuiInputTextFlags_EnterReturnsTrue)) {
+          data.browser->GetMainFrame()->LoadURL(data.url);
+        }
+
+        ImGui::SameLine();
+
+        if (ImGui::Button("Show DevTools")) {
+          CefWindowInfo windowInfo{};
+          CefBrowserSettings settings{};
+
+          data.browser->GetHost()->ShowDevTools(
+              windowInfo, DevToolsHandler::GetInstance(), settings, CefPoint());
+        }
+
+        ImVec2 size = ImGui::GetContentRegionAvail();
+
+        // Resize the browser view if needed
+        if (size.x > 0 && size.y > 0 &&
+            (data.view_width != static_cast<int>(size.x) ||
+             data.view_height != static_cast<int>(size.y))) {
+          browser_store_->GetBrowserHandle(identifier)
+              ->SetSize(static_cast<int>(size.x), static_cast<int>(size.y));
+        }
+
+        ImVec2 cursor_pos = ImGui::GetCursorScreenPos();
+
+        bool is_focused = ImGui::IsWindowFocused();
+        if (is_focused) {
+          focused_browser = &data;
+          browser_view_x = static_cast<int>(cursor_pos.x);
+          browser_view_y = static_cast<int>(cursor_pos.y);
+          data.browser->GetHost()->SetFocus(true);
+        }
+
+        // Render the browser tex
+        ImGui::Image((void*)(intptr_t)data.texture_id,
+                     ImVec2((float)data.view_width, (float)data.view_height));
+      }
+      ImGui::End();
+      ImGui::PopStyleVar();
+    }
+
+    GLEQevent event;
+
+    while (gleqNextEvent(&event)) {
+      switch (event.type) {
+        case GLEQ_CURSOR_MOVED:
+        case GLEQ_BUTTON_PRESSED:
+        case GLEQ_SCROLLED:
+        case GLEQ_BUTTON_RELEASED:
+          if (focused_browser) {
+            CefMouseEvent cef_event;
+
+            if (event.type == GLEQ_CURSOR_MOVED) {
+              cef_event.x = event.pos.x - browser_view_x;
+              cef_event.y = event.pos.y - browser_view_y;
+              focused_browser->browser->GetHost()->SendMouseMoveEvent(cef_event,
+                                                                      false);
+            } else if (event.type == GLEQ_SCROLLED) {
+              double xpos, ypos;
+              glfwGetCursorPos(window_, &xpos, &ypos);
+              cef_event.x = static_cast<int>(xpos) - browser_view_x;
+              cef_event.y = static_cast<int>(ypos) - browser_view_y;
+
+              static const int scrollbarPixelsPerTick = 20;
+              int scroll_x =
+                  static_cast<int>(event.scroll.x * scrollbarPixelsPerTick);
+              int scroll_y =
+                  static_cast<int>(event.scroll.y * scrollbarPixelsPerTick);
+
+              focused_browser->browser->GetHost()->SendMouseWheelEvent(
+                  cef_event, scroll_x, scroll_y);
+            } else {
+              double xpos, ypos;
+              glfwGetCursorPos(window_, &xpos, &ypos);
+              cef_event.x = static_cast<int>(xpos) - browser_view_x;
+              cef_event.y = static_cast<int>(ypos) - browser_view_y;
+              cef_event.modifiers = glfw_mods_to_cef_mods(event.mouse.mods);
+
+              std::optional<CefBrowserHost::MouseButtonType> cef_button =
+                  glfw_button_to_cef_button(event.mouse.button);
+
+              if (cef_button.has_value()) {
+                focused_browser->browser->GetHost()->SendMouseClickEvent(
+                    cef_event, cef_button.value(),
+                    event.type == GLEQ_BUTTON_RELEASED, 1);
+              }
+            }
+          }
+          break;
+        case GLEQ_KEY_PRESSED:
+        case GLEQ_KEY_RELEASED:
+          if (focused_browser) {
+            CefKeyEvent cef_event;
+            cef_event.windows_key_code =
+                glfw_key_to_cef_key(event.keyboard.key);
+            cef_event.native_key_code = event.keyboard.scancode;
+            cef_event.modifiers = glfw_mods_to_cef_mods(event.keyboard.mods);
+            cef_event.is_system_key = false;
+
+            if (event.type == GLEQ_KEY_PRESSED) {
+              cef_event.type = KEYEVENT_RAWKEYDOWN;
+              focused_browser->browser->GetHost()->SendKeyEvent(cef_event);
+            } else {
+              cef_event.type = KEYEVENT_KEYUP;
+              focused_browser->browser->GetHost()->SendKeyEvent(cef_event);
+            }
+          }
+          break;
+        case GLEQ_CODEPOINT_INPUT:
+          if (focused_browser) {
+            CefKeyEvent cef_event;
+            cef_event.type = KEYEVENT_CHAR;
+            cef_event.windows_key_code = 0;
+            cef_event.native_key_code = 0;
+            cef_event.modifiers = 0;
+            cef_event.is_system_key = false;
+            cef_event.unmodified_character = event.codepoint;
+            cef_event.character = event.codepoint;
+            focused_browser->browser->GetHost()->SendKeyEvent(cef_event);
+          }
+          break;
+        default:
+          break;
+      }
+
+      gleqFreeEvent(&event);
+    }
+
+    ImGui::End();
+    ImGui::Render();
+    int display_w, display_h;
+    glfwGetFramebufferSize(window_, &display_w, &display_h);
+    glViewport(0, 0, display_w, display_h);
+    glClearColor(clear_color.x * clear_color.w, clear_color.y * clear_color.w,
+                 clear_color.z * clear_color.w, clear_color.w);
+    glClear(GL_COLOR_BUFFER_BIT);
+    ImGui_ImplOpenGL3_RenderDrawData(ImGui::GetDrawData());
+
+    glfwSwapBuffers(window_);
+  }
+
+  ImGui_ImplOpenGL3_Shutdown();
+  ImGui_ImplGlfw_Shutdown();
+  ImGui::DestroyContext();
+
+  glfwDestroyWindow(window_);
+  glfwTerminate();
+}
+
+void DevRenderer::Close() {
+  // glfwSetWindowShouldClose(window_, GLFW_TRUE);
+}
diff --git a/livekit-plugins/livekit-plugins-browser/cef/src/dev_renderer.hpp b/livekit-plugins/livekit-plugins-browser/src/dev_renderer.hpp
similarity index 62%
rename from livekit-plugins/livekit-plugins-browser/cef/src/dev_renderer.hpp
rename to livekit-plugins/livekit-plugins-browser/src/dev_renderer.hpp
index 673674474..c6110a742 100644
--- a/livekit-plugins/livekit-plugins-browser/cef/src/dev_renderer.hpp
+++ b/livekit-plugins/livekit-plugins-browser/src/dev_renderer.hpp
@@ -2,6 +2,7 @@
 #define LKCEF_DEV_RENDERER_HPP
 
 #include "include/cef_app.h"
+#include "browser_handle.hpp"
 
 #define GL_SILENCE_DEPRECATION
 #include <GLFW/glfw3.h>  // Will drag system OpenGL headers
@@ -13,11 +14,18 @@
 
 class DevRenderer: public CefBaseRefCounted {
  public:
-  DevRenderer();
+  DevRenderer(CefRefPtr<BrowserStore> browser_store);
 
   void Run();
   void Close();
 
+  void OnTitleChange(CefRefPtr<CefBrowser> browser,
+                     const CefString &title);
+
+  void OnLoadingStateChange(CefRefPtr<CefBrowser> browser,
+                            bool isLoading,
+                            bool canGoBack,
+                            bool canGoForward);
 
   void OnAfterCreated(CefRefPtr<CefBrowser> browser);
 
@@ -30,19 +38,24 @@ class DevRenderer: public CefBaseRefCounted {
 
   void OnBeforeClose(CefRefPtr<CefBrowser> browser);
 
-  void* getNativeWindowHandle() {
+  void* getNativeWindowHandle() const {
     return glfwGetCocoaWindow(window_);
   }
 
  private:
-  struct RenderData{
+  struct BrowserData{
+    CefRefPtr<CefBrowser> browser;
     unsigned int texture_id;
     int view_width;
     int view_height;
+    std::string title;
+    std::string url;
   };
 
   GLFWwindow* window_ = nullptr;
-  std::unordered_map<int, RenderData> render_data_;
+  std::unordered_map<int, BrowserData> browser_data_;
+
+  CefRefPtr<BrowserStore> browser_store_;
 
   IMPLEMENT_REFCOUNTING(DevRenderer);
 };
diff --git a/livekit-plugins/livekit-plugins-browser/src/dummy.cpp b/livekit-plugins/livekit-plugins-browser/src/dummy.cpp
new file mode 100644
index 000000000..d269c8943
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/src/dummy.cpp
@@ -0,0 +1,3 @@
+int main() {
+  return 0;
+}
\ No newline at end of file
diff --git a/livekit-plugins/livekit-plugins-browser/src/gleq.h b/livekit-plugins/livekit-plugins-browser/src/gleq.h
new file mode 100644
index 000000000..69a9e6293
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/src/gleq.h
@@ -0,0 +1,419 @@
+/*
+* GLEQ - A basic event queue for GLFW 3
+* Copyright © Camilla Löwy <elmindreda@glfw.org>
+*
+* This software is provided 'as-is', without any express or implied
+* warranty. In no event will the authors be held liable for any damages
+* arising from the use of this software.
+*
+* Permission is granted to anyone to use this software for any purpose,
+* including commercial applications, and to alter it and redistribute it
+* freely, subject to the following restrictions:
+*
+* 1. The origin of this software must not be misrepresented; you must not
+*    claim that you wrote the original software. If you use this software
+*    in a product, an acknowledgment in the product documentation would
+*    be appreciated but is not required.
+*
+* 2. Altered source versions must be plainly marked as such, and must not
+*    be misrepresented as being the original software.
+*
+* 3. This notice may not be removed or altered from any source
+*    distribution.
+*/
+
+#ifndef GLEQ_HEADER_FILE
+#define GLEQ_HEADER_FILE
+
+#include <GLFW/glfw3.h>
+
+#ifdef GLEQ_STATIC
+#define GLEQDEF static
+#else
+#define GLEQDEF extern
+#endif
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+typedef enum
+{
+ GLEQ_NONE,
+ GLEQ_WINDOW_MOVED,
+ GLEQ_WINDOW_RESIZED,
+ GLEQ_WINDOW_CLOSED,
+ GLEQ_WINDOW_REFRESH,
+ GLEQ_WINDOW_FOCUSED,
+ GLEQ_WINDOW_DEFOCUSED,
+ GLEQ_WINDOW_ICONIFIED,
+ GLEQ_WINDOW_UNICONIFIED,
+ GLEQ_FRAMEBUFFER_RESIZED,
+ GLEQ_BUTTON_PRESSED,
+ GLEQ_BUTTON_RELEASED,
+ GLEQ_CURSOR_MOVED,
+ GLEQ_CURSOR_ENTERED,
+ GLEQ_CURSOR_LEFT,
+ GLEQ_SCROLLED,
+ GLEQ_KEY_PRESSED,
+ GLEQ_KEY_REPEATED,
+ GLEQ_KEY_RELEASED,
+ GLEQ_CODEPOINT_INPUT,
+ GLEQ_MONITOR_CONNECTED,
+ GLEQ_MONITOR_DISCONNECTED,
+#if GLFW_VERSION_MINOR >= 1
+ GLEQ_FILE_DROPPED,
+#endif
+#if GLFW_VERSION_MINOR >= 2
+ GLEQ_JOYSTICK_CONNECTED,
+ GLEQ_JOYSTICK_DISCONNECTED,
+#endif
+#if GLFW_VERSION_MINOR >= 3
+ GLEQ_WINDOW_MAXIMIZED,
+ GLEQ_WINDOW_UNMAXIMIZED,
+ GLEQ_WINDOW_SCALE_CHANGED,
+#endif
+} GLEQtype;
+
+typedef struct GLEQevent
+{
+ GLEQtype type;
+ union {
+   GLFWwindow* window;
+   GLFWmonitor* monitor;
+   int joystick;
+ };
+ union {
+   struct {
+     int x;
+     int y;
+   } pos;
+   struct {
+     int width;
+     int height;
+   } size;
+   struct {
+     double x;
+     double y;
+   } scroll;
+   struct {
+     int key;
+     int scancode;
+     int mods;
+   } keyboard;
+   struct {
+     int button;
+     int mods;
+   } mouse;
+   unsigned int codepoint;
+#if GLFW_VERSION_MINOR >= 1
+   struct {
+     char** paths;
+     int count;
+   } file;
+#endif
+#if GLFW_VERSION_MINOR >= 3
+   struct {
+     float x;
+     float y;
+   } scale;
+#endif
+ };
+} GLEQevent;
+
+GLEQDEF void gleqInit(void);
+GLEQDEF void gleqTrackWindow(GLFWwindow* window);
+
+GLEQDEF int gleqNextEvent(GLEQevent* event);
+GLEQDEF void gleqFreeEvent(GLEQevent* event);
+
+#ifdef __cplusplus
+}
+#endif
+
+#ifdef GLEQ_IMPLEMENTATION
+
+#include <stdlib.h>
+#include <string.h>
+#include <assert.h>
+
+#ifndef GLEQ_CAPACITY
+#define GLEQ_CAPACITY 1024
+#endif
+
+static struct
+{
+ GLEQevent events[GLEQ_CAPACITY];
+ size_t head;
+ size_t tail;
+} gleq_queue = { {}, 0, 0 };
+
+static char* gleq_strdup(const char* string)
+{
+ const size_t size = strlen(string) + 1;
+ char* result = (char*) malloc(size);
+ memcpy(result, string, size);
+ return result;
+}
+
+static GLEQevent* gleq_new_event(void)
+{
+ GLEQevent* event = gleq_queue.events + gleq_queue.head;
+ gleq_queue.head = (gleq_queue.head + 1) % GLEQ_CAPACITY;
+ assert(gleq_queue.head != gleq_queue.tail);
+ memset(event, 0, sizeof(GLEQevent));
+ return event;
+}
+
+static void gleq_window_pos_callback(GLFWwindow* window, int x, int y)
+{
+ GLEQevent* event = gleq_new_event();
+ event->type = GLEQ_WINDOW_MOVED;
+ event->window = window;
+ event->pos.x = x;
+ event->pos.y = y;
+}
+
+static void gleq_window_size_callback(GLFWwindow* window, int width, int height)
+{
+ GLEQevent* event = gleq_new_event();
+ event->type = GLEQ_WINDOW_RESIZED;
+ event->window = window;
+ event->size.width = width;
+ event->size.height = height;
+}
+
+static void gleq_window_close_callback(GLFWwindow* window)
+{
+ GLEQevent* event = gleq_new_event();
+ event->type = GLEQ_WINDOW_CLOSED;
+ event->window = window;
+}
+
+static void gleq_window_refresh_callback(GLFWwindow* window)
+{
+ GLEQevent* event = gleq_new_event();
+ event->type = GLEQ_WINDOW_REFRESH;
+ event->window = window;
+}
+
+static void gleq_window_focus_callback(GLFWwindow* window, int focused)
+{
+ GLEQevent* event = gleq_new_event();
+ event->window = window;
+
+ if (focused)
+   event->type = GLEQ_WINDOW_FOCUSED;
+ else
+   event->type = GLEQ_WINDOW_DEFOCUSED;
+}
+
+static void gleq_window_iconify_callback(GLFWwindow* window, int iconified)
+{
+ GLEQevent* event = gleq_new_event();
+ event->window = window;
+
+ if (iconified)
+   event->type = GLEQ_WINDOW_ICONIFIED;
+ else
+   event->type = GLEQ_WINDOW_UNICONIFIED;
+}
+
+static void gleq_framebuffer_size_callback(GLFWwindow* window, int width, int height)
+{
+ GLEQevent* event = gleq_new_event();
+ event->type = GLEQ_FRAMEBUFFER_RESIZED;
+ event->window = window;
+ event->size.width = width;
+ event->size.height = height;
+}
+
+static void gleq_mouse_button_callback(GLFWwindow* window, int button, int action, int mods)
+{
+ GLEQevent* event = gleq_new_event();
+ event->window = window;
+ event->mouse.button = button;
+ event->mouse.mods = mods;
+
+ if (action == GLFW_PRESS)
+   event->type = GLEQ_BUTTON_PRESSED;
+ else if (action == GLFW_RELEASE)
+   event->type = GLEQ_BUTTON_RELEASED;
+}
+
+static void gleq_cursor_pos_callback(GLFWwindow* window, double x, double y)
+{
+ GLEQevent* event = gleq_new_event();
+ event->type = GLEQ_CURSOR_MOVED;
+ event->window = window;
+ event->pos.x = (int) x;
+ event->pos.y = (int) y;
+}
+
+static void gleq_cursor_enter_callback(GLFWwindow* window, int entered)
+{
+ GLEQevent* event = gleq_new_event();
+ event->window = window;
+
+ if (entered)
+   event->type = GLEQ_CURSOR_ENTERED;
+ else
+   event->type = GLEQ_CURSOR_LEFT;
+}
+
+static void gleq_scroll_callback(GLFWwindow* window, double x, double y)
+{
+ GLEQevent* event = gleq_new_event();
+ event->type = GLEQ_SCROLLED;
+ event->window = window;
+ event->scroll.x = x;
+ event->scroll.y = y;
+}
+
+static void gleq_key_callback(GLFWwindow* window, int key, int scancode, int action, int mods)
+{
+ GLEQevent* event = gleq_new_event();
+ event->window = window;
+ event->keyboard.key = key;
+ event->keyboard.scancode = scancode;
+ event->keyboard.mods = mods;
+
+ if (action == GLFW_PRESS)
+   event->type = GLEQ_KEY_PRESSED;
+ else if (action == GLFW_RELEASE)
+   event->type = GLEQ_KEY_RELEASED;
+ else if (action == GLFW_REPEAT)
+   event->type = GLEQ_KEY_REPEATED;
+}
+
+static void gleq_char_callback(GLFWwindow* window, unsigned int codepoint)
+{
+ GLEQevent* event = gleq_new_event();
+ event->type = GLEQ_CODEPOINT_INPUT;
+ event->window = window;
+ event->codepoint = codepoint;
+}
+
+static void gleq_monitor_callback(GLFWmonitor* monitor, int action)
+{
+ GLEQevent* event = gleq_new_event();
+ event->monitor = monitor;
+
+ if (action == GLFW_CONNECTED)
+   event->type = GLEQ_MONITOR_CONNECTED;
+ else if (action == GLFW_DISCONNECTED)
+   event->type = GLEQ_MONITOR_DISCONNECTED;
+}
+
+#if GLFW_VERSION_MINOR >= 1
+static void gleq_file_drop_callback(GLFWwindow* window, int count, const char** paths)
+{
+ GLEQevent* event = gleq_new_event();
+ event->type = GLEQ_FILE_DROPPED;
+ event->window = window;
+ event->file.paths = (char**) malloc(count * sizeof(char*));
+ event->file.count = count;
+
+ while (count--)
+   event->file.paths[count] = gleq_strdup(paths[count]);
+}
+#endif
+
+#if GLFW_VERSION_MINOR >= 2
+static void gleq_joystick_callback(int jid, int action)
+{
+ GLEQevent* event = gleq_new_event();
+ event->joystick = jid;
+
+ if (action == GLFW_CONNECTED)
+   event->type = GLEQ_JOYSTICK_CONNECTED;
+ else if (action == GLFW_DISCONNECTED)
+   event->type = GLEQ_JOYSTICK_DISCONNECTED;
+}
+#endif
+
+#if GLFW_VERSION_MINOR >= 3
+static void gleq_window_maximize_callback(GLFWwindow* window, int maximized)
+{
+ GLEQevent* event = gleq_new_event();
+ event->window = window;
+
+ if (maximized)
+   event->type = GLEQ_WINDOW_MAXIMIZED;
+ else
+   event->type = GLEQ_WINDOW_UNMAXIMIZED;
+}
+
+static void gleq_window_content_scale_callback(GLFWwindow* window, float xscale, float yscale)
+{
+ GLEQevent* event = gleq_new_event();
+ event->window = window;
+ event->type = GLEQ_WINDOW_SCALE_CHANGED;
+ event->scale.x = xscale;
+ event->scale.y = yscale;
+}
+#endif
+
+GLEQDEF void gleqInit(void)
+{
+ glfwSetMonitorCallback(gleq_monitor_callback);
+#if GLFW_VERSION_MINOR >= 2
+ glfwSetJoystickCallback(gleq_joystick_callback);
+#endif
+}
+
+GLEQDEF void gleqTrackWindow(GLFWwindow* window)
+{
+ glfwSetWindowPosCallback(window, gleq_window_pos_callback);
+ glfwSetWindowSizeCallback(window, gleq_window_size_callback);
+ glfwSetWindowCloseCallback(window, gleq_window_close_callback);
+ glfwSetWindowRefreshCallback(window, gleq_window_refresh_callback);
+ glfwSetWindowFocusCallback(window, gleq_window_focus_callback);
+ glfwSetWindowIconifyCallback(window, gleq_window_iconify_callback);
+ glfwSetFramebufferSizeCallback(window, gleq_framebuffer_size_callback);
+ glfwSetMouseButtonCallback(window, gleq_mouse_button_callback);
+ glfwSetCursorPosCallback(window, gleq_cursor_pos_callback);
+ glfwSetCursorEnterCallback(window, gleq_cursor_enter_callback);
+ glfwSetScrollCallback(window, gleq_scroll_callback);
+ glfwSetKeyCallback(window, gleq_key_callback);
+ glfwSetCharCallback(window, gleq_char_callback);
+#if GLFW_VERSION_MINOR >= 1
+ glfwSetDropCallback(window, gleq_file_drop_callback);
+#endif
+#if GLFW_VERSION_MINOR >= 3
+ glfwSetWindowMaximizeCallback(window, gleq_window_maximize_callback);
+ glfwSetWindowContentScaleCallback(window, gleq_window_content_scale_callback);
+#endif
+}
+
+GLEQDEF int gleqNextEvent(GLEQevent* event)
+{
+ memset(event, 0, sizeof(GLEQevent));
+
+ if (gleq_queue.head != gleq_queue.tail)
+ {
+   *event = gleq_queue.events[gleq_queue.tail];
+   gleq_queue.tail = (gleq_queue.tail + 1) % GLEQ_CAPACITY;
+ }
+
+ return event->type != GLEQ_NONE;
+}
+
+GLEQDEF void gleqFreeEvent(GLEQevent* event)
+{
+#if GLFW_VERSION_MINOR >= 1
+ if (event->type == GLEQ_FILE_DROPPED)
+ {
+   while (event->file.count--)
+     free(event->file.paths[event->file.count]);
+
+   free(event->file.paths);
+ }
+#endif
+
+ memset(event, 0, sizeof(GLEQevent));
+}
+
+#endif /* GLEQ_IMPLEMENTATION */
+
+#endif /* GLEQ_HEADER_FILE */
diff --git a/livekit-plugins/livekit-plugins-browser/src/handler.cpp b/livekit-plugins/livekit-plugins-browser/src/handler.cpp
new file mode 100644
index 000000000..1c5e95972
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/src/handler.cpp
@@ -0,0 +1,181 @@
+#include "handler.hpp"
+
+#include <iostream>
+
+#include "include/base/cef_callback.h"
+#include "include/cef_parser.h"
+#include "include/views/cef_browser_view.h"
+#include "include/wrapper/cef_closure_task.h"
+#include "include/wrapper/cef_helpers.h"
+
+DevToolsHandler* g_dev_instance = nullptr;
+
+DevToolsHandler::DevToolsHandler() {
+  g_dev_instance = this;
+}
+
+DevToolsHandler::~DevToolsHandler() {
+  g_dev_instance = nullptr;
+}
+
+DevToolsHandler* DevToolsHandler::GetInstance() {
+  return g_dev_instance;
+}
+
+AgentHandler* g_instance = nullptr;
+
+AgentHandler::AgentHandler(CefRefPtr<BrowserStore> browser_store,
+                           CefRefPtr<DevRenderer> dev_renderer)
+    : browser_store_(std::move(browser_store)),
+      dev_renderer_(std::move(dev_renderer)) {
+  g_instance = this;
+}
+
+AgentHandler::~AgentHandler() {
+  g_instance = nullptr;
+}
+
+AgentHandler* AgentHandler::GetInstance() {
+  return g_instance;
+}
+
+void AgentHandler::OnTitleChange(CefRefPtr<CefBrowser> browser,
+                                 const CefString& title) {
+  CEF_REQUIRE_UI_THREAD();
+  if (dev_renderer_)
+    dev_renderer_->OnTitleChange(browser, title);
+}
+
+void AgentHandler::OnPaint(CefRefPtr<CefBrowser> browser,
+                           PaintElementType type,
+                           const RectList& dirtyRects,
+                           const void* buffer,
+                           int width,
+                           int height) {
+  CEF_REQUIRE_UI_THREAD();
+
+  int identifier = browser->GetIdentifier();
+  CefRefPtr<BrowserHandle> handle =
+      browser_store_->browser_handles_[identifier];
+  if (handle->paint_callback_)
+    handle->paint_callback_(dirtyRects, buffer, width, height);
+
+  if (dev_renderer_)
+    dev_renderer_->OnPaint(browser, type, dirtyRects, buffer, width, height);
+}
+
+void AgentHandler::GetViewRect(CefRefPtr<CefBrowser> browser, CefRect& rect) {
+  CEF_REQUIRE_UI_THREAD();
+
+  int identifier = browser->GetIdentifier();
+  CefRefPtr<BrowserHandle>& handle =
+      browser_store_->browser_handles_[identifier];
+  rect.Set(0, 0, handle->GetWidth(), handle->GetHeight());
+};
+
+void AgentHandler::OnAudioStreamPacket(CefRefPtr<CefBrowser> browser,
+                                       const float** data,
+                                       int frames,
+                                       int64_t pts) {
+  // std::cout << "OnAudioStreamPacket" << std::endl;
+}
+
+void AgentHandler::OnAudioStreamStarted(CefRefPtr<CefBrowser> browser,
+                                        const CefAudioParameters& params,
+                                        int channels) {}
+
+void AgentHandler::OnAudioStreamStopped(CefRefPtr<CefBrowser> browser) {}
+
+void AgentHandler::OnAudioStreamError(CefRefPtr<CefBrowser> browser,
+                                      const CefString& message) {}
+
+bool AgentHandler::OnBeforePopup(CefRefPtr<CefBrowser> browser,
+                                 CefRefPtr<CefFrame> frame,
+                                 const CefString& target_url,
+                                 const CefString& target_frame_name,
+                                 WindowOpenDisposition target_disposition,
+                                 bool user_gesture,
+                                 const CefPopupFeatures& popupFeatures,
+                                 CefWindowInfo& windowInfo,
+                                 CefRefPtr<CefClient>& client,
+                                 CefBrowserSettings& settings,
+                                 CefRefPtr<CefDictionaryValue>& extra_info,
+                                 bool* no_javascript_access) {
+  browser->GetMainFrame()->LoadURL(target_url);
+  return true;
+}
+
+void AgentHandler::OnAfterCreated(CefRefPtr<CefBrowser> browser) {
+  CEF_REQUIRE_UI_THREAD();
+
+  if (browser->IsPopup()) {
+    return;
+  }
+
+  int identifier = browser->GetIdentifier();
+  CefRefPtr<BrowserHandle> handle = browser_store_->pending_handles_.front();
+  browser_store_->pending_handles_.pop_front();
+
+  handle->browser_ = browser;
+  browser_store_->browser_handles_[identifier] = handle;
+
+  if (handle->created_callback_)
+    handle->created_callback_();
+
+  if (dev_renderer_)
+    dev_renderer_->OnAfterCreated(browser);
+}
+
+bool AgentHandler::DoClose(CefRefPtr<CefBrowser> browser) {
+  CEF_REQUIRE_UI_THREAD();
+  int identifier = browser->GetIdentifier();
+  CefRefPtr<BrowserHandle> handle =
+      browser_store_->browser_handles_[identifier];
+  browser_store_->browser_handles_.erase(identifier);
+
+  if (handle->close_callback_)
+    handle->close_callback_();
+
+  return false;
+}
+
+void AgentHandler::OnBeforeClose(CefRefPtr<CefBrowser> browser) {
+  CEF_REQUIRE_UI_THREAD();
+
+  if (dev_renderer_)
+    dev_renderer_->OnBeforeClose(browser);
+}
+
+void AgentHandler::OnLoadingStateChange(CefRefPtr<CefBrowser> browser,
+                                        bool isLoading,
+                                        bool canGoBack,
+                                        bool canGoForward) {
+  CEF_REQUIRE_UI_THREAD();
+
+  if (dev_renderer_)
+    dev_renderer_->OnLoadingStateChange(browser, isLoading, canGoBack,
+                                        canGoForward);
+}
+
+void AgentHandler::CloseAllBrowsers(bool force_close) {
+  if (!CefCurrentlyOn(TID_UI)) {
+    // Execute on the UI thread.
+    CefPostTask(TID_UI, base::BindOnce(&AgentHandler::CloseAllBrowsers, this,
+                                       force_close));
+    return;
+  }
+
+  if (browser_store_->browser_handles_.empty()) {
+    return;
+  }
+
+  for (const auto& pair : browser_store_->browser_handles_) {
+    pair.second->browser_->GetHost()->CloseBrowser(force_close);
+  }
+}
+
+#if !defined(OS_MAC)
+void AgentHandler::PlatformShowWindow(CefRefPtr<CefBrowser> browser) {
+  NOTIMPLEMENTED();
+}
+#endif
diff --git a/livekit-plugins/livekit-plugins-browser/src/handler.hpp b/livekit-plugins/livekit-plugins-browser/src/handler.hpp
new file mode 100644
index 000000000..3967ee7b2
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/src/handler.hpp
@@ -0,0 +1,104 @@
+#ifndef LKCEF_HANDLER_HPP
+#define LKCEF_HANDLER_HPP
+
+#include <list>
+
+#include "dev_renderer.hpp"
+#include "browser_handle.hpp"
+#include "include/cef_client.h"
+#include "include/wrapper/cef_helpers.h"
+
+class DevToolsHandler : public CefClient {
+ public:
+  DevToolsHandler();
+  ~DevToolsHandler();
+
+  static DevToolsHandler* GetInstance();
+
+ private:
+  IMPLEMENT_REFCOUNTING(DevToolsHandler);
+};
+
+class AgentHandler : public CefClient,
+                     public CefDisplayHandler,
+                     public CefRenderHandler,
+                     public CefAudioHandler,
+                     public CefLifeSpanHandler,
+                     public CefLoadHandler {
+ public:
+  AgentHandler(CefRefPtr<BrowserStore> browser_store, CefRefPtr<DevRenderer> dev_renderer);
+  ~AgentHandler();
+
+  static AgentHandler* GetInstance();
+
+  CefRefPtr<CefDisplayHandler> GetDisplayHandler() override { return this; }
+  CefRefPtr<CefRenderHandler> GetRenderHandler() override { return this; }
+  CefRefPtr<CefAudioHandler> GetAudioHandler() override { return this; }
+  CefRefPtr<CefLifeSpanHandler> GetLifeSpanHandler() override { return this; }
+  CefRefPtr<CefLoadHandler> GetLoadHandler() override { return this; }
+
+  // CefDisplayHandler methods
+  void OnTitleChange(CefRefPtr<CefBrowser> browser,
+                     const CefString& title) override;
+
+  // CefRenderHandler methods
+  void OnPaint(CefRefPtr<CefBrowser> browser,
+               PaintElementType type,
+               const RectList& dirtyRects,
+               const void* buffer,
+               int width,
+               int height) override;
+
+  void GetViewRect(CefRefPtr<CefBrowser> browser, CefRect& rect) override;
+
+  // CefAudioHandler methods
+  void OnAudioStreamPacket(CefRefPtr<CefBrowser> browser,
+                           const float** data,
+                           int frames,
+                           int64_t pts) override;
+
+  void OnAudioStreamStarted(CefRefPtr<CefBrowser> browser,
+                            const CefAudioParameters& params,
+                            int channels) override;
+
+  void OnAudioStreamStopped(CefRefPtr<CefBrowser> browser) override;
+
+  void OnAudioStreamError(CefRefPtr<CefBrowser> browser,
+                          const CefString& message) override;
+
+  // CefLifeSpanHandler methods
+
+  bool OnBeforePopup(CefRefPtr<CefBrowser> browser,
+                     CefRefPtr<CefFrame> frame,
+                     const CefString& target_url,
+                     const CefString& target_frame_name,
+                     WindowOpenDisposition target_disposition,
+                     bool user_gesture,
+                     const CefPopupFeatures& popupFeatures,
+                     CefWindowInfo& windowInfo,
+                     CefRefPtr<CefClient>& client,
+                     CefBrowserSettings& settings,
+                     CefRefPtr<CefDictionaryValue>& extra_info,
+                     bool* no_javascript_access) override;
+
+  void OnAfterCreated(CefRefPtr<CefBrowser> browser) override;
+  bool DoClose(CefRefPtr<CefBrowser> browser) override;
+  void OnBeforeClose(CefRefPtr<CefBrowser> browser) override;
+
+  // CefLoadHandler methods
+
+  void OnLoadingStateChange(CefRefPtr<CefBrowser> browser,
+                            bool isLoading,
+                            bool canGoBack,
+                            bool canGoForward) override;
+
+  void CloseAllBrowsers(bool force_close);
+
+ private:
+  CefRefPtr<BrowserStore> browser_store_;
+  CefRefPtr<DevRenderer> dev_renderer_;
+
+  IMPLEMENT_REFCOUNTING(AgentHandler);
+};
+
+#endif  // LKCEF_HANDLER_HPP
diff --git a/livekit-plugins/livekit-plugins-browser/cef/src/helper_main_linux.cpp b/livekit-plugins/livekit-plugins-browser/src/helper_main_linux.cpp
similarity index 100%
rename from livekit-plugins/livekit-plugins-browser/cef/src/helper_main_linux.cpp
rename to livekit-plugins/livekit-plugins-browser/src/helper_main_linux.cpp
diff --git a/livekit-plugins/livekit-plugins-browser/cef/src/helper_main_mac.mm b/livekit-plugins/livekit-plugins-browser/src/helper_main_mac.mm
similarity index 100%
rename from livekit-plugins/livekit-plugins-browser/cef/src/helper_main_mac.mm
rename to livekit-plugins/livekit-plugins-browser/src/helper_main_mac.mm
diff --git a/livekit-plugins/livekit-plugins-browser/cef/src/utils.hpp b/livekit-plugins/livekit-plugins-browser/src/helper_main_win.cpp
similarity index 100%
rename from livekit-plugins/livekit-plugins-browser/cef/src/utils.hpp
rename to livekit-plugins/livekit-plugins-browser/src/helper_main_win.cpp
diff --git a/livekit-plugins/livekit-plugins-browser/src/keyboard_codes.h b/livekit-plugins/livekit-plugins-browser/src/keyboard_codes.h
new file mode 100644
index 000000000..5a3b67e82
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/src/keyboard_codes.h
@@ -0,0 +1,528 @@
+#ifndef LKCEF_KEYBOARD_CODES_H
+#define LKCEF_KEYBOARD_CODES_H
+
+namespace WebCore {
+// VK_LBUTTON (01) Left mouse button
+// VK_RBUTTON (02) Right mouse button
+// VK_CANCEL (03) Control-break processing
+// VK_MBUTTON (04) Middle mouse button (three-button mouse)
+// VK_XBUTTON1 (05)
+// VK_XBUTTON2 (06)
+
+// VK_BACK (08) BACKSPACE key
+const int VK_BACK = 0x08;
+
+// VK_TAB (09) TAB key
+const int VK_TAB = 0x09;
+
+// VK_CLEAR (0C) CLEAR key
+const int VK_CLEAR = 0x0C;
+
+// VK_RETURN (0D)
+const int VK_RETURN = 0x0D;
+
+// VK_SHIFT (10) SHIFT key
+const int VK_SHIFT = 0x10;
+
+// VK_CONTROL (11) CTRL key
+const int VK_CONTROL = 0x11;
+
+// VK_MENU (12) ALT key
+const int VK_MENU = 0x12;
+
+// VK_PAUSE (13) PAUSE key
+const int VK_PAUSE = 0x13;
+
+// VK_CAPITAL (14) CAPS LOCK key
+const int VK_CAPITAL = 0x14;
+
+// VK_KANA (15) Input Method Editor (IME) Kana mode
+const int VK_KANA = 0x15;
+
+// VK_HANGUEL (15) IME Hanguel mode (maintained for compatibility; use
+// VK_HANGUL) VK_HANGUL (15) IME Hangul mode
+const int VK_HANGUL = 0x15;
+
+// VK_JUNJA (17) IME Junja mode
+const int VK_JUNJA = 0x17;
+
+// VK_FINAL (18) IME final mode
+const int VK_FINAL = 0x18;
+
+// VK_HANJA (19) IME Hanja mode
+const int VK_HANJA = 0x19;
+
+// VK_KANJI (19) IME Kanji mode
+const int VK_KANJI = 0x19;
+
+// VK_ESCAPE (1B) ESC key
+const int VK_ESCAPE = 0x1B;
+
+// VK_CONVERT (1C) IME convert
+const int VK_CONVERT = 0x1C;
+
+// VK_NONCONVERT (1D) IME nonconvert
+const int VK_NONCONVERT = 0x1D;
+
+// VK_ACCEPT (1E) IME accept
+const int VK_ACCEPT = 0x1E;
+
+// VK_MODECHANGE (1F) IME mode change request
+const int VK_MODECHANGE = 0x1F;
+
+// VK_SPACE (20) SPACEBAR
+const int VK_SPACE = 0x20;
+
+// VK_PRIOR (21) PAGE UP key
+const int VK_PRIOR = 0x21;
+
+// VK_NEXT (22) PAGE DOWN key
+const int VK_NEXT = 0x22;
+
+// VK_END (23) END key
+const int VK_END = 0x23;
+
+// VK_HOME (24) HOME key
+const int VK_HOME = 0x24;
+
+// VK_LEFT (25) LEFT ARROW key
+const int VK_LEFT = 0x25;
+
+// VK_UP (26) UP ARROW key
+const int VK_UP = 0x26;
+
+// VK_RIGHT (27) RIGHT ARROW key
+const int VK_RIGHT = 0x27;
+
+// VK_DOWN (28) DOWN ARROW key
+const int VK_DOWN = 0x28;
+
+// VK_SELECT (29) SELECT key
+const int VK_SELECT = 0x29;
+
+// VK_PRINT (2A) PRINT key
+const int VK_PRINT = 0x2A;
+
+// VK_EXECUTE (2B) EXECUTE key
+const int VK_EXECUTE = 0x2B;
+
+// VK_SNAPSHOT (2C) PRINT SCREEN key
+const int VK_SNAPSHOT = 0x2C;
+
+// VK_INSERT (2D) INS key
+const int VK_INSERT = 0x2D;
+
+// VK_DELETE (2E) DEL key
+const int VK_DELETE = 0x2E;
+
+// VK_HELP (2F) HELP key
+const int VK_HELP = 0x2F;
+
+// (30) 0 key
+const int VK_0 = 0x30;
+
+// (31) 1 key
+const int VK_1 = 0x31;
+
+// (32) 2 key
+const int VK_2 = 0x32;
+
+// (33) 3 key
+const int VK_3 = 0x33;
+
+// (34) 4 key
+const int VK_4 = 0x34;
+
+// (35) 5 key;
+
+const int VK_5 = 0x35;
+
+// (36) 6 key
+const int VK_6 = 0x36;
+
+// (37) 7 key
+const int VK_7 = 0x37;
+
+// (38) 8 key
+const int VK_8 = 0x38;
+
+// (39) 9 key
+const int VK_9 = 0x39;
+
+// (41) A key
+const int VK_A = 0x41;
+
+// (42) B key
+const int VK_B = 0x42;
+
+// (43) C key
+const int VK_C = 0x43;
+
+// (44) D key
+const int VK_D = 0x44;
+
+// (45) E key
+const int VK_E = 0x45;
+
+// (46) F key
+const int VK_F = 0x46;
+
+// (47) G key
+const int VK_G = 0x47;
+
+// (48) H key
+const int VK_H = 0x48;
+
+// (49) I key
+const int VK_I = 0x49;
+
+// (4A) J key
+const int VK_J = 0x4A;
+
+// (4B) K key
+const int VK_K = 0x4B;
+
+// (4C) L key
+const int VK_L = 0x4C;
+
+// (4D) M key
+const int VK_M = 0x4D;
+
+// (4E) N key
+const int VK_N = 0x4E;
+
+// (4F) O key
+const int VK_O = 0x4F;
+
+// (50) P key
+const int VK_P = 0x50;
+
+// (51) Q key
+const int VK_Q = 0x51;
+
+// (52) R key
+const int VK_R = 0x52;
+
+// (53) S key
+const int VK_S = 0x53;
+
+// (54) T key
+const int VK_T = 0x54;
+
+// (55) U key
+const int VK_U = 0x55;
+
+// (56) V key
+const int VK_V = 0x56;
+
+// (57) W key
+const int VK_W = 0x57;
+
+// (58) X key
+const int VK_X = 0x58;
+
+// (59) Y key
+const int VK_Y = 0x59;
+
+// (5A) Z key
+const int VK_Z = 0x5A;
+
+// VK_LWIN (5B) Left Windows key (Microsoft Natural keyboard)
+const int VK_LWIN = 0x5B;
+
+// VK_RWIN (5C) Right Windows key (Natural keyboard)
+const int VK_RWIN = 0x5C;
+
+// VK_APPS (5D) Applications key (Natural keyboard)
+const int VK_APPS = 0x5D;
+
+// VK_SLEEP (5F) Computer Sleep key
+const int VK_SLEEP = 0x5F;
+
+// VK_NUMPAD0 (60) Numeric keypad 0 key
+const int VK_NUMPAD0 = 0x60;
+
+// VK_NUMPAD1 (61) Numeric keypad 1 key
+const int VK_NUMPAD1 = 0x61;
+
+// VK_NUMPAD2 (62) Numeric keypad 2 key
+const int VK_NUMPAD2 = 0x62;
+
+// VK_NUMPAD3 (63) Numeric keypad 3 key
+const int VK_NUMPAD3 = 0x63;
+
+// VK_NUMPAD4 (64) Numeric keypad 4 key
+const int VK_NUMPAD4 = 0x64;
+
+// VK_NUMPAD5 (65) Numeric keypad 5 key
+const int VK_NUMPAD5 = 0x65;
+
+// VK_NUMPAD6 (66) Numeric keypad 6 key
+const int VK_NUMPAD6 = 0x66;
+
+// VK_NUMPAD7 (67) Numeric keypad 7 key
+const int VK_NUMPAD7 = 0x67;
+
+// VK_NUMPAD8 (68) Numeric keypad 8 key
+const int VK_NUMPAD8 = 0x68;
+
+// VK_NUMPAD9 (69) Numeric keypad 9 key
+const int VK_NUMPAD9 = 0x69;
+
+// VK_MULTIPLY (6A) Multiply key
+const int VK_MULTIPLY = 0x6A;
+
+// VK_ADD (6B) Add key
+const int VK_ADD = 0x6B;
+
+// VK_SEPARATOR (6C) Separator key
+const int VK_SEPARATOR = 0x6C;
+
+// VK_SUBTRACT (6D) Subtract key
+const int VK_SUBTRACT = 0x6D;
+
+// VK_DECIMAL (6E) Decimal key
+const int VK_DECIMAL = 0x6E;
+
+// VK_DIVIDE (6F) Divide key
+const int VK_DIVIDE = 0x6F;
+
+// VK_F1 (70) F1 key
+const int VK_F1 = 0x70;
+
+// VK_F2 (71) F2 key
+const int VK_F2 = 0x71;
+
+// VK_F3 (72) F3 key
+const int VK_F3 = 0x72;
+
+// VK_F4 (73) F4 key
+const int VK_F4 = 0x73;
+
+// VK_F5 (74) F5 key
+const int VK_F5 = 0x74;
+
+// VK_F6 (75) F6 key
+const int VK_F6 = 0x75;
+
+// VK_F7 (76) F7 key
+const int VK_F7 = 0x76;
+
+// VK_F8 (77) F8 key
+const int VK_F8 = 0x77;
+
+// VK_F9 (78) F9 key
+const int VK_F9 = 0x78;
+
+// VK_F10 (79) F10 key
+const int VK_F10 = 0x79;
+
+// VK_F11 (7A) F11 key
+const int VK_F11 = 0x7A;
+
+// VK_F12 (7B) F12 key
+const int VK_F12 = 0x7B;
+
+// VK_F13 (7C) F13 key
+const int VK_F13 = 0x7C;
+
+// VK_F14 (7D) F14 key
+const int VK_F14 = 0x7D;
+
+// VK_F15 (7E) F15 key
+const int VK_F15 = 0x7E;
+
+// VK_F16 (7F) F16 key
+const int VK_F16 = 0x7F;
+
+// VK_F17 (80H) F17 key
+const int VK_F17 = 0x80;
+
+// VK_F18 (81H) F18 key
+const int VK_F18 = 0x81;
+
+// VK_F19 (82H) F19 key
+const int VK_F19 = 0x82;
+
+// VK_F20 (83H) F20 key
+const int VK_F20 = 0x83;
+
+// VK_F21 (84H) F21 key
+const int VK_F21 = 0x84;
+
+// VK_F22 (85H) F22 key
+const int VK_F22 = 0x85;
+
+// VK_F23 (86H) F23 key
+const int VK_F23 = 0x86;
+
+// VK_F24 (87H) F24 key
+const int VK_F24 = 0x87;
+
+// VK_NUMLOCK (90) NUM LOCK key
+const int VK_NUMLOCK = 0x90;
+
+// VK_SCROLL (91) SCROLL LOCK key
+const int VK_SCROLL = 0x91;
+
+// VK_LSHIFT (A0) Left SHIFT key
+const int VK_LSHIFT = 0xA0;
+
+// VK_RSHIFT (A1) Right SHIFT key
+const int VK_RSHIFT = 0xA1;
+
+// VK_LCONTROL (A2) Left CONTROL key
+const int VK_LCONTROL = 0xA2;
+
+// VK_RCONTROL (A3) Right CONTROL key
+const int VK_RCONTROL = 0xA3;
+
+// VK_LMENU (A4) Left MENU key
+const int VK_LMENU = 0xA4;
+
+// VK_RMENU (A5) Right MENU key
+const int VK_RMENU = 0xA5;
+
+// VK_BROWSER_BACK (A6) Windows 2000/XP: Browser Back key
+const int VK_BROWSER_BACK = 0xA6;
+
+// VK_BROWSER_FORWARD (A7) Windows 2000/XP: Browser Forward key
+const int VK_BROWSER_FORWARD = 0xA7;
+
+// VK_BROWSER_REFRESH (A8) Windows 2000/XP: Browser Refresh key
+const int VK_BROWSER_REFRESH = 0xA8;
+
+// VK_BROWSER_STOP (A9) Windows 2000/XP: Browser Stop key
+const int VK_BROWSER_STOP = 0xA9;
+
+// VK_BROWSER_SEARCH (AA) Windows 2000/XP: Browser Search key
+const int VK_BROWSER_SEARCH = 0xAA;
+
+// VK_BROWSER_FAVORITES (AB) Windows 2000/XP: Browser Favorites key
+const int VK_BROWSER_FAVORITES = 0xAB;
+
+// VK_BROWSER_HOME (AC) Windows 2000/XP: Browser Start and Home key
+const int VK_BROWSER_HOME = 0xAC;
+
+// VK_VOLUME_MUTE (AD) Windows 2000/XP: Volume Mute key
+const int VK_VOLUME_MUTE = 0xAD;
+
+// VK_VOLUME_DOWN (AE) Windows 2000/XP: Volume Down key
+const int VK_VOLUME_DOWN = 0xAE;
+
+// VK_VOLUME_UP (AF) Windows 2000/XP: Volume Up key
+const int VK_VOLUME_UP = 0xAF;
+
+// VK_MEDIA_NEXT_TRACK (B0) Windows 2000/XP: Next Track key
+const int VK_MEDIA_NEXT_TRACK = 0xB0;
+
+// VK_MEDIA_PREV_TRACK (B1) Windows 2000/XP: Previous Track key
+const int VK_MEDIA_PREV_TRACK = 0xB1;
+
+// VK_MEDIA_STOP (B2) Windows 2000/XP: Stop Media key
+const int VK_MEDIA_STOP = 0xB2;
+
+// VK_MEDIA_PLAY_PAUSE (B3) Windows 2000/XP: Play/Pause Media key
+const int VK_MEDIA_PLAY_PAUSE = 0xB3;
+
+// VK_LAUNCH_MAIL (B4) Windows 2000/XP: Start Mail key
+const int VK_MEDIA_LAUNCH_MAIL = 0xB4;
+
+// VK_LAUNCH_MEDIA_SELECT (B5) Windows 2000/XP: Select Media key
+const int VK_MEDIA_LAUNCH_MEDIA_SELECT = 0xB5;
+
+// VK_LAUNCH_APP1 (B6) Windows 2000/XP: Start Application 1 key
+const int VK_MEDIA_LAUNCH_APP1 = 0xB6;
+
+// VK_LAUNCH_APP2 (B7) Windows 2000/XP: Start Application 2 key
+const int VK_MEDIA_LAUNCH_APP2 = 0xB7;
+
+// VK_OEM_1 (BA) Used for miscellaneous characters; it can vary by keyboard.
+// Windows 2000/XP: For the US standard keyboard, the ';:' key
+const int VK_OEM_1 = 0xBA;
+
+// VK_OEM_PLUS (BB) Windows 2000/XP: For any country/region, the '+' key
+const int VK_OEM_PLUS = 0xBB;
+
+// VK_OEM_COMMA (BC) Windows 2000/XP: For any country/region, the ',' key
+const int VK_OEM_COMMA = 0xBC;
+
+// VK_OEM_MINUS (BD) Windows 2000/XP: For any country/region, the '-' key
+const int VK_OEM_MINUS = 0xBD;
+
+// VK_OEM_PERIOD (BE) Windows 2000/XP: For any country/region, the '.' key
+const int VK_OEM_PERIOD = 0xBE;
+
+// VK_OEM_2 (BF) Used for miscellaneous characters; it can vary by keyboard.
+// Windows 2000/XP: For the US standard keyboard, the '/?' key
+const int VK_OEM_2 = 0xBF;
+
+// VK_OEM_3 (C0) Used for miscellaneous characters; it can vary by keyboard.
+// Windows 2000/XP: For the US standard keyboard, the '`~' key
+const int VK_OEM_3 = 0xC0;
+
+// VK_OEM_4 (DB) Used for miscellaneous characters; it can vary by keyboard.
+// Windows 2000/XP: For the US standard keyboard, the '[{' key
+const int VK_OEM_4 = 0xDB;
+
+// VK_OEM_5 (DC) Used for miscellaneous characters; it can vary by keyboard.
+// Windows 2000/XP: For the US standard keyboard, the '\|' key
+const int VK_OEM_5 = 0xDC;
+
+// VK_OEM_6 (DD) Used for miscellaneous characters; it can vary by keyboard.
+// Windows 2000/XP: For the US standard keyboard, the ']}' key
+const int VK_OEM_6 = 0xDD;
+
+// VK_OEM_7 (DE) Used for miscellaneous characters; it can vary by keyboard.
+// Windows 2000/XP: For the US standard keyboard, the
+// 'single-quote/double-quote' key
+const int VK_OEM_7 = 0xDE;
+
+// VK_OEM_8 (DF) Used for miscellaneous characters; it can vary by keyboard.
+const int VK_OEM_8 = 0xDF;
+
+// VK_OEM_102 (E2) Windows 2000/XP: Either the angle bracket key or the
+// backslash key on the RT 102-key keyboard
+const int VK_OEM_102 = 0xE2;
+
+// VK_PROCESSKEY (E5) Windows 95/98/Me, Windows NT 4.0, Windows 2000/XP: IME
+// PROCESS key
+const int VK_PROCESSKEY = 0xE5;
+
+// VK_PACKET (E7) Windows 2000/XP: Used to pass Unicode characters as if they
+// were keystrokes. The VK_PACKET key is the low word of a 32-bit Virtual Key
+// value used for non-keyboard input methods. For more information, see Remark
+// in KEYBDINPUT,SendInput, WM_KEYDOWN, and WM_KEYUP
+const int VK_PACKET = 0xE7;
+
+// VK_ATTN (F6) Attn key
+const int VK_ATTN = 0xF6;
+
+// VK_CRSEL (F7) CrSel key
+const int VK_CRSEL = 0xF7;
+
+// VK_EXSEL (F8) ExSel key
+const int VK_EXSEL = 0xF8;
+
+// VK_EREOF (F9) Erase EOF key
+const int VK_EREOF = 0xF9;
+
+// VK_PLAY (FA) Play key
+const int VK_PLAY = 0xFA;
+
+// VK_ZOOM (FB) Zoom key
+const int VK_ZOOM = 0xFB;
+
+// VK_NONAME (FC) Reserved for future use
+const int VK_NONAME = 0xFC;
+
+// VK_PA1 (FD) PA1 key
+const int VK_PA1 = 0xFD;
+
+// VK_OEM_CLEAR (FE) Clear key
+const int VK_OEM_CLEAR = 0xFE;
+
+const int VK_UNKNOWN = 0;
+}  // namespace WebCore
+
+#endif  // LKCEF_KEYBOARD_CODES_H
diff --git a/livekit-plugins/livekit-plugins-browser/cef/src/resources/lkcefapp-Info.plist b/livekit-plugins/livekit-plugins-browser/src/resources/lkcefapp-Info.plist
similarity index 100%
rename from livekit-plugins/livekit-plugins-browser/cef/src/resources/lkcefapp-Info.plist
rename to livekit-plugins/livekit-plugins-browser/src/resources/lkcefapp-Info.plist
diff --git a/livekit-plugins/livekit-plugins-browser/cef/src/resources/lkcefhelper-Info.plist b/livekit-plugins/livekit-plugins-browser/src/resources/lkcefhelper-Info.plist
similarity index 100%
rename from livekit-plugins/livekit-plugins-browser/cef/src/resources/lkcefhelper-Info.plist
rename to livekit-plugins/livekit-plugins-browser/src/resources/lkcefhelper-Info.plist
diff --git a/livekit-plugins/livekit-plugins-browser/src/run_browser.py b/livekit-plugins/livekit-plugins-browser/src/run_browser.py
new file mode 100644
index 000000000..e43c2e63a
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-browser/src/run_browser.py
@@ -0,0 +1,45 @@
+# flake8: noqa
+
+import sys
+
+print("cwd: ", sys.path[0])
+
+sys.path.insert(0, "./Debug")
+import lkcef_python as lkcef
+
+print("lkcef __dict__: ", lkcef.__dict__)
+print("BrowserImpl __dict__: ", lkcef.BrowserImpl.__dict__)
+
+
+def _context_initialized():
+    opts = lkcef.BrowserOptions()
+    opts.framerate = 30
+
+    def _browser_created(browser_impl):
+        print("run_browser.py - Browser created")
+
+    opts.created_callback = _browser_created
+
+    def _on_paint(frame_data):
+        pass
+
+    opts.paint_callback = _on_paint
+
+    def _on_closed():
+        print("run_browser.py - Browser closed")
+
+    opts.close_callback = _on_closed
+
+    app.create_browser("http://www.livekit.io", opts)
+    print("run_browser.py - Context initialized")
+
+
+opts = lkcef.AppOptions()
+opts.dev_mode = True
+opts.initialized_callback = _context_initialized
+opts.framework_path = "/Users/theomonnom/livekit/agents/livekit-plugins/livekit-plugins-browser/cef/src/Debug/lkcef_app.app/Contents/Frameworks/Chromium Embedded Framework.framework"
+opts.main_bundle_path = "/Users/theomonnom/livekit/agents/livekit-plugins/livekit-plugins-browser/cef/src/Debug/lkcef_app.app"
+opts.subprocess_path = "/Users/theomonnom/livekit/agents/livekit-plugins/livekit-plugins-browser/cef/src/Debug/lkcef_app.app/Contents/Frameworks/lkcef Helper.app/Contents/MacOS/lkcef Helper"
+
+app = lkcef.BrowserApp(opts)
+app.run()
diff --git a/livekit-plugins/livekit-plugins-cartesia/CHANGELOG.md b/livekit-plugins/livekit-plugins-cartesia/CHANGELOG.md
index 3f5c40b47..d92a10504 100644
--- a/livekit-plugins/livekit-plugins-cartesia/CHANGELOG.md
+++ b/livekit-plugins/livekit-plugins-cartesia/CHANGELOG.md
@@ -1,5 +1,18 @@
 # livekit-plugins-cartesia
 
+## 0.4.2
+
+### Patch Changes
+
+- Add support for cartesia voice control - [#740](https://github.com/livekit/agents/pull/740) ([@bcherry](https://github.com/bcherry))
+
+## 0.4.1
+
+### Patch Changes
+
+- Switch Cartesia to a sentence tokenizer and keep the same context id throughout. - [#608](https://github.com/livekit/agents/pull/608) ([@keepingitneil](https://github.com/keepingitneil))
+  Propagate segment_id through the basic sentence tokenizer
+
 ## 0.3.0
 
 ### Minor Changes
diff --git a/livekit-plugins/livekit-plugins-cartesia/livekit/plugins/cartesia/models.py b/livekit-plugins/livekit-plugins-cartesia/livekit/plugins/cartesia/models.py
index ca238356c..309448bdd 100644
--- a/livekit-plugins/livekit-plugins-cartesia/livekit/plugins/cartesia/models.py
+++ b/livekit-plugins/livekit-plugins-cartesia/livekit/plugins/cartesia/models.py
@@ -8,7 +8,34 @@
     # "pcm_alaw",
 ]
 
-
 TTSModels = Literal["sonic-english", "sonic-multilingual"]
 TTSLanguages = Literal["en", "es", "fr", "de", "pt", "zh", "ja"]
 TTSDefaultVoiceId = "c2ac25f9-ecc4-4f56-9095-651354df60c0"
+TTSVoiceSpeed = Literal["fastest", "fast", "normal", "slow", "slowest"]
+TTSVoiceEmotion = Literal[
+    "anger:lowest",
+    "anger:low",
+    "anger",
+    "anger:high",
+    "anger:highest",
+    "positivity:lowest",
+    "positivity:low",
+    "positivity",
+    "positivity:high",
+    "positivity:highest",
+    "surprise:lowest",
+    "surprise:low",
+    "surprise",
+    "surprise:high",
+    "surprise:highest",
+    "sadness:lowest",
+    "sadness:low",
+    "sadness",
+    "sadness:high",
+    "sadness:highest",
+    "curiosity:lowest",
+    "curiosity:low",
+    "curiosity",
+    "curiosity:high",
+    "curiosity:highest",
+]
diff --git a/livekit-plugins/livekit-plugins-cartesia/livekit/plugins/cartesia/tts.py b/livekit-plugins/livekit-plugins-cartesia/livekit/plugins/cartesia/tts.py
index 7a93a2ab6..42b830efb 100644
--- a/livekit-plugins/livekit-plugins-cartesia/livekit/plugins/cartesia/tts.py
+++ b/livekit-plugins/livekit-plugins-cartesia/livekit/plugins/cartesia/tts.py
@@ -25,7 +25,13 @@
 from livekit.agents import tokenize, tts, utils
 
 from .log import logger
-from .models import TTSDefaultVoiceId, TTSEncoding, TTSModels
+from .models import (
+    TTSDefaultVoiceId,
+    TTSEncoding,
+    TTSModels,
+    TTSVoiceEmotion,
+    TTSVoiceSpeed,
+)
 
 API_AUTH_HEADER = "X-API-Key"
 API_VERSION_HEADER = "Cartesia-Version"
@@ -41,6 +47,8 @@ class _TTSOptions:
     encoding: TTSEncoding
     sample_rate: int
     voice: str | list[float]
+    speed: TTSVoiceSpeed | float | None
+    emotion: list[TTSVoiceEmotion | str] | None
     api_key: str
     language: str
 
@@ -53,10 +61,29 @@ def __init__(
         language: str = "en",
         encoding: TTSEncoding = "pcm_s16le",
         voice: str | list[float] = TTSDefaultVoiceId,
+        speed: TTSVoiceSpeed | float | None = None,
+        emotion: list[TTSVoiceEmotion | str] | None = None,
         sample_rate: int = 24000,
         api_key: str | None = None,
         http_session: aiohttp.ClientSession | None = None,
     ) -> None:
+        """
+        Create a new instance of Cartesia TTS.
+
+        See https://docs.cartesia.ai/reference/web-socket/stream-speech/stream-speech for more details on the the Cartesia API.
+
+        Args:
+            model (TTSModels, optional): The Cartesia TTS model to use. Defaults to "sonic-english".
+            language (str, optional): The language code for synthesis. Defaults to "en".
+            encoding (TTSEncoding, optional): The audio encoding format. Defaults to "pcm_s16le".
+            voice (str | list[float], optional): The voice ID or embedding array.
+            speed (TTSVoiceSpeed | float, optional): Voice Control - Speed (https://docs.cartesia.ai/user-guides/voice-control)
+            emotion (list[TTSVoiceEmotion], optional): Voice Control - Emotion (https://docs.cartesia.ai/user-guides/voice-control)
+            sample_rate (int, optional): The audio sample rate in Hz. Defaults to 24000.
+            api_key (str, optional): The Cartesia API key. If not provided, it will be read from the CARTESIA_API_KEY environment variable.
+            http_session (aiohttp.ClientSession | None, optional): An existing aiohttp ClientSession to use. If not provided, a new session will be created.
+        """
+
         super().__init__(
             capabilities=tts.TTSCapabilities(streaming=True),
             sample_rate=sample_rate,
@@ -73,6 +100,8 @@ def __init__(
             encoding=encoding,
             sample_rate=sample_rate,
             voice=voice,
+            speed=speed,
+            emotion=emotion,
             api_key=api_key,
         )
         self._session = http_session
@@ -268,6 +297,15 @@ def _to_cartesia_options(opts: _TTSOptions) -> dict[str, Any]:
         voice["mode"] = "embedding"
         voice["embedding"] = opts.voice
 
+    voice_controls: dict = {}
+    if opts.speed is not None:
+        voice_controls["speed"] = opts.speed
+    if opts.emotion is not None:
+        voice_controls["emotion"] = opts.emotion
+
+    if voice_controls:
+        voice["__experimental_controls"] = voice_controls
+
     return {
         "model_id": opts.model,
         "voice": voice,
diff --git a/livekit-plugins/livekit-plugins-cartesia/livekit/plugins/cartesia/version.py b/livekit-plugins/livekit-plugins-cartesia/livekit/plugins/cartesia/version.py
index 00a7bde1d..608b5cd5a 100644
--- a/livekit-plugins/livekit-plugins-cartesia/livekit/plugins/cartesia/version.py
+++ b/livekit-plugins/livekit-plugins-cartesia/livekit/plugins/cartesia/version.py
@@ -12,4 +12,4 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-__version__ = "0.3.0"
+__version__ = "0.4.2"
diff --git a/livekit-plugins/livekit-plugins-cartesia/package.json b/livekit-plugins/livekit-plugins-cartesia/package.json
index 48a4aeb31..2bf74f9a9 100644
--- a/livekit-plugins/livekit-plugins-cartesia/package.json
+++ b/livekit-plugins/livekit-plugins-cartesia/package.json
@@ -1,5 +1,5 @@
 {
   "name": "livekit-plugins-cartesia",
   "private": true,
-  "version": "0.3.0"
+  "version": "0.4.2"
 }
diff --git a/livekit-plugins/livekit-plugins-clova/README.md b/livekit-plugins/livekit-plugins-clova/README.md
new file mode 100644
index 000000000..013cb7fe4
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-clova/README.md
@@ -0,0 +1,13 @@
+# LiveKit Plugins Clova
+
+Agent Framework plugin for speech-to-text with [Clova](https://api.ncloud-docs.com/docs/)'s API. Currently supports speech-to-text.
+
+## Installation
+
+```bash
+pip install livekit-plugins-clova
+```
+
+## Pre-requisites
+
+You need invoke url and secret key from Naver cloud platform -> Clova Speech and set as environment variables: `CLOVA_STT_INVOKE_URL` & `CLOVA_STT_SECRET_KEY`
diff --git a/livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/__init__.py b/livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/__init__.py
new file mode 100644
index 000000000..d554599f0
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/__init__.py
@@ -0,0 +1,21 @@
+from .stt import STT
+from .version import __version__
+
+__all__ = [
+    "STT",
+    "__version__",
+]
+
+
+from livekit.agents import Plugin
+
+
+class ClovaSTTPlugin(Plugin):
+    def __init__(self):
+        super().__init__(__name__, __version__, __package__)
+
+    def download_files(self):
+        pass
+
+
+Plugin.register_plugin(ClovaSTTPlugin())
diff --git a/livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/common.py b/livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/common.py
new file mode 100644
index 000000000..3418dd8bf
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/common.py
@@ -0,0 +1,13 @@
+import io
+
+from pydub import AudioSegment
+
+
+def resample_audio(audio_bytes, original_sample_rate, target_sample_rate):
+    resampled_audio = AudioSegment.from_raw(
+        io.BytesIO(audio_bytes),
+        sample_width=2,
+        frame_rate=original_sample_rate,
+        channels=1,
+    ).set_frame_rate(target_sample_rate)
+    return resampled_audio.raw_data
diff --git a/livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/constants.py b/livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/constants.py
new file mode 100644
index 000000000..ec109084f
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/constants.py
@@ -0,0 +1,2 @@
+CLOVA_INPUT_SAMPLE_RATE = 16000
+LIVEKIT_INPUT_SAMPLE_RATE = 48000
diff --git a/livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/log.py b/livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/log.py
new file mode 100644
index 000000000..e28e00f47
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/log.py
@@ -0,0 +1,3 @@
+import logging
+
+logger = logging.getLogger("livekit.plugins.clova")
diff --git a/livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/models.py b/livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/models.py
new file mode 100644
index 000000000..490ab9660
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/models.py
@@ -0,0 +1,17 @@
+from typing import Literal
+
+ClovaSttLanguages = Literal["ko-KR", "en-US", "enko", "ja", "zh-cn", "zh-tw"]
+
+ClovaSpeechAPIType = Literal[
+    "recognizer/object-storage", "recognizer/url", "recognizer/upload"
+]
+
+clova_languages_mapping = {
+    "en": "en-US",
+    "ko-KR": "ko-KR",
+    "en-US": "en-US",
+    "enko": "enko",
+    "ja": "ja",
+    "zh-cn": "zh-cn",
+    "zh-tw": "zh-tw",
+}
diff --git a/livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/stt.py b/livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/stt.py
new file mode 100644
index 000000000..308aa8b15
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/stt.py
@@ -0,0 +1,132 @@
+# Copyright 2023 LiveKit, Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import io
+import json
+import os
+import time
+import wave
+from typing import Optional, Union
+
+import aiohttp
+from livekit.agents import stt, utils
+from livekit.agents.stt import SpeechEventType, STTCapabilities
+from livekit.agents.utils import AudioBuffer, merge_frames
+from livekit.plugins.clova.constants import CLOVA_INPUT_SAMPLE_RATE
+
+from .common import resample_audio
+from .log import logger
+from .models import ClovaSpeechAPIType, ClovaSttLanguages, clova_languages_mapping
+
+
+class STT(stt.STT):
+    def __init__(
+        self,
+        *,
+        language: ClovaSttLanguages = "en-US",
+        secret: Optional[str] = None,
+        invoke_url: Optional[str] = None,
+        http_session: Optional[aiohttp.ClientSession] = None,
+        threshold: float = 0.5,
+    ):
+        """
+        Create a new instance of Clova STT.
+
+        ``secret`` and ``invoke_url`` must be set, either using arguments or by setting the
+        ``CLOVA_STT_SECRET_KEY`` and ``CLOVA_STT_INVOKE_URL`` environmental variables, respectively.
+        """
+
+        super().__init__(
+            capabilities=STTCapabilities(streaming=False, interim_results=True)
+        )
+        self._secret = secret or os.environ.get("CLOVA_STT_SECRET_KEY")
+        self._invoke_url = invoke_url or os.environ.get("CLOVA_STT_INVOKE_URL")
+        self._language = clova_languages_mapping.get(language, language)
+        self._session = http_session
+        if self._secret is None:
+            raise ValueError(
+                "Clova STT secret key is required. It should be set with env CLOVA_STT_SECRET_KEY"
+            )
+        self.threshold = threshold
+
+    def _ensure_session(self) -> aiohttp.ClientSession:
+        if not self._session:
+            self._session = utils.http_context.http_session()
+        return self._session
+
+    def url_builder(
+        self, process_method: ClovaSpeechAPIType = "recognizer/upload"
+    ) -> str:
+        return f"{self._invoke_url}/{process_method}"
+
+    async def recognize(
+        self,
+        *,
+        buffer: AudioBuffer,
+        language: Union[ClovaSttLanguages, str, None] = None,
+    ) -> stt.SpeechEvent:
+        try:
+            url = self.url_builder()
+            payload = json.dumps({"language": self._language, "completion": "sync"})
+
+            buffer = merge_frames(buffer)
+            buffer_bytes = resample_audio(
+                buffer.data.tobytes(), buffer.sample_rate, CLOVA_INPUT_SAMPLE_RATE
+            )
+
+            io_buffer = io.BytesIO()
+            with wave.open(io_buffer, "wb") as wav:
+                wav.setnchannels(1)
+                wav.setsampwidth(2)  # 16-bit
+                wav.setframerate(CLOVA_INPUT_SAMPLE_RATE)
+                wav.writeframes(buffer_bytes)
+            io_buffer.seek(0)
+
+            headers = {"X-CLOVASPEECH-API-KEY": self._secret}
+            form_data = aiohttp.FormData()
+            form_data.add_field("params", payload)
+            form_data.add_field(
+                "media", io_buffer, filename="audio.wav", content_type="audio/wav"
+            )
+            start = time.time()
+            async with self._ensure_session().post(
+                url, data=form_data, headers=headers
+            ) as response:
+                response_data = await response.json()
+                end = time.time()
+                text = response_data.get("text")
+                confidence = response_data.get("confidence")
+                logger.info(f"{text} | {confidence} | total_seconds: {end - start}")
+                if not text or "error" in response_data:
+                    raise ValueError(f"Unexpected response: {response_data}")
+                if confidence < self.threshold:
+                    raise ValueError(
+                        f"Confidence: {confidence} is bellow threshold {self.threshold}. Skipping."
+                    )
+                logger.info(f"final event: {response_data}")
+                return self._transcription_to_speech_event(text=text)
+        except Exception as ex:
+            logger.error(f"{ex}")
+            return self._transcription_to_speech_event(
+                event_type=stt.SpeechEventType.FINAL_TRANSCRIPT, text=""
+            )
+
+    def _transcription_to_speech_event(
+        self,
+        event_type: SpeechEventType = stt.SpeechEventType.INTERIM_TRANSCRIPT,
+        text: str = None,
+    ) -> stt.SpeechEvent:
+        return stt.SpeechEvent(
+            type=event_type,
+            alternatives=[stt.SpeechData(text=text, language=self._language)],
+        )
diff --git a/livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/version.py b/livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/version.py
new file mode 100644
index 000000000..18b2a337f
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-clova/livekit/plugins/clova/version.py
@@ -0,0 +1,15 @@
+# Copyright 2023 LiveKit, Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+__version__ = "0.0.2"
diff --git a/livekit-plugins/livekit-plugins-clova/pyproject.toml b/livekit-plugins/livekit-plugins-clova/pyproject.toml
new file mode 100644
index 000000000..8cf32563a
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-clova/pyproject.toml
@@ -0,0 +1,3 @@
+[build-system]
+requires = ["setuptools>=61.0"]
+build-backend = "setuptools.build_meta"
\ No newline at end of file
diff --git a/livekit-plugins/livekit-plugins-clova/setup.py b/livekit-plugins/livekit-plugins-clova/setup.py
new file mode 100644
index 000000000..b6bc2fd09
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-clova/setup.py
@@ -0,0 +1,56 @@
+# Copyright 2023 LiveKit, Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import pathlib
+
+import setuptools
+import setuptools.command.build_py
+
+here = pathlib.Path(__file__).parent.resolve()
+about = {}
+with open(os.path.join(here, "livekit", "plugins", "clova", "version.py"), "r") as f:
+    exec(f.read(), about)
+
+
+setuptools.setup(
+    name="livekit-plugins-clova",
+    version=about["__version__"],
+    description="LiveKit Agents Plugin for LINE Clova STT",
+    long_description=(here / "README.md").read_text(encoding="utf-8"),
+    long_description_content_type="text/markdown",
+    url="https://github.com/livekit/agents",
+    cmdclass={},
+    classifiers=[
+        "Intended Audience :: Developers",
+        "License :: OSI Approved :: Apache Software License",
+        "Topic :: Multimedia :: Sound/Audio",
+        "Topic :: Multimedia :: Video",
+        "Topic :: Scientific/Engineering :: Artificial Intelligence",
+        "Programming Language :: Python :: 3",
+        "Programming Language :: Python :: 3.9",
+        "Programming Language :: Python :: 3.10",
+        "Programming Language :: Python :: 3 :: Only",
+    ],
+    keywords=["webrtc", "realtime", "audio", "video", "livekit"],
+    license="Apache-2.0",
+    packages=setuptools.find_namespace_packages(include=["livekit.*"]),
+    python_requires=">=3.9.0",
+    install_requires=["livekit-agents>=0.8.0.dev0", "pydub~=0.25.1"],
+    project_urls={
+        "Documentation": "https://docs.livekit.io",
+        "Website": "https://livekit.io/",
+        "Source": "https://github.com/livekit/agents",
+    },
+)
diff --git a/livekit-plugins/livekit-plugins-deepgram/CHANGELOG.md b/livekit-plugins/livekit-plugins-deepgram/CHANGELOG.md
index f81ba4feb..c6b2dfe99 100644
--- a/livekit-plugins/livekit-plugins-deepgram/CHANGELOG.md
+++ b/livekit-plugins/livekit-plugins-deepgram/CHANGELOG.md
@@ -1,5 +1,25 @@
 # livekit-plugins-deepgram
 
+## 0.6.7
+
+### Patch Changes
+
+- Only send actual audio to Deepgram using a basic audio RMS filter - [#738](https://github.com/livekit/agents/pull/738) ([@keepingitneil](https://github.com/keepingitneil))
+
+- defaults to nova-2-general model - [#726](https://github.com/livekit/agents/pull/726) ([@davidzhao](https://github.com/davidzhao))
+
+## 0.6.6
+
+### Patch Changes
+
+- deepgram: switch the default model to phonecall - [#676](https://github.com/livekit/agents/pull/676) ([@theomonnom](https://github.com/theomonnom))
+
+## 0.6.5
+
+### Patch Changes
+
+- deepgram: fallback to nova-2-general when the language isn't supported - [#623](https://github.com/livekit/agents/pull/623) ([@theomonnom](https://github.com/theomonnom))
+
 ## 0.6.4
 
 ### Patch Changes
diff --git a/livekit-plugins/livekit-plugins-deepgram/livekit/plugins/deepgram/stt.py b/livekit-plugins/livekit-plugins-deepgram/livekit/plugins/deepgram/stt.py
index c56a2d74b..b1d593abb 100644
--- a/livekit-plugins/livekit-plugins-deepgram/livekit/plugins/deepgram/stt.py
+++ b/livekit-plugins/livekit-plugins-deepgram/livekit/plugins/deepgram/stt.py
@@ -30,6 +30,7 @@
 
 from .log import logger
 from .models import DeepgramLanguages, DeepgramModels
+from .utils import BasicAudioEnergyFilter
 
 BASE_URL = "https://api.deepgram.com/v1/listen"
 BASE_URL_WS = "wss://api.deepgram.com/v1/listen"
@@ -55,7 +56,7 @@ class STT(stt.STT):
     def __init__(
         self,
         *,
-        model: DeepgramModels = "nova-2-conversationalai",
+        model: DeepgramModels = "nova-2-general",
         language: DeepgramLanguages = "en-US",
         detect_language: bool = False,
         interim_results: bool = True,
@@ -68,6 +69,13 @@ def __init__(
         api_key: str | None = None,
         http_session: aiohttp.ClientSession | None = None,
     ) -> None:
+        """
+        Create a new instance of Deepgram STT.
+
+        ``api_key`` must be set to your Deepgram API key, either using the argument or by setting
+        the ``DEEPGRAM_API_KEY`` environmental variable.
+        """
+
         super().__init__(
             capabilities=stt.STTCapabilities(
                 streaming=True, interim_results=interim_results
@@ -78,7 +86,7 @@ def __init__(
         if api_key is None:
             raise ValueError("Deepgram API key is required")
 
-        if (language != "en-US" or language != "en") and model in (
+        if language not in ("en-US", "en") and model in (
             "nova-2-meeting",
             "nova-2-phonecall",
             "nova-2-finance",
@@ -193,6 +201,7 @@ def __init__(
         self._session = http_session
         self._speaking = False
         self._max_retry = max_retry
+        self._audio_energy_filter = BasicAudioEnergyFilter(cooldown_seconds=1)
 
     @utils.log_exceptions(logger=logger)
     async def _main_task(self) -> None:
@@ -284,10 +293,12 @@ async def send_task():
                 if isinstance(data, self._FlushSentinel):
                     frames = audio_bstream.flush()
                 else:
-                    frames = audio_bstream.write(data.data)
+                    frames = audio_bstream.write(data.data.tobytes())
 
                 for frame in frames:
-                    await ws.send_bytes(frame.data.tobytes())
+                    has_audio = self._audio_energy_filter.push_frame(frame)
+                    if has_audio:
+                        await ws.send_bytes(frame.data.tobytes())
 
             # tell deepgram we are done sending audio/inputs
             closing_ws = True
diff --git a/livekit-plugins/livekit-plugins-deepgram/livekit/plugins/deepgram/utils.py b/livekit-plugins/livekit-plugins-deepgram/livekit/plugins/deepgram/utils.py
new file mode 100644
index 000000000..c9c9ee452
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-deepgram/livekit/plugins/deepgram/utils.py
@@ -0,0 +1,27 @@
+import numpy as np
+from livekit import rtc
+
+# This is the magic number during testing that we use to determine if a frame is loud enough
+# to possibly contain speech. It's very conservative.
+MAGIC_NUMBER_THRESHOLD = 0.004
+
+
+class BasicAudioEnergyFilter:
+    def __init__(self, *, cooldown_seconds: float = 1):
+        self._cooldown_seconds = cooldown_seconds
+        self._cooldown = cooldown_seconds
+
+    def push_frame(self, frame: rtc.AudioFrame) -> bool:
+        arr = np.frombuffer(frame.data, dtype=np.int16)
+        float_arr = arr.astype(np.float32) / 32768.0
+        rms = np.sqrt(np.mean(np.square(float_arr)))
+        if rms > MAGIC_NUMBER_THRESHOLD:
+            self._cooldown = self._cooldown_seconds
+            return True
+
+        duration_seconds = frame.samples_per_channel / frame.sample_rate
+        self._cooldown -= duration_seconds
+        if self._cooldown > 0:
+            return True
+
+        return False
diff --git a/livekit-plugins/livekit-plugins-deepgram/livekit/plugins/deepgram/version.py b/livekit-plugins/livekit-plugins-deepgram/livekit/plugins/deepgram/version.py
index 4f1df5fb6..9aacd12fa 100644
--- a/livekit-plugins/livekit-plugins-deepgram/livekit/plugins/deepgram/version.py
+++ b/livekit-plugins/livekit-plugins-deepgram/livekit/plugins/deepgram/version.py
@@ -12,4 +12,4 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-__version__ = "0.6.4"
+__version__ = "0.6.7"
diff --git a/livekit-plugins/livekit-plugins-deepgram/package.json b/livekit-plugins/livekit-plugins-deepgram/package.json
index d28dfabe7..d317146ed 100644
--- a/livekit-plugins/livekit-plugins-deepgram/package.json
+++ b/livekit-plugins/livekit-plugins-deepgram/package.json
@@ -1,5 +1,5 @@
 {
   "name": "livekit-plugins-deepgram",
   "private": true,
-  "version": "0.6.4"
+  "version": "0.6.7"
 }
diff --git a/livekit-plugins/livekit-plugins-deepgram/setup.py b/livekit-plugins/livekit-plugins-deepgram/setup.py
index 37b739565..98a4b82ba 100644
--- a/livekit-plugins/livekit-plugins-deepgram/setup.py
+++ b/livekit-plugins/livekit-plugins-deepgram/setup.py
@@ -47,7 +47,7 @@
     license="Apache-2.0",
     packages=setuptools.find_namespace_packages(include=["livekit.*"]),
     python_requires=">=3.9.0",
-    install_requires=["livekit-agents>=0.8.0.dev0"],
+    install_requires=["livekit-agents>=0.8.0", "numpy~=1.21"],
     package_data={"livekit.plugins.deepgram": ["py.typed"]},
     project_urls={
         "Documentation": "https://docs.livekit.io",
diff --git a/livekit-plugins/livekit-plugins-elevenlabs/CHANGELOG.md b/livekit-plugins/livekit-plugins-elevenlabs/CHANGELOG.md
index d78e8d7d0..aabc2cbac 100644
--- a/livekit-plugins/livekit-plugins-elevenlabs/CHANGELOG.md
+++ b/livekit-plugins/livekit-plugins-elevenlabs/CHANGELOG.md
@@ -1,5 +1,19 @@
 # livekit-plugins-elevenlabs
 
+## 0.7.5
+
+### Patch Changes
+
+- avoid returning tiny frames from TTS - [#747](https://github.com/livekit/agents/pull/747) ([@theomonnom](https://github.com/theomonnom))
+
+- 11labs: send phoneme in one entire xml chunk - [#766](https://github.com/livekit/agents/pull/766) ([@theomonnom](https://github.com/theomonnom))
+
+## 0.7.4
+
+### Patch Changes
+
+- elevenlabs: expose enable_ssml_parsing - [#723](https://github.com/livekit/agents/pull/723) ([@theomonnom](https://github.com/theomonnom))
+
 ## 0.7.3
 
 ### Patch Changes
diff --git a/livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/tts.py b/livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/tts.py
index 72b2490a0..a1907cdf6 100644
--- a/livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/tts.py
+++ b/livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/tts.py
@@ -86,6 +86,7 @@ class _TTSOptions:
     streaming_latency: int
     word_tokenizer: tokenize.WordTokenizer
     chunk_length_schedule: list[int]
+    enable_ssml_parsing: bool
 
 
 class TTS(tts.TTS):
@@ -101,9 +102,17 @@ def __init__(
         word_tokenizer: tokenize.WordTokenizer = tokenize.basic.WordTokenizer(
             ignore_punctuation=False  # punctuation can help for intonation
         ),
+        enable_ssml_parsing: bool = False,
         chunk_length_schedule: list[int] = [80, 120, 200, 260],  # range is [50, 500]
         http_session: aiohttp.ClientSession | None = None,
     ) -> None:
+        """
+        Create a new instance of ElevenLabs TTS.
+
+        ``api_key`` must be set to your ElevenLabs API key, either using the argument or by setting
+        the ``ELEVEN_API_KEY`` environmental variable.
+        """
+
         super().__init__(
             capabilities=tts.TTSCapabilities(
                 streaming=True,
@@ -125,6 +134,7 @@ def __init__(
             streaming_latency=streaming_latency,
             word_tokenizer=word_tokenizer,
             chunk_length_schedule=chunk_length_schedule,
+            enable_ssml_parsing=enable_ssml_parsing,
         )
         self._session = http_session
 
@@ -187,17 +197,19 @@ async def _main_task(self) -> None:
                 content = await resp.text()
                 logger.error("11labs returned non-audio data: %s", content)
                 return
+
             encoding = _encoding_from_format(self._opts.encoding)
             if encoding == "mp3":
                 async for bytes_data, _ in resp.content.iter_chunks():
                     for frame in self._mp3_decoder.decode_chunk(bytes_data):
-                        self._event_ch.send_nowait(
-                            tts.SynthesizedAudio(
-                                request_id=request_id,
-                                segment_id=segment_id,
-                                frame=frame,
+                        for frame in bstream.write(frame.data.tobytes()):
+                            self._event_ch.send_nowait(
+                                tts.SynthesizedAudio(
+                                    request_id=request_id,
+                                    segment_id=segment_id,
+                                    frame=frame,
+                                )
                             )
-                        )
             else:
                 async for bytes_data, _ in resp.content.iter_chunks():
                     for frame in bstream.write(bytes_data):
@@ -209,12 +221,12 @@ async def _main_task(self) -> None:
                             )
                         )
 
-                for frame in bstream.flush():
-                    self._event_ch.send_nowait(
-                        tts.SynthesizedAudio(
-                            request_id=request_id, segment_id=segment_id, frame=frame
-                        )
+            for frame in bstream.flush():
+                self._event_ch.send_nowait(
+                    tts.SynthesizedAudio(
+                        request_id=request_id, segment_id=segment_id, frame=frame
                     )
+                )
 
 
 class SynthesizeStream(tts.SynthesizeStream):
@@ -313,15 +325,34 @@ async def _run_ws(
         async def send_task():
             nonlocal eos_sent
 
+            xml_content = []
             async for data in word_stream:
+                text = data.token
+
+                # send the xml phoneme in one go
+                if (
+                    self._opts.enable_ssml_parsing
+                    and data.token.startswith("<phoneme")
+                    or xml_content
+                ):
+                    xml_content.append(text)
+                    if data.token.find("</phoneme>") > -1:
+                        text = self._opts.word_tokenizer.format_words(xml_content)
+                        xml_content = []
+                    else:
+                        continue
+
                 # try_trigger_generation=True is a bad practice, we expose
                 # chunk_length_schedule instead
                 data_pkt = dict(
-                    text=f"{data.token} ",  # must always end with a space
+                    text=f"{text} ",  # must always end with a space
                     try_trigger_generation=False,
                 )
                 await ws_conn.send_str(json.dumps(data_pkt))
 
+            if xml_content:
+                logger.warning("11labs stream ended with incomplete xml content")
+
             # no more token, mark eos
             eos_pkt = dict(text="")
             await ws_conn.send_str(json.dumps(eos_pkt))
@@ -434,7 +465,9 @@ def _stream_url(opts: _TTSOptions) -> str:
     model_id = opts.model_id
     output_format = opts.encoding
     latency = opts.streaming_latency
+    enable_ssml = str(opts.enable_ssml_parsing).lower()
     return (
         f"{base_url}/text-to-speech/{voice_id}/stream-input?"
-        f"model_id={model_id}&output_format={output_format}&optimize_streaming_latency={latency}"
+        f"model_id={model_id}&output_format={output_format}&optimize_streaming_latency={latency}&"
+        f"enable_ssml_parsing={enable_ssml}"
     )
diff --git a/livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/version.py b/livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/version.py
index 20d8a2226..7bd26ee36 100644
--- a/livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/version.py
+++ b/livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/version.py
@@ -12,4 +12,4 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-__version__ = "0.7.3"
+__version__ = "0.7.5"
diff --git a/livekit-plugins/livekit-plugins-elevenlabs/package.json b/livekit-plugins/livekit-plugins-elevenlabs/package.json
index 16ebaf330..78fe20504 100644
--- a/livekit-plugins/livekit-plugins-elevenlabs/package.json
+++ b/livekit-plugins/livekit-plugins-elevenlabs/package.json
@@ -1,5 +1,5 @@
 {
   "name": "livekit-plugins-elevenlabs",
   "private": true,
-  "version": "0.7.3"
+  "version": "0.7.5"
 }
diff --git a/livekit-plugins/livekit-plugins-google/CHANGELOG.md b/livekit-plugins/livekit-plugins-google/CHANGELOG.md
index 9977a53a0..7a187de9a 100644
--- a/livekit-plugins/livekit-plugins-google/CHANGELOG.md
+++ b/livekit-plugins/livekit-plugins-google/CHANGELOG.md
@@ -1,5 +1,27 @@
 # livekit-plugins-google
 
+## 0.7.1
+
+### Patch Changes
+
+- avoid returning tiny frames from TTS - [#747](https://github.com/livekit/agents/pull/747) ([@theomonnom](https://github.com/theomonnom))
+
+## 0.7.0
+
+### Minor Changes
+
+- Enable use of Google STT with Application Default Credentials. - [#721](https://github.com/livekit/agents/pull/721) ([@rsinnet](https://github.com/rsinnet))
+
+### Patch Changes
+
+- google-tts: ignore wav header - [#703](https://github.com/livekit/agents/pull/703) ([@theomonnom](https://github.com/theomonnom))
+
+## 0.6.3
+
+### Patch Changes
+
+- Fix Google STT exception when no valid speech is recognized - [#680](https://github.com/livekit/agents/pull/680) ([@davidzhao](https://github.com/davidzhao))
+
 ## 0.6.2
 
 ### Patch Changes
diff --git a/livekit-plugins/livekit-plugins-google/README.md b/livekit-plugins/livekit-plugins-google/README.md
index 746e94473..b0fffb41e 100644
--- a/livekit-plugins/livekit-plugins-google/README.md
+++ b/livekit-plugins/livekit-plugins-google/README.md
@@ -10,4 +10,4 @@ pip install livekit-plugins-google
 
 ## Pre-requisites
 
-For credentials, you'll need a Google Cloud account and obtain the correct credentials. Credentials can be passed directly or set as [GOOGLE_APPLICATION_CREDENTIALS](https://cloud.google.com/docs/authentication/application-default-credentials) environment variable.
+For credentials, you'll need a Google Cloud account and obtain the correct credentials. Credentials can be passed directly or via Application Default Credentials as specified in [How Application Default Credentials works](https://cloud.google.com/docs/authentication/application-default-credentials).
diff --git a/livekit-plugins/livekit-plugins-google/livekit/plugins/google/stt.py b/livekit-plugins/livekit-plugins-google/livekit/plugins/google/stt.py
index 4946a1de9..afff6f93a 100644
--- a/livekit-plugins/livekit-plugins-google/livekit/plugins/google/stt.py
+++ b/livekit-plugins/livekit-plugins-google/livekit/plugins/google/stt.py
@@ -16,13 +16,14 @@
 
 import asyncio
 import dataclasses
-import os
 from dataclasses import dataclass
 from typing import AsyncIterable, List, Union
 
 from livekit import agents, rtc
 from livekit.agents import stt, utils
 
+from google.auth import default as gauth_default
+from google.auth.exceptions import DefaultCredentialsError
 from google.cloud.speech_v2 import SpeechAsyncClient
 from google.cloud.speech_v2.types import cloud_speech
 
@@ -58,8 +59,11 @@ def __init__(
         credentials_file: str | None = None,
     ):
         """
-        if no credentials is provided, it will use the credentials on the environment
-        GOOGLE_APPLICATION_CREDENTIALS (default behavior of Google SpeechAsyncClient)
+        Create a new instance of Google STT.
+
+        Credentials must be provided, either by using the ``credentials_info`` dict, or reading
+        from the file specified in ``credentials_file`` or via Application Default Credentials as
+        described in https://cloud.google.com/docs/authentication/application-default-credentials
         """
         super().__init__(
             capabilities=stt.STTCapabilities(streaming=True, interim_results=True)
@@ -70,10 +74,13 @@ def __init__(
         self._credentials_file = credentials_file
 
         if credentials_file is None and credentials_info is None:
-            creds = os.environ.get("GOOGLE_APPLICATION_CREDENTIALS")
-            if not creds:
+            try:
+                gauth_default()
+            except DefaultCredentialsError:
                 raise ValueError(
-                    "GOOGLE_APPLICATION_CREDENTIALS must be set if no credentials is provided"
+                    "Application default credentials must be available "
+                    "when using Google STT without explicitly passing "
+                    "credentials through credentials_info or credentials_file."
                 )
 
         if isinstance(languages, str):
@@ -109,7 +116,12 @@ def _recognizer(self) -> str:
         # recognizers may improve latency https://cloud.google.com/speech-to-text/v2/docs/recognizers#understand_recognizers
 
         # TODO(theomonnom): find a better way to access the project_id
-        project_id = self._ensure_client().transport._credentials.project_id  # type: ignore
+        try:
+            project_id = self._ensure_client().transport._credentials.project_id  # type: ignore
+        except AttributeError:
+            from google.auth import default as ga_default
+
+            _, project_id = ga_default()
         return f"projects/{project_id}/locations/global/recognizers/_"
 
     def _sanitize_options(self, *, language: str | None = None) -> STTOptions:
@@ -278,22 +290,22 @@ async def _run_stream(
                 == cloud_speech.StreamingRecognizeResponse.SpeechEventType.SPEECH_EVENT_TYPE_UNSPECIFIED
             ):
                 result = resp.results[0]
+                speech_data = _streaming_recognize_response_to_speech_data(resp)
+                if speech_data is None:
+                    continue
+
                 if not result.is_final:
                     self._event_ch.send_nowait(
                         stt.SpeechEvent(
                             type=stt.SpeechEventType.INTERIM_TRANSCRIPT,
-                            alternatives=[
-                                _streaming_recognize_response_to_speech_data(resp)
-                            ],
+                            alternatives=[speech_data],
                         )
                     )
                 else:
                     self._event_ch.send_nowait(
                         stt.SpeechEvent(
                             type=stt.SpeechEventType.FINAL_TRANSCRIPT,
-                            alternatives=[
-                                _streaming_recognize_response_to_speech_data(resp)
-                            ],
+                            alternatives=[speech_data],
                         )
                     )
 
@@ -337,16 +349,21 @@ def _recognize_response_to_speech_event(
 
 def _streaming_recognize_response_to_speech_data(
     resp: cloud_speech.StreamingRecognizeResponse,
-) -> stt.SpeechData:
+) -> stt.SpeechData | None:
     text = ""
     confidence = 0.0
     for result in resp.results:
+        if len(result.alternatives) == 0:
+            continue
         text += result.alternatives[0].transcript
         confidence += result.alternatives[0].confidence
 
     confidence /= len(resp.results)
     lg = resp.results[0].language_code
 
+    if text == "":
+        return None
+
     data = stt.SpeechData(
         language=lg, start_time=0, end_time=0, confidence=confidence, text=text
     )
diff --git a/livekit-plugins/livekit-plugins-google/livekit/plugins/google/tts.py b/livekit-plugins/livekit-plugins-google/livekit/plugins/google/tts.py
index 433ec84d2..f6fdb23e1 100644
--- a/livekit-plugins/livekit-plugins-google/livekit/plugins/google/tts.py
+++ b/livekit-plugins/livekit-plugins-google/livekit/plugins/google/tts.py
@@ -51,9 +51,13 @@ def __init__(
         credentials_file: str | None = None,
     ) -> None:
         """
-        if no credentials is provided, it will use the credentials on the environment
-        GOOGLE_APPLICATION_CREDENTIALS (default behavior of Google TextToSpeechAsyncClient)
+        Create a new instance of Google TTS.
+
+        Credentials must be provided, either by using the ``credentials_info`` dict, or reading
+        from the file specified in ``credentials_file`` or the ``GOOGLE_APPLICATION_CREDENTIALS``
+        environmental variable.
         """
+
         super().__init__(
             capabilities=tts.TTSCapabilities(
                 streaming=False,
@@ -137,13 +141,25 @@ async def _main_task(self) -> None:
         data = response.audio_content
         if self._opts.audio_config.audio_encoding == "mp3":
             decoder = utils.codecs.Mp3StreamDecoder()
+            bstream = utils.audio.AudioByteStream(
+                sample_rate=self._opts.audio_config.sample_rate_hertz, num_channels=1
+            )
             for frame in decoder.decode_chunk(data):
+                for frame in bstream.write(frame.data):
+                    self._event_ch.send_nowait(
+                        tts.SynthesizedAudio(
+                            request_id=request_id, segment_id=segment_id, frame=frame
+                        )
+                    )
+
+            for frame in bstream.flush():
                 self._event_ch.send_nowait(
                     tts.SynthesizedAudio(
                         request_id=request_id, segment_id=segment_id, frame=frame
                     )
                 )
         else:
+            data = data[44:]  # skip WAV header
             self._event_ch.send_nowait(
                 tts.SynthesizedAudio(
                     request_id=request_id,
diff --git a/livekit-plugins/livekit-plugins-google/livekit/plugins/google/version.py b/livekit-plugins/livekit-plugins-google/livekit/plugins/google/version.py
index 61bb6ddc4..947379190 100644
--- a/livekit-plugins/livekit-plugins-google/livekit/plugins/google/version.py
+++ b/livekit-plugins/livekit-plugins-google/livekit/plugins/google/version.py
@@ -12,4 +12,4 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-__version__ = "0.6.2"
+__version__ = "0.7.1"
diff --git a/livekit-plugins/livekit-plugins-google/package.json b/livekit-plugins/livekit-plugins-google/package.json
index b837c6a0f..96c90e560 100644
--- a/livekit-plugins/livekit-plugins-google/package.json
+++ b/livekit-plugins/livekit-plugins-google/package.json
@@ -1,5 +1,5 @@
 {
   "name": "livekit-plugins-google",
   "private": true,
-  "version": "0.6.2"
+  "version": "0.7.1"
 }
diff --git a/livekit-plugins/livekit-plugins-google/setup.py b/livekit-plugins/livekit-plugins-google/setup.py
index b3d601e02..02441a882 100644
--- a/livekit-plugins/livekit-plugins-google/setup.py
+++ b/livekit-plugins/livekit-plugins-google/setup.py
@@ -48,6 +48,7 @@
     packages=setuptools.find_namespace_packages(include=["livekit.*"]),
     python_requires=">=3.9.0",
     install_requires=[
+        "google-auth >= 2, < 3",
         "google-cloud-speech >= 2, < 3",
         "google-cloud-texttospeech >= 2, < 3",
         "livekit-agents>=0.8.0.dev0",
diff --git a/livekit-plugins/livekit-plugins-nltk/CHANGELOG.md b/livekit-plugins/livekit-plugins-nltk/CHANGELOG.md
index a5e977792..6ee2124fe 100644
--- a/livekit-plugins/livekit-plugins-nltk/CHANGELOG.md
+++ b/livekit-plugins/livekit-plugins-nltk/CHANGELOG.md
@@ -1,5 +1,17 @@
 # livekit-plugins-nltk
 
+## 0.7.2
+
+### Patch Changes
+
+- fix another semver break - [#659](https://github.com/livekit/agents/pull/659) ([@theomonnom](https://github.com/theomonnom))
+
+## 0.7.1
+
+### Patch Changes
+
+- Revert "nltk: fix broken punkt download" - [#630](https://github.com/livekit/agents/pull/630) ([@theomonnom](https://github.com/theomonnom))
+
 ## 0.7.0
 
 ### Minor Changes
diff --git a/livekit-plugins/livekit-plugins-nltk/livekit/plugins/nltk/version.py b/livekit-plugins/livekit-plugins-nltk/livekit/plugins/nltk/version.py
index 6d6d0deb7..d40c15247 100644
--- a/livekit-plugins/livekit-plugins-nltk/livekit/plugins/nltk/version.py
+++ b/livekit-plugins/livekit-plugins-nltk/livekit/plugins/nltk/version.py
@@ -12,4 +12,4 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-__version__ = "0.7.0"
+__version__ = "0.7.2"
diff --git a/livekit-plugins/livekit-plugins-nltk/package.json b/livekit-plugins/livekit-plugins-nltk/package.json
index f7bd7b3b2..66a8eb3fa 100644
--- a/livekit-plugins/livekit-plugins-nltk/package.json
+++ b/livekit-plugins/livekit-plugins-nltk/package.json
@@ -1,5 +1,5 @@
 {
   "name": "livekit-plugins-nltk",
   "private": true,
-  "version": "0.7.0"
+  "version": "0.7.2"
 }
diff --git a/livekit-plugins/livekit-plugins-nltk/setup.py b/livekit-plugins/livekit-plugins-nltk/setup.py
index 3f1307ba3..49ce4a921 100644
--- a/livekit-plugins/livekit-plugins-nltk/setup.py
+++ b/livekit-plugins/livekit-plugins-nltk/setup.py
@@ -46,7 +46,7 @@
     license="Apache-2.0",
     packages=setuptools.find_namespace_packages(include=["livekit.*"]),
     python_requires=">=3.9.0",
-    install_requires=["livekit-agents>=0.8.0.dev0", "nltk >= 3.8.2, < 4"],
+    install_requires=["livekit-agents>=0.8.0.dev0", "nltk >= 3.9.1, < 4"],
     package_data={"livekit.plugins.nltk": ["py.typed"]},
     project_urls={
         "Documentation": "https://docs.livekit.io",
diff --git a/livekit-plugins/livekit-plugins-openai/CHANGELOG.md b/livekit-plugins/livekit-plugins-openai/CHANGELOG.md
index f7e55bef2..686c38acf 100644
--- a/livekit-plugins/livekit-plugins-openai/CHANGELOG.md
+++ b/livekit-plugins/livekit-plugins-openai/CHANGELOG.md
@@ -1,5 +1,40 @@
 # livekit-plugins-openai
 
+## 0.8.4
+
+### Patch Changes
+
+- avoid returning tiny frames from TTS - [#747](https://github.com/livekit/agents/pull/747) ([@theomonnom](https://github.com/theomonnom))
+
+- Fixing Assistant API Vision Capabilities - [#771](https://github.com/livekit/agents/pull/771) ([@keepingitneil](https://github.com/keepingitneil))
+
+## 0.8.3
+
+### Patch Changes
+
+- Introduce function calling to OpenAI Assistants - [#710](https://github.com/livekit/agents/pull/710) ([@keepingitneil](https://github.com/keepingitneil))
+
+- Add Cerebras to OpenAI Plugin - [#731](https://github.com/livekit/agents/pull/731) ([@henrytwo](https://github.com/henrytwo))
+
+## 0.8.2
+
+### Patch Changes
+
+- Add deepseek LLMs at OpenAI plugin - [#714](https://github.com/livekit/agents/pull/714) ([@lenage](https://github.com/lenage))
+
+- skip processing of choice.delta when it is None - [#705](https://github.com/livekit/agents/pull/705) ([@theomonnom](https://github.com/theomonnom))
+
+## 0.8.1
+
+### Patch Changes
+
+- add support for Ollama, Perplexity, Fireworks, Octo, Together, and Groq LLMs through the OpenAI API - [#611](https://github.com/livekit/agents/pull/611) ([@nbsp](https://github.com/nbsp))
+
+- allow sending user IDs - [#633](https://github.com/livekit/agents/pull/633) ([@nbsp](https://github.com/nbsp))
+
+- Support OpenAI Assistants API as a beta feature under `livekit.plugins.openai.beta` - [#601](https://github.com/livekit/agents/pull/601) ([@keepingitneil](https://github.com/keepingitneil))
+  Add \_metadata to ChatCtx and ChatMessage which can be used (in the case of OpenAI assistants) for bookeeping to sync local state with remote, OpenAI state
+
 ## 0.8.0
 
 ### Minor Changes
diff --git a/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/__init__.py b/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/__init__.py
index e0fa12e4b..a09f09fb9 100644
--- a/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/__init__.py
+++ b/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/__init__.py
@@ -12,6 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
+
+from . import beta
 from .embeddings import EmbeddingData, create_embeddings
 from .llm import LLM, LLMStream
 from .models import TTSModels, TTSVoices, WhisperModels
@@ -25,6 +27,7 @@
     "LLM",
     "LLMStream",
     "WhisperModels",
+    "beta",
     "TTSModels",
     "TTSVoices",
     "create_embeddings",
diff --git a/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/beta/README.md b/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/beta/README.md
new file mode 100644
index 000000000..99827b787
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/beta/README.md
@@ -0,0 +1,78 @@
+# OpenAI Beta Features
+
+## Assistants API
+
+Example usage:
+
+```python
+import asyncio
+
+from dotenv import load_dotenv
+from livekit import rtc
+from livekit.agents import AutoSubscribe, JobContext, WorkerOptions, cli, llm
+from livekit.agents.voice_assistant import VoiceAssistant
+from livekit.plugins import deepgram, openai, silero
+from livekit.plugins.openai.beta import (
+    AssistantCreateOptions,
+    AssistantLLM,
+    AssistantOptions,
+    OnFileUploadedInfo
+)
+
+load_dotenv()
+
+
+async def entrypoint(ctx: JobContext):
+    initial_ctx = llm.ChatContext()
+
+    await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)
+
+    # When using vision capabilities, files are uploaded.
+    # It's up to you to remove them if desired or otherwise manage
+    # them going forward.
+    def on_file_uploaded(self, info: OnFileUploadedInfo):
+        pass
+
+    assistant = VoiceAssistant(
+        vad=silero.VAD.load(),
+        stt=deepgram.STT(),
+        llm=AssistantLLM(
+            assistant_opts=AssistantOptions(
+                create_options=AssistantCreateOptions(
+                    model="gpt-4o",
+                    instructions="You are a voice assistant created by LiveKit. Your interface with users will be voice.",
+                    name="KITT",
+                )
+            )
+        ),
+        tts=openai.TTS(),
+        chat_ctx=initial_ctx,
+        on_file_uploaded: on_file_uploaded,
+    )
+    assistant.start(ctx.room)
+
+    # listen to incoming chat messages, only required if you'd like the agent to
+    # answer incoming messages from Chat
+    chat = rtc.ChatManager(ctx.room)
+
+    async def answer_from_text(txt: str):
+        chat_ctx = assistant.chat_ctx.copy()
+        chat_ctx.append(role="user", text=txt)
+        stream = assistant.llm.chat(chat_ctx=chat_ctx)
+        await assistant.say(stream)
+
+    @chat.on("message_received")
+    def on_chat_received(msg: rtc.ChatMessage):
+        if msg.message:
+            asyncio.create_task(answer_from_text(msg.message))
+
+    await asyncio.sleep(1)
+    await assistant.say("Hey, how can I help you today?", allow_interruptions=True)
+
+
+if __name__ == "__main__":
+    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))
+```
+
+## TODO
+- tool calling
\ No newline at end of file
diff --git a/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/beta/__init__.py b/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/beta/__init__.py
new file mode 100644
index 000000000..f062606fb
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/beta/__init__.py
@@ -0,0 +1,17 @@
+from .assistant_llm import (
+    AssistantCreateOptions,
+    AssistantLLM,
+    AssistantLoadOptions,
+    AssistantOptions,
+    OnFileUploaded,
+    OnFileUploadedInfo,
+)
+
+__all__ = [
+    "AssistantLLM",
+    "AssistantOptions",
+    "AssistantCreateOptions",
+    "AssistantLoadOptions",
+    "OnFileUploaded",
+    "OnFileUploadedInfo",
+]
diff --git a/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/beta/assistant_llm.py b/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/beta/assistant_llm.py
new file mode 100644
index 000000000..01fd60bc4
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/beta/assistant_llm.py
@@ -0,0 +1,590 @@
+# Copyright 2023 LiveKit, Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import annotations
+
+import asyncio
+import json
+import uuid
+from dataclasses import dataclass
+from typing import Any, Callable, Dict, Literal, MutableSet, Union
+
+import httpx
+from livekit import rtc
+from livekit.agents import llm, utils
+
+from openai import AsyncAssistantEventHandler, AsyncClient
+from openai.types.beta.threads import Text, TextDelta
+from openai.types.beta.threads.run_create_params import AdditionalMessage
+from openai.types.beta.threads.run_submit_tool_outputs_params import ToolOutput
+from openai.types.beta.threads.runs import (
+    CodeInterpreterToolCall,
+    FileSearchToolCall,
+    FunctionToolCall,
+    ToolCall,
+)
+from openai.types.file_object import FileObject
+
+from ..log import logger
+from ..models import ChatModels
+
+DEFAULT_MODEL = "gpt-4o"
+OPENAI_MESSAGE_ID_KEY = "__openai_message_id__"
+LIVEKIT_MESSAGE_ID_KEY = "__livekit_message_id__"
+OPENAI_MESSAGES_ADDED_KEY = "__openai_messages_added__"
+OPENAI_FILE_ID_KEY = "__openai_file_id__"
+
+
+@dataclass
+class LLMOptions:
+    model: str | ChatModels
+
+
+@dataclass
+class AssistantOptions:
+    """Options for creating (on-the-fly) or loading an assistant. Only one of create_options or load_options should be set."""
+
+    create_options: AssistantCreateOptions | None = None
+    load_options: AssistantLoadOptions | None = None
+
+
+@dataclass
+class AssistantCreateOptions:
+    name: str
+    instructions: str
+    model: ChatModels
+    temperature: float | None = None
+    # TODO: when we implement code_interpreter and file_search tools
+    # tool_resources: ToolResources | None = None
+    # tools: list[AssistantTools] = field(default_factory=list)
+
+
+@dataclass
+class AssistantLoadOptions:
+    assistant_id: str
+    thread_id: str | None
+
+
+@dataclass
+class OnFileUploadedInfo:
+    type: Literal["image"]
+    original_file: llm.ChatImage
+    openai_file_object: FileObject
+
+
+OnFileUploaded = Callable[[OnFileUploadedInfo], None]
+
+
+class AssistantLLM(llm.LLM):
+    def __init__(
+        self,
+        *,
+        assistant_opts: AssistantOptions,
+        client: AsyncClient | None = None,
+        api_key: str | None = None,
+        base_url: str | None = None,
+        on_file_uploaded: OnFileUploaded | None = None,
+    ) -> None:
+        test_ctx = llm.ChatContext()
+        if not hasattr(test_ctx, "_metadata"):
+            raise Exception(
+                "This beta feature of 'livekit-plugins-openai' requires a newer version of 'livekit-agents'"
+            )
+        self._client = client or AsyncClient(
+            api_key=api_key,
+            base_url=base_url,
+            http_client=httpx.AsyncClient(
+                timeout=httpx.Timeout(timeout=30, connect=10, read=5, pool=5),
+                follow_redirects=True,
+                limits=httpx.Limits(
+                    max_connections=1000,
+                    max_keepalive_connections=100,
+                    keepalive_expiry=120,
+                ),
+            ),
+        )
+        self._assistant_opts = assistant_opts
+        self._running_fncs: MutableSet[asyncio.Task[Any]] = set()
+        self._on_file_uploaded = on_file_uploaded
+        self._tool_call_run_id_lookup = dict[str, str]()
+        self._submitted_tool_calls = set[str]()
+
+        self._sync_openai_task: asyncio.Task[AssistantLoadOptions] | None = None
+        try:
+            self._sync_openai_task = asyncio.create_task(self._sync_openai())
+        except Exception:
+            logger.error(
+                "failed to create sync openai task. This can happen when instantiating without a running asyncio event loop (such has when running tests)"
+            )
+        self._done_futures = list[asyncio.Future[None]]()
+
+    async def _sync_openai(self) -> AssistantLoadOptions:
+        if self._assistant_opts.create_options:
+            kwargs: Dict[str, Any] = {
+                "model": self._assistant_opts.create_options.model,
+                "name": self._assistant_opts.create_options.name,
+                "instructions": self._assistant_opts.create_options.instructions,
+                # "tools": [
+                #     {"type": t} for t in self._assistant_opts.create_options.tools
+                # ],
+                # "tool_resources": self._assistant_opts.create_options.tool_resources,
+            }
+            # TODO when we implement code_interpreter and file_search tools
+            # if self._assistant_opts.create_options.tool_resources:
+            #     kwargs["tool_resources"] = (
+            #         self._assistant_opts.create_options.tool_resources
+            #     )
+            if self._assistant_opts.create_options.temperature:
+                kwargs["temperature"] = self._assistant_opts.create_options.temperature
+            assistant = await self._client.beta.assistants.create(**kwargs)
+
+            thread = await self._client.beta.threads.create()
+            return AssistantLoadOptions(assistant_id=assistant.id, thread_id=thread.id)
+        elif self._assistant_opts.load_options:
+            if not self._assistant_opts.load_options.thread_id:
+                thread = await self._client.beta.threads.create()
+                self._assistant_opts.load_options.thread_id = thread.id
+            return self._assistant_opts.load_options
+
+        raise Exception("One of create_options or load_options must be set")
+
+    def chat(
+        self,
+        *,
+        chat_ctx: llm.ChatContext,
+        fnc_ctx: llm.FunctionContext | None = None,
+        temperature: float | None = None,
+        n: int | None = None,
+        parallel_tool_calls: bool | None = None,
+    ):
+        if n is not None:
+            logger.warning("OpenAI Assistants does not support the 'n' parameter")
+
+        if parallel_tool_calls is not None:
+            logger.warning(
+                "OpenAI Assistants does not support the 'parallel_tool_calls' parameter"
+            )
+
+        if not self._sync_openai_task:
+            self._sync_openai_task = asyncio.create_task(self._sync_openai())
+
+        return AssistantLLMStream(
+            temperature=temperature,
+            assistant_llm=self,
+            sync_openai_task=self._sync_openai_task,
+            client=self._client,
+            chat_ctx=chat_ctx,
+            fnc_ctx=fnc_ctx,
+            on_file_uploaded=self._on_file_uploaded,
+        )
+
+    async def _register_tool_call(self, tool_call_id: str, run_id: str) -> None:
+        self._tool_call_run_id_lookup[tool_call_id] = run_id
+
+    async def _submit_tool_call_result(self, tool_call_id: str, result: str) -> None:
+        if tool_call_id in self._submitted_tool_calls:
+            return
+        logger.debug(f"submitting tool call {tool_call_id} result")
+        run_id = self._tool_call_run_id_lookup.get(tool_call_id)
+        if not run_id:
+            logger.error(f"tool call {tool_call_id} not found")
+            return
+
+        if not self._sync_openai_task:
+            logger.error("sync_openai_task not set")
+            return
+
+        thread_id = (await self._sync_openai_task).thread_id
+        if not thread_id:
+            logger.error("thread_id not set")
+            return
+        tool_output = ToolOutput(output=result, tool_call_id=tool_call_id)
+        await self._client.beta.threads.runs.submit_tool_outputs_and_poll(
+            tool_outputs=[tool_output], run_id=run_id, thread_id=thread_id
+        )
+        self._submitted_tool_calls.add(tool_call_id)
+        logger.debug(f"submitted tool call {tool_call_id} result")
+
+
+class AssistantLLMStream(llm.LLMStream):
+    class EventHandler(AsyncAssistantEventHandler):
+        def __init__(
+            self,
+            llm: AssistantLLM,
+            llm_stream: AssistantLLMStream,
+            output_queue: asyncio.Queue[llm.ChatChunk | Exception | None],
+            chat_ctx: llm.ChatContext,
+            fnc_ctx: llm.FunctionContext | None = None,
+        ):
+            super().__init__()
+            self._llm = llm
+            self._llm_stream = llm_stream
+            self._chat_ctx = chat_ctx
+            self._output_queue = output_queue
+            self._fnc_ctx = fnc_ctx
+
+        async def on_text_delta(self, delta: TextDelta, snapshot: Text):
+            self._output_queue.put_nowait(
+                llm.ChatChunk(
+                    choices=[
+                        llm.Choice(
+                            delta=llm.ChoiceDelta(role="assistant", content=delta.value)
+                        )
+                    ]
+                )
+            )
+
+        async def on_tool_call_created(self, tool_call: ToolCall):
+            if not self.current_run:
+                logger.error("tool call created without run")
+                return
+            await self._llm._register_tool_call(tool_call.id, self.current_run.id)
+
+        async def on_tool_call_done(
+            self,
+            tool_call: CodeInterpreterToolCall | FileSearchToolCall | FunctionToolCall,
+        ) -> None:
+            if tool_call.type == "code_interpreter":
+                logger.warning("code interpreter tool call not yet implemented")
+            elif tool_call.type == "file_search":
+                logger.warning("file_search tool call not yet implemented")
+            elif tool_call.type == "function":
+                if not self._fnc_ctx:
+                    logger.error("function tool called without function context")
+                    return
+
+                fnc = llm.FunctionCallInfo(
+                    function_info=self._fnc_ctx.ai_functions[tool_call.function.name],
+                    arguments=json.loads(tool_call.function.arguments),
+                    tool_call_id=tool_call.id,
+                    raw_arguments=tool_call.function.arguments,
+                )
+
+                self._llm_stream._function_calls_info.append(fnc)
+                chunk = llm.ChatChunk(
+                    choices=[
+                        llm.Choice(
+                            delta=llm.ChoiceDelta(role="assistant", tool_calls=[fnc]),
+                            index=0,
+                        )
+                    ]
+                )
+                self._output_queue.put_nowait(chunk)
+
+    def __init__(
+        self,
+        *,
+        assistant_llm: AssistantLLM,
+        client: AsyncClient,
+        sync_openai_task: asyncio.Task[AssistantLoadOptions],
+        chat_ctx: llm.ChatContext,
+        fnc_ctx: llm.FunctionContext | None,
+        temperature: float | None,
+        on_file_uploaded: OnFileUploaded | None,
+    ) -> None:
+        super().__init__(chat_ctx=chat_ctx, fnc_ctx=fnc_ctx)
+        self._llm = assistant_llm
+        self._client = client
+        self._temperature = temperature
+        self._on_file_uploaded = on_file_uploaded
+
+        # current function call that we're waiting for full completion (args are streamed)
+        self._tool_call_id: str | None = None
+        self._fnc_name: str | None = None
+        self._fnc_raw_arguments: str | None = None
+        self._output_queue = asyncio.Queue[Union[llm.ChatChunk, Exception, None]]()
+        self._create_stream_task = asyncio.create_task(self._create_stream())
+        self._sync_openai_task = sync_openai_task
+
+        # Running stream is used to ensure that we only have one stream running at a time
+        self._done_future: asyncio.Future[None] = asyncio.Future()
+
+    async def _create_stream(self) -> None:
+        # This function's complexity is due to the fact that we need to sync chat_ctx messages with OpenAI.
+        # OpenAI also does not allow us to modify messages while a stream is running. So we need to make sure streams run
+        # sequentially. The strategy is as follows:
+        #
+        # 1. ensure that we have a thread_id and assistant_id from OpenAI. This comes from the _sync_openai_task
+        # 2. make sure all previous streams are done before starting a new one
+        # 3. delete messages that are no longer in the chat_ctx but are still in OpenAI by using the OpenAI message id
+        # 4. add new messages to OpenAI that are in the chat_ctx but not in OpenAI. We don't know the OpenAI message id yet
+        #    so we create a random uuid (we call it the LiveKit message id) and set that in the metdata.
+        # 5. start the stream and wait for it to finish
+        # 6. get the OpenAI message ids for the messages we added to OpenAI by using the metadata
+        # 7. Resolve the OpenAI message id with all messages that have a LiveKit message id.
+        try:
+            load_options = await self._sync_openai_task
+
+            # The assistants api does not let us modify messages while a stream is running.
+            # So we have to make sure previous streams are done before starting a new one.
+            await asyncio.gather(*self._llm._done_futures)
+            self._llm._done_futures.clear()
+            self._llm._done_futures.append(self._done_future)
+
+            # OpenAI required submitting tool call outputs manually. We iterate
+            # tool outputs in the chat_ctx (from previous runs) and submit them
+            # before continuing.
+            for msg in self._chat_ctx.messages:
+                if msg.role == "tool":
+                    if not msg.tool_call_id:
+                        logger.error("tool message without tool_call_id")
+                        continue
+                    if not isinstance(msg.content, str):
+                        logger.error("tool message content is not str")
+                        continue
+                    await self._llm._submit_tool_call_result(
+                        msg.tool_call_id, msg.content
+                    )
+
+            # At the chat_ctx level, create a map of thread_id to message_ids
+            # This is used to keep track of which messages have been added to the thread
+            # and which we may need to delete from OpenAI
+            if OPENAI_MESSAGES_ADDED_KEY not in self._chat_ctx._metadata:
+                self._chat_ctx._metadata[OPENAI_MESSAGES_ADDED_KEY] = dict()
+
+            if (
+                load_options.thread_id
+                not in self._chat_ctx._metadata[OPENAI_MESSAGES_ADDED_KEY]
+            ):
+                self._chat_ctx._metadata[OPENAI_MESSAGES_ADDED_KEY][
+                    load_options.thread_id
+                ] = set()
+
+            # Keep this handy to make the code more readable later on
+            openai_addded_messages_set: set[str] = self._chat_ctx._metadata[
+                OPENAI_MESSAGES_ADDED_KEY
+            ][load_options.thread_id]
+
+            # Keep track of messages that are no longer in the chat_ctx but are still in OpenAI
+            # Note: Unfortuneately, this will add latency unfortunately. Usually it's just one message so we loop it but
+            # it will create an extra round trip to OpenAI before being able to run inference.
+            # TODO: parallelize it?
+            for msg in self._chat_ctx.messages:
+                msg_id = msg._metadata.get(OPENAI_MESSAGE_ID_KEY, {}).get(
+                    load_options.thread_id
+                )
+                assert load_options.thread_id
+                if msg_id and msg_id not in openai_addded_messages_set:
+                    await self._client.beta.threads.messages.delete(
+                        thread_id=load_options.thread_id,
+                        message_id=msg_id,
+                    )
+                    logger.debug(
+                        f"Deleted message '{msg_id}' in thread '{load_options.thread_id}'"
+                    )
+                    openai_addded_messages_set.remove(msg_id)
+
+            # Upload any images in the chat_ctx that have not been uploaded to OpenAI
+            for msg in self._chat_ctx.messages:
+                if msg.role != "user":
+                    continue
+
+                if not isinstance(msg.content, list):
+                    continue
+
+                for cnt in msg.content:
+                    if (
+                        not isinstance(cnt, llm.ChatImage)
+                        or OPENAI_FILE_ID_KEY in cnt._cache
+                    ):
+                        continue
+
+                    if isinstance(cnt.image, str):
+                        continue
+
+                    file_obj = await self._upload_frame(
+                        cnt.image, cnt.inference_width, cnt.inference_height
+                    )
+                    cnt._cache[OPENAI_FILE_ID_KEY] = file_obj.id
+                    if self._on_file_uploaded:
+                        self._on_file_uploaded(
+                            OnFileUploadedInfo(
+                                type="image",
+                                original_file=cnt,
+                                openai_file_object=file_obj,
+                            )
+                        )
+
+            # Keep track of the new messages in the chat_ctx that we need to add to OpenAI
+            additional_messages: list[AdditionalMessage] = []
+            for msg in self._chat_ctx.messages:
+                if msg.role != "user":
+                    continue
+
+                msg_id = str(uuid.uuid4())
+                if OPENAI_MESSAGE_ID_KEY not in msg._metadata:
+                    msg._metadata[OPENAI_MESSAGE_ID_KEY] = dict[str, str]()
+
+                if LIVEKIT_MESSAGE_ID_KEY not in msg._metadata:
+                    msg._metadata[LIVEKIT_MESSAGE_ID_KEY] = dict[str, str]()
+
+                oai_msg_id_dict = msg._metadata[OPENAI_MESSAGE_ID_KEY]
+                lk_msg_id_dict = msg._metadata[LIVEKIT_MESSAGE_ID_KEY]
+
+                if load_options.thread_id not in oai_msg_id_dict:
+                    converted_msg = build_oai_message(msg)
+                    converted_msg["private_message_id"] = msg_id
+                    additional_messages.append(
+                        AdditionalMessage(
+                            role="user",
+                            content=converted_msg["content"],
+                            metadata={LIVEKIT_MESSAGE_ID_KEY: msg_id},
+                        )
+                    )
+                    lk_msg_id_dict[load_options.thread_id] = msg_id
+
+            eh = AssistantLLMStream.EventHandler(
+                llm=self._llm,
+                output_queue=self._output_queue,
+                chat_ctx=self._chat_ctx,
+                fnc_ctx=self._fnc_ctx,
+                llm_stream=self,
+            )
+            assert load_options.thread_id
+            kwargs: dict[str, Any] = {
+                "additional_messages": additional_messages,
+                "thread_id": load_options.thread_id,
+                "assistant_id": load_options.assistant_id,
+                "event_handler": eh,
+                "temperature": self._temperature,
+            }
+            if self._fnc_ctx:
+                kwargs["tools"] = [
+                    llm._oai_api.build_oai_function_description(f)
+                    for f in self._fnc_ctx.ai_functions.values()
+                ]
+
+            async with self._client.beta.threads.runs.stream(**kwargs) as stream:
+                await stream.until_done()
+
+            await self._output_queue.put(None)
+
+            # Populate the openai_message_id for the messages we added to OpenAI. Note, we do this after
+            # sending None to close the iterator so that it is done in parellel with any users of
+            # the stream. However, the next stream will not start until this is done.
+            lk_to_oai_lookup = dict[str, str]()
+            messages = await self._client.beta.threads.messages.list(
+                thread_id=load_options.thread_id,
+                limit=10,  # We could be smarter and make a more exact query, but this is probably fine
+            )
+            for oai_msg in messages.data:
+                if oai_msg.metadata.get(LIVEKIT_MESSAGE_ID_KEY):  # type: ignore
+                    lk_to_oai_lookup[oai_msg.metadata[LIVEKIT_MESSAGE_ID_KEY]] = (  # type: ignore
+                        oai_msg.id
+                    )
+
+            for msg in self._chat_ctx.messages:
+                if msg.role != "user":
+                    continue
+                oai_msg_id_dict = msg._metadata.get(OPENAI_MESSAGE_ID_KEY)
+                lk_msg_id_dict = msg._metadata.get(LIVEKIT_MESSAGE_ID_KEY)
+                if oai_msg_id_dict is None or lk_msg_id_dict is None:
+                    continue
+
+                lk_msg_id = lk_msg_id_dict.get(load_options.thread_id)
+                if lk_msg_id and lk_msg_id in lk_to_oai_lookup:
+                    oai_msg_id = lk_to_oai_lookup[lk_msg_id]
+                    oai_msg_id_dict[load_options.thread_id] = oai_msg_id
+                    openai_addded_messages_set.add(oai_msg_id)
+                    # We don't need the LiveKit message id anymore
+                    lk_msg_id_dict.pop(load_options.thread_id)
+
+        except Exception as e:
+            await self._output_queue.put(e)
+        finally:
+            self._done_future.set_result(None)
+
+    async def _upload_frame(
+        self,
+        frame: rtc.VideoFrame,
+        inference_width: int | None,
+        inference_height: int | None,
+    ):
+        # inside our internal implementation, we allow to put extra metadata to
+        # each ChatImage (avoid to reencode each time we do a chatcompletion request)
+        opts = utils.images.EncodeOptions()
+        if inference_width and inference_height:
+            opts.resize_options = utils.images.ResizeOptions(
+                width=inference_width,
+                height=inference_height,
+                strategy="center_aspect_fit",
+            )
+
+        encoded_data = utils.images.encode(frame, opts)
+        fileObj = await self._client.files.create(
+            file=("image.jpg", encoded_data),
+            purpose="vision",
+        )
+
+        return fileObj
+
+    async def __anext__(self):
+        item = await self._output_queue.get()
+        if item is None:
+            raise StopAsyncIteration
+
+        if isinstance(item, Exception):
+            raise item
+
+        return item
+
+
+def build_oai_message(msg: llm.ChatMessage):
+    oai_msg: dict[str, Any] = {"role": msg.role}
+
+    if msg.name:
+        oai_msg["name"] = msg.name
+
+    # add content if provided
+    if isinstance(msg.content, str):
+        oai_msg["content"] = msg.content
+    elif isinstance(msg.content, list):
+        oai_content: list[dict[str, Any]] = []
+        for cnt in msg.content:
+            if isinstance(cnt, str):
+                oai_content.append({"type": "text", "text": cnt})
+            elif isinstance(cnt, llm.ChatImage):
+                if cnt._cache[OPENAI_FILE_ID_KEY]:
+                    oai_content.append(
+                        {
+                            "type": "image_file",
+                            "image_file": {"file_id": cnt._cache[OPENAI_FILE_ID_KEY]},
+                        }
+                    )
+
+        oai_msg["content"] = oai_content
+
+    # make sure to provide when function has been called inside the context
+    # (+ raw_arguments)
+    if msg.tool_calls is not None:
+        tool_calls: list[dict[str, Any]] = []
+        oai_msg["tool_calls"] = tool_calls
+        for fnc in msg.tool_calls:
+            tool_calls.append(
+                {
+                    "id": fnc.tool_call_id,
+                    "type": "function",
+                    "function": {
+                        "name": fnc.function_info.name,
+                        "arguments": fnc.raw_arguments,
+                    },
+                }
+            )
+
+    # tool_call_id is set when the message is a response/result to a function call
+    # (content is a string in this case)
+    if msg.tool_call_id:
+        oai_msg["tool_call_id"] = msg.tool_call_id
+
+    return oai_msg
diff --git a/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/llm.py b/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/llm.py
index 07f75cc85..d8cfccc9f 100644
--- a/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/llm.py
+++ b/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/llm.py
@@ -15,13 +15,12 @@
 from __future__ import annotations
 
 import asyncio
-import base64
+import os
 from dataclasses import dataclass
 from typing import Any, Awaitable, MutableSet
 
 import httpx
-from livekit import rtc
-from livekit.agents import llm, utils
+from livekit.agents import llm
 
 import openai
 from openai.types.chat import ChatCompletionChunk, ChatCompletionMessageParam
@@ -29,18 +28,22 @@
 
 from .log import logger
 from .models import (
+    CerebrasChatModels,
     ChatModels,
+    DeepSeekChatModels,
     GroqChatModels,
     OctoChatModels,
     PerplexityChatModels,
     TogetherChatModels,
 )
-from .utils import AsyncAzureADTokenProvider
+from .utils import AsyncAzureADTokenProvider, build_oai_message
 
 
 @dataclass
 class LLMOptions:
     model: str | ChatModels
+    user: str | None
+    temperature: float | None
 
 
 class LLM(llm.LLM):
@@ -50,14 +53,27 @@ def __init__(
         model: str | ChatModels = "gpt-4o",
         api_key: str | None = None,
         base_url: str | None = None,
+        user: str | None = None,
         client: openai.AsyncClient | None = None,
+        temperature: float | None = None,
     ) -> None:
-        self._opts = LLMOptions(model=model)
+        """
+        Create a new instance of OpenAI LLM.
+
+        ``api_key`` must be set to your OpenAI API key, either using the argument or by setting the
+        ``OPENAI_API_KEY`` environmental variable.
+        """
+        # throw an error on our end
+        api_key = api_key or os.environ.get("OPENAI_API_KEY")
+        if api_key is None:
+            raise ValueError("OpenAI API key is required")
+
+        self._opts = LLMOptions(model=model, user=user, temperature=temperature)
         self._client = client or openai.AsyncClient(
             api_key=api_key,
             base_url=base_url,
             http_client=httpx.AsyncClient(
-                timeout=5.0,
+                timeout=httpx.Timeout(timeout=30, connect=10, read=5, pool=5),
                 follow_redirects=True,
                 limits=httpx.Limits(
                     max_connections=1000,
@@ -81,6 +97,8 @@ def with_azure(
         organization: str | None = None,
         project: str | None = None,
         base_url: str | None = None,
+        user: str | None = None,
+        temperature: float | None = None,
     ) -> LLM:
         """
         This automatically infers the following arguments from their corresponding environment variables if they are not provided:
@@ -104,7 +122,38 @@ def with_azure(
             base_url=base_url,
         )  # type: ignore
 
-        return LLM(model=model, client=azure_client)
+        return LLM(model=model, client=azure_client, user=user, temperature=temperature)
+
+    @staticmethod
+    def with_cerebras(
+        *,
+        model: str | CerebrasChatModels = "llama3.1-8b",
+        api_key: str | None = None,
+        base_url: str | None = "https://api.cerebras.ai/v1",
+        client: openai.AsyncClient | None = None,
+        user: str | None = None,
+        temperature: float | None = None,
+    ) -> LLM:
+        """
+        Create a new instance of Cerebras LLM.
+
+        ``api_key`` must be set to your Cerebras API key, either using the argument or by setting
+        the ``CEREBRAS_API_KEY`` environmental variable.
+        """
+
+        # shim for not using OPENAI_API_KEY
+        api_key = api_key or os.environ.get("CEREBRAS_API_KEY")
+        if api_key is None:
+            raise ValueError("Cerebras API key is required")
+
+        return LLM(
+            model=model,
+            api_key=api_key,
+            base_url=base_url,
+            client=client,
+            user=user,
+            temperature=temperature,
+        )
 
     @staticmethod
     def with_fireworks(
@@ -113,8 +162,29 @@ def with_fireworks(
         api_key: str | None = None,
         base_url: str | None = "https://api.fireworks.ai/inference/v1",
         client: openai.AsyncClient | None = None,
+        user: str | None = None,
+        temperature: float | None = None,
     ) -> LLM:
-        return LLM(model=model, api_key=api_key, base_url=base_url, client=client)
+        """
+        Create a new instance of Fireworks LLM.
+
+        ``api_key`` must be set to your Fireworks API key, either using the argument or by setting
+        the ``FIREWORKS_API_KEY`` environmental variable.
+        """
+
+        # shim for not using OPENAI_API_KEY
+        api_key = api_key or os.environ.get("FIREWORKS_API_KEY")
+        if api_key is None:
+            raise ValueError("Fireworks API key is required")
+
+        return LLM(
+            model=model,
+            api_key=api_key,
+            base_url=base_url,
+            client=client,
+            user=user,
+            temperature=temperature,
+        )
 
     @staticmethod
     def with_groq(
@@ -123,8 +193,60 @@ def with_groq(
         api_key: str | None = None,
         base_url: str | None = "https://api.groq.com/openai/v1",
         client: openai.AsyncClient | None = None,
+        user: str | None = None,
+        temperature: float | None = None,
     ) -> LLM:
-        return LLM(model=model, api_key=api_key, base_url=base_url, client=client)
+        """
+        Create a new instance of Groq LLM.
+
+        ``api_key`` must be set to your Groq API key, either using the argument or by setting
+        the ``GROQ_API_KEY`` environmental variable.
+        """
+
+        # shim for not using OPENAI_API_KEY
+        api_key = api_key or os.environ.get("GROQ_API_KEY")
+        if api_key is None:
+            raise ValueError("Groq API key is required")
+
+        return LLM(
+            model=model,
+            api_key=api_key,
+            base_url=base_url,
+            client=client,
+            user=user,
+            temperature=temperature,
+        )
+
+    @staticmethod
+    def with_deepseek(
+        *,
+        model: str | DeepSeekChatModels = "deepseek-chat",
+        api_key: str | None = None,
+        base_url: str | None = "https://api.deepseek.com/v1",
+        client: openai.AsyncClient | None = None,
+        user: str | None = None,
+        temperature: float | None = None,
+    ) -> LLM:
+        """
+        Create a new instance of DeepSeek LLM.
+
+        ``api_key`` must be set to your DeepSeek API key, either using the argument or by setting
+        the ``DEEPSEEK_API_KEY`` environmental variable.
+        """
+
+        # shim for not using OPENAI_API_KEY
+        api_key = api_key or os.environ.get("DEEPSEEK_API_KEY")
+        if api_key is None:
+            raise ValueError("DeepSeek API key is required")
+
+        return LLM(
+            model=model,
+            api_key=api_key,
+            base_url=base_url,
+            client=client,
+            user=user,
+            temperature=temperature,
+        )
 
     @staticmethod
     def with_octo(
@@ -133,8 +255,29 @@ def with_octo(
         api_key: str | None = None,
         base_url: str | None = "https://text.octoai.run/v1",
         client: openai.AsyncClient | None = None,
+        user: str | None = None,
+        temperature: float | None = None,
     ) -> LLM:
-        return LLM(model=model, api_key=api_key, base_url=base_url, client=client)
+        """
+        Create a new instance of OctoAI LLM.
+
+        ``api_key`` must be set to your OctoAI API key, either using the argument or by setting
+        the ``OCTOAI_TOKEN`` environmental variable.
+        """
+
+        # shim for not using OPENAI_API_KEY
+        api_key = api_key or os.environ.get("OCTOAI_TOKEN")
+        if api_key is None:
+            raise ValueError("OctoAI API key is required")
+
+        return LLM(
+            model=model,
+            api_key=api_key,
+            base_url=base_url,
+            client=client,
+            user=user,
+            temperature=temperature,
+        )
 
     @staticmethod
     def with_ollama(
@@ -142,8 +285,19 @@ def with_ollama(
         model: str = "llama3.1",
         base_url: str | None = "http://localhost:11434/v1",
         client: openai.AsyncClient | None = None,
+        temperature: float | None = None,
     ) -> LLM:
-        return LLM(model=model, api_key="ollama", base_url=base_url, client=client)
+        """
+        Create a new instance of Ollama LLM.
+        """
+
+        return LLM(
+            model=model,
+            api_key="ollama",
+            base_url=base_url,
+            client=client,
+            temperature=temperature,
+        )
 
     @staticmethod
     def with_perplexity(
@@ -152,8 +306,17 @@ def with_perplexity(
         api_key: str | None = None,
         base_url: str | None = "https://api.perplexity.ai",
         client: openai.AsyncClient | None = None,
+        user: str | None = None,
+        temperature: float | None = None,
     ) -> LLM:
-        return LLM(model=model, api_key=api_key, base_url=base_url, client=client)
+        return LLM(
+            model=model,
+            api_key=api_key,
+            base_url=base_url,
+            client=client,
+            user=user,
+            temperature=temperature,
+        )
 
     @staticmethod
     def with_together(
@@ -162,8 +325,29 @@ def with_together(
         api_key: str | None = None,
         base_url: str | None = "https://api.together.xyz/v1",
         client: openai.AsyncClient | None = None,
+        user: str | None = None,
+        temperature: float | None = None,
     ) -> LLM:
-        return LLM(model=model, api_key=api_key, base_url=base_url, client=client)
+        """
+        Create a new instance of TogetherAI LLM.
+
+        ``api_key`` must be set to your TogetherAI API key, either using the argument or by setting
+        the ``TOGETHER_API_KEY`` environmental variable.
+        """
+
+        # shim for not using OPENAI_API_KEY
+        api_key = api_key or os.environ.get("TOGETHER_API_KEY")
+        if api_key is None:
+            raise ValueError("TogetherAI API key is required")
+
+        return LLM(
+            model=model,
+            api_key=api_key,
+            base_url=base_url,
+            client=client,
+            user=user,
+            temperature=temperature,
+        )
 
     @staticmethod
     def create_azure_client(
@@ -178,6 +362,8 @@ def create_azure_client(
         organization: str | None = None,
         project: str | None = None,
         base_url: str | None = None,
+        user: str | None = None,
+        temperature: float | None = None,
     ) -> LLM:
         logger.warning("This alias is deprecated. Use LLM.with_azure() instead")
         return LLM.with_azure(
@@ -190,6 +376,8 @@ def create_azure_client(
             organization=organization,
             project=project,
             base_url=base_url,
+            user=user,
+            temperature=temperature,
         )
 
     def chat(
@@ -212,6 +400,10 @@ def chat(
             if fnc_ctx and parallel_tool_calls is not None:
                 opts["parallel_tool_calls"] = parallel_tool_calls
 
+        user = self._opts.user or openai.NOT_GIVEN
+        if temperature is None:
+            temperature = self._opts.temperature
+
         messages = _build_oai_context(chat_ctx, id(self))
         cmp = self._client.chat.completions.create(
             messages=messages,
@@ -219,6 +411,7 @@ def chat(
             n=n,
             temperature=temperature,
             stream=True,
+            user=user,
             **opts,
         )
 
@@ -263,6 +456,11 @@ async def __anext__(self):
     def _parse_choice(self, choice: Choice) -> llm.ChatChunk | None:
         delta = choice.delta
 
+        # https://github.com/livekit/agents/issues/688
+        # the delta can be None when using Azure OpenAI using content filtering
+        if delta is None:
+            return None
+
         if delta.tool_calls:
             # check if we have functions to calls
             for tool in delta.tool_calls:
@@ -332,77 +530,4 @@ def _try_run_function(self, choice: Choice) -> llm.ChatChunk | None:
 def _build_oai_context(
     chat_ctx: llm.ChatContext, cache_key: Any
 ) -> list[ChatCompletionMessageParam]:
-    return [_build_oai_message(msg, cache_key) for msg in chat_ctx.messages]  # type: ignore
-
-
-def _build_oai_message(msg: llm.ChatMessage, cache_key: Any):
-    oai_msg: dict = {"role": msg.role}
-
-    if msg.name:
-        oai_msg["name"] = msg.name
-
-    # add content if provided
-    if isinstance(msg.content, str):
-        oai_msg["content"] = msg.content
-    elif isinstance(msg.content, list):
-        oai_content = []
-        for cnt in msg.content:
-            if isinstance(cnt, str):
-                oai_content.append({"type": "text", "text": cnt})
-            elif isinstance(cnt, llm.ChatImage):
-                oai_content.append(_build_oai_image_content(cnt, cache_key))
-
-        oai_msg["content"] = oai_content
-
-    # make sure to provide when function has been called inside the context
-    # (+ raw_arguments)
-    if msg.tool_calls is not None:
-        tool_calls: list[dict[str, Any]] = []
-        oai_msg["tool_calls"] = tool_calls
-        for fnc in msg.tool_calls:
-            tool_calls.append(
-                {
-                    "id": fnc.tool_call_id,
-                    "type": "function",
-                    "function": {
-                        "name": fnc.function_info.name,
-                        "arguments": fnc.raw_arguments,
-                    },
-                }
-            )
-
-    # tool_call_id is set when the message is a response/result to a function call
-    # (content is a string in this case)
-    if msg.tool_call_id:
-        oai_msg["tool_call_id"] = msg.tool_call_id
-
-    return oai_msg
-
-
-def _build_oai_image_content(image: llm.ChatImage, cache_key: Any):
-    if isinstance(image.image, str):  # image url
-        return {
-            "type": "image_url",
-            "image_url": {"url": image.image, "detail": "auto"},
-        }
-    elif isinstance(image.image, rtc.VideoFrame):  # VideoFrame
-        if cache_key not in image._cache:
-            # inside our internal implementation, we allow to put extra metadata to
-            # each ChatImage (avoid to reencode each time we do a chatcompletion request)
-            opts = utils.images.EncodeOptions()
-            if image.inference_width and image.inference_height:
-                opts.resize_options = utils.images.ResizeOptions(
-                    width=image.inference_width,
-                    height=image.inference_height,
-                    strategy="center_aspect_fit",
-                )
-
-            encoded_data = utils.images.encode(image.image, opts)
-            image._cache[cache_key] = base64.b64encode(encoded_data).decode("utf-8")
-
-        return {
-            "type": "image_url",
-            "image_url": {"url": f"data:image/jpeg;base64,{image._cache[cache_key]}"},
-        }
-
-    raise ValueError(f"unknown image type {type(image.image)}")
+    return [build_oai_message(msg, cache_key) for msg in chat_ctx.messages]  # type: ignore
diff --git a/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/models.py b/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/models.py
index 95e81aa66..3815826e4 100644
--- a/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/models.py
+++ b/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/models.py
@@ -7,6 +7,8 @@
 ChatModels = Literal[
     "gpt-4o",
     "gpt-4o-2024-05-13",
+    "gpt-4o-mini",
+    "gpt-4o-mini-2024-07-18",
     "gpt-4-turbo",
     "gpt-4-turbo-2024-04-09",
     "gpt-4-turbo-preview",
@@ -31,8 +33,15 @@
     "text-embedding-ada-002", "text-embedding-3-small", "text-embedding-3-large"
 ]
 
+AssistantTools = Literal["code_interpreter", "file_search", "function"]
+
 # adapters for OpenAI-compatible LLMs
 
+CerebrasChatModels = Literal[
+    "llama3.1-8b",
+    "llama3.1-70b",
+]
+
 PerplexityChatModels = Literal[
     "llama-3.1-sonar-small-128k-online",
     "llama-3.1-sonar-small-128k-chat",
@@ -56,6 +65,11 @@
     "gemma2-9b-it",
 ]
 
+DeepSeekChatModels = Literal[
+    "deepseek-coder",
+    "deepseek-chat",
+]
+
 TogetherChatModels = Literal[
     "Austism/chronos-hermes-13b",
     "Gryphe/MythoMax-L2-13b",
diff --git a/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/stt.py b/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/stt.py
index f9356a1cb..6cb949b9d 100644
--- a/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/stt.py
+++ b/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/stt.py
@@ -16,6 +16,7 @@
 
 import dataclasses
 import io
+import os
 import wave
 from dataclasses import dataclass
 
@@ -47,6 +48,13 @@ def __init__(
         api_key: str | None = None,
         client: openai.AsyncClient | None = None,
     ):
+        """
+        Create a new instance of OpenAI STT.
+
+        ``api_key`` must be set to your OpenAI API key, either using the argument or by setting the
+        ``OPENAI_API_KEY`` environmental variable.
+        """
+
         super().__init__(
             capabilities=stt.STTCapabilities(streaming=False, interim_results=False)
         )
@@ -59,6 +67,11 @@ def __init__(
             model=model,
         )
 
+        # throw an error on our end
+        api_key = api_key or os.environ.get("OPENAI_API_KEY")
+        if api_key is None:
+            raise ValueError("OpenAI API key is required")
+
         self._client = client or openai.AsyncClient(
             api_key=api_key,
             base_url=base_url,
diff --git a/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/tts.py b/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/tts.py
index 27f62df13..fed67c9c5 100644
--- a/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/tts.py
+++ b/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/tts.py
@@ -14,6 +14,7 @@
 
 from __future__ import annotations
 
+import os
 from dataclasses import dataclass
 from typing import AsyncContextManager
 
@@ -48,6 +49,13 @@ def __init__(
         api_key: str | None = None,
         client: openai.AsyncClient | None = None,
     ) -> None:
+        """
+        Create a new instance of OpenAI TTS.
+
+        ``api_key`` must be set to your OpenAI API key, either using the argument or by setting the
+        ``OPENAI_API_KEY`` environmental variable.
+        """
+
         super().__init__(
             capabilities=tts.TTSCapabilities(
                 streaming=False,
@@ -56,6 +64,11 @@ def __init__(
             num_channels=OPENAI_TTS_CHANNELS,
         )
 
+        # throw an error on our end
+        api_key = api_key or os.environ.get("OPENAI_API_KEY")
+        if api_key is None:
+            raise ValueError("OpenAI API key is required")
+
         self._client = client or openai.AsyncClient(
             api_key=api_key,
             base_url=base_url,
@@ -144,11 +157,26 @@ async def _main_task(self):
         request_id = utils.shortuuid()
         segment_id = utils.shortuuid()
         decoder = utils.codecs.Mp3StreamDecoder()
+        audio_bstream = utils.audio.AudioByteStream(
+            sample_rate=OPENAI_TTS_SAMPLE_RATE,
+            num_channels=OPENAI_TTS_CHANNELS,
+        )
+
         async with self._oai_stream as stream:
-            async for data in stream.iter_bytes(4096):
+            async for data in stream.iter_bytes():
                 for frame in decoder.decode_chunk(data):
-                    self._event_ch.send_nowait(
-                        tts.SynthesizedAudio(
-                            request_id=request_id, segment_id=segment_id, frame=frame
+                    for frame in audio_bstream.write(frame.data):
+                        self._event_ch.send_nowait(
+                            tts.SynthesizedAudio(
+                                request_id=request_id,
+                                segment_id=segment_id,
+                                frame=frame,
+                            )
                         )
+
+            for frame in audio_bstream.flush():
+                self._event_ch.send_nowait(
+                    tts.SynthesizedAudio(
+                        request_id=request_id, segment_id=segment_id, frame=frame
                     )
+                )
diff --git a/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/utils.py b/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/utils.py
index 55e7d8d13..40d95037f 100644
--- a/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/utils.py
+++ b/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/utils.py
@@ -1,3 +1,89 @@
-from typing import Awaitable, Callable, Union
+from __future__ import annotations
+
+import base64
+import os
+from typing import Any, Awaitable, Callable, Optional, Union
+
+from livekit import rtc
+from livekit.agents import llm, utils
 
 AsyncAzureADTokenProvider = Callable[[], Union[str, Awaitable[str]]]
+
+
+def get_base_url(base_url: Optional[str]) -> str:
+    if not base_url:
+        base_url = os.getenv("OPENAI_BASE_URL", "https://api.openai.com/v1")
+    return base_url
+
+
+def build_oai_message(msg: llm.ChatMessage, cache_key: Any):
+    oai_msg: dict[str, Any] = {"role": msg.role}
+
+    if msg.name:
+        oai_msg["name"] = msg.name
+
+    # add content if provided
+    if isinstance(msg.content, str):
+        oai_msg["content"] = msg.content
+    elif isinstance(msg.content, list):
+        oai_content: list[dict[str, Any]] = []
+        for cnt in msg.content:
+            if isinstance(cnt, str):
+                oai_content.append({"type": "text", "text": cnt})
+            elif isinstance(cnt, llm.ChatImage):
+                oai_content.append(_build_oai_image_content(cnt, cache_key))
+
+        oai_msg["content"] = oai_content
+
+    # make sure to provide when function has been called inside the context
+    # (+ raw_arguments)
+    if msg.tool_calls is not None:
+        tool_calls: list[dict[str, Any]] = []
+        oai_msg["tool_calls"] = tool_calls
+        for fnc in msg.tool_calls:
+            tool_calls.append(
+                {
+                    "id": fnc.tool_call_id,
+                    "type": "function",
+                    "function": {
+                        "name": fnc.function_info.name,
+                        "arguments": fnc.raw_arguments,
+                    },
+                }
+            )
+
+    # tool_call_id is set when the message is a response/result to a function call
+    # (content is a string in this case)
+    if msg.tool_call_id:
+        oai_msg["tool_call_id"] = msg.tool_call_id
+
+    return oai_msg
+
+
+def _build_oai_image_content(image: llm.ChatImage, cache_key: Any):
+    if isinstance(image.image, str):  # image url
+        return {
+            "type": "image_url",
+            "image_url": {"url": image.image, "detail": "auto"},
+        }
+    elif isinstance(image.image, rtc.VideoFrame):  # VideoFrame
+        if cache_key not in image._cache:
+            # inside our internal implementation, we allow to put extra metadata to
+            # each ChatImage (avoid to reencode each time we do a chatcompletion request)
+            opts = utils.images.EncodeOptions()
+            if image.inference_width and image.inference_height:
+                opts.resize_options = utils.images.ResizeOptions(
+                    width=image.inference_width,
+                    height=image.inference_height,
+                    strategy="center_aspect_fit",
+                )
+
+            encoded_data = utils.images.encode(image.image, opts)
+            image._cache[cache_key] = base64.b64encode(encoded_data).decode("utf-8")
+
+        return {
+            "type": "image_url",
+            "image_url": {"url": f"data:image/jpeg;base64,{image._cache[cache_key]}"},
+        }
+
+    raise ValueError(f"unknown image type {type(image.image)}")
diff --git a/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/version.py b/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/version.py
index fc4dcfeb4..bdeeae9e4 100644
--- a/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/version.py
+++ b/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/version.py
@@ -12,4 +12,4 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-__version__ = "0.8.0"
+__version__ = "0.8.4"
diff --git a/livekit-plugins/livekit-plugins-openai/package.json b/livekit-plugins/livekit-plugins-openai/package.json
index b27e3946f..20eeab2f4 100644
--- a/livekit-plugins/livekit-plugins-openai/package.json
+++ b/livekit-plugins/livekit-plugins-openai/package.json
@@ -1,5 +1,5 @@
 {
   "name": "livekit-plugins-openai",
   "private": true,
-  "version": "0.8.0"
+  "version": "0.8.4"
 }
diff --git a/livekit-plugins/livekit-plugins-playht/README.md b/livekit-plugins/livekit-plugins-playht/README.md
new file mode 100644
index 000000000..53badc144
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-playht/README.md
@@ -0,0 +1,13 @@
+# LiveKit Plugins PlayHT
+
+Agent Framework plugin for voice synthesis with [PlayHT](https://play.ht/) API.
+
+## Installation
+
+```bash
+pip install livekit-plugins-playht
+```
+
+## Pre-requisites
+
+You'll need USER ID and API Secret KEY from PlayHT. It can be set as an environment variable: `PLAYHT_USER_ID`, `PLAYHT_API_KEY`
\ No newline at end of file
diff --git a/livekit-plugins/livekit-plugins-playht/livekit/__init__.py b/livekit-plugins/livekit-plugins-playht/livekit/__init__.py
new file mode 100644
index 000000000..e69de29bb
diff --git a/livekit-plugins/livekit-plugins-playht/livekit/plugins/__init__.py b/livekit-plugins/livekit-plugins-playht/livekit/plugins/__init__.py
new file mode 100644
index 000000000..e69de29bb
diff --git a/livekit-plugins/livekit-plugins-playht/livekit/plugins/playht/__init__.py b/livekit-plugins/livekit-plugins-playht/livekit/plugins/playht/__init__.py
new file mode 100644
index 000000000..366012953
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-playht/livekit/plugins/playht/__init__.py
@@ -0,0 +1,24 @@
+
+from .models import TTSEngines
+from .tts import DEFAULT_VOICE, TTS, Voice
+from .version import __version__
+
+__all__ = [
+    "TTS",
+    "Voice",
+    "DEFAULT_VOICE",
+    "TTSEngines",
+    "__version__",
+]
+
+from livekit.agents import Plugin
+
+
+class PlayHTPlugin(Plugin):
+    def __init__(self) -> None:
+        super().__init__(__name__, __version__, __package__)
+
+    def download_files(self) -> None:
+        self.download_files(self)
+
+Plugin.register_plugin(PlayHTPlugin())
diff --git a/livekit-plugins/livekit-plugins-playht/livekit/plugins/playht/log.py b/livekit-plugins/livekit-plugins-playht/livekit/plugins/playht/log.py
new file mode 100644
index 000000000..fe278b042
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-playht/livekit/plugins/playht/log.py
@@ -0,0 +1,3 @@
+import logging
+
+logger = logging.getLogger("livekit.custom_tts_plugins.playht")
\ No newline at end of file
diff --git a/livekit-plugins/livekit-plugins-playht/livekit/plugins/playht/models.py b/livekit-plugins/livekit-plugins-playht/livekit/plugins/playht/models.py
new file mode 100644
index 000000000..942560b9b
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-playht/livekit/plugins/playht/models.py
@@ -0,0 +1,19 @@
+from typing import Literal
+
+TTSEngines = Literal[
+    'PlayHT2.0',
+    'PlayHT1.0',
+    'PlayHT2.0-turbo'
+]
+
+TTSEncoding = Literal[
+    "mp3_22050_32",
+    "mp3_44100_32",
+    "mp3_44100_64",
+    "mp3_44100_96",
+    "mp3_44100_128",
+    "mp3_44100_192",
+    "pcm_16000",
+    "pcm_22050",
+    "pcm_44100",
+]
\ No newline at end of file
diff --git a/livekit-plugins/livekit-plugins-playht/livekit/plugins/playht/tts.py b/livekit-plugins/livekit-plugins-playht/livekit/plugins/playht/tts.py
new file mode 100644
index 000000000..4ce65e7fc
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-playht/livekit/plugins/playht/tts.py
@@ -0,0 +1,218 @@
+from __future__ import annotations
+
+import asyncio
+import base64
+import dataclasses
+import json
+import os
+import io
+from dataclasses import dataclass
+from typing import Any, List, Literal
+from pyht import Client, TTSOptions, Format
+
+import aiohttp
+from livekit.agents import tts, utils, tokenize
+from livekit import rtc
+
+from .log import logger
+from .models import TTSEncoding, TTSEngines
+
+_Encoding = Literal["mp3", "pcm"]
+
+
+def _sample_rate_from_format(output_format: TTSEncoding) -> int:
+    split = output_format.split("_") 
+    return int(split[1])
+
+
+def _encoding_from_format(output_format: TTSEncoding) -> _Encoding:
+    if output_format.startswith("mp3"):
+        return "mp3"
+    elif output_format.startswith("pcm"):
+        return "pcm"
+
+    raise ValueError(f"Unknown format: {output_format}")
+
+
+@dataclass
+class Voice:
+    id: str
+    name: str
+    voice_engine: TTSEngines
+
+
+DEFAULT_VOICE = Voice(
+    id="s3://peregrine-voices/mel22/manifest.json",
+    name="Will",
+    voice_engine="PlayHT2.0"
+)
+
+ACCEPT_HEADER = {
+    "mp3": "audio/mpeg",
+    "wav": "audio/wav",
+    "ogg": "audio/ogg",
+    "flac": "audio/flac",
+    "mulaw": "audio/basic"  # commonly used for mulaw
+}
+
+API_BASE_URL_V1 = "https://api.play.ht/api/v2"
+AUTHORIZATION_HEADER = "AUTHORIZATION"
+USERID_HEADER = "X-USER-ID"
+PLAYHT_TTS_SAMPLE_RATE = 24000
+PLAYHT_TTS_CHANNELS = 1
+
+
+@dataclass
+class _TTSOptions:
+    api_key: str
+    user_id: str
+    voice: Voice
+    base_url: str
+    sample_rate: int
+    encoding: TTSEncoding
+
+
+
+
+class TTS(tts.TTS):
+    def __init__(
+            self,
+            *,
+            voice: Voice = DEFAULT_VOICE,
+            api_key: str | None = None,
+            user_id: str | None = None,
+            base_url: str | None = None,
+            encoding: Literal["mp3", "wav", "ogg", "flac", "mulaw"] | None = "wav",
+            http_session: aiohttp.ClientSession | None = None,
+    ) -> None:
+        super().__init__(
+            capabilities=tts.TTSCapabilities(
+                streaming=False,
+            ),
+            sample_rate=PLAYHT_TTS_SAMPLE_RATE,
+            num_channels=PLAYHT_TTS_CHANNELS,
+        )
+        api_key = api_key or os.environ.get("PLAYHT_API_KEY")
+        if not api_key:
+            raise ValueError("PLAYHT_API_KEY must be set")
+
+        user_id = user_id or os.environ.get("PLAYHT_USER_ID")
+        if not user_id:
+            raise ValueError("PLAYHT_USER_ID mus be set")
+
+        self._opts = _TTSOptions(
+            voice=voice,
+            user_id=user_id,
+            api_key=api_key,
+            base_url=base_url or API_BASE_URL_V1,
+            sample_rate=self.sample_rate,
+            encoding=encoding
+        )
+        self._session = http_session
+
+    def _ensure_session(self) -> aiohttp.ClientSession:
+        if not self._session:
+            self._session = utils.http_context.http_session()
+
+        return self._session
+
+    async def list_voices(self) -> List[Voice]:
+        async with self._ensure_session().get(
+                f"{self._opts.base_url}/voices",
+                headers={
+                    "accept": "application/json",
+                    AUTHORIZATION_HEADER: self._opts.api_key,
+                    USERID_HEADER: self._opts.user_id
+                },
+        ) as resp:
+            return _dict_to_voices_list(await resp.json())
+
+    def synthesize(self, text: str) -> "ChunkedStream":
+        return ChunkedStream(text, self._opts, self._ensure_session())
+
+
+class ChunkedStream(tts.ChunkedStream):
+    """Synthesize using the chunked api endpoint"""
+
+    def __init__(
+            self, text: str, opts: _TTSOptions, session: aiohttp.ClientSession
+    ) -> None:
+        super().__init__()
+        self._text, self._opts, self._session = text, opts, session
+
+    @utils.log_exceptions(logger=logger)
+    async def _main_task(self) -> None:
+        stream = utils.audio.AudioByteStream(
+            sample_rate=self._opts.sample_rate, num_channels=1
+        )
+        self._mp3_decoder = utils.codecs.Mp3StreamDecoder()
+        request_id = utils.shortuuid()
+        segment_id = utils.shortuuid()
+        parent_path = os.path.dirname(os.path.abspath(__file__))
+        client = Client(self._opts.user_id, self._opts.api_key)
+        options = TTSOptions(
+            voice=self._opts.voice.id,
+            sample_rate=PLAYHT_TTS_SAMPLE_RATE,
+            format=Format.FORMAT_WAV,
+            speed=1
+        )
+        url = "https://api.play.ht/api/v2/tts/stream"
+        headers = {
+            "accept": ACCEPT_HEADER[self._opts.encoding],
+            "content-type": "application/json",
+            "AUTHORIZATION": self._opts.api_key,
+            "X-USER-ID": self._opts.user_id
+        }
+        json_data = {
+            "text": "Hello, How are you?",
+            "output_format": self._opts.en,
+            "voice": "s3://voice-cloning-zero-shot/d9ff78ba-d016-47f6-b0ef-dd630f59414e/female-cs/manifest.json",
+        }
+        async with self._session.post(url=url, headers=headers, json=json_data) as resp:
+            if not resp.content_type.startswith("audio/"):
+                content = await resp.text()
+                logger.error("playHT returned non-audio data: %s", content)
+                return
+            
+            encoding = _encoding_from_format(self._opts.encoding)
+            if encoding == "mp3":
+                async for bytes_data, _ in resp.content.iter_chunks():
+                    for frame in self._mp3_decoder.decode_chunk(bytes_data):
+                        self._event_ch.send_nowait(
+                            tts.SynthesizedAudio(
+                                request_id=request_id,
+                                segment_id=segment_id,
+                                frame=frame,
+                            )
+                        )
+            else:
+                async for bytes_data, _ in resp.content.iter_chunks():
+                    for frame in stream.write(bytes_data):
+                        self._event_ch.send_nowait(
+                            tts.SynthesizedAudio(
+                                request_id=request_id,
+                                segment_id=segment_id,
+                                frame=frame,
+                            )
+                        )
+
+                for frame in stream.flush():
+                    self._event_ch.send_nowait(
+                        tts.SynthesizedAudio(
+                            request_id=request_id, segment_id=segment_id, frame=frame
+                        )
+                    )
+
+
+def _dict_to_voices_list(data: dict[str, Any]):
+    voices: List[Voice] = []
+    for voice in data["text"]:
+        voices.append(
+            Voice(
+                id=voice["id"],
+                name=voice["name"],
+                voice_engine=voice["voice_engine"]
+            )
+        )
+    return voices
+
diff --git a/livekit-plugins/livekit-plugins-playht/livekit/plugins/playht/version.py b/livekit-plugins/livekit-plugins-playht/livekit/plugins/playht/version.py
new file mode 100644
index 000000000..5becc17c0
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-playht/livekit/plugins/playht/version.py
@@ -0,0 +1 @@
+__version__ = "1.0.0"
diff --git a/livekit-plugins/livekit-plugins-playht/package.json b/livekit-plugins/livekit-plugins-playht/package.json
new file mode 100644
index 000000000..0ac1584b7
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-playht/package.json
@@ -0,0 +1,6 @@
+{
+    "name": "livekit-plugins-playht",
+    "private": true,
+    "version": "1.0.0"
+  }
+  
\ No newline at end of file
diff --git a/livekit-plugins/livekit-plugins-playht/pyproject.toml b/livekit-plugins/livekit-plugins-playht/pyproject.toml
new file mode 100644
index 000000000..8cf32563a
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-playht/pyproject.toml
@@ -0,0 +1,3 @@
+[build-system]
+requires = ["setuptools>=61.0"]
+build-backend = "setuptools.build_meta"
\ No newline at end of file
diff --git a/livekit-plugins/livekit-plugins-playht/setup.py b/livekit-plugins/livekit-plugins-playht/setup.py
new file mode 100644
index 000000000..ba8d7f293
--- /dev/null
+++ b/livekit-plugins/livekit-plugins-playht/setup.py
@@ -0,0 +1,44 @@
+
+import os
+import pathlib
+
+import setuptools
+import setuptools.command.build_py
+
+here = pathlib.Path(__file__).parent.resolve()
+about = {}
+with open(
+    os.path.join(here, "livekit", "plugins", "playht", "version.py"), "r"
+) as f:
+    exec(f.read(), about)
+
+
+setuptools.setup(
+    name="livekit-plugins-playht",
+    version=about["__version__"],
+    description="Agent Framework plugin for voice synthesis with PlayHT's API.",
+    long_description=(here / "README.md").read_text(encoding="utf-8"),
+    long_description_content_type="text/markdown",
+    url="https://github.com/livekit/agents",
+    cmdclass={},
+    classifiers=[
+        "Intended Audience :: Developers",
+        "Topic :: Multimedia :: Sound/Audio",
+        "Topic :: Scientific/Engineering :: Artificial Intelligence",
+        "Programming Language :: Python :: 3",
+        "Programming Language :: Python :: 3.11",
+        "Programming Language :: Python :: 3.12",
+        "Programming Language :: Python :: 3 :: Only",
+    ],
+    keywords=["webrtc", "realtime", "audio", "livekit", "playHT"],
+    license="Apache-2.0",
+    packages=setuptools.find_namespace_packages(include=["livekit.*"]),
+    python_requires=">=3.9.0",
+    install_requires=["livekit-agents[codecs]>=0.8.0.dev0", "pyht", "aiohttp", "livekit"],
+    package_data={"livekit.plugins.playht": ["py.typed"]},
+    project_urls={
+        "Documentation": "https://docs.livekit.io",
+        "Website": "https://livekit.io/",
+        "Source": "https://github.com/livekit/agents",
+    },
+)
\ No newline at end of file
diff --git a/livekit-plugins/livekit-plugins-rag/CHANGELOG.md b/livekit-plugins/livekit-plugins-rag/CHANGELOG.md
index 875d3beee..6a7effef5 100644
--- a/livekit-plugins/livekit-plugins-rag/CHANGELOG.md
+++ b/livekit-plugins/livekit-plugins-rag/CHANGELOG.md
@@ -1,5 +1,11 @@
 # livekit-plugins-rag
 
+## 0.2.2
+
+### Patch Changes
+
+- rag: fix backward compatibility - [#629](https://github.com/livekit/agents/pull/629) ([@afigar](https://github.com/afigar))
+
 ## 0.2.1
 
 ### Patch Changes
diff --git a/livekit-plugins/livekit-plugins-rag/livekit/plugins/rag/__init__.py b/livekit-plugins/livekit-plugins-rag/livekit/plugins/rag/__init__.py
index 3cd283ce6..7042c3fa7 100644
--- a/livekit-plugins/livekit-plugins-rag/livekit/plugins/rag/__init__.py
+++ b/livekit-plugins/livekit-plugins-rag/livekit/plugins/rag/__init__.py
@@ -27,5 +27,8 @@ class RAGPlugin(Plugin):
     def __init__(self) -> None:
         super().__init__(__name__, __version__, __package__, logger)
 
+    def download_files(self) -> None:
+        pass
+
 
 Plugin.register_plugin(RAGPlugin())
diff --git a/livekit-plugins/livekit-plugins-rag/livekit/plugins/rag/version.py b/livekit-plugins/livekit-plugins-rag/livekit/plugins/rag/version.py
index 875ee5214..2985d9da1 100644
--- a/livekit-plugins/livekit-plugins-rag/livekit/plugins/rag/version.py
+++ b/livekit-plugins/livekit-plugins-rag/livekit/plugins/rag/version.py
@@ -12,4 +12,4 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-__version__ = "0.2.1"
+__version__ = "0.2.2"
diff --git a/livekit-plugins/livekit-plugins-rag/package.json b/livekit-plugins/livekit-plugins-rag/package.json
index ccac72cfa..897e16552 100644
--- a/livekit-plugins/livekit-plugins-rag/package.json
+++ b/livekit-plugins/livekit-plugins-rag/package.json
@@ -1,5 +1,5 @@
 {
   "name": "livekit-plugins-rag",
   "private": true,
-  "version": "0.2.1"
+  "version": "0.2.2"
 }
diff --git a/livekit-plugins/livekit-plugins-silero/CHANGELOG.md b/livekit-plugins/livekit-plugins-silero/CHANGELOG.md
index 8e754db97..5fd5671c5 100644
--- a/livekit-plugins/livekit-plugins-silero/CHANGELOG.md
+++ b/livekit-plugins/livekit-plugins-silero/CHANGELOG.md
@@ -1,5 +1,13 @@
 # livekit-plugins-silero
 
+## 0.6.4
+
+### Patch Changes
+
+- silero: adjust vad activation threshold - [#639](https://github.com/livekit/agents/pull/639) ([@theomonnom](https://github.com/theomonnom))
+
+- silero: fix vad padding & static audio - [#631](https://github.com/livekit/agents/pull/631) ([@theomonnom](https://github.com/theomonnom))
+
 ## 0.6.3
 
 ### Patch Changes
diff --git a/livekit-plugins/livekit-plugins-silero/livekit/plugins/silero/vad.py b/livekit-plugins/livekit-plugins-silero/livekit/plugins/silero/vad.py
index 2cd23c9da..7ac763508 100644
--- a/livekit-plugins/livekit-plugins-silero/livekit/plugins/silero/vad.py
+++ b/livekit-plugins/livekit-plugins-silero/livekit/plugins/silero/vad.py
@@ -27,6 +27,8 @@
 from . import onnx_model
 from .log import logger
 
+SLOW_INFERENCE_THRESHOLD = 0.2  # late by 200ms
+
 
 @dataclass
 class _VADOptions:
@@ -47,7 +49,7 @@ def load(
         min_silence_duration: float = 0.25,
         padding_duration: float = 0.1,
         max_buffered_speech: float = 60.0,
-        activation_threshold: float = 0.25,
+        activation_threshold: float = 0.5,
         sample_rate: int = 16000,
         force_cpu: bool = True,
     ) -> "VAD":
@@ -108,11 +110,14 @@ def __init__(self, opts: _VADOptions, model: onnx_model.OnnxModel) -> None:
         self._task.add_done_callback(lambda _: self._executor.shutdown(wait=False))
         self._exp_filter = utils.ExpFilter(alpha=0.35)
 
+        self._extra_inference_time = 0.0
+
     @agents.utils.log_exceptions(logger=logger)
     async def _main_task(self):
         og_sample_rate = 0
         og_needed_samples = 0  # needed samples to complete the window data
         og_window_size_samples = 0  # size in samples of og_window_data
+        og_padding_size_samples = 0  # size in samples of padding data
         og_window_data: np.ndarray | None = None
 
         index_step = 0
@@ -143,16 +148,22 @@ async def _main_task(self):
             elif og_window_data is None:
                 # alloc the og buffers now that we know the pushed sample rate
                 og_sample_rate = frame.sample_rate
+
                 og_window_size_samples = int(
                     (self._model.window_size_samples / self._model.sample_rate)
                     * og_sample_rate
                 )
+                og_padding_size_samples = int(
+                    self._opts.padding_duration * og_sample_rate
+                )
                 og_window_data = np.empty(og_window_size_samples, dtype=np.int16)
                 og_needed_samples = og_window_size_samples
                 index_step = frame.sample_rate // 16000
 
                 speech_buffer = np.empty(
-                    int(self._opts.max_buffered_speech * og_sample_rate), dtype=np.int16
+                    int(self._opts.max_buffered_speech * og_sample_rate)
+                    + int(self._opts.padding_duration * og_sample_rate) * 2,
+                    dtype=np.int16,
                 )
             elif og_sample_rate != frame.sample_rate:
                 logger.error("a frame with another sample rate was already pushed")
@@ -160,11 +171,15 @@ async def _main_task(self):
 
             frame_data = np.frombuffer(frame.data, dtype=np.int16)
             remaining_samples = len(frame_data)
+
             while remaining_samples > 0:
                 to_copy = min(remaining_samples, og_needed_samples)
 
-                index = len(og_window_data) - og_needed_samples
-                og_window_data[index : index + to_copy] = frame_data[:to_copy]
+                window_index = og_window_size_samples - og_needed_samples
+                frame_index = len(frame_data) - remaining_samples
+                og_window_data[window_index : window_index + to_copy] = frame_data[
+                    frame_index : frame_index + to_copy
+                ]
 
                 remaining_samples -= to_copy
                 og_needed_samples -= to_copy
@@ -183,45 +198,74 @@ async def _main_task(self):
                 )
 
                 # run the inference
-                start_time = time.time()
+                start_time = time.perf_counter()
                 raw_prob = await self._loop.run_in_executor(
                     self._executor, self._model, inference_window_data
                 )
 
+                inference_duration = time.perf_counter() - start_time
+
                 prob_change = abs(raw_prob - self._exp_filter.filtered())
                 exp = 0.5 if prob_change > 0.25 else 1
                 raw_prob = self._exp_filter.apply(exp=exp, sample=raw_prob)
 
-                inference_duration = time.time() - start_time
                 window_duration = (
                     self._model.window_size_samples / self._opts.sample_rate
                 )
-                if inference_duration > window_duration:
+
+                self._extra_inference_time = max(
+                    0.0,
+                    self._extra_inference_time + inference_duration - window_duration,
+                )
+                if inference_duration > SLOW_INFERENCE_THRESHOLD:
                     logger.warning(
-                        "vad inference took too long - slower than realtime: %f",
-                        inference_duration,
+                        "inference is slower than realtime",
+                        extra={"delay": self._extra_inference_time},
                     )
 
                 pub_current_sample += og_window_size_samples
 
-                def _copy_window():
+                def _copy_inference_window():
                     nonlocal speech_buffer_index
-                    to_copy = min(
-                        og_window_size_samples,
-                        len(speech_buffer) - speech_buffer_index,
-                    )
+                    available_space = len(speech_buffer) - speech_buffer_index
+                    to_copy = min(og_window_size_samples, available_space)
                     if to_copy <= 0:
-                        # max_buffered_speech reached
-                        return
+                        return  # max_buffered_speech reached
 
                     speech_buffer[
                         speech_buffer_index : speech_buffer_index + to_copy
-                    ] = og_window_data
-                    speech_buffer_index += og_window_size_samples
+                    ] = og_window_data[:to_copy]
+                    speech_buffer_index += to_copy
+
+                def _reset_write_cursor():
+                    nonlocal speech_buffer_index
+                    if speech_buffer_index <= og_padding_size_samples:
+                        return
+
+                    padding_data = speech_buffer[
+                        speech_buffer_index
+                        - og_padding_size_samples : speech_buffer_index
+                    ]
+
+                    speech_buffer[:og_padding_size_samples] = padding_data
+                    speech_buffer_index = og_padding_size_samples
+
+                def _copy_speech_buffer() -> rtc.AudioFrame:
+                    # copy the data from speech_buffer
+                    assert speech_buffer is not None
+                    speech_data = speech_buffer[:speech_buffer_index].tobytes()
+
+                    return rtc.AudioFrame(
+                        sample_rate=og_sample_rate,
+                        num_channels=1,
+                        samples_per_channel=speech_buffer_index,
+                        data=speech_data,
+                    )
+
+                _copy_inference_window()
 
                 if pub_speaking:
                     pub_speech_duration += window_duration
-                    _copy_window()
                 else:
                     pub_silence_duration += window_duration
 
@@ -242,8 +286,6 @@ def _copy_window():
                     silence_threshold_duration = 0.0
 
                     if not pub_speaking:
-                        _copy_window()
-
                         if speech_threshold_duration >= self._opts.min_speech_duration:
                             pub_speaking = True
                             pub_silence_duration = 0.0
@@ -255,6 +297,7 @@ def _copy_window():
                                     samples_index=pub_current_sample,
                                     silence_duration=pub_silence_duration,
                                     speech_duration=pub_speech_duration,
+                                    frames=[_copy_speech_buffer()],
                                     speaking=True,
                                 )
                             )
@@ -263,37 +306,26 @@ def _copy_window():
                     speech_threshold_duration = 0.0
 
                     if not pub_speaking:
-                        speech_buffer_index = 0
+                        _reset_write_cursor()
 
                     if (
                         pub_speaking
                         and silence_threshold_duration
-                        >= self._opts.min_silence_duration
+                        >= self._opts.min_silence_duration + self._opts.padding_duration
                     ):
                         pub_speaking = False
                         pub_speech_duration = 0.0
                         pub_silence_duration = silence_threshold_duration
 
-                        speech_data = speech_buffer[
-                            :speech_buffer_index
-                        ].tobytes()  # copy the data from speech_buffer
-
                         self._event_ch.send_nowait(
                             agents.vad.VADEvent(
                                 type=agents.vad.VADEventType.END_OF_SPEECH,
                                 samples_index=pub_current_sample,
                                 silence_duration=pub_silence_duration,
                                 speech_duration=pub_speech_duration,
-                                frames=[
-                                    rtc.AudioFrame(
-                                        sample_rate=og_sample_rate,
-                                        num_channels=1,
-                                        samples_per_channel=speech_buffer_index,
-                                        data=speech_data,
-                                    )
-                                ],
+                                frames=[_copy_speech_buffer()],
                                 speaking=False,
                             )
                         )
 
-                        speech_buffer_index = 0
+                        _reset_write_cursor()
diff --git a/livekit-plugins/livekit-plugins-silero/livekit/plugins/silero/version.py b/livekit-plugins/livekit-plugins-silero/livekit/plugins/silero/version.py
index b315b98ad..4f1df5fb6 100644
--- a/livekit-plugins/livekit-plugins-silero/livekit/plugins/silero/version.py
+++ b/livekit-plugins/livekit-plugins-silero/livekit/plugins/silero/version.py
@@ -12,4 +12,4 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-__version__ = "0.6.3"
+__version__ = "0.6.4"
diff --git a/livekit-plugins/livekit-plugins-silero/package.json b/livekit-plugins/livekit-plugins-silero/package.json
index 39ad000c4..5d0bc7ed4 100644
--- a/livekit-plugins/livekit-plugins-silero/package.json
+++ b/livekit-plugins/livekit-plugins-silero/package.json
@@ -1,5 +1,5 @@
 {
   "name": "livekit-plugins-silero",
   "private": true,
-  "version": "0.6.3"
+  "version": "0.6.4"
 }
diff --git a/pnpm-lock.yaml b/pnpm-lock.yaml
index 635bf97df..3d3b0e9e1 100644
--- a/pnpm-lock.yaml
+++ b/pnpm-lock.yaml
@@ -17,37 +17,507 @@ importers:
 
   livekit-agents: {}
 
+  livekit-plugins/livekit-plugins-anthropic: {}
+
   livekit-plugins/livekit-plugins-azure: {}
 
-  livekit-plugins/livekit-plugins-cartesia: {}
+  livekit-plugins/livekit-plugins-browser: {}
+
+  livekit-plugins/livekit-plugins-cartesia: {}
+
+  livekit-plugins/livekit-plugins-deepgram: {}
+
+  livekit-plugins/livekit-plugins-elevenlabs: {}
+
+  livekit-plugins/livekit-plugins-google: {}
+
+  livekit-plugins/livekit-plugins-minimal: {}
+
+  livekit-plugins/livekit-plugins-nltk: {}
+
+  livekit-plugins/livekit-plugins-openai: {}
+
+  livekit-plugins/livekit-plugins-rag: {}
+
+  livekit-plugins/livekit-plugins-silero: {}
+
+packages:
+
+  '@babel/runtime@7.24.8':
+    resolution: {integrity: sha512-5F7SDGs1T72ZczbRwbGO9lQi0NLjQxzl6i4lJxLxfW9U5UluCSyEJeniWvnhl3/euNiqQVbo8zruhsDfid0esA==}
+    engines: {node: '>=6.9.0'}
+
+  '@changesets/apply-release-plan@7.0.4':
+    resolution: {integrity: sha512-HLFwhKWayKinWAul0Vj+76jVx1Pc2v55MGPVjZ924Y/ROeSsBMFutv9heHmCUj48lJyRfOTJG5+ar+29FUky/A==}
+
+  '@changesets/assemble-release-plan@6.0.3':
+    resolution: {integrity: sha512-bLNh9/Lgl1VwkjWZTq8JmRqH+hj7/Yzfz0jsQ/zJJ+FTmVqmqPj3szeKOri8O/hEM8JmHW019vh2gTO9iq5Cuw==}
+
+  '@changesets/changelog-git@0.2.0':
+    resolution: {integrity: sha512-bHOx97iFI4OClIT35Lok3sJAwM31VbUM++gnMBV16fdbtBhgYu4dxsphBF/0AZZsyAHMrnM0yFcj5gZM1py6uQ==}
+
+  '@changesets/cli@2.27.7':
+    resolution: {integrity: sha512-6lr8JltiiXPIjDeYg4iM2MeePP6VN/JkmqBsVA5XRiy01hGS3y629LtSDvKcycj/w/5Eur1rEwby/MjcYS+e2A==}
+    hasBin: true
+
+  '@changesets/config@3.0.2':
+    resolution: {integrity: sha512-cdEhS4t8woKCX2M8AotcV2BOWnBp09sqICxKapgLHf9m5KdENpWjyrFNMjkLqGJtUys9U+w93OxWT0czorVDfw==}
+
+  '@changesets/errors@0.2.0':
+    resolution: {integrity: sha512-6BLOQUscTpZeGljvyQXlWOItQyU71kCdGz7Pi8H8zdw6BI0g3m43iL4xKUVPWtG+qrrL9DTjpdn8eYuCQSRpow==}
+
+  '@changesets/get-dependents-graph@2.1.1':
+    resolution: {integrity: sha512-LRFjjvigBSzfnPU2n/AhFsuWR5DK++1x47aq6qZ8dzYsPtS/I5mNhIGAS68IAxh1xjO9BTtz55FwefhANZ+FCA==}
+
+  '@changesets/get-github-info@0.5.2':
+    resolution: {integrity: sha512-JppheLu7S114aEs157fOZDjFqUDpm7eHdq5E8SSR0gUBTEK0cNSHsrSR5a66xs0z3RWuo46QvA3vawp8BxDHvg==}
+
+  '@changesets/get-release-plan@4.0.3':
+    resolution: {integrity: sha512-6PLgvOIwTSdJPTtpdcr3sLtGatT+Jr22+cQwEBJBy6wP0rjB4yJ9lv583J9fVpn1bfQlBkDa8JxbS2g/n9lIyA==}
+
+  '@changesets/get-version-range-type@0.4.0':
+    resolution: {integrity: sha512-hwawtob9DryoGTpixy1D3ZXbGgJu1Rhr+ySH2PvTLHvkZuQ7sRT4oQwMh0hbqZH1weAooedEjRsbrWcGLCeyVQ==}
+
+  '@changesets/git@3.0.0':
+    resolution: {integrity: sha512-vvhnZDHe2eiBNRFHEgMiGd2CT+164dfYyrJDhwwxTVD/OW0FUD6G7+4DIx1dNwkwjHyzisxGAU96q0sVNBns0w==}
+
+  '@changesets/logger@0.1.0':
+    resolution: {integrity: sha512-pBrJm4CQm9VqFVwWnSqKEfsS2ESnwqwH+xR7jETxIErZcfd1u2zBSqrHbRHR7xjhSgep9x2PSKFKY//FAshA3g==}
+
+  '@changesets/parse@0.4.0':
+    resolution: {integrity: sha512-TS/9KG2CdGXS27S+QxbZXgr8uPsP4yNJYb4BC2/NeFUj80Rni3TeD2qwWmabymxmrLo7JEsytXH1FbpKTbvivw==}
+
+  '@changesets/pre@2.0.0':
+    resolution: {integrity: sha512-HLTNYX/A4jZxc+Sq8D1AMBsv+1qD6rmmJtjsCJa/9MSRybdxh0mjbTvE6JYZQ/ZiQ0mMlDOlGPXTm9KLTU3jyw==}
+
+  '@changesets/read@0.6.0':
+    resolution: {integrity: sha512-ZypqX8+/im1Fm98K4YcZtmLKgjs1kDQ5zHpc2U1qdtNBmZZfo/IBiG162RoP0CUF05tvp2y4IspH11PLnPxuuw==}
+
+  '@changesets/should-skip-package@0.1.0':
+    resolution: {integrity: sha512-FxG6Mhjw7yFStlSM7Z0Gmg3RiyQ98d/9VpQAZ3Fzr59dCOM9G6ZdYbjiSAt0XtFr9JR5U2tBaJWPjrkGGc618g==}
+
+  '@changesets/types@4.1.0':
+    resolution: {integrity: sha512-LDQvVDv5Kb50ny2s25Fhm3d9QSZimsoUGBsUioj6MC3qbMUCuC8GPIvk/M6IvXx3lYhAs0lwWUQLb+VIEUCECw==}
+
+  '@changesets/types@5.2.1':
+    resolution: {integrity: sha512-myLfHbVOqaq9UtUKqR/nZA/OY7xFjQMdfgfqeZIBK4d0hA6pgxArvdv8M+6NUzzBsjWLOtvApv8YHr4qM+Kpfg==}
+
+  '@changesets/types@6.0.0':
+    resolution: {integrity: sha512-b1UkfNulgKoWfqyHtzKS5fOZYSJO+77adgL7DLRDr+/7jhChN+QcHnbjiQVOz/U+Ts3PGNySq7diAItzDgugfQ==}
+
+  '@changesets/write@0.3.1':
+    resolution: {integrity: sha512-SyGtMXzH3qFqlHKcvFY2eX+6b0NGiFcNav8AFsYwy5l8hejOeoeTDemu5Yjmke2V5jpzY+pBvM0vCCQ3gdZpfw==}
+
+  '@livekit/changesets-changelog-github@0.0.4':
+    resolution: {integrity: sha512-MXaiLYwgkYciZb8G2wkVtZ1pJJzZmVx5cM30Q+ClslrIYyAqQhRbPmZDM79/5CGxb1MTemR/tfOM25tgJgAK0g==}
+
+  '@manypkg/find-root@1.1.0':
+    resolution: {integrity: sha512-mki5uBvhHzO8kYYix/WRy2WX8S3B5wdVSc9D6KcU5lQNglP2yt58/VfLuAK49glRXChosY8ap2oJ1qgma3GUVA==}
+
+  '@manypkg/get-packages@1.1.3':
+    resolution: {integrity: sha512-fo+QhuU3qE/2TQMQmbVMqaQ6EWbMhi4ABWP+O4AM1NqPBuy0OrApV5LO6BrrgnhtAHS2NH6RrVk9OL181tTi8A==}
+
+  '@nodelib/fs.scandir@2.1.5':
+    resolution: {integrity: sha512-vq24Bq3ym5HEQm2NKCr3yXDwjc7vTsEThRDnkp2DK9p1uqLR+DHurm/NOTo0KG7HYHU7eppKZj3MyqYuMBf62g==}
+    engines: {node: '>= 8'}
+
+  '@nodelib/fs.stat@2.0.5':
+    resolution: {integrity: sha512-RkhPPp2zrqDAQA/2jNhnztcPAlv64XdhIp7a7454A5ovI7Bukxgt7MX7udwAu3zg1DcpPU0rz3VV1SeaqvY4+A==}
+    engines: {node: '>= 8'}
+
+  '@nodelib/fs.walk@1.2.8':
+    resolution: {integrity: sha512-oGB+UxlgWcgQkgwo8GcEGwemoTFt3FIO9ababBmaGwXIoBKZ+GTy0pP185beGg7Llih/NSHSV2XAs1lnznocSg==}
+    engines: {node: '>= 8'}
+
+  '@types/node@12.20.55':
+    resolution: {integrity: sha512-J8xLz7q2OFulZ2cyGTLE1TbbZcjpno7FaN6zdJNrgAdrJ+DZzh/uFR6YrTb4C+nXakvud8Q4+rbhoIWlYQbUFQ==}
+
+  '@types/semver@7.5.8':
+    resolution: {integrity: sha512-I8EUhyrgfLrcTkzV3TSsGyl1tSuPrEDzr0yd5m90UgNxQkyDXULk3b6MlQqTCpZpNtWe1K0hzclnZkTcLBe2UQ==}
+
+  ansi-colors@4.1.3:
+    resolution: {integrity: sha512-/6w/C21Pm1A7aZitlI5Ni/2J6FFQN8i1Cvz3kHABAAbw93v/NlvKdVOqz7CCWz/3iv/JplRSEEZ83XION15ovw==}
+    engines: {node: '>=6'}
+
+  ansi-regex@5.0.1:
+    resolution: {integrity: sha512-quJQXlTSUGL2LH9SUXo8VwsY4soanhgo6LNSm84E1LBcE8s3O0wpdiRzyR9z/ZZJMlMWv37qOOb9pdJlMUEKFQ==}
+    engines: {node: '>=8'}
+
+  ansi-styles@3.2.1:
+    resolution: {integrity: sha512-VT0ZI6kZRdTh8YyJw3SMbYm/u+NqfsAxEpWO0Pf9sq8/e94WxxOpPKx9FR1FlyCtOVDNOQ+8ntlqFxiRc+r5qA==}
+    engines: {node: '>=4'}
+
+  argparse@1.0.10:
+    resolution: {integrity: sha512-o5Roy6tNG4SL/FOkCAN6RzjiakZS25RLYFrcMttJqbdd8BWrnA+fGz57iN5Pb06pvBGvl5gQ0B48dJlslXvoTg==}
+
+  array-union@2.1.0:
+    resolution: {integrity: sha512-HGyxoOTYUyCM6stUe6EJgnd4EoewAI7zMdfqO+kGjnlZmBDz/cR5pf8r/cR4Wq60sL/p0IkcjUEEPwS3GFrIyw==}
+    engines: {node: '>=8'}
+
+  better-path-resolve@1.0.0:
+    resolution: {integrity: sha512-pbnl5XzGBdrFU/wT4jqmJVPn2B6UHPBOhzMQkY/SPUPB6QtUXtmBHBIwCbXJol93mOpGMnQyP/+BB19q04xj7g==}
+    engines: {node: '>=4'}
+
+  braces@3.0.3:
+    resolution: {integrity: sha512-yQbXgO/OSZVD2IsiLlro+7Hf6Q18EJrKSEsdoMzKePKXct3gvD8oLcOQdIzGupr5Fj+EDe8gO/lxc1BzfMpxvA==}
+    engines: {node: '>=8'}
+
+  chalk@2.4.2:
+    resolution: {integrity: sha512-Mti+f9lpJNcwF4tWV8/OrTTtF1gZi+f8FqlyAdouralcFWFQWF2+NgCHShjkCb+IFBLq9buZwE1xckQU4peSuQ==}
+    engines: {node: '>=4'}
+
+  chardet@0.7.0:
+    resolution: {integrity: sha512-mT8iDcrh03qDGRRmoA2hmBJnxpllMR+0/0qlzjqZES6NdiWDcZkCNAk4rPFZ9Q85r27unkiNNg8ZOiwZXBHwcA==}
+
+  ci-info@3.9.0:
+    resolution: {integrity: sha512-NIxF55hv4nSqQswkAeiOi1r83xy8JldOFDTWiug55KBu9Jnblncd2U6ViHmYgHf01TPZS77NJBhBMKdWj9HQMQ==}
+    engines: {node: '>=8'}
+
+  color-convert@1.9.3:
+    resolution: {integrity: sha512-QfAUtd+vFdAtFQcC8CCyYt1fYWxSqAiK2cSD6zDB8N3cpsEBAvRxp9zOGg6G/SHHJYAT88/az/IuDGALsNVbGg==}
+
+  color-name@1.1.3:
+    resolution: {integrity: sha512-72fSenhMw2HZMTVHeCA9KCmpEIbzWiQsjN+BHcBbS9vr1mtt+vJjPdksIBNUmKAW8TFUDPJK5SUU3QhE9NEXDw==}
+
+  cross-spawn@5.1.0:
+    resolution: {integrity: sha512-pTgQJ5KC0d2hcY8eyL1IzlBPYjTkyH72XRZPnLyKus2mBfNjQs3klqbJU2VILqZryAZUt9JOb3h/mWMy23/f5A==}
+
+  dataloader@1.4.0:
+    resolution: {integrity: sha512-68s5jYdlvasItOJnCuI2Q9s4q98g0pCyL3HrcKJu8KNugUl8ahgmZYg38ysLTgQjjXX3H8CJLkAvWrclWfcalw==}
+
+  detect-indent@6.1.0:
+    resolution: {integrity: sha512-reYkTUJAZb9gUuZ2RvVCNhVHdg62RHnJ7WJl8ftMi4diZ6NWlciOzQN88pUhSELEwflJht4oQDv0F0BMlwaYtA==}
+    engines: {node: '>=8'}
+
+  dir-glob@3.0.1:
+    resolution: {integrity: sha512-WkrWp9GR4KXfKGYzOLmTuGVi1UWFfws377n9cc55/tb6DuqyF6pcQ5AbiHEshaDpY9v6oaSr2XCDidGmMwdzIA==}
+    engines: {node: '>=8'}
+
+  dotenv@8.6.0:
+    resolution: {integrity: sha512-IrPdXQsk2BbzvCBGBOTmmSH5SodmqZNt4ERAZDmW4CT+tL8VtvinqywuANaFu4bOMWki16nqf0e4oC0QIaDr/g==}
+    engines: {node: '>=10'}
+
+  enquirer@2.4.1:
+    resolution: {integrity: sha512-rRqJg/6gd538VHvR3PSrdRBb/1Vy2YfzHqzvbhGIQpDRKIa4FgV/54b5Q1xYSxOOwKvjXweS26E0Q+nAMwp2pQ==}
+    engines: {node: '>=8.6'}
+
+  escape-string-regexp@1.0.5:
+    resolution: {integrity: sha512-vbRorB5FUQWvla16U8R/qgaFIya2qGzwDrNmCZuYKrbdSUMG6I1ZCGQRefkRVhuOkIGVne7BQ35DSfo1qvJqFg==}
+    engines: {node: '>=0.8.0'}
+
+  esprima@4.0.1:
+    resolution: {integrity: sha512-eGuFFw7Upda+g4p+QHvnW0RyTX/SVeJBDM/gCtMARO0cLuT2HcEKnTPvhjV6aGeqrCB/sbNop0Kszm0jsaWU4A==}
+    engines: {node: '>=4'}
+    hasBin: true
+
+  extendable-error@0.1.7:
+    resolution: {integrity: sha512-UOiS2in6/Q0FK0R0q6UY9vYpQ21mr/Qn1KOnte7vsACuNJf514WvCCUHSRCPcgjPT2bAhNIJdlE6bVap1GKmeg==}
+
+  external-editor@3.1.0:
+    resolution: {integrity: sha512-hMQ4CX1p1izmuLYyZqLMO/qGNw10wSv9QDCPfzXfyFrOaCSSoRfqE1Kf1s5an66J5JZC62NewG+mK49jOCtQew==}
+    engines: {node: '>=4'}
+
+  fast-glob@3.3.2:
+    resolution: {integrity: sha512-oX2ruAFQwf/Orj8m737Y5adxDQO0LAB7/S5MnxCdTNDd4p6BsyIVsv9JQsATbTSq8KHRpLwIHbVlUNatxd+1Ow==}
+    engines: {node: '>=8.6.0'}
+
+  fastq@1.17.1:
+    resolution: {integrity: sha512-sRVD3lWVIXWg6By68ZN7vho9a1pQcN/WBFaAAsDDFzlJjvoGx0P8z7V1t72grFJfJhu3YPZBuu25f7Kaw2jN1w==}
+
+  fill-range@7.1.1:
+    resolution: {integrity: sha512-YsGpe3WHLK8ZYi4tWDg2Jy3ebRz2rXowDxnld4bkQB00cc/1Zw9AWnC0i9ztDJitivtQvaI9KaLyKrc+hBW0yg==}
+    engines: {node: '>=8'}
+
+  find-up@4.1.0:
+    resolution: {integrity: sha512-PpOwAdQ/YlXQ2vj8a3h8IipDuYRi3wceVQQGYWxNINccq40Anw7BlsEXCMbt1Zt+OLA6Fq9suIpIWD0OsnISlw==}
+    engines: {node: '>=8'}
+
+  find-up@5.0.0:
+    resolution: {integrity: sha512-78/PXT1wlLLDgTzDs7sjq9hzz0vXD+zn+7wypEe4fXQxCmdmqfGsEPQxmiCSQI3ajFV91bVSsvNtrJRiW6nGng==}
+    engines: {node: '>=10'}
+
+  find-yarn-workspace-root2@1.2.16:
+    resolution: {integrity: sha512-hr6hb1w8ePMpPVUK39S4RlwJzi+xPLuVuG8XlwXU3KD5Yn3qgBWVfy3AzNlDhWvE1EORCE65/Qm26rFQt3VLVA==}
+
+  fs-extra@7.0.1:
+    resolution: {integrity: sha512-YJDaCJZEnBmcbw13fvdAM9AwNOJwOzrE4pqMqBq5nFiEqXUqHwlK4B+3pUw6JNvfSPtX05xFHtYy/1ni01eGCw==}
+    engines: {node: '>=6 <7 || >=8'}
+
+  fs-extra@8.1.0:
+    resolution: {integrity: sha512-yhlQgA6mnOJUKOsRUFsgJdQCvkKhcz8tlZG5HBQfReYZy46OwLcY+Zia0mtdHsOo9y/hP+CxMN0TU9QxoOtG4g==}
+    engines: {node: '>=6 <7 || >=8'}
+
+  glob-parent@5.1.2:
+    resolution: {integrity: sha512-AOIgSQCepiJYwP3ARnGx+5VnTu2HBYdzbGP45eLw1vr3zB3vZLeyed1sC9hnbcOc9/SrMyM5RPQrkGz4aS9Zow==}
+    engines: {node: '>= 6'}
+
+  globby@11.1.0:
+    resolution: {integrity: sha512-jhIXaOzy1sb8IyocaruWSn1TjmnBVs8Ayhcy83rmxNJ8q2uWKCAj3CnJY+KpGSXCueAPc0i05kVvVKtP1t9S3g==}
+    engines: {node: '>=10'}
+
+  graceful-fs@4.2.11:
+    resolution: {integrity: sha512-RbJ5/jmFcNNCcDV5o9eTnBLJ/HszWV0P73bc+Ff4nS/rJj+YaS6IGyiOL0VoBYX+l1Wrl3k63h/KrH+nhJ0XvQ==}
+
+  has-flag@3.0.0:
+    resolution: {integrity: sha512-sKJf1+ceQBr4SMkvQnBDNDtf4TXpVhVGateu0t918bl30FnbE2m4vNLX+VWe/dpjlb+HugGYzW7uQXH98HPEYw==}
+    engines: {node: '>=4'}
+
+  human-id@1.0.2:
+    resolution: {integrity: sha512-UNopramDEhHJD+VR+ehk8rOslwSfByxPIZyJRfV739NDhN5LF1fa1MqnzKm2lGTQRjNrjK19Q5fhkgIfjlVUKw==}
+
+  iconv-lite@0.4.24:
+    resolution: {integrity: sha512-v3MXnZAcvnywkTUEZomIActle7RXXeedOR31wwl7VlyoXO4Qi9arvSenNQWne1TcRwhCL1HwLI21bEqdpj8/rA==}
+    engines: {node: '>=0.10.0'}
+
+  ignore@5.3.1:
+    resolution: {integrity: sha512-5Fytz/IraMjqpwfd34ke28PTVMjZjJG2MPn5t7OE4eUCUNf8BAa7b5WUS9/Qvr6mwOQS7Mk6vdsMno5he+T8Xw==}
+    engines: {node: '>= 4'}
+
+  is-extglob@2.1.1:
+    resolution: {integrity: sha512-SbKbANkN603Vi4jEZv49LeVJMn4yGwsbzZworEoyEiutsN3nJYdbO36zfhGJ6QEDpOZIFkDtnq5JRxmvl3jsoQ==}
+    engines: {node: '>=0.10.0'}
+
+  is-glob@4.0.3:
+    resolution: {integrity: sha512-xelSayHH36ZgE7ZWhli7pW34hNbNl8Ojv5KVmkJD4hBdD3th8Tfk9vYasLM+mXWOZhFkgZfxhLSnrwRr4elSSg==}
+    engines: {node: '>=0.10.0'}
+
+  is-number@7.0.0:
+    resolution: {integrity: sha512-41Cifkg6e8TylSpdtTpeLVMqvSBEVzTttHvERD741+pnZ8ANv0004MRL43QKPDlK9cGvNp6NZWZUBlbGXYxxng==}
+    engines: {node: '>=0.12.0'}
+
+  is-subdir@1.2.0:
+    resolution: {integrity: sha512-2AT6j+gXe/1ueqbW6fLZJiIw3F8iXGJtt0yDrZaBhAZEG1raiTxKWU+IPqMCzQAXOUCKdA4UDMgacKH25XG2Cw==}
+    engines: {node: '>=4'}
+
+  is-windows@1.0.2:
+    resolution: {integrity: sha512-eXK1UInq2bPmjyX6e3VHIzMLobc4J94i4AWn+Hpq3OU5KkrRC96OAcR3PRJ/pGu6m8TRnBHP9dkXQVsT/COVIA==}
+    engines: {node: '>=0.10.0'}
+
+  isexe@2.0.0:
+    resolution: {integrity: sha512-RHxMLp9lnKHGHRng9QFhRCMbYAcVpn69smSGcq3f36xjgVVWThj4qqLbTLlq7Ssj8B+fIQ1EuCEGI2lKsyQeIw==}
+
+  js-yaml@3.14.1:
+    resolution: {integrity: sha512-okMH7OXXJ7YrN9Ok3/SXrnu4iX9yOk+25nqX4imS2npuvTYDmo/QEZoqwZkYaIDk3jVvBOTOIEgEhaLOynBS9g==}
+    hasBin: true
+
+  jsonfile@4.0.0:
+    resolution: {integrity: sha512-m6F1R3z8jjlf2imQHS2Qez5sjKWQzbuuhuJ/FKYFRZvPE3PuHcSMVZzfsLhGVOkfd20obL5SWEBew5ShlquNxg==}
+
+  load-yaml-file@0.2.0:
+    resolution: {integrity: sha512-OfCBkGEw4nN6JLtgRidPX6QxjBQGQf72q3si2uvqyFEMbycSFFHwAZeXx6cJgFM9wmLrf9zBwCP3Ivqa+LLZPw==}
+    engines: {node: '>=6'}
+
+  locate-path@5.0.0:
+    resolution: {integrity: sha512-t7hw9pI+WvuwNJXwk5zVHpyhIqzg2qTlklJOf0mVxGSbe3Fp2VieZcduNYjaLDoy6p9uGpQEGWG87WpMKlNq8g==}
+    engines: {node: '>=8'}
+
+  locate-path@6.0.0:
+    resolution: {integrity: sha512-iPZK6eYjbxRu3uB4/WZ3EsEIMJFMqAoopl3R+zuq0UjcAm/MO6KCweDgPfP3elTztoKP3KtnVHxTn2NHBSDVUw==}
+    engines: {node: '>=10'}
+
+  lodash.startcase@4.4.0:
+    resolution: {integrity: sha512-+WKqsK294HMSc2jEbNgpHpd0JfIBhp7rEV4aqXWqFr6AlXov+SlcgB1Fv01y2kGe3Gc8nMW7VA0SrGuSkRfIEg==}
+
+  lru-cache@4.1.5:
+    resolution: {integrity: sha512-sWZlbEP2OsHNkXrMl5GYk/jKk70MBng6UU4YI/qGDYbgf6YbP4EvmqISbXCoJiRKs+1bSpFHVgQxvJ17F2li5g==}
+
+  merge2@1.4.1:
+    resolution: {integrity: sha512-8q7VEgMJW4J8tcfVPy8g09NcQwZdbwFEqhe/WZkoIzjn/3TGDwtOCYtXGxA3O8tPzpczCCDgv+P2P5y00ZJOOg==}
+    engines: {node: '>= 8'}
+
+  micromatch@4.0.7:
+    resolution: {integrity: sha512-LPP/3KorzCwBxfeUuZmaR6bG2kdeHSbe0P2tY3FLRU4vYrjYz5hI4QZwV0njUx3jeuKe67YukQ1LSPZBKDqO/Q==}
+    engines: {node: '>=8.6'}
+
+  mri@1.2.0:
+    resolution: {integrity: sha512-tzzskb3bG8LvYGFF/mDTpq3jpI6Q9wc3LEmBaghu+DdCssd1FakN7Bc0hVNmEyGq1bq3RgfkCb3cmQLpNPOroA==}
+    engines: {node: '>=4'}
+
+  node-fetch@2.7.0:
+    resolution: {integrity: sha512-c4FRfUm/dbcWZ7U+1Wq0AwCyFL+3nt2bEw05wfxSz+DWpWsitgmSgYmy2dQdWyKC1694ELPqMs/YzUSNozLt8A==}
+    engines: {node: 4.x || >=6.0.0}
+    peerDependencies:
+      encoding: ^0.1.0
+    peerDependenciesMeta:
+      encoding:
+        optional: true
+
+  os-tmpdir@1.0.2:
+    resolution: {integrity: sha512-D2FR03Vir7FIu45XBY20mTb+/ZSWB00sjU9jdQXt83gDrI4Ztz5Fs7/yy74g2N5SVQY4xY1qDr4rNddwYRVX0g==}
+    engines: {node: '>=0.10.0'}
+
+  outdent@0.5.0:
+    resolution: {integrity: sha512-/jHxFIzoMXdqPzTaCpFzAAWhpkSjZPF4Vsn6jAfNpmbH/ymsmd7Qc6VE9BGn0L6YMj6uwpQLxCECpus4ukKS9Q==}
+
+  p-filter@2.1.0:
+    resolution: {integrity: sha512-ZBxxZ5sL2HghephhpGAQdoskxplTwr7ICaehZwLIlfL6acuVgZPm8yBNuRAFBGEqtD/hmUeq9eqLg2ys9Xr/yw==}
+    engines: {node: '>=8'}
+
+  p-limit@2.3.0:
+    resolution: {integrity: sha512-//88mFWSJx8lxCzwdAABTJL2MyWB12+eIY7MDL2SqLmAkeKU9qxRvWuSyTjm3FUmpBEMuFfckAIqEaVGUDxb6w==}
+    engines: {node: '>=6'}
+
+  p-limit@3.1.0:
+    resolution: {integrity: sha512-TYOanM3wGwNGsZN2cVTYPArw454xnXj5qmWF1bEoAc4+cU/ol7GVh7odevjp1FNHduHc3KZMcFduxU5Xc6uJRQ==}
+    engines: {node: '>=10'}
+
+  p-locate@4.1.0:
+    resolution: {integrity: sha512-R79ZZ/0wAxKGu3oYMlz8jy/kbhsNrS7SKZ7PxEHBgJ5+F2mtFW2fK2cOtBh1cHYkQsbzFV7I+EoRKe6Yt0oK7A==}
+    engines: {node: '>=8'}
+
+  p-locate@5.0.0:
+    resolution: {integrity: sha512-LaNjtRWUBY++zB5nE/NwcaoMylSPk+S+ZHNB1TzdbMJMny6dynpAGt7X/tl/QYq3TIeE6nxHppbo2LGymrG5Pw==}
+    engines: {node: '>=10'}
+
+  p-map@2.1.0:
+    resolution: {integrity: sha512-y3b8Kpd8OAN444hxfBbFfj1FY/RjtTd8tzYwhUqNYXx0fXx2iX4maP4Qr6qhIKbQXI02wTLAda4fYUbDagTUFw==}
+    engines: {node: '>=6'}
+
+  p-try@2.2.0:
+    resolution: {integrity: sha512-R4nPAVTAU0B9D35/Gk3uJf/7XYbQcyohSKdvAxIRSNghFl4e71hVoGnBNQz9cWaXxO2I10KTC+3jMdvvoKw6dQ==}
+    engines: {node: '>=6'}
+
+  path-exists@4.0.0:
+    resolution: {integrity: sha512-ak9Qy5Q7jYb2Wwcey5Fpvg2KoAc/ZIhLSLOSBmRmygPsGwkVVt0fZa0qrtMz+m6tJTAHfZQ8FnmB4MG4LWy7/w==}
+    engines: {node: '>=8'}
+
+  path-type@4.0.0:
+    resolution: {integrity: sha512-gDKb8aZMDeD/tZWs9P6+q0J9Mwkdl6xMV8TjnGP3qJVJ06bdMgkbBlLU8IdfOsIsFz2BW1rNVT3XuNEl8zPAvw==}
+    engines: {node: '>=8'}
+
+  picomatch@2.3.1:
+    resolution: {integrity: sha512-JU3teHTNjmE2VCGFzuY8EXzCDVwEqB2a8fsIvwaStHhAWJEeVd1o1QD80CU6+ZdEXXSLbSsuLwJjkCBWqRQUVA==}
+    engines: {node: '>=8.6'}
+
+  pify@4.0.1:
+    resolution: {integrity: sha512-uB80kBFb/tfd68bVleG9T5GGsGPjJrLAUpR5PZIrhBnIaRTQRjqdJSsIKkOP6OAIFbj7GOrcudc5pNjZ+geV2g==}
+    engines: {node: '>=6'}
+
+  pkg-dir@4.2.0:
+    resolution: {integrity: sha512-HRDzbaKjC+AOWVXxAU/x54COGeIv9eb+6CkDSQoNTt4XyWoIJvuPsXizxu/Fr23EiekbtZwmh1IcIG/l/a10GQ==}
+    engines: {node: '>=8'}
+
+  preferred-pm@3.1.4:
+    resolution: {integrity: sha512-lEHd+yEm22jXdCphDrkvIJQU66EuLojPPtvZkpKIkiD+l0DMThF/niqZKJSoU8Vl7iuvtmzyMhir9LdVy5WMnA==}
+    engines: {node: '>=10'}
+
+  prettier@2.8.8:
+    resolution: {integrity: sha512-tdN8qQGvNjw4CHbY+XXk0JgCXn9QiF21a55rBe5LJAU+kDyC4WQn4+awm2Xfk2lQMk5fKup9XgzTZtGkjBdP9Q==}
+    engines: {node: '>=10.13.0'}
+    hasBin: true
+
+  pseudomap@1.0.2:
+    resolution: {integrity: sha512-b/YwNhb8lk1Zz2+bXXpS/LK9OisiZZ1SNsSLxN1x2OXVEhW2Ckr/7mWE5vrC1ZTiJlD9g19jWszTmJsB+oEpFQ==}
+
+  queue-microtask@1.2.3:
+    resolution: {integrity: sha512-NuaNSa6flKT5JaSYQzJok04JzTL1CA6aGhv5rfLW3PgqA+M2ChpZQnAC8h8i4ZFkBS8X5RqkDBHA7r4hej3K9A==}
+
+  read-yaml-file@1.1.0:
+    resolution: {integrity: sha512-VIMnQi/Z4HT2Fxuwg5KrY174U1VdUIASQVWXXyqtNRtxSr9IYkn1rsI6Tb6HsrHCmB7gVpNwX6JxPTHcH6IoTA==}
+    engines: {node: '>=6'}
+
+  regenerator-runtime@0.14.1:
+    resolution: {integrity: sha512-dYnhHh0nJoMfnkZs6GmmhFknAGRrLznOu5nc9ML+EJxGvrx6H7teuevqVqCuPcPK//3eDrrjQhehXVx9cnkGdw==}
+
+  resolve-from@5.0.0:
+    resolution: {integrity: sha512-qYg9KP24dD5qka9J47d0aVky0N+b4fTU89LN9iDnjB5waksiC49rvMB0PrUJQGoTmH50XPiqOvAjDfaijGxYZw==}
+    engines: {node: '>=8'}
+
+  reusify@1.0.4:
+    resolution: {integrity: sha512-U9nH88a3fc/ekCF1l0/UP1IosiuIjyTh7hBvXVMHYgVcfGvt897Xguj2UOLDeI5BG2m7/uwyaLVT6fbtCwTyzw==}
+    engines: {iojs: '>=1.0.0', node: '>=0.10.0'}
+
+  run-parallel@1.2.0:
+    resolution: {integrity: sha512-5l4VyZR86LZ/lDxZTR6jqL8AFE2S0IFLMP26AbjsLVADxHdhB/c0GUsH+y39UfCi3dzz8OlQuPmnaJOMoDHQBA==}
+
+  safer-buffer@2.1.2:
+    resolution: {integrity: sha512-YZo3K82SD7Riyi0E1EQPojLz7kpepnSQI9IyPbHHg1XXXevb5dJI7tpyN2ADxGcQbHG7vcyRHk0cbwqcQriUtg==}
+
+  semver@7.6.3:
+    resolution: {integrity: sha512-oVekP1cKtI+CTDvHWYFUcMtsK/00wmAEfyqKfNdARm8u1wNVhSgaX7A8d4UuIlUI5e84iEwOhs7ZPYRmzU9U6A==}
+    engines: {node: '>=10'}
+    hasBin: true
+
+  shebang-command@1.2.0:
+    resolution: {integrity: sha512-EV3L1+UQWGor21OmnvojK36mhg+TyIKDh3iFBKBohr5xeXIhNBcx8oWdgkTEEQ+BEFFYdLRuqMfd5L84N1V5Vg==}
+    engines: {node: '>=0.10.0'}
+
+  shebang-regex@1.0.0:
+    resolution: {integrity: sha512-wpoSFAxys6b2a2wHZ1XpDSgD7N9iVjg29Ph9uV/uaP9Ex/KXlkTZTeddxDPSYQpgvzKLGJke2UU0AzoGCjNIvQ==}
+    engines: {node: '>=0.10.0'}
+
+  signal-exit@3.0.7:
+    resolution: {integrity: sha512-wnD2ZE+l+SPC/uoS0vXeE9L1+0wuaMqKlfz9AMUo38JsyLSBWSFcHR1Rri62LZc12vLr1gb3jl7iwQhgwpAbGQ==}
+
+  slash@3.0.0:
+    resolution: {integrity: sha512-g9Q1haeby36OSStwb4ntCGGGaKsaVSjQ68fBxoQcutl5fS1vuY18H3wSt3jFyFtrkx+Kz0V1G85A4MyAdDMi2Q==}
+    engines: {node: '>=8'}
+
+  spawndamnit@2.0.0:
+    resolution: {integrity: sha512-j4JKEcncSjFlqIwU5L/rp2N5SIPsdxaRsIv678+TZxZ0SRDJTm8JrxJMjE/XuiEZNEir3S8l0Fa3Ke339WI4qA==}
 
-  livekit-plugins/livekit-plugins-deepgram: {}
+  sprintf-js@1.0.3:
+    resolution: {integrity: sha512-D9cPgkvLlV3t3IzL0D0YLvGA9Ahk4PcvVwUbN0dSGr1aP0Nrt4AEnTUbuGvquEC0mA64Gqt1fzirlRs5ibXx8g==}
 
-  livekit-plugins/livekit-plugins-elevenlabs: {}
+  strip-ansi@6.0.1:
+    resolution: {integrity: sha512-Y38VPSHcqkFrCpFnQ9vuSXmquuv5oXOKpGeT6aGrr3o3Gc9AlVa6JBfUSOCnbxGGZF+/0ooI7KrPuUSztUdU5A==}
+    engines: {node: '>=8'}
 
-  livekit-plugins/livekit-plugins-google: {}
+  strip-bom@3.0.0:
+    resolution: {integrity: sha512-vavAMRXOgBVNF6nyEEmL3DBK19iRpDcoIwW+swQ+CbGiu7lju6t+JklA1MHweoWtadgt4ISVUsXLyDq34ddcwA==}
+    engines: {node: '>=4'}
 
-  livekit-plugins/livekit-plugins-minimal: {}
+  supports-color@5.5.0:
+    resolution: {integrity: sha512-QjVjwdXIt408MIiAqCX4oUKsgU2EqAGzs2Ppkm4aQYbjm+ZEWEcW4SfFNTr4uMNZma0ey4f5lgLrkB0aX0QMow==}
+    engines: {node: '>=4'}
 
-  livekit-plugins/livekit-plugins-nltk: {}
+  term-size@2.2.1:
+    resolution: {integrity: sha512-wK0Ri4fOGjv/XPy8SBHZChl8CM7uMc5VML7SqiQ0zG7+J5Vr+RMQDoHa2CNT6KHUnTGIXH34UDMkPzAUyapBZg==}
+    engines: {node: '>=8'}
 
-  livekit-plugins/livekit-plugins-openai: {}
+  tmp@0.0.33:
+    resolution: {integrity: sha512-jRCJlojKnZ3addtTOjdIqoRuPEKBvNXcGYqzO6zWZX8KfKEpnGY5jfggJQ3EjKuu8D4bJRr0y+cYJFmYbImXGw==}
+    engines: {node: '>=0.6.0'}
 
-  livekit-plugins/livekit-plugins-rag: {}
+  to-regex-range@5.0.1:
+    resolution: {integrity: sha512-65P7iz6X5yEr1cwcgvQxbbIw7Uk3gOy5dIdtZ4rDveLqhrdJP+Li/Hx6tyK0NEb+2GCyneCMJiGqrADCSNk8sQ==}
+    engines: {node: '>=8.0'}
 
-  livekit-plugins/livekit-plugins-silero: {}
+  tr46@0.0.3:
+    resolution: {integrity: sha512-N3WMsuqV66lT30CrXNbEjx4GEwlow3v6rr4mCcv6prnfwhS01rkgyFdjPNBYd9br7LpXV1+Emh01fHnq2Gdgrw==}
 
-packages:
+  universalify@0.1.2:
+    resolution: {integrity: sha512-rBJeI5CXAlmy1pV+617WB9J63U6XcazHHF2f2dbJix4XzpUF0RS3Zbj0FGIOCAva5P/d/GBOYaACQ1w+0azUkg==}
+    engines: {node: '>= 4.0.0'}
 
-  /@babel/runtime@7.24.8:
-    resolution: {integrity: sha512-5F7SDGs1T72ZczbRwbGO9lQi0NLjQxzl6i4lJxLxfW9U5UluCSyEJeniWvnhl3/euNiqQVbo8zruhsDfid0esA==}
-    engines: {node: '>=6.9.0'}
+  webidl-conversions@3.0.1:
+    resolution: {integrity: sha512-2JAn3z8AR6rjK8Sm8orRC0h/bcl/DqL7tRPdGZ4I1CjdF+EaMLmYxBHyXuKL849eucPFhvBoxMsflfOb8kxaeQ==}
+
+  whatwg-url@5.0.0:
+    resolution: {integrity: sha512-saE57nupxk6v3HY35+jzBwYa0rKSy0XR8JSxZPwgLr7ys0IBzhGviA1/TUGJLmSVqs8pb9AnvICXEuOHLprYTw==}
+
+  which-pm@2.2.0:
+    resolution: {integrity: sha512-MOiaDbA5ZZgUjkeMWM5EkJp4loW5ZRoa5bc3/aeMox/PJelMhE6t7S/mLuiY43DBupyxH+S0U1bTui9kWUlmsw==}
+    engines: {node: '>=8.15'}
+
+  which@1.3.1:
+    resolution: {integrity: sha512-HxJdYWq1MTIQbJ3nw0cqssHoTNU267KlrDuGZ1WYlxDStUtKUhOaJmh112/TZmHxxUfuJqPXSOm7tDyas0OSIQ==}
+    hasBin: true
+
+  yallist@2.1.2:
+    resolution: {integrity: sha512-ncTzHV7NvsQZkYe1DW7cbDLm0YpzHmZF5r/iyP3ZnQtMiJ+pjzisCiMNI+Sj+xQF5pXhSHxSB3uDbsBTzY/c2A==}
+
+  yocto-queue@0.1.0:
+    resolution: {integrity: sha512-rVksvsnNCdJ/ohGc6xgPwyN8eheCxsiLM8mxuE/t/mOVqJewPuO1miLpTHQiRgTKCLexL4MeAFVagts7HmNZ2Q==}
+    engines: {node: '>=10'}
+
+snapshots:
+
+  '@babel/runtime@7.24.8':
     dependencies:
       regenerator-runtime: 0.14.1
-    dev: true
 
-  /@changesets/apply-release-plan@7.0.4:
-    resolution: {integrity: sha512-HLFwhKWayKinWAul0Vj+76jVx1Pc2v55MGPVjZ924Y/ROeSsBMFutv9heHmCUj48lJyRfOTJG5+ar+29FUky/A==}
+  '@changesets/apply-release-plan@7.0.4':
     dependencies:
       '@babel/runtime': 7.24.8
       '@changesets/config': 3.0.2
@@ -63,10 +533,8 @@ packages:
       prettier: 2.8.8
       resolve-from: 5.0.0
       semver: 7.6.3
-    dev: true
 
-  /@changesets/assemble-release-plan@6.0.3:
-    resolution: {integrity: sha512-bLNh9/Lgl1VwkjWZTq8JmRqH+hj7/Yzfz0jsQ/zJJ+FTmVqmqPj3szeKOri8O/hEM8JmHW019vh2gTO9iq5Cuw==}
+  '@changesets/assemble-release-plan@6.0.3':
     dependencies:
       '@babel/runtime': 7.24.8
       '@changesets/errors': 0.2.0
@@ -75,17 +543,12 @@ packages:
       '@changesets/types': 6.0.0
       '@manypkg/get-packages': 1.1.3
       semver: 7.6.3
-    dev: true
 
-  /@changesets/changelog-git@0.2.0:
-    resolution: {integrity: sha512-bHOx97iFI4OClIT35Lok3sJAwM31VbUM++gnMBV16fdbtBhgYu4dxsphBF/0AZZsyAHMrnM0yFcj5gZM1py6uQ==}
+  '@changesets/changelog-git@0.2.0':
     dependencies:
       '@changesets/types': 6.0.0
-    dev: true
 
-  /@changesets/cli@2.27.7:
-    resolution: {integrity: sha512-6lr8JltiiXPIjDeYg4iM2MeePP6VN/JkmqBsVA5XRiy01hGS3y629LtSDvKcycj/w/5Eur1rEwby/MjcYS+e2A==}
-    hasBin: true
+  '@changesets/cli@2.27.7':
     dependencies:
       '@babel/runtime': 7.24.8
       '@changesets/apply-release-plan': 7.0.4
@@ -119,10 +582,8 @@ packages:
       semver: 7.6.3
       spawndamnit: 2.0.0
       term-size: 2.2.1
-    dev: true
 
-  /@changesets/config@3.0.2:
-    resolution: {integrity: sha512-cdEhS4t8woKCX2M8AotcV2BOWnBp09sqICxKapgLHf9m5KdENpWjyrFNMjkLqGJtUys9U+w93OxWT0czorVDfw==}
+  '@changesets/config@3.0.2':
     dependencies:
       '@changesets/errors': 0.2.0
       '@changesets/get-dependents-graph': 2.1.1
@@ -131,35 +592,27 @@ packages:
       '@manypkg/get-packages': 1.1.3
       fs-extra: 7.0.1
       micromatch: 4.0.7
-    dev: true
 
-  /@changesets/errors@0.2.0:
-    resolution: {integrity: sha512-6BLOQUscTpZeGljvyQXlWOItQyU71kCdGz7Pi8H8zdw6BI0g3m43iL4xKUVPWtG+qrrL9DTjpdn8eYuCQSRpow==}
+  '@changesets/errors@0.2.0':
     dependencies:
       extendable-error: 0.1.7
-    dev: true
 
-  /@changesets/get-dependents-graph@2.1.1:
-    resolution: {integrity: sha512-LRFjjvigBSzfnPU2n/AhFsuWR5DK++1x47aq6qZ8dzYsPtS/I5mNhIGAS68IAxh1xjO9BTtz55FwefhANZ+FCA==}
+  '@changesets/get-dependents-graph@2.1.1':
     dependencies:
       '@changesets/types': 6.0.0
       '@manypkg/get-packages': 1.1.3
       chalk: 2.4.2
       fs-extra: 7.0.1
       semver: 7.6.3
-    dev: true
 
-  /@changesets/get-github-info@0.5.2:
-    resolution: {integrity: sha512-JppheLu7S114aEs157fOZDjFqUDpm7eHdq5E8SSR0gUBTEK0cNSHsrSR5a66xs0z3RWuo46QvA3vawp8BxDHvg==}
+  '@changesets/get-github-info@0.5.2':
     dependencies:
       dataloader: 1.4.0
       node-fetch: 2.7.0
     transitivePeerDependencies:
       - encoding
-    dev: true
 
-  /@changesets/get-release-plan@4.0.3:
-    resolution: {integrity: sha512-6PLgvOIwTSdJPTtpdcr3sLtGatT+Jr22+cQwEBJBy6wP0rjB4yJ9lv583J9fVpn1bfQlBkDa8JxbS2g/n9lIyA==}
+  '@changesets/get-release-plan@4.0.3':
     dependencies:
       '@babel/runtime': 7.24.8
       '@changesets/assemble-release-plan': 6.0.3
@@ -168,14 +621,10 @@ packages:
       '@changesets/read': 0.6.0
       '@changesets/types': 6.0.0
       '@manypkg/get-packages': 1.1.3
-    dev: true
 
-  /@changesets/get-version-range-type@0.4.0:
-    resolution: {integrity: sha512-hwawtob9DryoGTpixy1D3ZXbGgJu1Rhr+ySH2PvTLHvkZuQ7sRT4oQwMh0hbqZH1weAooedEjRsbrWcGLCeyVQ==}
-    dev: true
+  '@changesets/get-version-range-type@0.4.0': {}
 
-  /@changesets/git@3.0.0:
-    resolution: {integrity: sha512-vvhnZDHe2eiBNRFHEgMiGd2CT+164dfYyrJDhwwxTVD/OW0FUD6G7+4DIx1dNwkwjHyzisxGAU96q0sVNBns0w==}
+  '@changesets/git@3.0.0':
     dependencies:
       '@babel/runtime': 7.24.8
       '@changesets/errors': 0.2.0
@@ -184,33 +633,25 @@ packages:
       is-subdir: 1.2.0
       micromatch: 4.0.7
       spawndamnit: 2.0.0
-    dev: true
 
-  /@changesets/logger@0.1.0:
-    resolution: {integrity: sha512-pBrJm4CQm9VqFVwWnSqKEfsS2ESnwqwH+xR7jETxIErZcfd1u2zBSqrHbRHR7xjhSgep9x2PSKFKY//FAshA3g==}
+  '@changesets/logger@0.1.0':
     dependencies:
       chalk: 2.4.2
-    dev: true
 
-  /@changesets/parse@0.4.0:
-    resolution: {integrity: sha512-TS/9KG2CdGXS27S+QxbZXgr8uPsP4yNJYb4BC2/NeFUj80Rni3TeD2qwWmabymxmrLo7JEsytXH1FbpKTbvivw==}
+  '@changesets/parse@0.4.0':
     dependencies:
       '@changesets/types': 6.0.0
       js-yaml: 3.14.1
-    dev: true
 
-  /@changesets/pre@2.0.0:
-    resolution: {integrity: sha512-HLTNYX/A4jZxc+Sq8D1AMBsv+1qD6rmmJtjsCJa/9MSRybdxh0mjbTvE6JYZQ/ZiQ0mMlDOlGPXTm9KLTU3jyw==}
+  '@changesets/pre@2.0.0':
     dependencies:
       '@babel/runtime': 7.24.8
       '@changesets/errors': 0.2.0
       '@changesets/types': 6.0.0
       '@manypkg/get-packages': 1.1.3
       fs-extra: 7.0.1
-    dev: true
 
-  /@changesets/read@0.6.0:
-    resolution: {integrity: sha512-ZypqX8+/im1Fm98K4YcZtmLKgjs1kDQ5zHpc2U1qdtNBmZZfo/IBiG162RoP0CUF05tvp2y4IspH11PLnPxuuw==}
+  '@changesets/read@0.6.0':
     dependencies:
       '@babel/runtime': 7.24.8
       '@changesets/git': 3.0.0
@@ -220,59 +661,43 @@ packages:
       chalk: 2.4.2
       fs-extra: 7.0.1
       p-filter: 2.1.0
-    dev: true
 
-  /@changesets/should-skip-package@0.1.0:
-    resolution: {integrity: sha512-FxG6Mhjw7yFStlSM7Z0Gmg3RiyQ98d/9VpQAZ3Fzr59dCOM9G6ZdYbjiSAt0XtFr9JR5U2tBaJWPjrkGGc618g==}
+  '@changesets/should-skip-package@0.1.0':
     dependencies:
       '@babel/runtime': 7.24.8
       '@changesets/types': 6.0.0
       '@manypkg/get-packages': 1.1.3
-    dev: true
 
-  /@changesets/types@4.1.0:
-    resolution: {integrity: sha512-LDQvVDv5Kb50ny2s25Fhm3d9QSZimsoUGBsUioj6MC3qbMUCuC8GPIvk/M6IvXx3lYhAs0lwWUQLb+VIEUCECw==}
-    dev: true
+  '@changesets/types@4.1.0': {}
 
-  /@changesets/types@5.2.1:
-    resolution: {integrity: sha512-myLfHbVOqaq9UtUKqR/nZA/OY7xFjQMdfgfqeZIBK4d0hA6pgxArvdv8M+6NUzzBsjWLOtvApv8YHr4qM+Kpfg==}
-    dev: true
+  '@changesets/types@5.2.1': {}
 
-  /@changesets/types@6.0.0:
-    resolution: {integrity: sha512-b1UkfNulgKoWfqyHtzKS5fOZYSJO+77adgL7DLRDr+/7jhChN+QcHnbjiQVOz/U+Ts3PGNySq7diAItzDgugfQ==}
-    dev: true
+  '@changesets/types@6.0.0': {}
 
-  /@changesets/write@0.3.1:
-    resolution: {integrity: sha512-SyGtMXzH3qFqlHKcvFY2eX+6b0NGiFcNav8AFsYwy5l8hejOeoeTDemu5Yjmke2V5jpzY+pBvM0vCCQ3gdZpfw==}
+  '@changesets/write@0.3.1':
     dependencies:
       '@babel/runtime': 7.24.8
       '@changesets/types': 6.0.0
       fs-extra: 7.0.1
       human-id: 1.0.2
       prettier: 2.8.8
-    dev: true
 
-  /@livekit/changesets-changelog-github@0.0.4:
-    resolution: {integrity: sha512-MXaiLYwgkYciZb8G2wkVtZ1pJJzZmVx5cM30Q+ClslrIYyAqQhRbPmZDM79/5CGxb1MTemR/tfOM25tgJgAK0g==}
+  '@livekit/changesets-changelog-github@0.0.4':
     dependencies:
       '@changesets/get-github-info': 0.5.2
       '@changesets/types': 5.2.1
       dotenv: 8.6.0
     transitivePeerDependencies:
       - encoding
-    dev: true
 
-  /@manypkg/find-root@1.1.0:
-    resolution: {integrity: sha512-mki5uBvhHzO8kYYix/WRy2WX8S3B5wdVSc9D6KcU5lQNglP2yt58/VfLuAK49glRXChosY8ap2oJ1qgma3GUVA==}
+  '@manypkg/find-root@1.1.0':
     dependencies:
       '@babel/runtime': 7.24.8
       '@types/node': 12.20.55
       find-up: 4.1.0
       fs-extra: 8.1.0
-    dev: true
 
-  /@manypkg/get-packages@1.1.3:
-    resolution: {integrity: sha512-fo+QhuU3qE/2TQMQmbVMqaQ6EWbMhi4ABWP+O4AM1NqPBuy0OrApV5LO6BrrgnhtAHS2NH6RrVk9OL181tTi8A==}
+  '@manypkg/get-packages@1.1.3':
     dependencies:
       '@babel/runtime': 7.24.8
       '@changesets/types': 4.1.0
@@ -280,243 +705,142 @@ packages:
       fs-extra: 8.1.0
       globby: 11.1.0
       read-yaml-file: 1.1.0
-    dev: true
 
-  /@nodelib/fs.scandir@2.1.5:
-    resolution: {integrity: sha512-vq24Bq3ym5HEQm2NKCr3yXDwjc7vTsEThRDnkp2DK9p1uqLR+DHurm/NOTo0KG7HYHU7eppKZj3MyqYuMBf62g==}
-    engines: {node: '>= 8'}
+  '@nodelib/fs.scandir@2.1.5':
     dependencies:
       '@nodelib/fs.stat': 2.0.5
       run-parallel: 1.2.0
-    dev: true
 
-  /@nodelib/fs.stat@2.0.5:
-    resolution: {integrity: sha512-RkhPPp2zrqDAQA/2jNhnztcPAlv64XdhIp7a7454A5ovI7Bukxgt7MX7udwAu3zg1DcpPU0rz3VV1SeaqvY4+A==}
-    engines: {node: '>= 8'}
-    dev: true
+  '@nodelib/fs.stat@2.0.5': {}
 
-  /@nodelib/fs.walk@1.2.8:
-    resolution: {integrity: sha512-oGB+UxlgWcgQkgwo8GcEGwemoTFt3FIO9ababBmaGwXIoBKZ+GTy0pP185beGg7Llih/NSHSV2XAs1lnznocSg==}
-    engines: {node: '>= 8'}
+  '@nodelib/fs.walk@1.2.8':
     dependencies:
       '@nodelib/fs.scandir': 2.1.5
       fastq: 1.17.1
-    dev: true
 
-  /@types/node@12.20.55:
-    resolution: {integrity: sha512-J8xLz7q2OFulZ2cyGTLE1TbbZcjpno7FaN6zdJNrgAdrJ+DZzh/uFR6YrTb4C+nXakvud8Q4+rbhoIWlYQbUFQ==}
-    dev: true
+  '@types/node@12.20.55': {}
 
-  /@types/semver@7.5.8:
-    resolution: {integrity: sha512-I8EUhyrgfLrcTkzV3TSsGyl1tSuPrEDzr0yd5m90UgNxQkyDXULk3b6MlQqTCpZpNtWe1K0hzclnZkTcLBe2UQ==}
-    dev: true
+  '@types/semver@7.5.8': {}
 
-  /ansi-colors@4.1.3:
-    resolution: {integrity: sha512-/6w/C21Pm1A7aZitlI5Ni/2J6FFQN8i1Cvz3kHABAAbw93v/NlvKdVOqz7CCWz/3iv/JplRSEEZ83XION15ovw==}
-    engines: {node: '>=6'}
-    dev: true
+  ansi-colors@4.1.3: {}
 
-  /ansi-regex@5.0.1:
-    resolution: {integrity: sha512-quJQXlTSUGL2LH9SUXo8VwsY4soanhgo6LNSm84E1LBcE8s3O0wpdiRzyR9z/ZZJMlMWv37qOOb9pdJlMUEKFQ==}
-    engines: {node: '>=8'}
-    dev: true
+  ansi-regex@5.0.1: {}
 
-  /ansi-styles@3.2.1:
-    resolution: {integrity: sha512-VT0ZI6kZRdTh8YyJw3SMbYm/u+NqfsAxEpWO0Pf9sq8/e94WxxOpPKx9FR1FlyCtOVDNOQ+8ntlqFxiRc+r5qA==}
-    engines: {node: '>=4'}
+  ansi-styles@3.2.1:
     dependencies:
       color-convert: 1.9.3
-    dev: true
 
-  /argparse@1.0.10:
-    resolution: {integrity: sha512-o5Roy6tNG4SL/FOkCAN6RzjiakZS25RLYFrcMttJqbdd8BWrnA+fGz57iN5Pb06pvBGvl5gQ0B48dJlslXvoTg==}
+  argparse@1.0.10:
     dependencies:
       sprintf-js: 1.0.3
-    dev: true
 
-  /array-union@2.1.0:
-    resolution: {integrity: sha512-HGyxoOTYUyCM6stUe6EJgnd4EoewAI7zMdfqO+kGjnlZmBDz/cR5pf8r/cR4Wq60sL/p0IkcjUEEPwS3GFrIyw==}
-    engines: {node: '>=8'}
-    dev: true
+  array-union@2.1.0: {}
 
-  /better-path-resolve@1.0.0:
-    resolution: {integrity: sha512-pbnl5XzGBdrFU/wT4jqmJVPn2B6UHPBOhzMQkY/SPUPB6QtUXtmBHBIwCbXJol93mOpGMnQyP/+BB19q04xj7g==}
-    engines: {node: '>=4'}
+  better-path-resolve@1.0.0:
     dependencies:
       is-windows: 1.0.2
-    dev: true
 
-  /braces@3.0.3:
-    resolution: {integrity: sha512-yQbXgO/OSZVD2IsiLlro+7Hf6Q18EJrKSEsdoMzKePKXct3gvD8oLcOQdIzGupr5Fj+EDe8gO/lxc1BzfMpxvA==}
-    engines: {node: '>=8'}
+  braces@3.0.3:
     dependencies:
       fill-range: 7.1.1
-    dev: true
 
-  /chalk@2.4.2:
-    resolution: {integrity: sha512-Mti+f9lpJNcwF4tWV8/OrTTtF1gZi+f8FqlyAdouralcFWFQWF2+NgCHShjkCb+IFBLq9buZwE1xckQU4peSuQ==}
-    engines: {node: '>=4'}
+  chalk@2.4.2:
     dependencies:
       ansi-styles: 3.2.1
       escape-string-regexp: 1.0.5
       supports-color: 5.5.0
-    dev: true
 
-  /chardet@0.7.0:
-    resolution: {integrity: sha512-mT8iDcrh03qDGRRmoA2hmBJnxpllMR+0/0qlzjqZES6NdiWDcZkCNAk4rPFZ9Q85r27unkiNNg8ZOiwZXBHwcA==}
-    dev: true
+  chardet@0.7.0: {}
 
-  /ci-info@3.9.0:
-    resolution: {integrity: sha512-NIxF55hv4nSqQswkAeiOi1r83xy8JldOFDTWiug55KBu9Jnblncd2U6ViHmYgHf01TPZS77NJBhBMKdWj9HQMQ==}
-    engines: {node: '>=8'}
-    dev: true
+  ci-info@3.9.0: {}
 
-  /color-convert@1.9.3:
-    resolution: {integrity: sha512-QfAUtd+vFdAtFQcC8CCyYt1fYWxSqAiK2cSD6zDB8N3cpsEBAvRxp9zOGg6G/SHHJYAT88/az/IuDGALsNVbGg==}
+  color-convert@1.9.3:
     dependencies:
       color-name: 1.1.3
-    dev: true
 
-  /color-name@1.1.3:
-    resolution: {integrity: sha512-72fSenhMw2HZMTVHeCA9KCmpEIbzWiQsjN+BHcBbS9vr1mtt+vJjPdksIBNUmKAW8TFUDPJK5SUU3QhE9NEXDw==}
-    dev: true
+  color-name@1.1.3: {}
 
-  /cross-spawn@5.1.0:
-    resolution: {integrity: sha512-pTgQJ5KC0d2hcY8eyL1IzlBPYjTkyH72XRZPnLyKus2mBfNjQs3klqbJU2VILqZryAZUt9JOb3h/mWMy23/f5A==}
+  cross-spawn@5.1.0:
     dependencies:
       lru-cache: 4.1.5
       shebang-command: 1.2.0
       which: 1.3.1
-    dev: true
 
-  /dataloader@1.4.0:
-    resolution: {integrity: sha512-68s5jYdlvasItOJnCuI2Q9s4q98g0pCyL3HrcKJu8KNugUl8ahgmZYg38ysLTgQjjXX3H8CJLkAvWrclWfcalw==}
-    dev: true
+  dataloader@1.4.0: {}
 
-  /detect-indent@6.1.0:
-    resolution: {integrity: sha512-reYkTUJAZb9gUuZ2RvVCNhVHdg62RHnJ7WJl8ftMi4diZ6NWlciOzQN88pUhSELEwflJht4oQDv0F0BMlwaYtA==}
-    engines: {node: '>=8'}
-    dev: true
+  detect-indent@6.1.0: {}
 
-  /dir-glob@3.0.1:
-    resolution: {integrity: sha512-WkrWp9GR4KXfKGYzOLmTuGVi1UWFfws377n9cc55/tb6DuqyF6pcQ5AbiHEshaDpY9v6oaSr2XCDidGmMwdzIA==}
-    engines: {node: '>=8'}
+  dir-glob@3.0.1:
     dependencies:
       path-type: 4.0.0
-    dev: true
 
-  /dotenv@8.6.0:
-    resolution: {integrity: sha512-IrPdXQsk2BbzvCBGBOTmmSH5SodmqZNt4ERAZDmW4CT+tL8VtvinqywuANaFu4bOMWki16nqf0e4oC0QIaDr/g==}
-    engines: {node: '>=10'}
-    dev: true
+  dotenv@8.6.0: {}
 
-  /enquirer@2.4.1:
-    resolution: {integrity: sha512-rRqJg/6gd538VHvR3PSrdRBb/1Vy2YfzHqzvbhGIQpDRKIa4FgV/54b5Q1xYSxOOwKvjXweS26E0Q+nAMwp2pQ==}
-    engines: {node: '>=8.6'}
+  enquirer@2.4.1:
     dependencies:
       ansi-colors: 4.1.3
       strip-ansi: 6.0.1
-    dev: true
 
-  /escape-string-regexp@1.0.5:
-    resolution: {integrity: sha512-vbRorB5FUQWvla16U8R/qgaFIya2qGzwDrNmCZuYKrbdSUMG6I1ZCGQRefkRVhuOkIGVne7BQ35DSfo1qvJqFg==}
-    engines: {node: '>=0.8.0'}
-    dev: true
+  escape-string-regexp@1.0.5: {}
 
-  /esprima@4.0.1:
-    resolution: {integrity: sha512-eGuFFw7Upda+g4p+QHvnW0RyTX/SVeJBDM/gCtMARO0cLuT2HcEKnTPvhjV6aGeqrCB/sbNop0Kszm0jsaWU4A==}
-    engines: {node: '>=4'}
-    hasBin: true
-    dev: true
+  esprima@4.0.1: {}
 
-  /extendable-error@0.1.7:
-    resolution: {integrity: sha512-UOiS2in6/Q0FK0R0q6UY9vYpQ21mr/Qn1KOnte7vsACuNJf514WvCCUHSRCPcgjPT2bAhNIJdlE6bVap1GKmeg==}
-    dev: true
+  extendable-error@0.1.7: {}
 
-  /external-editor@3.1.0:
-    resolution: {integrity: sha512-hMQ4CX1p1izmuLYyZqLMO/qGNw10wSv9QDCPfzXfyFrOaCSSoRfqE1Kf1s5an66J5JZC62NewG+mK49jOCtQew==}
-    engines: {node: '>=4'}
+  external-editor@3.1.0:
     dependencies:
       chardet: 0.7.0
       iconv-lite: 0.4.24
       tmp: 0.0.33
-    dev: true
 
-  /fast-glob@3.3.2:
-    resolution: {integrity: sha512-oX2ruAFQwf/Orj8m737Y5adxDQO0LAB7/S5MnxCdTNDd4p6BsyIVsv9JQsATbTSq8KHRpLwIHbVlUNatxd+1Ow==}
-    engines: {node: '>=8.6.0'}
+  fast-glob@3.3.2:
     dependencies:
       '@nodelib/fs.stat': 2.0.5
       '@nodelib/fs.walk': 1.2.8
       glob-parent: 5.1.2
       merge2: 1.4.1
       micromatch: 4.0.7
-    dev: true
 
-  /fastq@1.17.1:
-    resolution: {integrity: sha512-sRVD3lWVIXWg6By68ZN7vho9a1pQcN/WBFaAAsDDFzlJjvoGx0P8z7V1t72grFJfJhu3YPZBuu25f7Kaw2jN1w==}
+  fastq@1.17.1:
     dependencies:
       reusify: 1.0.4
-    dev: true
 
-  /fill-range@7.1.1:
-    resolution: {integrity: sha512-YsGpe3WHLK8ZYi4tWDg2Jy3ebRz2rXowDxnld4bkQB00cc/1Zw9AWnC0i9ztDJitivtQvaI9KaLyKrc+hBW0yg==}
-    engines: {node: '>=8'}
+  fill-range@7.1.1:
     dependencies:
       to-regex-range: 5.0.1
-    dev: true
 
-  /find-up@4.1.0:
-    resolution: {integrity: sha512-PpOwAdQ/YlXQ2vj8a3h8IipDuYRi3wceVQQGYWxNINccq40Anw7BlsEXCMbt1Zt+OLA6Fq9suIpIWD0OsnISlw==}
-    engines: {node: '>=8'}
+  find-up@4.1.0:
     dependencies:
       locate-path: 5.0.0
       path-exists: 4.0.0
-    dev: true
 
-  /find-up@5.0.0:
-    resolution: {integrity: sha512-78/PXT1wlLLDgTzDs7sjq9hzz0vXD+zn+7wypEe4fXQxCmdmqfGsEPQxmiCSQI3ajFV91bVSsvNtrJRiW6nGng==}
-    engines: {node: '>=10'}
+  find-up@5.0.0:
     dependencies:
       locate-path: 6.0.0
       path-exists: 4.0.0
-    dev: true
 
-  /find-yarn-workspace-root2@1.2.16:
-    resolution: {integrity: sha512-hr6hb1w8ePMpPVUK39S4RlwJzi+xPLuVuG8XlwXU3KD5Yn3qgBWVfy3AzNlDhWvE1EORCE65/Qm26rFQt3VLVA==}
+  find-yarn-workspace-root2@1.2.16:
     dependencies:
       micromatch: 4.0.7
       pkg-dir: 4.2.0
-    dev: true
 
-  /fs-extra@7.0.1:
-    resolution: {integrity: sha512-YJDaCJZEnBmcbw13fvdAM9AwNOJwOzrE4pqMqBq5nFiEqXUqHwlK4B+3pUw6JNvfSPtX05xFHtYy/1ni01eGCw==}
-    engines: {node: '>=6 <7 || >=8'}
+  fs-extra@7.0.1:
     dependencies:
       graceful-fs: 4.2.11
       jsonfile: 4.0.0
       universalify: 0.1.2
-    dev: true
 
-  /fs-extra@8.1.0:
-    resolution: {integrity: sha512-yhlQgA6mnOJUKOsRUFsgJdQCvkKhcz8tlZG5HBQfReYZy46OwLcY+Zia0mtdHsOo9y/hP+CxMN0TU9QxoOtG4g==}
-    engines: {node: '>=6 <7 || >=8'}
+  fs-extra@8.1.0:
     dependencies:
       graceful-fs: 4.2.11
       jsonfile: 4.0.0
       universalify: 0.1.2
-    dev: true
 
-  /glob-parent@5.1.2:
-    resolution: {integrity: sha512-AOIgSQCepiJYwP3ARnGx+5VnTu2HBYdzbGP45eLw1vr3zB3vZLeyed1sC9hnbcOc9/SrMyM5RPQrkGz4aS9Zow==}
-    engines: {node: '>= 6'}
+  glob-parent@5.1.2:
     dependencies:
       is-glob: 4.0.3
-    dev: true
 
-  /globby@11.1.0:
-    resolution: {integrity: sha512-jhIXaOzy1sb8IyocaruWSn1TjmnBVs8Ayhcy83rmxNJ8q2uWKCAj3CnJY+KpGSXCueAPc0i05kVvVKtP1t9S3g==}
-    engines: {node: '>=10'}
+  globby@11.1.0:
     dependencies:
       array-union: 2.1.0
       dir-glob: 3.0.1
@@ -524,400 +848,210 @@ packages:
       ignore: 5.3.1
       merge2: 1.4.1
       slash: 3.0.0
-    dev: true
 
-  /graceful-fs@4.2.11:
-    resolution: {integrity: sha512-RbJ5/jmFcNNCcDV5o9eTnBLJ/HszWV0P73bc+Ff4nS/rJj+YaS6IGyiOL0VoBYX+l1Wrl3k63h/KrH+nhJ0XvQ==}
-    dev: true
+  graceful-fs@4.2.11: {}
 
-  /has-flag@3.0.0:
-    resolution: {integrity: sha512-sKJf1+ceQBr4SMkvQnBDNDtf4TXpVhVGateu0t918bl30FnbE2m4vNLX+VWe/dpjlb+HugGYzW7uQXH98HPEYw==}
-    engines: {node: '>=4'}
-    dev: true
+  has-flag@3.0.0: {}
 
-  /human-id@1.0.2:
-    resolution: {integrity: sha512-UNopramDEhHJD+VR+ehk8rOslwSfByxPIZyJRfV739NDhN5LF1fa1MqnzKm2lGTQRjNrjK19Q5fhkgIfjlVUKw==}
-    dev: true
+  human-id@1.0.2: {}
 
-  /iconv-lite@0.4.24:
-    resolution: {integrity: sha512-v3MXnZAcvnywkTUEZomIActle7RXXeedOR31wwl7VlyoXO4Qi9arvSenNQWne1TcRwhCL1HwLI21bEqdpj8/rA==}
-    engines: {node: '>=0.10.0'}
+  iconv-lite@0.4.24:
     dependencies:
       safer-buffer: 2.1.2
-    dev: true
 
-  /ignore@5.3.1:
-    resolution: {integrity: sha512-5Fytz/IraMjqpwfd34ke28PTVMjZjJG2MPn5t7OE4eUCUNf8BAa7b5WUS9/Qvr6mwOQS7Mk6vdsMno5he+T8Xw==}
-    engines: {node: '>= 4'}
-    dev: true
+  ignore@5.3.1: {}
 
-  /is-extglob@2.1.1:
-    resolution: {integrity: sha512-SbKbANkN603Vi4jEZv49LeVJMn4yGwsbzZworEoyEiutsN3nJYdbO36zfhGJ6QEDpOZIFkDtnq5JRxmvl3jsoQ==}
-    engines: {node: '>=0.10.0'}
-    dev: true
+  is-extglob@2.1.1: {}
 
-  /is-glob@4.0.3:
-    resolution: {integrity: sha512-xelSayHH36ZgE7ZWhli7pW34hNbNl8Ojv5KVmkJD4hBdD3th8Tfk9vYasLM+mXWOZhFkgZfxhLSnrwRr4elSSg==}
-    engines: {node: '>=0.10.0'}
+  is-glob@4.0.3:
     dependencies:
       is-extglob: 2.1.1
-    dev: true
 
-  /is-number@7.0.0:
-    resolution: {integrity: sha512-41Cifkg6e8TylSpdtTpeLVMqvSBEVzTttHvERD741+pnZ8ANv0004MRL43QKPDlK9cGvNp6NZWZUBlbGXYxxng==}
-    engines: {node: '>=0.12.0'}
-    dev: true
+  is-number@7.0.0: {}
 
-  /is-subdir@1.2.0:
-    resolution: {integrity: sha512-2AT6j+gXe/1ueqbW6fLZJiIw3F8iXGJtt0yDrZaBhAZEG1raiTxKWU+IPqMCzQAXOUCKdA4UDMgacKH25XG2Cw==}
-    engines: {node: '>=4'}
+  is-subdir@1.2.0:
     dependencies:
       better-path-resolve: 1.0.0
-    dev: true
 
-  /is-windows@1.0.2:
-    resolution: {integrity: sha512-eXK1UInq2bPmjyX6e3VHIzMLobc4J94i4AWn+Hpq3OU5KkrRC96OAcR3PRJ/pGu6m8TRnBHP9dkXQVsT/COVIA==}
-    engines: {node: '>=0.10.0'}
-    dev: true
+  is-windows@1.0.2: {}
 
-  /isexe@2.0.0:
-    resolution: {integrity: sha512-RHxMLp9lnKHGHRng9QFhRCMbYAcVpn69smSGcq3f36xjgVVWThj4qqLbTLlq7Ssj8B+fIQ1EuCEGI2lKsyQeIw==}
-    dev: true
+  isexe@2.0.0: {}
 
-  /js-yaml@3.14.1:
-    resolution: {integrity: sha512-okMH7OXXJ7YrN9Ok3/SXrnu4iX9yOk+25nqX4imS2npuvTYDmo/QEZoqwZkYaIDk3jVvBOTOIEgEhaLOynBS9g==}
-    hasBin: true
+  js-yaml@3.14.1:
     dependencies:
       argparse: 1.0.10
       esprima: 4.0.1
-    dev: true
 
-  /jsonfile@4.0.0:
-    resolution: {integrity: sha512-m6F1R3z8jjlf2imQHS2Qez5sjKWQzbuuhuJ/FKYFRZvPE3PuHcSMVZzfsLhGVOkfd20obL5SWEBew5ShlquNxg==}
+  jsonfile@4.0.0:
     optionalDependencies:
       graceful-fs: 4.2.11
-    dev: true
 
-  /load-yaml-file@0.2.0:
-    resolution: {integrity: sha512-OfCBkGEw4nN6JLtgRidPX6QxjBQGQf72q3si2uvqyFEMbycSFFHwAZeXx6cJgFM9wmLrf9zBwCP3Ivqa+LLZPw==}
-    engines: {node: '>=6'}
+  load-yaml-file@0.2.0:
     dependencies:
       graceful-fs: 4.2.11
       js-yaml: 3.14.1
       pify: 4.0.1
       strip-bom: 3.0.0
-    dev: true
 
-  /locate-path@5.0.0:
-    resolution: {integrity: sha512-t7hw9pI+WvuwNJXwk5zVHpyhIqzg2qTlklJOf0mVxGSbe3Fp2VieZcduNYjaLDoy6p9uGpQEGWG87WpMKlNq8g==}
-    engines: {node: '>=8'}
+  locate-path@5.0.0:
     dependencies:
       p-locate: 4.1.0
-    dev: true
 
-  /locate-path@6.0.0:
-    resolution: {integrity: sha512-iPZK6eYjbxRu3uB4/WZ3EsEIMJFMqAoopl3R+zuq0UjcAm/MO6KCweDgPfP3elTztoKP3KtnVHxTn2NHBSDVUw==}
-    engines: {node: '>=10'}
+  locate-path@6.0.0:
     dependencies:
       p-locate: 5.0.0
-    dev: true
 
-  /lodash.startcase@4.4.0:
-    resolution: {integrity: sha512-+WKqsK294HMSc2jEbNgpHpd0JfIBhp7rEV4aqXWqFr6AlXov+SlcgB1Fv01y2kGe3Gc8nMW7VA0SrGuSkRfIEg==}
-    dev: true
+  lodash.startcase@4.4.0: {}
 
-  /lru-cache@4.1.5:
-    resolution: {integrity: sha512-sWZlbEP2OsHNkXrMl5GYk/jKk70MBng6UU4YI/qGDYbgf6YbP4EvmqISbXCoJiRKs+1bSpFHVgQxvJ17F2li5g==}
+  lru-cache@4.1.5:
     dependencies:
       pseudomap: 1.0.2
       yallist: 2.1.2
-    dev: true
 
-  /merge2@1.4.1:
-    resolution: {integrity: sha512-8q7VEgMJW4J8tcfVPy8g09NcQwZdbwFEqhe/WZkoIzjn/3TGDwtOCYtXGxA3O8tPzpczCCDgv+P2P5y00ZJOOg==}
-    engines: {node: '>= 8'}
-    dev: true
+  merge2@1.4.1: {}
 
-  /micromatch@4.0.7:
-    resolution: {integrity: sha512-LPP/3KorzCwBxfeUuZmaR6bG2kdeHSbe0P2tY3FLRU4vYrjYz5hI4QZwV0njUx3jeuKe67YukQ1LSPZBKDqO/Q==}
-    engines: {node: '>=8.6'}
+  micromatch@4.0.7:
     dependencies:
       braces: 3.0.3
       picomatch: 2.3.1
-    dev: true
 
-  /mri@1.2.0:
-    resolution: {integrity: sha512-tzzskb3bG8LvYGFF/mDTpq3jpI6Q9wc3LEmBaghu+DdCssd1FakN7Bc0hVNmEyGq1bq3RgfkCb3cmQLpNPOroA==}
-    engines: {node: '>=4'}
-    dev: true
+  mri@1.2.0: {}
 
-  /node-fetch@2.7.0:
-    resolution: {integrity: sha512-c4FRfUm/dbcWZ7U+1Wq0AwCyFL+3nt2bEw05wfxSz+DWpWsitgmSgYmy2dQdWyKC1694ELPqMs/YzUSNozLt8A==}
-    engines: {node: 4.x || >=6.0.0}
-    peerDependencies:
-      encoding: ^0.1.0
-    peerDependenciesMeta:
-      encoding:
-        optional: true
+  node-fetch@2.7.0:
     dependencies:
       whatwg-url: 5.0.0
-    dev: true
 
-  /os-tmpdir@1.0.2:
-    resolution: {integrity: sha512-D2FR03Vir7FIu45XBY20mTb+/ZSWB00sjU9jdQXt83gDrI4Ztz5Fs7/yy74g2N5SVQY4xY1qDr4rNddwYRVX0g==}
-    engines: {node: '>=0.10.0'}
-    dev: true
+  os-tmpdir@1.0.2: {}
 
-  /outdent@0.5.0:
-    resolution: {integrity: sha512-/jHxFIzoMXdqPzTaCpFzAAWhpkSjZPF4Vsn6jAfNpmbH/ymsmd7Qc6VE9BGn0L6YMj6uwpQLxCECpus4ukKS9Q==}
-    dev: true
+  outdent@0.5.0: {}
 
-  /p-filter@2.1.0:
-    resolution: {integrity: sha512-ZBxxZ5sL2HghephhpGAQdoskxplTwr7ICaehZwLIlfL6acuVgZPm8yBNuRAFBGEqtD/hmUeq9eqLg2ys9Xr/yw==}
-    engines: {node: '>=8'}
+  p-filter@2.1.0:
     dependencies:
       p-map: 2.1.0
-    dev: true
 
-  /p-limit@2.3.0:
-    resolution: {integrity: sha512-//88mFWSJx8lxCzwdAABTJL2MyWB12+eIY7MDL2SqLmAkeKU9qxRvWuSyTjm3FUmpBEMuFfckAIqEaVGUDxb6w==}
-    engines: {node: '>=6'}
+  p-limit@2.3.0:
     dependencies:
       p-try: 2.2.0
-    dev: true
 
-  /p-limit@3.1.0:
-    resolution: {integrity: sha512-TYOanM3wGwNGsZN2cVTYPArw454xnXj5qmWF1bEoAc4+cU/ol7GVh7odevjp1FNHduHc3KZMcFduxU5Xc6uJRQ==}
-    engines: {node: '>=10'}
+  p-limit@3.1.0:
     dependencies:
       yocto-queue: 0.1.0
-    dev: true
 
-  /p-locate@4.1.0:
-    resolution: {integrity: sha512-R79ZZ/0wAxKGu3oYMlz8jy/kbhsNrS7SKZ7PxEHBgJ5+F2mtFW2fK2cOtBh1cHYkQsbzFV7I+EoRKe6Yt0oK7A==}
-    engines: {node: '>=8'}
+  p-locate@4.1.0:
     dependencies:
       p-limit: 2.3.0
-    dev: true
 
-  /p-locate@5.0.0:
-    resolution: {integrity: sha512-LaNjtRWUBY++zB5nE/NwcaoMylSPk+S+ZHNB1TzdbMJMny6dynpAGt7X/tl/QYq3TIeE6nxHppbo2LGymrG5Pw==}
-    engines: {node: '>=10'}
+  p-locate@5.0.0:
     dependencies:
       p-limit: 3.1.0
-    dev: true
 
-  /p-map@2.1.0:
-    resolution: {integrity: sha512-y3b8Kpd8OAN444hxfBbFfj1FY/RjtTd8tzYwhUqNYXx0fXx2iX4maP4Qr6qhIKbQXI02wTLAda4fYUbDagTUFw==}
-    engines: {node: '>=6'}
-    dev: true
+  p-map@2.1.0: {}
 
-  /p-try@2.2.0:
-    resolution: {integrity: sha512-R4nPAVTAU0B9D35/Gk3uJf/7XYbQcyohSKdvAxIRSNghFl4e71hVoGnBNQz9cWaXxO2I10KTC+3jMdvvoKw6dQ==}
-    engines: {node: '>=6'}
-    dev: true
+  p-try@2.2.0: {}
 
-  /path-exists@4.0.0:
-    resolution: {integrity: sha512-ak9Qy5Q7jYb2Wwcey5Fpvg2KoAc/ZIhLSLOSBmRmygPsGwkVVt0fZa0qrtMz+m6tJTAHfZQ8FnmB4MG4LWy7/w==}
-    engines: {node: '>=8'}
-    dev: true
+  path-exists@4.0.0: {}
 
-  /path-type@4.0.0:
-    resolution: {integrity: sha512-gDKb8aZMDeD/tZWs9P6+q0J9Mwkdl6xMV8TjnGP3qJVJ06bdMgkbBlLU8IdfOsIsFz2BW1rNVT3XuNEl8zPAvw==}
-    engines: {node: '>=8'}
-    dev: true
+  path-type@4.0.0: {}
 
-  /picomatch@2.3.1:
-    resolution: {integrity: sha512-JU3teHTNjmE2VCGFzuY8EXzCDVwEqB2a8fsIvwaStHhAWJEeVd1o1QD80CU6+ZdEXXSLbSsuLwJjkCBWqRQUVA==}
-    engines: {node: '>=8.6'}
-    dev: true
+  picomatch@2.3.1: {}
 
-  /pify@4.0.1:
-    resolution: {integrity: sha512-uB80kBFb/tfd68bVleG9T5GGsGPjJrLAUpR5PZIrhBnIaRTQRjqdJSsIKkOP6OAIFbj7GOrcudc5pNjZ+geV2g==}
-    engines: {node: '>=6'}
-    dev: true
+  pify@4.0.1: {}
 
-  /pkg-dir@4.2.0:
-    resolution: {integrity: sha512-HRDzbaKjC+AOWVXxAU/x54COGeIv9eb+6CkDSQoNTt4XyWoIJvuPsXizxu/Fr23EiekbtZwmh1IcIG/l/a10GQ==}
-    engines: {node: '>=8'}
+  pkg-dir@4.2.0:
     dependencies:
       find-up: 4.1.0
-    dev: true
 
-  /preferred-pm@3.1.4:
-    resolution: {integrity: sha512-lEHd+yEm22jXdCphDrkvIJQU66EuLojPPtvZkpKIkiD+l0DMThF/niqZKJSoU8Vl7iuvtmzyMhir9LdVy5WMnA==}
-    engines: {node: '>=10'}
+  preferred-pm@3.1.4:
     dependencies:
       find-up: 5.0.0
       find-yarn-workspace-root2: 1.2.16
       path-exists: 4.0.0
       which-pm: 2.2.0
-    dev: true
 
-  /prettier@2.8.8:
-    resolution: {integrity: sha512-tdN8qQGvNjw4CHbY+XXk0JgCXn9QiF21a55rBe5LJAU+kDyC4WQn4+awm2Xfk2lQMk5fKup9XgzTZtGkjBdP9Q==}
-    engines: {node: '>=10.13.0'}
-    hasBin: true
-    dev: true
+  prettier@2.8.8: {}
 
-  /pseudomap@1.0.2:
-    resolution: {integrity: sha512-b/YwNhb8lk1Zz2+bXXpS/LK9OisiZZ1SNsSLxN1x2OXVEhW2Ckr/7mWE5vrC1ZTiJlD9g19jWszTmJsB+oEpFQ==}
-    dev: true
+  pseudomap@1.0.2: {}
 
-  /queue-microtask@1.2.3:
-    resolution: {integrity: sha512-NuaNSa6flKT5JaSYQzJok04JzTL1CA6aGhv5rfLW3PgqA+M2ChpZQnAC8h8i4ZFkBS8X5RqkDBHA7r4hej3K9A==}
-    dev: true
+  queue-microtask@1.2.3: {}
 
-  /read-yaml-file@1.1.0:
-    resolution: {integrity: sha512-VIMnQi/Z4HT2Fxuwg5KrY174U1VdUIASQVWXXyqtNRtxSr9IYkn1rsI6Tb6HsrHCmB7gVpNwX6JxPTHcH6IoTA==}
-    engines: {node: '>=6'}
+  read-yaml-file@1.1.0:
     dependencies:
       graceful-fs: 4.2.11
       js-yaml: 3.14.1
       pify: 4.0.1
       strip-bom: 3.0.0
-    dev: true
 
-  /regenerator-runtime@0.14.1:
-    resolution: {integrity: sha512-dYnhHh0nJoMfnkZs6GmmhFknAGRrLznOu5nc9ML+EJxGvrx6H7teuevqVqCuPcPK//3eDrrjQhehXVx9cnkGdw==}
-    dev: true
+  regenerator-runtime@0.14.1: {}
 
-  /resolve-from@5.0.0:
-    resolution: {integrity: sha512-qYg9KP24dD5qka9J47d0aVky0N+b4fTU89LN9iDnjB5waksiC49rvMB0PrUJQGoTmH50XPiqOvAjDfaijGxYZw==}
-    engines: {node: '>=8'}
-    dev: true
+  resolve-from@5.0.0: {}
 
-  /reusify@1.0.4:
-    resolution: {integrity: sha512-U9nH88a3fc/ekCF1l0/UP1IosiuIjyTh7hBvXVMHYgVcfGvt897Xguj2UOLDeI5BG2m7/uwyaLVT6fbtCwTyzw==}
-    engines: {iojs: '>=1.0.0', node: '>=0.10.0'}
-    dev: true
+  reusify@1.0.4: {}
 
-  /run-parallel@1.2.0:
-    resolution: {integrity: sha512-5l4VyZR86LZ/lDxZTR6jqL8AFE2S0IFLMP26AbjsLVADxHdhB/c0GUsH+y39UfCi3dzz8OlQuPmnaJOMoDHQBA==}
+  run-parallel@1.2.0:
     dependencies:
       queue-microtask: 1.2.3
-    dev: true
 
-  /safer-buffer@2.1.2:
-    resolution: {integrity: sha512-YZo3K82SD7Riyi0E1EQPojLz7kpepnSQI9IyPbHHg1XXXevb5dJI7tpyN2ADxGcQbHG7vcyRHk0cbwqcQriUtg==}
-    dev: true
+  safer-buffer@2.1.2: {}
 
-  /semver@7.6.3:
-    resolution: {integrity: sha512-oVekP1cKtI+CTDvHWYFUcMtsK/00wmAEfyqKfNdARm8u1wNVhSgaX7A8d4UuIlUI5e84iEwOhs7ZPYRmzU9U6A==}
-    engines: {node: '>=10'}
-    hasBin: true
-    dev: true
+  semver@7.6.3: {}
 
-  /shebang-command@1.2.0:
-    resolution: {integrity: sha512-EV3L1+UQWGor21OmnvojK36mhg+TyIKDh3iFBKBohr5xeXIhNBcx8oWdgkTEEQ+BEFFYdLRuqMfd5L84N1V5Vg==}
-    engines: {node: '>=0.10.0'}
+  shebang-command@1.2.0:
     dependencies:
       shebang-regex: 1.0.0
-    dev: true
 
-  /shebang-regex@1.0.0:
-    resolution: {integrity: sha512-wpoSFAxys6b2a2wHZ1XpDSgD7N9iVjg29Ph9uV/uaP9Ex/KXlkTZTeddxDPSYQpgvzKLGJke2UU0AzoGCjNIvQ==}
-    engines: {node: '>=0.10.0'}
-    dev: true
+  shebang-regex@1.0.0: {}
 
-  /signal-exit@3.0.7:
-    resolution: {integrity: sha512-wnD2ZE+l+SPC/uoS0vXeE9L1+0wuaMqKlfz9AMUo38JsyLSBWSFcHR1Rri62LZc12vLr1gb3jl7iwQhgwpAbGQ==}
-    dev: true
+  signal-exit@3.0.7: {}
 
-  /slash@3.0.0:
-    resolution: {integrity: sha512-g9Q1haeby36OSStwb4ntCGGGaKsaVSjQ68fBxoQcutl5fS1vuY18H3wSt3jFyFtrkx+Kz0V1G85A4MyAdDMi2Q==}
-    engines: {node: '>=8'}
-    dev: true
+  slash@3.0.0: {}
 
-  /spawndamnit@2.0.0:
-    resolution: {integrity: sha512-j4JKEcncSjFlqIwU5L/rp2N5SIPsdxaRsIv678+TZxZ0SRDJTm8JrxJMjE/XuiEZNEir3S8l0Fa3Ke339WI4qA==}
+  spawndamnit@2.0.0:
     dependencies:
       cross-spawn: 5.1.0
       signal-exit: 3.0.7
-    dev: true
 
-  /sprintf-js@1.0.3:
-    resolution: {integrity: sha512-D9cPgkvLlV3t3IzL0D0YLvGA9Ahk4PcvVwUbN0dSGr1aP0Nrt4AEnTUbuGvquEC0mA64Gqt1fzirlRs5ibXx8g==}
-    dev: true
+  sprintf-js@1.0.3: {}
 
-  /strip-ansi@6.0.1:
-    resolution: {integrity: sha512-Y38VPSHcqkFrCpFnQ9vuSXmquuv5oXOKpGeT6aGrr3o3Gc9AlVa6JBfUSOCnbxGGZF+/0ooI7KrPuUSztUdU5A==}
-    engines: {node: '>=8'}
+  strip-ansi@6.0.1:
     dependencies:
       ansi-regex: 5.0.1
-    dev: true
 
-  /strip-bom@3.0.0:
-    resolution: {integrity: sha512-vavAMRXOgBVNF6nyEEmL3DBK19iRpDcoIwW+swQ+CbGiu7lju6t+JklA1MHweoWtadgt4ISVUsXLyDq34ddcwA==}
-    engines: {node: '>=4'}
-    dev: true
+  strip-bom@3.0.0: {}
 
-  /supports-color@5.5.0:
-    resolution: {integrity: sha512-QjVjwdXIt408MIiAqCX4oUKsgU2EqAGzs2Ppkm4aQYbjm+ZEWEcW4SfFNTr4uMNZma0ey4f5lgLrkB0aX0QMow==}
-    engines: {node: '>=4'}
+  supports-color@5.5.0:
     dependencies:
       has-flag: 3.0.0
-    dev: true
 
-  /term-size@2.2.1:
-    resolution: {integrity: sha512-wK0Ri4fOGjv/XPy8SBHZChl8CM7uMc5VML7SqiQ0zG7+J5Vr+RMQDoHa2CNT6KHUnTGIXH34UDMkPzAUyapBZg==}
-    engines: {node: '>=8'}
-    dev: true
+  term-size@2.2.1: {}
 
-  /tmp@0.0.33:
-    resolution: {integrity: sha512-jRCJlojKnZ3addtTOjdIqoRuPEKBvNXcGYqzO6zWZX8KfKEpnGY5jfggJQ3EjKuu8D4bJRr0y+cYJFmYbImXGw==}
-    engines: {node: '>=0.6.0'}
+  tmp@0.0.33:
     dependencies:
       os-tmpdir: 1.0.2
-    dev: true
 
-  /to-regex-range@5.0.1:
-    resolution: {integrity: sha512-65P7iz6X5yEr1cwcgvQxbbIw7Uk3gOy5dIdtZ4rDveLqhrdJP+Li/Hx6tyK0NEb+2GCyneCMJiGqrADCSNk8sQ==}
-    engines: {node: '>=8.0'}
+  to-regex-range@5.0.1:
     dependencies:
       is-number: 7.0.0
-    dev: true
 
-  /tr46@0.0.3:
-    resolution: {integrity: sha512-N3WMsuqV66lT30CrXNbEjx4GEwlow3v6rr4mCcv6prnfwhS01rkgyFdjPNBYd9br7LpXV1+Emh01fHnq2Gdgrw==}
-    dev: true
+  tr46@0.0.3: {}
 
-  /universalify@0.1.2:
-    resolution: {integrity: sha512-rBJeI5CXAlmy1pV+617WB9J63U6XcazHHF2f2dbJix4XzpUF0RS3Zbj0FGIOCAva5P/d/GBOYaACQ1w+0azUkg==}
-    engines: {node: '>= 4.0.0'}
-    dev: true
+  universalify@0.1.2: {}
 
-  /webidl-conversions@3.0.1:
-    resolution: {integrity: sha512-2JAn3z8AR6rjK8Sm8orRC0h/bcl/DqL7tRPdGZ4I1CjdF+EaMLmYxBHyXuKL849eucPFhvBoxMsflfOb8kxaeQ==}
-    dev: true
+  webidl-conversions@3.0.1: {}
 
-  /whatwg-url@5.0.0:
-    resolution: {integrity: sha512-saE57nupxk6v3HY35+jzBwYa0rKSy0XR8JSxZPwgLr7ys0IBzhGviA1/TUGJLmSVqs8pb9AnvICXEuOHLprYTw==}
+  whatwg-url@5.0.0:
     dependencies:
       tr46: 0.0.3
       webidl-conversions: 3.0.1
-    dev: true
 
-  /which-pm@2.2.0:
-    resolution: {integrity: sha512-MOiaDbA5ZZgUjkeMWM5EkJp4loW5ZRoa5bc3/aeMox/PJelMhE6t7S/mLuiY43DBupyxH+S0U1bTui9kWUlmsw==}
-    engines: {node: '>=8.15'}
+  which-pm@2.2.0:
     dependencies:
       load-yaml-file: 0.2.0
       path-exists: 4.0.0
-    dev: true
 
-  /which@1.3.1:
-    resolution: {integrity: sha512-HxJdYWq1MTIQbJ3nw0cqssHoTNU267KlrDuGZ1WYlxDStUtKUhOaJmh112/TZmHxxUfuJqPXSOm7tDyas0OSIQ==}
-    hasBin: true
+  which@1.3.1:
     dependencies:
       isexe: 2.0.0
-    dev: true
 
-  /yallist@2.1.2:
-    resolution: {integrity: sha512-ncTzHV7NvsQZkYe1DW7cbDLm0YpzHmZF5r/iyP3ZnQtMiJ+pjzisCiMNI+Sj+xQF5pXhSHxSB3uDbsBTzY/c2A==}
-    dev: true
+  yallist@2.1.2: {}
 
-  /yocto-queue@0.1.0:
-    resolution: {integrity: sha512-rVksvsnNCdJ/ohGc6xgPwyN8eheCxsiLM8mxuE/t/mOVqJewPuO1miLpTHQiRgTKCLexL4MeAFVagts7HmNZ2Q==}
-    engines: {node: '>=10'}
-    dev: true
+  yocto-queue@0.1.0: {}
diff --git a/test.py b/test.py
deleted file mode 100644
index e5d5b542b..000000000
--- a/test.py
+++ /dev/null
@@ -1,59 +0,0 @@
-import asyncio
-import multiprocessing as mp
-import os
-import socket
-
-
-async def async_send(loop, sock, message):
-    await loop.sock_sendall(sock, message.encode("utf-8"))
-
-
-async def async_recv(loop, sock, buffer_size=1024):
-    data = await loop.sock_recv(sock, buffer_size)
-    return data.decode("utf-8")
-
-
-def worker_process(send_sock):
-    # This will run in the worker process
-    loop = asyncio.get_event_loop()
-
-    async def worker_task():
-        # Simulate sending messages from the worker to the main process
-        for i in range(5):
-            message = f"Message {i} from process {os.getpid()}"
-            print(f"Sending: {message}")
-            await async_send(loop, send_sock, message)
-            await asyncio.sleep(1)
-
-    loop.run_until_complete(worker_task())
-    send_sock.close()
-
-
-async def main():
-    parent_sock, child_sock = socket.socketpair()
-
-    ctx = mp.get_context("spawn")
-    process = ctx.Process(target=worker_process, args=(child_sock,))
-    process.start()
-
-    child_sock.close()  # Close the child socket in the main process
-
-    loop = asyncio.get_event_loop()
-
-    # Asynchronously receive messages from the worker process
-    async def receive_messages():
-        while True:
-            message = await async_recv(loop, parent_sock)
-            if not message:
-                break
-            print(f"Received: {message}")
-
-    await receive_messages()
-
-    # Wait for the process to finish
-    process.join()
-    parent_sock.close()
-
-
-if __name__ == "__main__":
-    asyncio.run(main())
diff --git a/tests/.gitignore b/tests/.gitignore
new file mode 100644
index 000000000..ef3d21d85
--- /dev/null
+++ b/tests/.gitignore
@@ -0,0 +1 @@
+**/test_vad*.wav
\ No newline at end of file
diff --git a/tests/test_ipc.py b/tests/test_ipc.py
index 9256cb1d8..d77715dde 100644
--- a/tests/test_ipc.py
+++ b/tests/test_ipc.py
@@ -56,6 +56,7 @@ async def _pong():
                 msg = await ipc.channel.arecv_message(cch, IPC_MESSAGES)
                 await ipc.channel.asend_message(cch, msg)
             except utils.aio.duplex_unix.DuplexClosed:
+                print("_echo_main, duplex closed..")
                 break
 
     asyncio.run(_pong())
@@ -192,6 +193,7 @@ async def test_proc_pool():
         initialize_process_fnc=_initialize_proc,
         job_entrypoint_fnc=_job_entrypoint,
         num_idle_processes=num_idle_processes,
+        job_executor_type=job.JobExecutorType.PROCESS,
         initialize_timeout=20.0,
         close_timeout=20.0,
         mp_ctx=mp_ctx,
@@ -208,21 +210,21 @@ async def test_proc_pool():
     exitcodes = []
 
     @pool.on("process_created")
-    def _process_created(proc: ipc.proc_pool.SupervisedProc):
+    def _process_created(proc: ipc.proc_job_executor.ProcJobExecutor):
         created_q.put_nowait(None)
         proc.start_arguments = start_args
 
     @pool.on("process_started")
-    def _process_started(proc: ipc.proc_pool.SupervisedProc):
+    def _process_started(proc: ipc.proc_job_executor.ProcJobExecutor):
         start_q.put_nowait(None)
         pids.append(proc.pid)
 
     @pool.on("process_ready")
-    def _process_ready(proc: ipc.proc_pool.SupervisedProc):
+    def _process_ready(proc: ipc.proc_job_executor.ProcJobExecutor):
         ready_q.put_nowait(None)
 
     @pool.on("process_closed")
-    def _process_closed(proc: ipc.proc_pool.SupervisedProc):
+    def _process_closed(proc: ipc.proc_job_executor.ProcJobExecutor):
         close_q.put_nowait(None)
         exitcodes.append(proc.exitcode)
 
@@ -264,6 +266,7 @@ async def test_slow_initialization():
     loop = asyncio.get_running_loop()
     num_idle_processes = 2
     pool = ipc.proc_pool.ProcPool(
+        job_executor_type=job.JobExecutorType.PROCESS,
         initialize_process_fnc=_initialize_proc,
         job_entrypoint_fnc=_job_entrypoint,
         num_idle_processes=num_idle_processes,
@@ -282,12 +285,12 @@ async def test_slow_initialization():
     exitcodes = []
 
     @pool.on("process_created")
-    def _process_created(proc: ipc.proc_pool.SupervisedProc):
+    def _process_created(proc: ipc.proc_job_executor.ProcJobExecutor):
         proc.start_arguments = start_args
         start_q.put_nowait(None)
 
     @pool.on("process_closed")
-    def _process_closed(proc: ipc.proc_pool.SupervisedProc):
+    def _process_closed(proc: ipc.proc_job_executor.ProcJobExecutor):
         close_q.put_nowait(None)
         pids.append(proc.pid)
         exitcodes.append(proc.exitcode)
@@ -313,10 +316,10 @@ def _create_proc(
     close_timeout: float,
     mp_ctx: BaseContext,
     initialize_timeout: float = 20.0,
-) -> (ipc.supervised_proc.SupervisedProc, _StartArgs):
+) -> tuple[ipc.proc_job_executor.ProcJobExecutor, _StartArgs]:
     start_args = _new_start_args(mp_ctx)
     loop = asyncio.get_running_loop()
-    proc = ipc.supervised_proc.SupervisedProc(
+    proc = ipc.proc_job_executor.ProcJobExecutor(
         initialize_process_fnc=_initialize_proc,
         job_entrypoint_fnc=_job_entrypoint,
         initialize_timeout=initialize_timeout,
diff --git a/tests/test_llm.py b/tests/test_llm.py
index 97ff6f033..44ce9c434 100644
--- a/tests/test_llm.py
+++ b/tests/test_llm.py
@@ -1,10 +1,14 @@
+from __future__ import annotations
+
 import asyncio
+import uuid
 from enum import Enum
-from typing import Annotated
+from typing import Annotated, Callable, Optional
 
+import pytest
 from livekit.agents import llm
 from livekit.agents.llm import ChatContext, FunctionContext, TypeInfo, ai_callable
-from livekit.plugins import openai
+from livekit.plugins import anthropic, openai
 
 
 class Unit(Enum):
@@ -13,18 +17,6 @@ class Unit(Enum):
 
 
 class FncCtx(FunctionContext):
-    def __init__(self) -> None:
-        super().__init__()
-        self._get_weather_calls = 0
-        self._play_music_calls = 0
-        self._toggle_light_calls = 0
-        self._select_currency_calls = 0
-        self._change_volume_calls = 0
-
-        self._toggle_light_cancelled = False
-        self._selected_currencies = None
-        self._selected_volume = None
-
     @ai_callable(
         description="Get the current weather in a given location", auto_retry=True
     )
@@ -36,8 +28,7 @@ def get_weather(
         unit: Annotated[
             Unit, TypeInfo(description="The temperature unit to use.")
         ] = Unit.CELSIUS,
-    ) -> None:
-        self._get_weather_calls += 1
+    ) -> None: ...
 
     @ai_callable(description="Play a music")
     def play_music(
@@ -45,8 +36,7 @@ def play_music(
         name: Annotated[
             str, TypeInfo(description="The artist and the name of the song")
         ],
-    ) -> None:
-        self._play_music_calls += 1
+    ) -> None: ...
 
     # test for cancelled calls
     @ai_callable(description="Turn on/off the lights in a room")
@@ -55,26 +45,20 @@ async def toggle_light(
         room: Annotated[str, TypeInfo(description="The room to control")],
         on: bool = True,
     ) -> None:
-        self._toggle_light_calls += 1
-        try:
-            await asyncio.sleep(60)
-        except asyncio.CancelledError:
-            self._toggle_light_cancelled = True
+        await asyncio.sleep(60)
 
     # used to test arrays as arguments
-    @ai_callable(description="Currencies of a specific country")
+    @ai_callable(description="Currencies of a specific area")
     def select_currencies(
         self,
         currencies: Annotated[
             list[str],
             TypeInfo(
-                description="The currency to select",
+                description="The currencies to select",
                 choices=["usd", "eur", "gbp", "jpy", "sek"],
             ),
         ],
-    ) -> None:
-        self._select_currency_calls += 1
-        self._selected_currencies = currencies
+    ) -> None: ...
 
     # test choices on int
     @ai_callable(description="Change the volume")
@@ -83,19 +67,58 @@ def change_volume(
         volume: Annotated[
             int, TypeInfo(description="The volume level", choices=[0, 11, 30, 83, 99])
         ],
-    ) -> None:
-        self._change_volume_calls += 1
-        self._selected_volume = volume
-
-
-async def test_chat():
-    llm = openai.LLM(model="gpt-4o")
+    ) -> None: ...
 
+    @ai_callable(description="Update user info")
+    def update_user_info(
+        self,
+        email: Annotated[
+            Optional[str], TypeInfo(description="The user address email")
+        ] = None,
+        name: Annotated[Optional[str], TypeInfo(description="The user name")] = None,
+        address: Optional[
+            Annotated[str, TypeInfo(description="The user address")]
+        ] = None,
+    ) -> None: ...
+
+
+def test_hashable_typeinfo():
+    typeinfo = TypeInfo(description="testing", choices=[1, 2, 3])
+    # TypeInfo must be hashable when used in combination of typing.Annotated
+    hash(typeinfo)
+
+
+LLMS: list[llm.LLM | Callable[[], llm.LLM]] = [
+    openai.LLM(),
+    lambda: openai.beta.AssistantLLM(
+        assistant_opts=openai.beta.AssistantOptions(
+            create_options=openai.beta.AssistantCreateOptions(
+                name=f"test-{uuid.uuid4()}",
+                instructions="You are a basic assistant",
+                model="gpt-4o",
+            )
+        )
+    ),
+    # anthropic.LLM(),
+]
+
+
+@pytest.mark.parametrize("input_llm", LLMS)
+async def test_chat(input_llm: llm.LLM | Callable[[], llm.LLM]):
+    if not isinstance(input_llm, llm.LLM):
+        input_llm = input_llm()
     chat_ctx = ChatContext().append(
         text='You are an assistant at a drive-thru restaurant "Live-Burger". Ask the customer what they would like to order.'
     )
 
-    stream = llm.chat(chat_ctx=chat_ctx)
+    # Anthropics LLM requires at least one message (system messages don't count)
+    if isinstance(input_llm, anthropic.LLM):
+        chat_ctx.append(
+            text="Hello",
+            role="user",
+        )
+
+    stream = input_llm.chat(chat_ctx=chat_ctx)
     text = ""
     async for chunk in stream:
         content = chunk.choices[0].delta.content
@@ -105,23 +128,28 @@ async def test_chat():
     assert len(text) > 0
 
 
-async def test_fnc_calls():
+@pytest.mark.parametrize("input_llm", LLMS)
+async def test_basic_fnc_calls(input_llm: Callable[[], llm.LLM] | llm.LLM):
+    if not isinstance(input_llm, llm.LLM):
+        input_llm = input_llm()
     fnc_ctx = FncCtx()
-    llm = openai.LLM(model="gpt-4o")
 
     stream = await _request_fnc_call(
-        llm, "What's the weather in San Francisco and Paris?", fnc_ctx
+        input_llm,
+        "What's the weather in San Francisco and what's the weather Paris?",
+        fnc_ctx,
     )
-    fns = stream.execute_functions()
-    await asyncio.gather(*[f.task for f in fns])
+    calls = stream.execute_functions()
+    await asyncio.gather(*[f.task for f in calls])
     await stream.aclose()
+    assert len(calls) == 2, "get_weather should be called twice"
 
-    assert fnc_ctx._get_weather_calls == 2, "get_weather should be called twice"
 
-
-async def test_fnc_calls_runtime_addition():
+@pytest.mark.parametrize("input_llm", LLMS)
+async def test_runtime_addition(input_llm: Callable[[], llm.LLM] | llm.LLM):
+    if not isinstance(input_llm, llm.LLM):
+        input_llm = input_llm()
     fnc_ctx = FncCtx()
-    llm = openai.LLM(model="gpt-4o")
     called_msg = ""
 
     @fnc_ctx.ai_callable(description="Show a message on the screen")
@@ -132,7 +160,7 @@ async def show_message(
         called_msg = message
 
     stream = await _request_fnc_call(
-        llm, "Can you show 'Hello LiveKit!' on the screen?", fnc_ctx
+        input_llm, "Can you show 'Hello LiveKit!' on the screen?", fnc_ctx
     )
     fns = stream.execute_functions()
     await asyncio.gather(*[f.task for f in fns])
@@ -141,63 +169,107 @@ async def show_message(
     assert called_msg == "Hello LiveKit!", "send_message should be called"
 
 
-async def test_cancelled_calls():
+@pytest.mark.parametrize("input_llm", LLMS)
+async def test_cancelled_calls(input_llm: Callable[[], llm.LLM] | llm.LLM):
+    if not isinstance(input_llm, llm.LLM):
+        input_llm = input_llm()
     fnc_ctx = FncCtx()
-    llm = openai.LLM(model="gpt-4o")
 
     stream = await _request_fnc_call(
-        llm, "Turn off the lights in the Theo's bedroom", fnc_ctx
+        input_llm, "Turn off the lights in the Theo's bedroom", fnc_ctx
     )
-    stream.execute_functions()
-
-    # Need to wait for the task to start
-    await asyncio.sleep(0)
+    calls = stream.execute_functions()
+    await asyncio.sleep(0.2)  # wait for the loop executor to start the task
 
-    # don't wait for gather_function_results and directly close
+    # don't wait for gather_function_results and directly close (this should cancel the ongoing calls)
     await stream.aclose()
 
-    assert fnc_ctx._toggle_light_calls == 1
-    assert fnc_ctx._toggle_light_cancelled, "toggle_light should be cancelled"
+    assert len(calls) == 1
+    assert isinstance(
+        calls[0].exception, asyncio.CancelledError
+    ), "toggle_light should have been cancelled"
 
 
-async def test_calls_arrays():
+@pytest.mark.parametrize("input_llm", LLMS)
+async def test_calls_arrays(input_llm: Callable[[], llm.LLM] | llm.LLM):
+    if not isinstance(input_llm, llm.LLM):
+        input_llm = input_llm()
     fnc_ctx = FncCtx()
-    llm = openai.LLM(model="gpt-4o")
 
     stream = await _request_fnc_call(
-        llm, "Can you select all currencies in Europe at once?", fnc_ctx
+        input_llm,
+        "Can you select all currencies in Europe at once?",
+        fnc_ctx,
+        temperature=0.2,
     )
-    fns = stream.execute_functions()
-    await asyncio.gather(*[f.task for f in fns])
+    calls = stream.execute_functions()
+    await asyncio.gather(*[f.task for f in calls])
     await stream.aclose()
 
-    assert fnc_ctx._select_currency_calls == 1
-    assert fnc_ctx._selected_currencies is not None
-    assert len(fnc_ctx._selected_currencies) == 3
+    assert len(calls) == 1, "select_currencies should have been called only once"
 
-    assert "eur" in fnc_ctx._selected_currencies
-    assert "gbp" in fnc_ctx._selected_currencies
-    assert "sek" in fnc_ctx._selected_currencies
+    call = calls[0]
+    currencies = call.call_info.arguments["currencies"]
+    assert len(currencies) == 3, "select_currencies should have 3 currencies"
+    assert (
+        "eur" in currencies and "gbp" in currencies and "sek" in currencies
+    ), "select_currencies should have eur, gbp, sek"
 
 
-async def test_calls_choices():
+@pytest.mark.parametrize("input_llm", LLMS)
+async def test_calls_choices(input_llm: Callable[[], llm.LLM] | llm.LLM):
+    if not isinstance(input_llm, llm.LLM):
+        input_llm = input_llm()
     fnc_ctx = FncCtx()
-    llm = openai.LLM(model="gpt-4o")
 
-    stream = await _request_fnc_call(llm, "Set the volume to 30", fnc_ctx)
-    fns = stream.execute_functions()
-    await asyncio.gather(*[f.task for f in fns])
+    stream = await _request_fnc_call(input_llm, "Set the volume to 30", fnc_ctx)
+    calls = stream.execute_functions()
+    await asyncio.gather(*[f.task for f in calls])
     await stream.aclose()
 
-    assert fnc_ctx._change_volume_calls == 1
-    assert fnc_ctx._selected_volume == 30
+    assert len(calls) == 1, "change_volume should have been called only once"
+
+    call = calls[0]
+    volume = call.call_info.arguments["volume"]
+    assert volume == 30, "change_volume should have been called with volume 30"
+
+
+@pytest.mark.parametrize("input_llm", LLMS)
+async def test_optional_args(input_llm: Callable[[], llm.LLM] | llm.LLM):
+    if not isinstance(input_llm, llm.LLM):
+        input_llm = input_llm()
+    fnc_ctx = FncCtx()
+
+    stream = await _request_fnc_call(
+        input_llm, "Can you update my information? My name is Theo", fnc_ctx
+    )
+
+    calls = stream.execute_functions()
+    await asyncio.gather(*[f.task for f in calls])
+    await stream.aclose()
+
+    assert len(calls) == 1, "update_user_info should have been called only once"
+
+    call = calls[0]
+    name = call.call_info.arguments.get("name", None)
+    email = call.call_info.arguments.get("email", None)
+    address = call.call_info.arguments.get("address", None)
+
+    assert name == "Theo", "update_user_info should have been called with name 'Theo'"
+    assert email is None, "update_user_info should have been called with email None"
+    assert address is None, "update_user_info should have been called with address None"
 
 
 async def _request_fnc_call(
-    model: llm.LLM, request: str, fnc_ctx: FncCtx
+    model: llm.LLM,
+    request: str,
+    fnc_ctx: FncCtx,
+    temperature: float | None = None,
 ) -> llm.LLMStream:
     stream = model.chat(
-        chat_ctx=ChatContext().append(text=request, role="user"), fnc_ctx=fnc_ctx
+        chat_ctx=ChatContext().append(text=request, role="user"),
+        fnc_ctx=fnc_ctx,
+        temperature=temperature,
     )
 
     async for _ in stream:
diff --git a/tests/test_tokenizer.py b/tests/test_tokenizer.py
index 931713eeb..bead760b7 100644
--- a/tests/test_tokenizer.py
+++ b/tests/test_tokenizer.py
@@ -118,6 +118,60 @@ async def test_streamed_word_tokenizer(tokenizer: tokenize.WordTokenizer):
         assert ev.token == WORDS_EXPECTED[i]
 
 
+WORDS_PUNCT_TEXT = 'This is <phoneme alphabet="cmu-arpabet" ph="AE K CH UW AH L IY">actually</phoneme> tricky to handle.'
+
+WORDS_PUNCT_EXPECTED = [
+    "This",
+    "is",
+    "<phoneme",
+    'alphabet="cmu-arpabet"',
+    'ph="AE',
+    "K",
+    "CH",
+    "UW",
+    "AH",
+    "L",
+    'IY">actually</phoneme>',
+    "tricky",
+    "to",
+    "handle.",
+]
+
+WORD_PUNCT_TOKENIZERS = [basic.WordTokenizer(ignore_punctuation=False)]
+
+
+@pytest.mark.parametrize("tokenizer", WORD_PUNCT_TOKENIZERS)
+def test_punct_word_tokenizer(tokenizer: tokenize.WordTokenizer):
+    tokens = tokenizer.tokenize(text=WORDS_PUNCT_TEXT)
+    for i, token in enumerate(WORDS_PUNCT_EXPECTED):
+        assert token == tokens[i]
+
+
+@pytest.mark.parametrize("tokenizer", WORD_PUNCT_TOKENIZERS)
+async def test_streamed_punct_word_tokenizer(tokenizer: tokenize.WordTokenizer):
+    # divide text by chunks of arbitrary length (1-4)
+    pattern = [1, 2, 4]
+    text = WORDS_PUNCT_TEXT
+    chunks = []
+    pattern_iter = iter(pattern * (len(text) // sum(pattern) + 1))
+
+    for chunk_size in pattern_iter:
+        if not text:
+            break
+        chunks.append(text[:chunk_size])
+        text = text[chunk_size:]
+
+    stream = tokenizer.stream()
+    for chunk in chunks:
+        stream.push_text(chunk)
+
+    stream.end_input()
+
+    for i in range(len(WORDS_PUNCT_EXPECTED)):
+        ev = await stream.__anext__()
+        assert ev.token == WORDS_PUNCT_EXPECTED[i]
+
+
 HYPHENATOR_TEXT = [
     "Segment",
     "expected",
@@ -141,3 +195,55 @@ def test_hyphenate_word():
     for i, word in enumerate(HYPHENATOR_TEXT):
         hyphenated = basic.hyphenate_word(word)
         assert hyphenated == HYPHENATOR_EXPECTED[i]
+
+
+REPLACE_TEXT = (
+    "This is a test. Hello world, I'm creating this agents..     framework. Once again "
+    "framework.  A.B.C"
+)
+REPLACE_EXPECTED = (
+    "This is a test. Hello universe, I'm creating this assistants..     library. twice again "
+    "library.  A.B.C.D"
+)
+
+REPLACE_REPLACEMENTS = {
+    "world": "universe",
+    "framework": "library",
+    "a.b.c": "A.B.C.D",
+    "once": "twice",
+    "agents": "assistants",
+}
+
+
+def test_replace_words():
+    replaced = tokenize.utils.replace_words(
+        text=REPLACE_TEXT, replacements=REPLACE_REPLACEMENTS
+    )
+    assert replaced == REPLACE_EXPECTED
+
+
+async def test_replace_words_async():
+    pattern = [1, 2, 4]
+    text = REPLACE_TEXT
+    chunks = []
+    pattern_iter = iter(pattern * (len(text) // sum(pattern) + 1))
+
+    for chunk_size in pattern_iter:
+        if not text:
+            break
+        chunks.append(text[:chunk_size])
+        text = text[chunk_size:]
+
+    async def _replace_words_async():
+        for chunk in chunks:
+            yield chunk
+
+    replaced_chunks = []
+
+    async for chunk in tokenize.utils.replace_words(
+        text=_replace_words_async(), replacements=REPLACE_REPLACEMENTS
+    ):
+        replaced_chunks.append(chunk)
+
+    replaced = "".join(replaced_chunks)
+    assert replaced == REPLACE_EXPECTED
diff --git a/tests/test_tts.py b/tests/test_tts.py
index 5b2ebe1d4..cd1858607 100644
--- a/tests/test_tts.py
+++ b/tests/test_tts.py
@@ -41,6 +41,7 @@ async def _assert_valid_synthesized_audio(
     google.TTS(),
     azure.TTS(),
     cartesia.TTS(),
+    cartesia.TTS(speed="fastest", emotion=["surprise:highest"]),
 ]
 
 
@@ -61,6 +62,7 @@ async def test_synthesize(tts: agents.tts.TTS):
     elevenlabs.TTS(),
     elevenlabs.TTS(encoding="pcm_44100"),
     cartesia.TTS(),
+    cartesia.TTS(speed="fastest", emotion=["surprise:highest"]),
     agents.tts.StreamAdapter(
         tts=openai.TTS(), sentence_tokenizer=STREAM_SENT_TOKENIZER
     ),
diff --git a/tests/test_vad.py b/tests/test_vad.py
index e69de29bb..15d066571 100644
--- a/tests/test_vad.py
+++ b/tests/test_vad.py
@@ -0,0 +1,66 @@
+from livekit.agents import vad
+from livekit.plugins import silero
+
+from . import utils
+
+VAD = silero.VAD.load(
+    min_speech_duration=0.5, min_silence_duration=0.5, padding_duration=1.0
+)
+
+
+async def test_chunks_vad() -> None:
+    frames, transcript = utils.make_test_audio(chunk_duration_ms=10)
+    assert len(frames) > 1, "frames aren't chunked"
+
+    stream = VAD.stream()
+
+    for frame in frames:
+        stream.push_frame(frame)
+
+    stream.end_input()
+
+    start_of_speech_i = 0
+    end_of_speech_i = 0
+    async for ev in stream:
+        if ev.type == vad.VADEventType.START_OF_SPEECH:
+            with open(
+                f"test_vad.start_of_speech_frames_{start_of_speech_i}.wav", "wb"
+            ) as f:
+                f.write(utils.make_wav_file(ev.frames))
+
+            start_of_speech_i += 1
+
+        if ev.type == vad.VADEventType.END_OF_SPEECH:
+            with open(
+                f"test_vad.end_of_speech_frames_{end_of_speech_i}.wav", "wb"
+            ) as f:
+                f.write(utils.make_wav_file(ev.frames))
+
+            end_of_speech_i += 1
+
+    assert start_of_speech_i > 0, "no start of speech detected"
+    assert start_of_speech_i == end_of_speech_i, "start and end of speech mismatch"
+
+
+async def test_file_vad():
+    frames, transcript = utils.make_test_audio()
+    assert len(frames) == 1, "one frame should be the whole audio"
+
+    stream = VAD.stream()
+
+    for frame in frames:
+        stream.push_frame(frame)
+
+    stream.end_input()
+
+    start_of_speech_i = 0
+    end_of_speech_i = 0
+    async for ev in stream:
+        if ev.type == vad.VADEventType.START_OF_SPEECH:
+            start_of_speech_i += 1
+
+        if ev.type == vad.VADEventType.END_OF_SPEECH:
+            end_of_speech_i += 1
+
+    assert start_of_speech_i > 0, "no start of speech detected"
+    assert start_of_speech_i == end_of_speech_i, "start and end of speech mismatch"
diff --git a/tests/utils.py b/tests/utils.py
index efcc6f964..bd1d6fe1e 100644
--- a/tests/utils.py
+++ b/tests/utils.py
@@ -1,4 +1,18 @@
+from __future__ import annotations
+
+import io
+import os
+import pathlib
+import wave
+
 import jiwer as tr
+from livekit import rtc
+from livekit.agents import utils
+
+TEST_AUDIO_FILEPATH = os.path.join(os.path.dirname(__file__), "long.mp3")
+TEST_AUDIO_TRANSCRIPT = pathlib.Path(
+    os.path.dirname(__file__), "long_transcript.txt"
+).read_text()
 
 
 def wer(hypothesis: str, reference: str) -> float:
@@ -21,3 +35,49 @@ def wer(hypothesis: str, reference: str) -> float:
         reference_transform=wer_standardize_contiguous,
         hypothesis_transform=wer_standardize_contiguous,
     )
+
+
+def read_mp3_file(path) -> rtc.AudioFrame:
+    mp3 = utils.codecs.Mp3StreamDecoder()
+    frames: list[rtc.AudioFrame] = []
+    with open(path, "rb") as file:
+        while True:
+            chunk = file.read(4096)
+            if not chunk:
+                break
+
+            frames.extend(mp3.decode_chunk(chunk))
+
+    return utils.merge_frames(frames)  # merging just for ease of use
+
+
+def make_test_audio(
+    chunk_duration_ms: int | None = None,
+) -> (list[rtc.AudioFrame], str):
+    mp3_audio = read_mp3_file(TEST_AUDIO_FILEPATH)
+
+    if not chunk_duration_ms:
+        return [mp3_audio], TEST_AUDIO_TRANSCRIPT
+
+    chunk_size = int(mp3_audio.sample_rate / (1000 / chunk_duration_ms))
+    bstream = utils.audio.AudioByteStream(
+        sample_rate=mp3_audio.sample_rate,
+        num_channels=mp3_audio.num_channels,
+        samples_per_channel=chunk_size,
+    )
+
+    frames = bstream.write(mp3_audio.data.tobytes())
+    frames.extend(bstream.flush())
+    return frames, TEST_AUDIO_TRANSCRIPT
+
+
+def make_wav_file(frames: list[rtc.AudioFrame]) -> bytes:
+    buffer = utils.merge_frames(frames)
+    io_buffer = io.BytesIO()
+    with wave.open(io_buffer, "wb") as wav:
+        wav.setnchannels(buffer.num_channels)
+        wav.setsampwidth(2)  # 16-bit
+        wav.setframerate(buffer.sample_rate)
+        wav.writeframes(buffer.data)
+
+    return io_buffer.getvalue()