-
Notifications
You must be signed in to change notification settings - Fork 504
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Improvements to end of turn plugin (#1195)
Co-authored-by: jeradf <jeradfields@gmail.com> Co-authored-by: Long Chen <longch1024@gmail.com> Co-authored-by: Jayesh Parmar <60539217+jayeshp19@users.noreply.github.com>
- Loading branch information
1 parent
6b7f21b
commit 6b4e903
Showing
10 changed files
with
155 additions
and
31 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
--- | ||
"livekit-plugins-azure": minor | ||
"livekit-plugins-turn-detector": patch | ||
"livekit-plugins-openai": patch | ||
"livekit-agents": patch | ||
--- | ||
|
||
Improvements to end of turn plugin, ensure STT language settings. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,48 @@ | ||
# LiveKit Plugins Turn Detector | ||
|
||
This plugin introduces end-of-turn detection for LiveKit Agents using a custom open-weight model to determine when a user has finished speaking. | ||
|
||
Traditional voice agents use VAD (voice activity detection) for end-of-turn detection. However, VAD models lack language understanding, often causing false positives where the agent interrupts the user before they finish speaking. | ||
|
||
By leveraging a language model specifically trained for this task, this plugin offers a more accurate and robust method for detecting end-of-turns. The current version supports English only and should not be used when targeting other languages. | ||
|
||
## Installation | ||
|
||
```bash | ||
pip install livekit-plugins-turn-detector | ||
``` | ||
|
||
## Usage | ||
|
||
This plugin is designed to be used with the `VoicePipelineAgent`: | ||
|
||
```python | ||
from livekit.plugins import turn_detector | ||
|
||
agent = VoicePipelineAgent( | ||
... | ||
turn_detector=turn_detector.EOUModel(), | ||
) | ||
``` | ||
|
||
## Running your agent | ||
|
||
This plugin requires model files. Before starting your agent for the first time, or when building Docker images for deployment, run the following command to download the model files: | ||
|
||
```bash | ||
python my_agent.py download-files | ||
``` | ||
|
||
## Model system requirements | ||
|
||
The end-of-turn model is optimized to run on CPUs with modest system requirements. It is designed to run on the same server hosting your agents. On a 4-core server instance, it completes inference in under 100ms with minimal CPU usage. | ||
|
||
The model requires 1.5GB of RAM and runs within a shared inference server, supporting multiple concurrent sessions. | ||
|
||
We are working to reduce the CPU and memory requirements in future releases. | ||
|
||
## License | ||
|
||
The plugin source code is licensed under the Apache-2.0 license. | ||
|
||
The end-of-turn model is licensed under the [LiveKit Model License](https://huggingface.co/livekit/turn-detector/blob/main/LICENSE). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2 changes: 1 addition & 1 deletion
2
livekit-plugins/livekit-plugins-turn-detector/livekit/plugins/turn_detector/log.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,3 @@ | ||
import logging | ||
|
||
logger = logging.getLogger("livekit.plugins.eou") | ||
logger = logging.getLogger("livekit.plugins.turn_detector") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters