Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve speech recognition accuracy #44

Merged
merged 3 commits into from
May 12, 2024
Merged

Conversation

uezo
Copy link
Owner

@uezo uezo commented May 12, 2024

Overview

Improved the accuracy of the speech recognition system by implementing automatic threshold settings based on measured ambient noise levels.

[INFO] 2024-05-12 11:15:54,704 : Input device: [1] MacBook Airのマイク
[INFO] 2024-05-12 11:15:54,704 : Output device: [2] MacBook Airのスピーカー
[INFO] 2024-05-12 11:15:54,816 : Measuring noise levels...
Noise level: -61.68dB
[INFO] 2024-05-12 11:15:57,964 : Set volume threshold: -41.0dB

Additionally, adjusted the volume measurement to a fixed interval of 0.05 seconds, ensuring all data is consistently analyzed for better precision.

Threshold adjustment

Introduce a new parameter noise_margin to allow dynamic adjustment of the sensitivity margin above the measured noise level.
This parameter helps in fine-tuning the voice detection threshold based on ambient noise conditions, enhancing the flexibility and effectiveness of the audio settings.

app = AIAvatar(
    openai_api_key=OPENAI_API_KEY,
    google_api_key=GOOGLE_API_KEY,
    noise_margin=10.0
)

Set noise filter level manually

To manually set the noise filter level for voice detection, set auto_noise_filter_threshold to False and specify the volume_threshold_db in decibels (dB).

app = AIAvatar(
    openai_api_key=OPENAI_API_KEY,
    google_api_key=GOOGLE_API_KEY,
    auto_noise_filter_threshold=False,
    volume_threshold_db=-40   # Set the voice detection threshold to -40 dB
)

uezo added 3 commits May 12, 2024 11:14
Improved the accuracy of the speech recognition system by implementing automatic threshold settings based on measured ambient noise levels.

Additionally, adjusted the volume measurement to a fixed interval of 0.05 seconds, ensuring all data is consistently analyzed for better precision.
Introduce a new parameter `noise_margin` to allow dynamic adjustment of the sensitivity margin above the measured noise level.
This parameter helps in fine-tuning the voice detection threshold based on ambient noise conditions, enhancing the flexibility and effectiveness of the audio settings.
To manually set the noise filter level for voice detection, set `auto_noise_filter_threshold` to `False` and specify the `volume_threshold_db` in decibels (dB).

```python
app = AIAvatar(
    openai_api_key=OPENAI_API_KEY,
    google_api_key=GOOGLE_API_KEY,
    auto_noise_filter_threshold=False,
    volume_threshold_db=-40   # Set the voice detection threshold to -40 dB
)
```
@uezo uezo merged commit 93e522b into main May 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant