Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there any additional way to improve the performance? #14

Closed
KihongK opened this issue Jun 27, 2024 · 1 comment
Closed

Is there any additional way to improve the performance? #14

KihongK opened this issue Jun 27, 2024 · 1 comment

Comments

@KihongK
Copy link

KihongK commented Jun 27, 2024

We are currently using the Whisper-Tiny multilingual model and seeking ways to improve its performance. We would appreciate any insights or suggestions on how to enhance the model's accuracy, speed, and overall efficiency.

Model:
Whisper-Tiny Multilingual TFlite (decord_id is Korean) nyadla-sys/whisper.tflite#15 (comment)

Apply:
#4 (comment)
+To achieve real-time speech processing, we are using sendData(sample) in Recorder.java

The accuracy is low; are there any ways to improve it?

I understand that the accuracy is low partly because it is a Tiny model and it has been converted to a tflite model, but I would still like to improve the performance as much as possible.

@vilassn
Copy link
Owner

vilassn commented Jul 12, 2024

@KihongK The primary reason for the low accuracy of the Whisper-Tiny model is due to how the audio data is being segmented. Currently, we are feeding 3-second audio clips without considering the natural pauses in speech, which often results in cutting off words and phrases. To improve the accuracy, we should segment the audio based on pauses in the speech rather than fixed time intervals. This can be achieved by implementing voice activity detection (VAD) to detect and segment speech more naturally.

To improve the performance, improve Audio Segmentation. Utilize voice activity detection (VAD) to ensure that audio clips are cut at natural pauses, avoiding mid-word and mid-sentence breaks.

@KihongK KihongK closed this as completed Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants