-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow inference time using CPU #271
Comments
Hi, First, you don't measure the inference time correctly. You should include the loop Second, what are you comparing the inference time against? It can only be slow relative to something else (e.g. slower than openai/whisper or whisper.cpp). |
Thank you for responding. |
You may need to reduce the threads. Even on my MacBook, threads greater than 6 resulted in slower results. |
@archive-r reducing from 8 to 2 resulted in doubling the time. |
what is the sampling rate of the WAV file you used for your test? |
16 kHz |
So the inference time may be slow relative to your expectation, but it's still much faster than openai/whisper for example. You should use a smaller model if it's still too slow for your usage. |
@guillaumekln Thanks. |
Sure, the call to |
Indeed, it helped to accelerate the inference! |
Indeed, it's quite slow. A 2-minute MP3 takes 6 minutes to finish. Meanwhile, OpenAI's Whisper finishes within 2 minutes. Did I do something wrong? Here's the code I used:
|
Hi All,
How are you?
Thank you for your valuable and amazing contribution.
I tried to inference whisper model
large-v2
version with this repo and it seems like the inference time is pretty slow. I looked at some of the posted issues and benchmarks and saw that it should be lightning fast so obviously I'm doing something wrong.Here is the code snip I used:
I ran it over multiple short samples
<10 sec
each using two different systems withint_8
compute_type:Mac M2 with 32gb ram -> took around 5 sec to transcribe.
EC2 r3.4xlarge machine with Linux OS, 16 CPU cores, and 120gb ram -> took around 10 sec to transcribe
Note that changing the passed arguments did not help either (
cpu_threads, beam_size, etc.
)Thank you in advance!
The text was updated successfully, but these errors were encountered: