Whisperx Optimization #2258

hauck-jvsh · 2024-07-09T23:01:17Z

hauck-jvsh · 2024-07-09T23:03:38Z

This must be used with our fork of the whisperx, as I had to change the library to accept more than one file and to do not use ffmpeg, We already convert the to wav before sending it to be transcribed.

lfcnassif · 2024-07-10T00:09:21Z

Thank you very very much @hauck-jvsh! I'll run a basic accuracy test soon, to be sure it wasn't affected by the changes. Have you pushed the changes to WhisperX to some branch in our fork?

To let others know, with the batch inference approach suggested on #1539 (processing up to 16 audios at the same time) and avoiding a duplicated ffmpeg run into the library, @hauck-jvsh was able to speed up WhisperX inference on a big batch of audios with different durations up to 5x-6x on RTX 3090!

hauck-jvsh · 2024-07-10T00:13:17Z

Yes, I have pushed the changes to a branch called multi-audio
https://github.com/sepinf-inc/whisperX/tree/multi-audio

hauck-jvsh · 2024-07-10T00:16:34Z

I forgot to run the tests with all the changes disabled to compare. I ran tests only against the wav2vec model and I manage to the get whisperX only 1.6 times slower than the wav2vec.

lfcnassif · 2024-07-10T03:07:54Z

I forgot to run the tests with all the changes disabled to compare. I ran tests only against the wav2vec model and I manage to the get whisperX only 1.6 times slower than the wav2vec.

Default WhisperX is about 9x times slower than Wav2Vec2 large model on RTX 3090 with my test data sets, so 9 / 1.6 = 5.6x speed up :-)

I've just computed the WER numbers for 2 relevant data sets:

WER	TedX (3.8h)	Real data set (1h)
WhisperX-LargeV3	0.134	0.193
ThisPR-LargeV3	0.134	0.187

So accuracy seems fine!

hauck-jvsh added 8 commits July 8, 2024 00:27

First version of the remote transcription using whisper

72ab684

Only sleep if the queue is empty

6e3f1a2

Always notify the thread to prevent blocking

9dbd88f

Do not retry audios with errors

42ee8e6

Better error handling

9b62ea5

Merge branch 'master' of github.com:sepinf-inc/IPED into whisperx

fa0b8a2

Pass new parameter

92b43db

Remove compatibility with protocols older than 1.2

4a76d17

lfcnassif changed the title ~~Whisperx~~ Whisperx Optimization Jul 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Whisperx Optimization #2258

Whisperx Optimization #2258

hauck-jvsh commented Jul 9, 2024

hauck-jvsh commented Jul 9, 2024

lfcnassif commented Jul 10, 2024 •

edited

Loading

hauck-jvsh commented Jul 10, 2024 •

edited

Loading

hauck-jvsh commented Jul 10, 2024

lfcnassif commented Jul 10, 2024 •

edited

Loading

Whisperx Optimization #2258

Are you sure you want to change the base?

Whisperx Optimization #2258

Conversation

hauck-jvsh commented Jul 9, 2024

hauck-jvsh commented Jul 9, 2024

lfcnassif commented Jul 10, 2024 • edited Loading

hauck-jvsh commented Jul 10, 2024 • edited Loading

hauck-jvsh commented Jul 10, 2024

lfcnassif commented Jul 10, 2024 • edited Loading

lfcnassif commented Jul 10, 2024 •

edited

Loading

hauck-jvsh commented Jul 10, 2024 •

edited

Loading

lfcnassif commented Jul 10, 2024 •

edited

Loading