streaming.exe + command.exe transcription much lower quality than main.exe with otherwise identical setup #1641

dgm3333 · 2023-12-14T16:17:05Z

I'm trying to get a live streaming.exe transcription (and/or command.exe) to work as accurately as main.exe when processing the same audio input. Ideally I would also like the input simultaneously transcribed and saved as a .wav file for future reprocessing although at this point I'm not attempting both simultaneously since even basic streaming transcription is not working.

If I record audio using a c++ SDL2 program to take input from the PC mic and save it as a wav file 16k, AUDIO_FORMAT = AUDIO_S16LSB then load it into whisper main.exe to transcribe it, then main.exe will transcribe slightly faster than real-time with reasonable accuracy (implying time isn't the limiting factor).
Playing the same audio through the same microphone (or with normal voice) the transcription quality is significantly worse when using streaming.exe or command.exe, and even on the highspec machine there are chunks of audio which are totally ignored.

I've tried this on multiple Windows 10 PCs - including top end desktops (12 core + 64MB + GPU) and relative basic i5s with only 8MB and no GPU with the same difference. Tested both ad-hoc voice as well as playing a track from a speaker to the microphone so both inputs are identical. I've also had the same issue for every whisper version I've tried over the past year.

I've tried setting -keep-context = true

I've tried changing the following common-sdl.cpp settings with no success
changing format:-
AUDIO_F32; -> AUDIO_S16LSB
changing buffer size:-
capture_spec_requested.samples = 1024;
-> 16384;
boosting SDL thread priority:
SDL_SetHintWithPriority(SDL_HINT_AUDIO_RESAMPLING_MODE, "medium",
-> SDL_HINT_OVERRIDE);SDL_SetThreadPriority(SDL_THREAD_PRIORITY_HIGH);
setting c++ thread priority:
SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_ABOVE_NORMAL);

dgm3333 · 2023-12-19T20:53:24Z

I've updated some libraries on the build PC and converted to Win 11 and now working well - so potentially this was an external issue

Works in realtime using:
cd C:\temp && C:\bin\stream.exe -m C:\bin\models\ggml-small.en.bin -c 0 -sa

dgm3333 mentioned this issue Dec 19, 2023

Can real-time transcription be achieved? #1653

Open

dgm3333 closed this as completed Dec 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

streaming.exe + command.exe transcription much lower quality than main.exe with otherwise identical setup #1641

streaming.exe + command.exe transcription much lower quality than main.exe with otherwise identical setup #1641

dgm3333 commented Dec 14, 2023 •

edited

Loading

dgm3333 commented Dec 19, 2023 •

edited

Loading

streaming.exe + command.exe transcription much lower quality than main.exe with otherwise identical setup #1641

streaming.exe + command.exe transcription much lower quality than main.exe with otherwise identical setup #1641

Comments

dgm3333 commented Dec 14, 2023 • edited Loading

dgm3333 commented Dec 19, 2023 • edited Loading

dgm3333 commented Dec 14, 2023 •

edited

Loading

dgm3333 commented Dec 19, 2023 •

edited

Loading