Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zero-pad the audio, not the spectrogram #579

Closed
ggerganov opened this issue Mar 7, 2023 · 3 comments
Closed

Zero-pad the audio, not the spectrogram #579

ggerganov opened this issue Mar 7, 2023 · 3 comments
Labels
enhancement New feature or request

Comments

@ggerganov
Copy link
Owner

Makes sense - hopefully will reduce hallucinations

openai/whisper#838 (comment)

@ggerganov ggerganov added the enhancement New feature or request label Mar 7, 2023
@lunixbochs
Copy link

lunixbochs commented Mar 7, 2023

(Removed previous incorrect assumption about the featurization).

As long as you don't change the featurization, it looks like you can just switch your padding to -1.5 if it's more convenient to keep padding at the spectrogram level.

It seems like your speed_up could interfere with the padding otherwise

@meakbiyik
Copy link
Contributor

Oh this might be a game-changer, I wasn't able to find a way to reduce them so far.

@albino1
Copy link

albino1 commented Mar 23, 2023

This was merged into Whisper as of 20230307, is there a chance we'll see it in whisper.cpp soon?

ggerganov added a commit that referenced this issue Apr 15, 2023
Also, fallback only if more temperatures are available and if we are
at least 3 seconds before the end of the audio
jacobwu-b pushed a commit to jacobwu-b/Transcriptify-by-whisper.cpp that referenced this issue Oct 24, 2023
Also, fallback only if more temperatures are available and if we are
at least 3 seconds before the end of the audio
jacobwu-b pushed a commit to jacobwu-b/Transcriptify-by-whisper.cpp that referenced this issue Oct 24, 2023
Also, fallback only if more temperatures are available and if we are
at least 3 seconds before the end of the audio
landtanin pushed a commit to landtanin/whisper.cpp that referenced this issue Dec 16, 2023
Also, fallback only if more temperatures are available and if we are
at least 3 seconds before the end of the audio
iThalay pushed a commit to iThalay/whisper.cpp that referenced this issue Sep 23, 2024
Also, fallback only if more temperatures are available and if we are
at least 3 seconds before the end of the audio
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants