-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Constrain Available Languages when Autodetecting Language #1164
Comments
You can already do this, |
I see, I don't think the version of Faster-Whisper I was using (1.0.3) allowed you to return language probabilities like this. I wrote some code to return the desired languages. It works fine but I still think it would simpler for the user if you could just pass in a language list in the transcribe function. I'll let you decide to close this issue or not.
|
Hi @MahmoudAshraf97 , what if multilingual=True? It does not seem possible here to limit the possible languages in a code-switched setting? |
Yes, it's not possible |
Is it possible to pass two language tokens into the model prompt for whisper? Or is that a fundamental limitation? |
It's possible |
Would you mind giving a pointer on how to do this? I'm happy to make a PR if that is useful. |
Currently Faster-Whisper only allows you to specify a single language or attempt to detect the language out of a pool of 94 languages. I would like to be able to limit what languages can be detected. Something like the following to limit autodetection to only English, Spanish and French.
model.transcribe("audio.mp3", beam_size=5, language=["en", "es", "fr"])
The text was updated successfully, but these errors were encountered: