Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Constrain Available Languages when Autodetecting Language #1164

Open
WesleyFister opened this issue Nov 22, 2024 · 7 comments

Comments

@WesleyFister
Copy link

Currently Faster-Whisper only allows you to specify a single language or attempt to detect the language out of a pool of 94 languages. I would like to be able to limit what languages can be detected. Something like the following to limit autodetection to only English, Spanish and French.
model.transcribe("audio.mp3", beam_size=5, language=["en", "es", "fr"])

@MahmoudAshraf97
Copy link
Collaborator

You can already do this, detect_language function retrns the probability of all languages, you can then exclude ll languages except these 3 and choose the one with the highest probability and pass it manually to transcribe

@WesleyFister
Copy link
Author

I see, I don't think the version of Faster-Whisper I was using (1.0.3) allowed you to return language probabilities like this. I wrote some code to return the desired languages. It works fine but I still think it would simpler for the user if you could just pass in a language list in the transcribe function. I'll let you decide to close this issue or not.

from scipy.io import wavfile

def limit_languages(audio, allowed_languages):
    sampling_rate, audio_data = wavfile.read(audio)

    model = WhisperModel("large-v2", device="cpu", compute_type="int8")
    language, language_probability, all_language_probs = model.detect_language(audio_data)

    score = 0
    for language_code, language_prob in all_language_probs:
        for allowed_language in allowed_languages:
            if language_code == allowed_language:
                if language_prob > score:
                    score = language_prob
                    detected_language = language_code

    return detected_language```

@George0828Zhang
Copy link

George0828Zhang commented Dec 18, 2024

You can already do this, detect_language function retrns the probability of all languages, you can then exclude ll languages except these 3 and choose the one with the highest probability and pass it manually to transcribe

Hi @MahmoudAshraf97 , what if multilingual=True? It does not seem possible here to limit the possible languages in a code-switched setting?

@MahmoudAshraf97
Copy link
Collaborator

Yes, it's not possible

@mariano54
Copy link

Is it possible to pass two language tokens into the model prompt for whisper? Or is that a fundamental limitation?

@MahmoudAshraf97
Copy link
Collaborator

It's possible

@mariano54
Copy link

Would you mind giving a pointer on how to do this? I'm happy to make a PR if that is useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants