Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: Cannot change transcription language #20

Closed
mxpucci opened this issue Mar 2, 2023 · 4 comments · Fixed by #52
Closed

bug: Cannot change transcription language #20

mxpucci opened this issue Mar 2, 2023 · 4 comments · Fixed by #52
Labels
bug Something isn't working

Comments

@mxpucci
Copy link

mxpucci commented Mar 2, 2023

Describe the bug

Even if the original audio is not an English speech, the transcription is always translated in English.
I've tried to change the language property of params using api.Params.language = 'it' but it didn't work.

To reproduce

import ffmpeg
import numpy as np
from whispercpp import Whisper
from whispercpp import api

try:
    y, _ = (
        ffmpeg.input("/Users/michelangelopucci/Downloads/untitled folder 2/output.wav", threads=0)
        .output("-", format="s16le", acodec="pcm_s16le", ac=1)
        .run(
            cmd=["ffmpeg", "-nostdin"], capture_stdout=True, capture_stderr=True
        )
    )
except ffmpeg.Error as e:
    raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from e

arr = np.frombuffer(y, np.int16).flatten().astype(np.float32) / 32768.0

api.Params.language = 'it'
w = Whisper.from_pretrained("large")
a = w.transcribe(arr)
print(a)

Expected behavior

No response

Environment

Python 3.9.6

@mxpucci mxpucci added the bug Something isn't working label Mar 2, 2023
@aarnphm
Copy link
Owner

aarnphm commented Mar 2, 2023

You have to change it from the Whisper instance

w = w.Whisper.from_pretrained("tiny")
w.params.language = "it"
w.transcribe(arr)

@mxpucci
Copy link
Author

mxpucci commented Mar 2, 2023

Well, I tried also doing that, however I get this error whisper_lang_id: unknown language 'ӄ'
In fact, after the language property is edited, accessing to w.params.language gets UnicodeDecodeError: 'utf-8' codec can't decode byte 0xce in position 0: invalid continuation byte

@aarnphm
Copy link
Owner

aarnphm commented Mar 3, 2023

I think there is a bug with the params c_str right now. Feel free to put up a PR to fix it. It is in src/whispercpp/api_export.cc for the Params obj.

@lasseedfast
Copy link

Wish I could but I cannot... Hope someone else can fix this soon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants