-
Notifications
You must be signed in to change notification settings - Fork 337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streaming Microphone for CLI #35
Streaming Microphone for CLI #35
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good so far! Excellent approach taking the code from the example and adding it to a shared class. I'm curious about the Transcriber
protocol, can you elaborate on your use case for that?
Since we have this new AudioWarper
here now, do you think there's any recording code in AudioProcessor that would fit into here as well? Would be nice to have a few debug logs from Logging.debug
in this section as well.
One other thing: I think the microphone streaming should be explicit in swift run transcribe --model-path "Models/whisperkit-coreml/openai_whisper-large-v3"
such as a --stream
boolean argument. Reason for that is ideally giving people a heads up if they forgot to include --audio-path
, and only requesting the microphone if we are sure they want to stream.
This should allow significant cleanup for the example apps with this shared interface, but that can happen separately, nicely done!
@ZachNagengast thanks for your comments, added some changes, lmk what you think
In order to do the transcription in
added more logging, changed name to
added
I could work on this cleanup ofc |
Alright tried this out and have just some minor tweaks to the UI:
Everything else looks good, I'll try to help with this as well after word timestamps. |
I'm using the model loading function from
changed the state change callback in |
Great work @jkrukowski and @ZachNagengast! We can work on improving the output formatting next week on a separate PR |
This PR:
ContentView.swift
AudioWarper
AudioStreamTranscriber
to combine streaming audio from the microphone, processing it, and transcribing it in real-timeTranscriber
protocol to exposeWhisperKit
transcribe methodsAudioWarper
AudioStreamTranscriber
can replace the streaming logic in the example app as wellResolves: #25