-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transcribe audio files using the OpenAI Whisper (speech-to-text) API #451
Conversation
…he result. Fire this when a new attachment is added
…rate transcriptions
…ctions class to better support multiple providers, following what was done in #437
…er has access. Use this method anywhere we output our functionality. This fixes a bug our e2e tests found; thanks tests :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dkotter Thanks for the amazing work and very detailed information in the PR description. PR looks good to me and it's working fine.
Just added 2 minor notes to check and we are ready to merge this.
Thanks.
return new WP_Error( 'not_enabled', esc_html__( 'Transcripts are not enabled.', 'classifai' ) ); | ||
} | ||
|
||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should also add a check for authenticated
. without this when we don't have the API key saved and enable checkbox is checked it shows the "Transcribe audio" option in bulk actions.
// Check if valid authentication is in place.
if ( empty( $settings ) || ( isset( $settings['authenticated'] ) && false === $settings['authenticated'] ) ) {
return new WP_Error( 'auth', esc_html__( 'Please set up valid authentication with OpenAI.', 'classifai' ) );
}
textField.value = resp; | ||
} | ||
} | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
}, | |
}, | |
buttonText: __('Re-transcribe', 'classifai'), |
To keep the button text "Re-transcribe" after the process is complete. Currently, it becomes "Rescan".
…is enabled. Ensure button text stays correct
Description of the Change
This PR introduces a new integration with the OpenAI Whisper (speech-to-text) API. This integration will automatically create a transcript for any supported audio files and store that transcript as the
post_content
for the item (shows in theDescription
field).Workflow
A new settings section is added under
Tools > ClassifAI > Language Processing > OpenAI Whisper
. Here there are three options to choose from:Once configured, whenever a valid audio file is uploaded (must be under 25 MB and the file type has to be one of: mp3, mp4, mpeg, mpga, m4a, wav, or webm) we send that file to the Whisper API and if we get a successful response back, we'll store the transcript as the
post_content
of the attachment item.For existing audio items, there's a few options to generate transcripts. You can go to the Media Library grid view and click on an audio file. Within the modal that pops up, there will be an option there to (Re-) Transcribe:
You can also go to the single media view and there will be a custom metabox with an option to (Re-) Transcribe the item. Check the box and then save the item:
If you prefer using the Media Model list view, there's a
Transcribe audio
bulk edit option as well as an inlineTranscribe
option:How to test the Change
Tools > ClassifAI > Language Processing > OpenAI Whisper
and configure the featureDescription
field has contentChangelog Entry
Credits
Props @dkotter
Checklist: