Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add WP-CLI command to bulk process audio transcriptions #514

Merged
merged 7 commits into from
Jul 12, 2023
Merged

Conversation

dkotter
Copy link
Collaborator

@dkotter dkotter commented Jun 28, 2023

Description of the Change

In #451 we added the ability to generate a transcription from an audio file. This PR is a follow-up to that and adds a WP-CLI command that can be used to do the same thing. This is perfect for processing a large amount of audio files or having more control over which files are processed.

The new command looks like:

wp classifai transcribe_audio

and has the following options:

  • A comma-delimited list of attachment IDs to process
  • A per_page argument. This controls how many items we process in each batch. Defaults to 100. As an example, if you have 1000 items to process, all of these will be processed but will be done in batches of 100 for performance reasons
  • A force argument that defaults to false. Will only process items that don't have data saved in post_content if set to false (basically won't re-process items that already have a transcription saved)
  • A dry-run argument that defaults to true. You must pass false to actually run the command

Here are some example commands that can be run:

wp classifai transcribe_audio 1,2,3 --dry-run=false
wp classifai transcribe_audio --dry-run=false
wp classifai transcribe_audio --per_page=5 --force=true --dry-run=false

Closes #498

How to test the Change

Try running some of the WP-CLI commands as described above and ensure they all work as expected

Changelog Entry

Added - Custom WP-CLI command that can be used to generate audio transcriptions in bulk

Credits

Props @dkotter

Checklist:

  • I agree to follow this project's Code of Conduct.
  • I have updated the documentation accordingly.
  • I have added tests to cover my change.
  • All new and existing tests pass.

…dio attachment items and process those. Add helper method that can be used to determine if an item should be processed to avoid duplicate code
@dkotter dkotter added this to the 2.2.3 milestone Jun 28, 2023
@dkotter dkotter requested review from a team and jeffpaul as code owners June 28, 2023 21:36
@dkotter dkotter self-assigned this Jun 28, 2023
@dkotter dkotter requested review from a team and Sidsector9 and removed request for a team and jeffpaul June 28, 2023 21:36
@Sidsector9
Copy link
Member

@dkotter there are some conflicts in this PR.

@dkotter
Copy link
Collaborator Author

dkotter commented Jul 5, 2023

@Sidsector9 Thanks for pointing that out. Those should be cleaned up now

@dkotter dkotter mentioned this pull request Jul 6, 2023
18 tasks
Copy link
Member

@Sidsector9 Sidsector9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good 👍

These are the points I noted:

1. Logs both error and success messages

Screenshot 2023-07-11 at 12 58 23 PM
  1. Disable Generate transcripts from audio files.
  2. Run wp classifai transcribe_audio <audio_id>.

2. Logging types can be changed

  • \WP_CLI::log( sprintf( '%d items had errors', $errors ) ); can use \WP_CLI::error() instead.
  • \WP_CLI::success( sprintf( '%d items would have had transcriptions added', $count ) ); can use \WP_CLI::log() instead.

3. Minor inconsistency between CLI and GUI

When Generate transcripts from audio files is disabled, the Re-Transcribe button is not rendered, however, we can still use CLI to Re-Transcribe. The setting description says
Automatically generate transcripts for supported audio files, but Re-Transcribing is a manual process. So IMO we should render that button at all times, whether this setting is enabled or disabled.

Adding to that, we should add a separate setting to enable/disable the feature, and use this setting value as a conditional to decide whether to render the button in the GUI.

Let me know your thoughts on this.

@dkotter
Copy link
Collaborator Author

dkotter commented Jul 11, 2023

@Sidsector9 I've updated the log types you've mentioned.

When Generate transcripts from audio files is disabled, the Re-Transcribe button is not rendered, however, we can still use CLI to Re-Transcribe. The setting description says
Automatically generate transcripts for supported audio files, but Re-Transcribing is a manual process. So IMO we should render that button at all times, whether this setting is enabled or disabled.

Adding to that, we should add a separate setting to enable/disable the feature, and use this setting value as a conditional to decide whether to render the button in the GUI.

I think these are both good points though I wouldn't tackle them as part of this PR. I'd suggest opening a new issue to track those things and we can look to address those in a separate PR.

@dkotter dkotter requested a review from Sidsector9 July 11, 2023 14:48
@dkotter
Copy link
Collaborator Author

dkotter commented Jul 12, 2023

@Sidsector9 Let me know if you have any thoughts on the above. Hoping to get this finished off so I can proceed with a 2.2.3 release. Thanks!

@dkotter dkotter merged commit 3d7cd23 into develop Jul 12, 2023
12 checks passed
@dkotter dkotter deleted the feature/498 branch July 12, 2023 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add WP-CLI command to bulk transcribe audio files
2 participants