Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic excerpt generation using OpenAI's ChatGPT API #405

Merged
merged 40 commits into from
Mar 21, 2023

Conversation

dkotter
Copy link
Collaborator

@dkotter dkotter commented Mar 8, 2023

Description of the Change

This PR adds an integration with OpenAI as a new provider in the Language Processing service, specifically integrating with ChatGPT. The specific integration being added here is utilizing ChatGPT to provide a summary of a piece of content and then storing that summary in the excerpt field.

Setup

Setup with this provider only requires an API key. There's validation done on the settings page, anytime settings are saved, to verify if the API key is valid. This can give you one of three errors:

No API Key Invalid API Key Rate limit reached
No API key entered Invalid API key entered Rate limit reached

If the API key is valid, you won't get an error message and you'll get a success message instead. You then have a few settings to change. The most important is turning on the Generate excerpt option. If a valid API key is added but this setting isn't on, no integration will happen. The Allowed roles setting allows you to control which user roles see the Generate excerpt button (and are allowed to access the REST API endpoint). The Excerpt length setting controls how many words the final excerpt will be. This defaults to the excerpt length that WordPress has set (that can be changed by the core excerpt_length filter). The Temperature value is one of the config options the API supports. There are other options there but I decided not to bring those over for now.

ChatGPT settings

Excerpt integration

I debated a few different approaches on how to actually integrate with the excerpt generation. My priorities were the following:

  1. Limit how often we hit the API since OpenAI charges by usage
  2. Only try generating an excerpt once the content is (mostly) finalized. No sense in generating an excerpt on an in-progress post that doesn't have much content
  3. Have the ability to see what the generated excerpt will be before publishing
  4. Have the ability to add your own excerpt, modify the generated excerpt or remove the excerpt all together without those changes being overwritten

I initially was thinking of automatically generating an excerpt on save (draft, publish, ...) but this goes against point 1 and 2. I then considered automatically generating only on publish but that goes against 3 (and possibly 4). I eventually landed on the approach of adding a Generate excerpt button in the Excerpt panel that will send content to OpenAI when clicked and populate the excerpt with whatever value is returned.

This solves all the points above, as you are able to choose how often the API is hit and when the excerpt is generated (only when the button is pressed). And if you don't want to generate the excerpt, you don't have to. It does make the process more manual, as you have to click the button but I think that's a fine trade-off. I am open to other ideas on the best integration here though (I had considered adding something to the pre-publish panel but I don't think that's enough by itself. May be worth adding in addition to what else we have here).

| Excerpt panel | Excerpt loading in |

API integration

When the Generate excerpt button is clicked, a request is made to a new REST endpoint (wp-json/classifai/v1/generate-excerpt/POST_ID). This endpoint verifies the current user has permission to edit the post, we are properly authenticated with OpenAI and the Generate excerpt setting is on.

Assuming all that passes, a new Tokenizer class has been added that will try and determine how many tokens the content has and how many tokens the final excerpt will be. The ChatGPT API has a limit of 4096 tokens per request and this includes both the data you send and the data that is sent back. Unfortunately tokens are equivalent to words or characters (roughly 4 characters is 1 token) but we do some basic calculations, erring on the side of being too aggressive, to ensure our request doesn't go over the limit.

A new APIRequest class has been added here as well (followed the approach in the Watson APIRequest class) to make it easier to integrate with the API, not only for this feature but for any other OpenAI features we may add in the future (the Tokenizer class should also be reusable for future integrations).

The request is sent and then the response is parsed and returned, whether we get a successful response including our excerpt or we get an error. If it's an error, that will be shown to the user. If success, we set the returned value as our excerpt.

Reviewer notes

  • I couldn't find a way to add a custom button to the core Excerpt panel so I removed that panel all together and replaced it with our own, copying most of the code from Gutenberg and adding in our custom handling
  • There's no WP-CLI integration in this PR. I think it makes sense to add that in a followup PR
  • I added new details about OpenAI into the readme files and also updated those in a few places to make it more clear that we have multiple Language Processing providers now
  • I added a new image to the readme as well and noticed the existing images were not optimized, so I ran those through an optimization step as well as reduced the dimensions on one image. I can revert that last change if we want that image to stay super large but currently displays weird in the readme
  • get_plugin_settings was updated to account for multiple providers instead of just always using the first provider. There are other places in the code that should be updated to account for this but I'm planning to tackle that in a followup PR

How to test the Change

A valid OpenAI API key is needed to fully test this feature. OpenAI does offer a free $5 credit for new users so if you haven't signed up before, you can sign up and get an API key (ping me in Slack if you want to use my API key for testing).

  1. Log in to your OpenAI account and go to your API key section. Generate a new API key there and copy it
  2. Go to ClassifAI > Language Processing > OpenAI and paste in your API key
  3. Turn on the Generate excerpt setting. The other settings can be left default. Save changes and ensure no error message is shown
  4. Create a new post, ensuring it has at least a few paragraphs of content
  5. Open the Excerpt panel, ensure you can see the Generate excerpt button then click on that
  6. Ensure an excerpt gets populated and no errors are shown
  7. Can run through these same tests with no API key entered, an invalid API key entered and/or the Generate excerpt option is off, ensuring proper error messages are shown and functionality is removed

Changelog Entry

Added - Automatic excerpt generation using OpenAI's ChatGPT API

Credits

Props @dkotter, @jeffpaul, @zamanq

Checklist:

  • I agree to follow this project's Code of Conduct.
  • I have updated the documentation accordingly.
  • I have added tests to cover my change.
  • All new and existing tests pass.

@dkotter dkotter self-assigned this Mar 8, 2023
@dkotter dkotter requested review from jeffpaul and a team as code owners March 8, 2023 22:43
@dkotter dkotter linked an issue Mar 8, 2023 that may be closed by this pull request
4 tasks
@dkotter
Copy link
Collaborator Author

dkotter commented Mar 8, 2023

Note that E2E tests are failing here but they seem to have been failing for a bit (all recent PRs are failing as well). I'm going to look to see what needs fixed on those and tackle that separately from this PR

@jeffpaul jeffpaul added this to the 1.9.0 milestone Mar 9, 2023
@jeffpaul
Copy link
Member

jeffpaul commented Mar 9, 2023

@fabiankaegy per Darin's comment of:

I couldn't find a way to add a custom button the core Excerpt panel so I removed that panel all together and replaced it with our own, copying most of the code from Gutenberg and adding in our custom handling

...are you aware of a way to add a button into that panel or is the approach here the best given the current state of the editor?

…ate to determine when this shows. This allows us to only show the panel if an excerpt was added prior to the panel showing.
…s. Don't load our custom JS if the current user role doesn't match
@jeffpaul
Copy link
Member

@iamdharmesh tagging you for code review here as this week you've got some OSS time, hoping to get this ready for release as expeditiously as we can (hoping to get 1-2 features released before Summit as feasible)

iamdharmesh
iamdharmesh previously approved these changes Mar 15, 2023
Copy link
Member

@iamdharmesh iamdharmesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this @dkotter. This looks amazing. 🎉

I just added 2-3 minor notes to discuss but otherwise, all looks great.

includes/Classifai/Helpers.php Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Auto-populate missing meta tags and descriptions
6 participants