Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Add LLM API key checks to LLM-based evaluators #1989

Conversation

aybruhm
Copy link
Member

@aybruhm aybruhm commented Aug 14, 2024

Description

This PR enhances the backend by adding checks to ensure that an OpenAI API key is present for LLM-based evaluators, with clear exception messages if the key is missing. Additionally, test coverage has been expanded to include scenarios where the OpenAI API key is required, including a new test case for auto_ai_critique.

Related Issue

Closes AGE-532 & AGE-569

Acceptance Tests

Test 1: OpenAI API Key is Required for LLM-Based Evaluators

  • Precondition: Ensure no OpenAI API key is set in the LLM Keys view on the frontend.
  • Action:
    1. Run an LLM-based evaluator (e.g., auto_ai_critique) without an OpenAI API key.
  • Expected Outcome:
    • The evaluator should raise a clear and informative exception message indicating that an OpenAI API key is missing.

Test 2: LLM-based Evaluator Runs Successfully with a Valid API Key

  • Precondition: Ensure a valid OpenAI API key is set in the LLM Keys view on the frontend.
  • Action:
    1. Run an LLM-based evaluator (e.g., auto_ai_critique) with the valid OpenAI API key.
  • Expected Outcome:
    • The evaluator should run without errors.
    • The output should be consistent with expected results based on the inputs provided.

Test 3 (automated): Test for AI Critique (LLM as a Judge) API Key Checks

  • Precondition: Ensure the test suite includes scenarios where the OpenAI API key is required.
  • Action:
    1. This will be run automatically by GitHub.
  • Expected Outcome:
    • The test suite should pass, covering scenarios with and without an OpenAI API key.
    • The new test case for auto_ai_critique should be included and verified as part of the coverage.

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Aug 14, 2024
Copy link

vercel bot commented Aug 14, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
agenta ✅ Ready (Inspect) Visit Preview 💬 Add feedback Aug 26, 2024 1:16pm
agenta-documentation ✅ Ready (Inspect) Visit Preview 💬 Add feedback Aug 26, 2024 1:16pm

@aybruhm aybruhm requested a review from jp-agenta August 14, 2024 07:07
@dosubot dosubot bot added Backend enhancement New feature or request labels Aug 14, 2024
Copy link
Contributor

@jp-agenta jp-agenta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aybruhm !
There is one thing missing though.
See 👇

success, response = await check_ai_critique_inputs(

We'd need to update that so that we immediately check for API keys for any LLM-based evaluator, not just AI critique. Does it make sense ?

- format llm provider keys
- and to ensure required llm keys exists in the provided evaluator configs
- properly format llm provider keys
- and check that the required llm keys exists
Copy link
Contributor

@jp-agenta jp-agenta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left one suggestion to centralise evaluator info.

- configurable setting to evaluators requiring llm api keys
- update fixture to make use of centralized evaluators
@dosubot dosubot bot removed the size:L This PR changes 100-499 lines, ignoring generated files. label Aug 20, 2024
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Aug 22, 2024
…i-to-playground' into feature/age-532-poc-1e-add-llm-api-key-checks-in-llm-based-evaluators
Copy link
Contributor

@jp-agenta jp-agenta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can't go to QA until we fix the issue whereby the LLM doesn't respond in numeric format anymore.
-- @mmabrouk @aybruhm

@aybruhm
Copy link
Member Author

aybruhm commented Aug 26, 2024

This can't go to QA until we fix the issue whereby the LLM doesn't respond in numeric format anymore. -- @mmabrouk @aybruhm

It works now, and I didn’t change anything. Safe to say the language model was probably hallucinating at the time, right?

image

@aybruhm
Copy link
Member Author

aybruhm commented Aug 29, 2024

@aybruhm aybruhm merged commit 9212c7b into feature/age-491-poc-1e-expose-running-evaluators-via-api-to-playground Aug 29, 2024
7 checks passed
@aybruhm aybruhm deleted the feature/age-532-poc-1e-add-llm-api-key-checks-in-llm-based-evaluators branch August 29, 2024 08:22
@zenUnicorn
Copy link
Contributor

I have QA the PR and I got the expected result ✅
Works fine!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Backend enhancement New feature or request lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants