-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluators can access all columns #1606
Conversation
aakrem
commented
May 2, 2024
•
edited
Loading
edited
- Now evaluators have access to whatever correct answer column from the testset.
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t understand why you specify correct_answer_key as a special argument for all evaluators. This makes the assumption that the evaluator would need access to one column in the test set that has the correct_answer. However, it could be that the evaluator requires multiple columns (which do not even need to be semantically correct answers), or it might not require anything other than the output of the LLM app (as for instance the case of a toxicity evaluator).
Instead, what I propose is to have the correct_answer_key as part of the configuration of the evaluators (i.e. in settings_values). This would simplify the code (less arguments in the calls), and make the logic more general.
The second comment is about the UI, I think that per default, the evaluators we have should use the correct_answer column. We can have this option hidden inside a collapse.
Removed langchain and use openai directly
Fix evaluation PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @aakrem
I reviewed the migration. It makes sense to me from logical perspective. For the syntax itself (using .dict() and deleting __dict__
) I trust you have tested it.
This has been fixed. Now you configure in the evaluator which columns it has access to and it would be able to access them
It depends on the evaluator. Some of them, like JSON check, which checks whether the llm output is in json format does not require access to any ground truth obviously. |
Responded to review and clarified issues in #1606 (comment)