Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moved to extract_answer from #148 and back to gpt-4o-mini #161

Merged
merged 3 commits into from
Dec 18, 2024

Conversation

jamesbraza
Copy link
Collaborator

I incorporated #148 evaluation scheme into #157. @whitead 's prompting scheme was much better, so now we can move back to gpt-4o-mini for evaluation.

@jamesbraza jamesbraza added the enhancement New feature or request label Dec 18, 2024
@jamesbraza jamesbraza requested review from sidnarayanan and a team December 18, 2024 21:50
@jamesbraza jamesbraza self-assigned this Dec 18, 2024
@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Dec 18, 2024
eval_prompt = self.EVALUATION_PROMPT_TEMPLATE.format(
qa_prompt=self.question_prompt, qa_answer=answer
)
raw_evaluation = await prompt_runner(eval_prompt)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also thanks to this change we no longer need raw_evaluation

Copy link
Contributor

@whitead whitead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great

@jamesbraza jamesbraza requested a review from a team December 18, 2024 21:57
@jamesbraza jamesbraza merged commit afa6961 into main Dec 18, 2024
6 checks passed
@jamesbraza jamesbraza deleted the better-mc-grading branch December 18, 2024 21:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request size:XXL This PR changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants