chore: get prediction for eval dataset #414

Yuan325 · 2024-06-14T23:44:48Z

Add the function to get prediction for each of the queries from golden_dataset. Prediction is used as comparison to retrieve metrics.

Usage example:

from evaluation import run_llm_for_eval, goldens

# set up orchestration, session, set uuid
eval_list = await run_llm_for_eval(goldens, orchestration, session, session_id)

llm_demo/evaluation/evaluation.py

llm_demo/evaluation/eval_golden.py

Add the function to get prediction for each of the queries from golden_dataset. Prediction is used as comparison to retrieve metrics. Usage example: ``` from evaluation import run_llm_for_eval, goldens # set up orchestration, session, set uuid eval_list = await run_llm_for_eval(goldens, orchestration, session, session_id) ```

Yuan325 requested a review from a team as a code owner June 14, 2024 23:44

Yuan325 force-pushed the eval-implementation branch 3 times, most recently from 6c84c5d to c10073d Compare June 15, 2024 00:03

kurtisvg requested changes Jun 21, 2024

View reviewed changes

llm_demo/evaluation/evaluation.py Outdated Show resolved Hide resolved

llm_demo/evaluation/evaluation.py Outdated Show resolved Hide resolved

llm_demo/evaluation/evaluation.py Outdated Show resolved Hide resolved

Yuan325 force-pushed the eval-implementation branch 3 times, most recently from 5f88f32 to 9a1e174 Compare June 24, 2024 22:14

Yuan325 requested a review from kurtisvg June 24, 2024 22:15

kurtisvg approved these changes Jul 1, 2024

View reviewed changes

Base automatically changed from eval-dataset to evaluation July 11, 2024 20:07

Yuan325 added 3 commits July 11, 2024 13:14

chore: get prediction for eval dataset

4e9e57c

resolve comments

323062a

remove irrelevant functions

bfcbcf2

Yuan325 force-pushed the eval-implementation branch from 9a1e174 to bfcbcf2 Compare July 11, 2024 20:14

add context to eval

5f005d0

Yuan325 force-pushed the eval-implementation branch from 6e1ce55 to 5f005d0 Compare July 12, 2024 00:05

kurtisvg approved these changes Jul 12, 2024

View reviewed changes

llm_demo/evaluation/eval_golden.py Outdated Show resolved Hide resolved

update type for context

cf36054

Yuan325 merged commit cbbd98b into evaluation Jul 15, 2024
2 checks passed

Yuan325 deleted the eval-implementation branch July 15, 2024 17:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: get prediction for eval dataset #414

chore: get prediction for eval dataset #414

Yuan325 commented Jun 14, 2024 •

edited

Loading

chore: get prediction for eval dataset #414

chore: get prediction for eval dataset #414

Conversation

Yuan325 commented Jun 14, 2024 • edited Loading

Yuan325 commented Jun 14, 2024 •

edited

Loading