RAG Evaluation Workflows

First create a raw knowledge base:

union run --remote union_rag/simple_rag.py get_documents \
   --include_union \
   --exclude_patterns '["/api/", "/_tags/"]'

Then synthesize a question and answer dataset:

union run --remote union_rag/synthesize_data.py data_synthesis_workflow \
   --n_questions 1 \
   --n_answers 5

Register the data annotation workflow:

union register union_rag/annotate_data.py

Run a single annotation session to test it out:

union run --remote union_rag/annotate_data.py create_annotation_set --random_seed 42 --n_samples 10

Annotator App

Create a secrets.txt file to store these credentials. This file is ignored by git and should look something like this:

UNIONAI_SERVERLESS_API_KEY=<UNIONAI_SERVERLESS_API_KEY>

Export the secrets to your environment:

export $(cat secrets.txt | xargs)

Run the app

streamlit run streamlit/annotation_app.py

Create the eval dataset:

union run --remote union_rag/eval_dataset.py create_eval_dataset --min_annotations_per_question 1

Evaluate a RAG experiment:

union run --remote union_rag/eval_rag.py evaluate_simple_rag --eval_configs config/eval_inputs_prompt.yaml

Experiment with different chunksizes:

union run --remote union_rag/eval_rag.py evaluate_simple_rag --eval_configs config/eval_inputs_chunksize.yaml

Experiment with different splitters:

union run --remote union_rag/eval_rag.py evaluate_simple_rag --eval_configs config/eval_inputs_splitter.yaml