Add variance/std when displaying evaluations with repetitions #768

Bradley-Butcher · 2024-06-06T12:22:10Z

Feature request

Currently only the mean of the metric is displayed in the UI when repetitions > 1. It would be nice to have variance/std.

Motivation

Many LLM APIs (including OpenAI) can struggle to give deterministic completions. It would be nice to see the stability of metrics without looking at the individual repetitions.

hinthornw · 2024-06-06T19:37:49Z

Ya have been wanting this myself. We'll put it on the roadmap - as of now I can't promise a timeline

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add variance/std when displaying evaluations with repetitions #768

Add variance/std when displaying evaluations with repetitions #768

Bradley-Butcher commented Jun 6, 2024

hinthornw commented Jun 6, 2024

Add variance/std when displaying evaluations with repetitions #768

Add variance/std when displaying evaluations with repetitions #768

Comments

Bradley-Butcher commented Jun 6, 2024

Feature request

Motivation

hinthornw commented Jun 6, 2024