NLU and NLG datasets developed within the Latvian Language Technology Initiative
-
ALPACA-LV is a machine translated Alpaca instruction dataset for Latvian.
-
COPA is a machine translated COPA benchmark dataset for Latvian.
-
MMLU is a machine translated MMLU benchmark dataset for Latvian. The
sociology_postedited.json
file contains a post-edited collection of the first 100 tasks in the sociology subject. -
Multiple-choice questions (MCQ) from Latvian Centralized High School Exams.