Skip to content

Solution of Kaggle competition: LMSYS - Chatbot Arena Human Preference Predictions

License

Notifications You must be signed in to change notification settings

tascj/kaggle-lmsys-chatbot-arena

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LMSYS - Chatbot Arena Human Preference Predictions

Competition

Requirements

Hardware

A100 SXM 80G x4

Software

Base Image

nvcr.io/nvidia/pytorch:24.04-py3

Packages

detectron2==0.6
transformers==4.43.3
datasets==2.19.0
flash-attn==2.6.2
optimi==0.2.1

Training

Directory structure should be as follows.

├── data
│   ├── train.csv
│   └── test.csv
├── artifacts
│   ├── dtrainval.csv
│   ├── lmsys-33k-deduplicated.csv
│   ├── ...
│   ├── stage1
│   ├── ...
│   └── stage3
└── src  # this repo
    ├── configs
    ├── human_pref
    └── main.py
  1. python scripts/prepare_dataset.py and download 21k external data from abdullahmeda
  2. stage1
  3. make pseudo labels
  4. stage2
  5. stage3

Inference

Reference scripts to convert checkpoints for inference.

python scripts/prepare_gemma2_for_submission.py
python scripts/prepare_llama3_for_submission.py

Kaggle Notebook

About

Solution of Kaggle competition: LMSYS - Chatbot Arena Human Preference Predictions

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages