Skip to content

Latest commit

 

History

History
30 lines (17 loc) · 1.14 KB

File metadata and controls

30 lines (17 loc) · 1.14 KB

image

Curator is an open-source tool to curate large scale datasets for post-training LLMs.

Curator was used to curate Bespoke-Stratos-17k, a reasoning dataset to train a fully open reasoning model Bespoke-Stratos.

Curator supports:

  • Calling Deepseek API for scalable synthetic data curation
  • Easy structured data extraction
  • Caching and automatic recovery
  • Dataset visualization
  • Saving $$$ using batch mode

Call Deepseek API with Curator easily:

image

Get Started here