Skip to content

Find Hugging face datasets that are missing tags. Then Help to fill then in; one-by-one

Notifications You must be signed in to change notification settings

Hugging-Face-Supporter/datacards

Repository files navigation

Datacard

This repo aims to find and update the missing model cards for Hugging face datasets.

If you find this a worth while pursute, feel free to reach out and let's try to make the Hugging face datasets complete 😉

Setup

# install poetry
git clone --recurse-submodules --remote-submodules git@github.com:Hugging-Face-Supporter/datacards.git
cd datacards
git submodule update

poetry install

Run

poetry shell
python datacards/main.py

WIP

  • Look into how to provide multiple answers in model card (ex. Glue dataset)
  • Find the datasets that are missing information by parsing the README
  • Find ways to know what categories are valid answers
  • Create method to filter for missing datasets
  • Incorporate the argparse to filter for certiain things
  • Toggle between datasets to annotate.
  • Save modified files to the README again
  • Once done, find ways to create automatic PR to Hugging face datasets
  • Incorporate the Huggingface Hub API

About

Find Hugging face datasets that are missing tags. Then Help to fill then in; one-by-one

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages