Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I started adding the dataset card for OSCAR !
For now it's just basic info for all the different configurations in
Dataset Structure
.In particular the Data Splits section tells how may samples there are for each config. The Data Instances section show an example for each config, and it also shows the size in MB. Since the Data Instances section is very long the user has to click to expand the info. I was able to generate it thanks to the tools made by @madlag and @yjernite :D
Cc @pjox could you help me with the other sections ? (Dataset Description, Dataset Creation, Considerations for Using the Data, Additional Information)