Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make inspect.get_dataset_config_names always return a non-empty list of configs #3135

Closed
severo opened this issue Oct 22, 2021 · 2 comments · Fixed by #3159
Closed

Make inspect.get_dataset_config_names always return a non-empty list of configs #3135

severo opened this issue Oct 22, 2021 · 2 comments · Fixed by #3159
Assignees
Labels
dataset-viewer Related to the dataset viewer on huggingface.co enhancement New feature or request

Comments

@severo
Copy link
Collaborator

severo commented Oct 22, 2021

Is your feature request related to a problem? Please describe.

Currently, some datasets have a configuration, while others don't. It would be simpler for the user to always have configuration names to refer to

Describe the solution you'd like

In that sense inspect.get_dataset_config_names should always return at least one configuration name, be it default or Check___region_1 (for community datasets like Check/region_1).

def get_dataset_config_names(

@severo severo added the enhancement New feature or request label Oct 22, 2021
@severo severo added the dataset-viewer Related to the dataset viewer on huggingface.co label Oct 22, 2021
@severo severo changed the title Make inspect.get_dataset_config_names always return a nen-empty list of configs Make inspect.get_dataset_config_names always return a non-empty list of configs Oct 22, 2021
@albertvillanova albertvillanova self-assigned this Oct 25, 2021
@albertvillanova
Copy link
Member

albertvillanova commented Oct 25, 2021

Hi @severo, I guess this issue requests not only to be able to access the configuration name (by using inspect.get_dataset_config_names), but the configuration itself as well (I mean you use the name to get the configuration afterwards, maybe using builder_cls.builder_configs), is this right?

@severo
Copy link
Collaborator Author

severo commented Oct 25, 2021

Yes, maybe the issue could be reformulated. As a user, I want to avoid having to manage special cases:

  • I want to be able to get the names of a dataset's configs, and use them in the rest of the API (get the data, get the split names, etc).
  • I don't want to have to manage datasets with named configs (glue) differently from datasets without named configs (acronym_identification, Check/region_1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dataset-viewer Related to the dataset viewer on huggingface.co enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants