-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support hfh 0.10 implicit auth #5031
Conversation
The documentation is not available anymore as the PR was closed or merged. |
@lhoestq it is now released so you can move forward with it :) |
src/datasets/utils/_hf_hub_fixes.py
Outdated
""" | ||
Get info on one specific dataset on huggingface.co. | ||
Dataset can be private if you pass an acceptable token. | ||
Args: | ||
hf_api (`huggingface_hub.HfApi`): Hub client | ||
repo_id (`str`): | ||
A namespace (user or an organization) and a repo name separated | ||
by a `/`. | ||
revision (`str`, *optional*): | ||
The revision of the dataset repository from which to get the | ||
information. | ||
timeout (`float`, *optional*): | ||
Whether to set a timeout for the request to the Hub. | ||
use_auth_token (`bool` or `str`, *optional*): | ||
Whether to use the `auth_token` provided from the | ||
`huggingface_hub` cli. If not logged in, a valid `auth_token` | ||
can be passed in as a string. | ||
Returns: | ||
[`hf_api.DatasetInfo`]: The dataset repository information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is a copy-paste from the huggingface_hub
docstring ? I would mention it and refer to https://huggingface.co/docs/huggingface_hub/v0.10.0/en/package_reference/hf_api#huggingface_hub.HfApi.dataset_info in the description to that a dataset
user knows.
def get_datasets_user_agent(user_agent: Optional[Union[str, dict]] = None) -> str: | ||
ua = f"datasets/{__version__}; python/{config.PY_VERSION}" | ||
ua = f"datasets/{__version__}" | ||
ua += f"; python/{config.PY_VERSION}" | ||
ua += f"; huggingface_hub/{huggingface_hub.__version__}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would love to hear if you have ideas on how to make this easy to configure. We have the same logic in huggingface_hub
and transformers
as well but nothing very convenient.
(not a suggestion to change something here, more as a general question)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it's only a few lines and pretty easy to modify it, I don't think we can make something significantly better than that
I took your comments into account @Wauplin :) cc @albertvillanova WDYT ? I edited the CI job to also check for our minimum supported version of hfh at the same time as the minimum pyarrow version |
@lhoestq great, thanks ! :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the tweaks. Just a suggestion and a comment below.
def skip_if_metric_requires_transformers(test_case): | ||
@wraps(test_case) | ||
def wrapper(self, metric_name): | ||
if not _has_transformers and metric_name in REQUIRE_TRANSFORMERS: | ||
self.skipTest('"test requires transformers"') | ||
else: | ||
test_case(self, metric_name) | ||
|
||
return wrapper | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why have you added this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had an env without transformers and I had failing tests x) so I did the same as for other optional deps: excluding the tests when it's not installed
@@ -1112,14 +1110,11 @@ def dataset_module_factory( | |||
_raise_if_offline_mode_is_enabled() | |||
hf_api = HfApi(config.HF_ENDPOINT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can remove this line because hf_api
is no longer used in this method?
hf_api = HfApi(config.HF_ENDPOINT)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's still used in hf_api_dataset_info(hf_api, ...) a few lines later ;)
* support hfh 0.10 implicit auth * update tests * Bump minimum hfh to 0.2.0 and test minimum version * style * fix test * fix tests * again * lucain's comment * fix ci
In huggingface-hub 0.10 the
token
parameter is deprecated for dataset_info and list_repo_files in favor of use_auth_token.Moreover if use_auth_token=None then the user's token is used implicitly.
I took those two changes into account
Close #4990
TODO:
We should wait hfh 0.10 to be relased first to make sure it works correctly before merging