Skip to content

Commit

Permalink
Fix HfFileSystem.exists() for deleted repos and update documentation (
Browse files Browse the repository at this point in the history
#2643)

* Fix fs exists + add and fix documentation

* Update documentation

* Add test and fix documentation
  • Loading branch information
hanouticelina authored Nov 4, 2024
1 parent 8a99deb commit 1da4018
Show file tree
Hide file tree
Showing 4 changed files with 329 additions and 61 deletions.
8 changes: 8 additions & 0 deletions docs/source/en/guides/hf_file_system.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,14 @@ rendered properly in your Markdown viewer.

In addition to the [`HfApi`], the `huggingface_hub` library provides [`HfFileSystem`], a pythonic [fsspec-compatible](https://filesystem-spec.readthedocs.io/en/latest/) file interface to the Hugging Face Hub. The [`HfFileSystem`] builds on top of the [`HfApi`] and offers typical filesystem style operations like `cp`, `mv`, `ls`, `du`, `glob`, `get_file`, and `put_file`.

<Tip warning={true}>

[`HfFileSystem`] provides fsspec compatibility, which is useful for libraries that require it (e.g., reading
Hugging Face datasets directly with `pandas`). However, it introduces additional overhead due to this compatibility
layer. For better performance and reliability, it's recommended to use [`HfApi`] methods when possible.

</Tip>

## Usage

```python
Expand Down
56 changes: 30 additions & 26 deletions src/huggingface_hub/hf_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -1549,6 +1549,36 @@ def _inner(self, *args, **kwargs):


class HfApi:
"""
Client to interact with the Hugging Face Hub via HTTP.
The client is initialized with some high-level settings used in all requests
made to the Hub (HF endpoint, authentication, user agents...). Using the `HfApi`
client is preferred but not mandatory as all of its public methods are exposed
directly at the root of `huggingface_hub`.
Args:
endpoint (`str`, *optional*):
Endpoint of the Hub. Defaults to <https://huggingface.co>.
token (Union[bool, str, None], optional):
A valid user access token (string). Defaults to the locally saved
token, which is the recommended method for authentication (see
https://huggingface.co/docs/huggingface_hub/quick-start#authentication).
To disable authentication, pass `False`.
library_name (`str`, *optional*):
The name of the library that is making the HTTP request. Will be added to
the user-agent header. Example: `"transformers"`.
library_version (`str`, *optional*):
The version of the library that is making the HTTP request. Will be added
to the user-agent header. Example: `"4.24.0"`.
user_agent (`str`, `dict`, *optional*):
The user agent info in the form of a dictionary or a single string. It will
be completed with information about the installed packages.
headers (`dict`, *optional*):
Additional headers to be sent with each request. Example: `{"X-My-Header": "value"}`.
Headers passed here are taking precedence over the default headers.
"""

def __init__(
self,
endpoint: Optional[str] = None,
Expand All @@ -1558,32 +1588,6 @@ def __init__(
user_agent: Union[Dict, str, None] = None,
headers: Optional[Dict[str, str]] = None,
) -> None:
"""Create a HF client to interact with the Hub via HTTP.
The client is initialized with some high-level settings used in all requests
made to the Hub (HF endpoint, authentication, user agents...). Using the `HfApi`
client is preferred but not mandatory as all of its public methods are exposed
directly at the root of `huggingface_hub`.
Args:
token (Union[bool, str, None], optional):
A valid user access token (string). Defaults to the locally saved
token, which is the recommended method for authentication (see
https://huggingface.co/docs/huggingface_hub/quick-start#authentication).
To disable authentication, pass `False`.
library_name (`str`, *optional*):
The name of the library that is making the HTTP request. Will be added to
the user-agent header. Example: `"transformers"`.
library_version (`str`, *optional*):
The version of the library that is making the HTTP request. Will be added
to the user-agent header. Example: `"4.24.0"`.
user_agent (`str`, `dict`, *optional*):
The user agent info in the form of a dictionary or a single string. It will
be completed with information about the installed packages.
headers (`dict`, *optional*):
Additional headers to be sent with each request. Example: `{"X-My-Header": "value"}`.
Headers passed here are taking precedence over the default headers.
"""
self.endpoint = endpoint if endpoint is not None else constants.ENDPOINT
self.token = token
self.library_name = library_name
Expand Down
Loading

0 comments on commit 1da4018

Please sign in to comment.