Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] HF_HUB_ENABLE_HF_TRANSFER=0? Extremely slow Model downloads from Hugging Face #1375

Open
lucasmelogithub opened this issue Jan 9, 2025 · 5 comments
Labels
feature New feature or request

Comments

@lucasmelogithub
Copy link
Contributor

lucasmelogithub commented Jan 9, 2025

Priority

Undecided

OS type

Ubuntu

Hardware type

Xeon-GNR

Running nodes

Single Node

Description

Is there a technical reason for HF_HUB_ENABLE_HF_TRANSFER=0 on the compose.yaml files?
I did not run into download issues early last year. But starting in December, Hugging Face Models downloads(ex: Intel/neural-chat-7b-v3-3) are now taking 24 minutes to download.

I updated the Xeon ChatQnA compose.yaml file to HF_HUB_ENABLE_HF_TRANSFER=1 and it now only took 15 seconds.

If no concerns, I can submit a PR to update the compose.yaml files.

Per docs https://huggingface.co/docs/huggingface_hub/en/package_reference/environment_variables#hfhubenablehftransfer

HF_HUB_ENABLE_HF_TRANSFER
Set to True for faster uploads and downloads from the Hub

@lucasmelogithub lucasmelogithub added the feature New feature or request label Jan 9, 2025
@xiguiw
Copy link
Collaborator

xiguiw commented Jan 13, 2025

Some more info about HF_HUB_ENABLE_HF_TRANSFER
hf_transfer is an experimental feature, so it may not be enabled by default in all Hugging Face libraries. Make sure your huggingface_hub library is up to date to use this feature.

HF_HUB_ENABLE_HF_TRANSFER=1
What it does: Enables the hf_transfer library, which is a custom, high-performance file transfer mechanism developed by Hugging Face.

Purpose: It is designed to significantly speed up file downloads from the Hugging Face Hub, especially for large files or in environments with high latency or limited bandwidth.

How it works: hf_transfer uses optimizations like parallel downloads and better connection handling to improve download speeds.

When to use: Set this to 1 if you want faster downloads and are working with large datasets or models from the Hugging Face Hub.

HF_HUB_ENABLE_HF_TRANSFER=0
What it does: Disables the hf_transfer library and falls back to the default file transfer mechanism (usually Python's requests library or similar).

Purpose: This is the default behavior if hf_transfer is not explicitly enabled.

How it works: Downloads files using standard HTTP requests, which may be slower for large files or in suboptimal network conditions.

When to use: Set this to 0 if you encounter issues with hf_transfer or prefer to use the standard download mechanism.

@lucasmelogithub
Copy link
Contributor Author

We have automated the deployment on Xeon end-to-end. https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA#-automated-terraform-deployment-using-intel-optimized-cloud-modules-for-terraform

HF_HUB_ENABLE_HF_TRANSFER=0 is "unusable" due to the amount of time it takes to download the Models (~25 minutes).

HF_HUB_ENABLE_HF_TRANSFER=1 reduced that time to seconds.

@eero-t
Copy link
Contributor

eero-t commented Jan 20, 2025

@lianhao Please check above. Helm charts default to HF_HUB_ENABLE_HF_TRANSFER=0, but now that they've been switched to using a specific version of hf-downloader, maybe it would make sense to change that value?

@lianhao
Copy link
Collaborator

lianhao commented Jan 22, 2025

@lianhao Please check above. Helm charts default to HF_HUB_ENABLE_HF_TRANSFER=0, but now that they've been switched to using a specific version of hf-downloader, maybe it would make sense to change that value?

Issue opea-project/GenAIInfra#744 created for tracking purpose

@eero-t
Copy link
Contributor

eero-t commented Jan 22, 2025

Per docs https://huggingface.co/docs/huggingface_hub/en/package_reference/environment_variables#hfhubenablehftransfer

I just noticed this in that doc:

hf_transfer lacks several user-friendly features such as resumable downloads and proxies

=> Lack of proxy support would make this a no-go. Lack of resumable downloads is pretty significant downside too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants