-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] HF_HUB_ENABLE_HF_TRANSFER=0? Extremely slow Model downloads from Hugging Face #1375
Comments
Some more info about HF_HUB_ENABLE_HF_TRANSFER
|
We have automated the deployment on Xeon end-to-end. https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA#-automated-terraform-deployment-using-intel-optimized-cloud-modules-for-terraform HF_HUB_ENABLE_HF_TRANSFER=0 is "unusable" due to the amount of time it takes to download the Models (~25 minutes). HF_HUB_ENABLE_HF_TRANSFER=1 reduced that time to seconds. |
@lianhao Please check above. Helm charts default to |
Issue opea-project/GenAIInfra#744 created for tracking purpose |
I just noticed this in that doc:
=> Lack of proxy support would make this a no-go. Lack of resumable downloads is pretty significant downside too. |
Priority
Undecided
OS type
Ubuntu
Hardware type
Xeon-GNR
Running nodes
Single Node
Description
Is there a technical reason for
HF_HUB_ENABLE_HF_TRANSFER=0
on the compose.yaml files?I did not run into download issues early last year. But starting in December, Hugging Face Models downloads(ex: Intel/neural-chat-7b-v3-3) are now taking 24 minutes to download.
I updated the Xeon ChatQnA compose.yaml file to
HF_HUB_ENABLE_HF_TRANSFER=1
and it now only took 15 seconds.If no concerns, I can submit a PR to update the compose.yaml files.
Per docs https://huggingface.co/docs/huggingface_hub/en/package_reference/environment_variables#hfhubenablehftransfer
HF_HUB_ENABLE_HF_TRANSFER
Set to True for faster uploads and downloads from the Hub
The text was updated successfully, but these errors were encountered: