Use revision when downloading the quantization config file #2697

Pernekhan · 2024-02-01T01:56:55Z

Problem

When using quantization, it's not respecting the model's revision that was passed from the arguments.
It tries to fetch from the latest main branch instead.

Testing

Consider these two revisions for testing:

Has quantize_config.json file: https://huggingface.co/notstoic/pygmalion-13b-4bit-128g/tree/5f7a136b834d0cfdd6906e43dfe2182ce165e65c
Doesn't have quantize_config.json file:https://huggingface.co/notstoic/pygmalion-13b-4bit-128g/tree/main

When you pass --model notstoic/pygmalion-13b-4bit-128g --revision 5f7a136b834d0cfdd6906e43dfe2182ce165e65c --download-dir=/data --quantization gptq it will start working as it tries to fetch from the same revision, not from the latest main

zhuohan123

LGTM! Thanks for your contribution!

…ect#2697) Co-authored-by: Pernekhan Utemuratov <pernekhan@deepinfra.com>

Use revision when downloading the quantization config file

723b61a

Pernekhan force-pushed the quant-config-use-revision branch from 5b2e064 to 723b61a Compare February 1, 2024 01:59

Wrap the lines

0793c0e

zhuohan123 approved these changes Feb 1, 2024

View reviewed changes

zhuohan123 merged commit c410f5d into vllm-project:main Feb 1, 2024
17 checks passed

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024

Use revision when downloading the quantization config file (vllm-proj…

d1d6e89

…ect#2697) Co-authored-by: Pernekhan Utemuratov <pernekhan@deepinfra.com>

alexm-redhat pushed a commit to neuralmagic/nm-vllm that referenced this pull request Feb 13, 2024

Use revision when downloading the quantization config file (vllm-proj…

3b1644e

…ect#2697) Co-authored-by: Pernekhan Utemuratov <pernekhan@deepinfra.com>

xjpang pushed a commit to xjpang/vllm that referenced this pull request Feb 20, 2024

Use revision when downloading the quantization config file (vllm-proj…

adfd9b9

…ect#2697) Co-authored-by: Pernekhan Utemuratov <pernekhan@deepinfra.com>

xjpang pushed a commit to xjpang/vllm that referenced this pull request Feb 22, 2024

Use revision when downloading the quantization config file (vllm-proj…

9217016

…ect#2697) Co-authored-by: Pernekhan Utemuratov <pernekhan@deepinfra.com>

andy-neuma mentioned this pull request Feb 23, 2024

andy/bump main to v0.3.2 neuralmagic/nm-vllm#49

Closed

xjpang pushed a commit to xjpang/vllm that referenced this pull request Mar 4, 2024

Use revision when downloading the quantization config file (vllm-proj…

b1e138c

…ect#2697) Co-authored-by: Pernekhan Utemuratov <pernekhan@deepinfra.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use revision when downloading the quantization config file #2697

Use revision when downloading the quantization config file #2697

Pernekhan commented Feb 1, 2024 •

edited

Loading

zhuohan123 left a comment

Use revision when downloading the quantization config file #2697

Use revision when downloading the quantization config file #2697

Conversation

Pernekhan commented Feb 1, 2024 • edited Loading

Problem

Testing

zhuohan123 left a comment

Choose a reason for hiding this comment

Pernekhan commented Feb 1, 2024 •

edited

Loading