Fail to load fixie-ai/ultravox-v0_4_1-llama-3_1-70b with device_map 'auto' #166

MatthewCYM · 2024-12-12T16:13:02Z

Hi,

When I load the model into 4 gpus with model parallelism:

transformers.pipeline(model='fixie-ai/ultravox-v0_4_1-llama-3_1-70b', trust_remote_code=True, device_map='auto')

It gives the below error:

ValueError: weight is on the meta device, we need a `value` to put in on 0.

The text was updated successfully, but these errors were encountered:

Madoshakalaka · 2024-12-19T11:49:29Z

facing same problem here with two 48GB A6000 GPUs

farzadab · 2024-12-19T17:45:58Z

Hi there,

I've taken a look before and I wasn't able to get any good performance (if at all) out of it, so currently for 70B inference we use VLLM instead.

Can I ask why you want to do inference with the 70B model? For example, do you care about performance or is it just to test the model out.

Madoshakalaka · 2024-12-20T07:36:13Z

thanks! We are just poking around, checking if the 70B model provides better output.
After some testing, it seems that the 8B one is already very capable though.
Actually in out application, a timely response from the LLM is crucial, so if 70B is slower in that regard it's indeed a worse choice.
Thanks for helping out!

zkoch assigned farzadab and zqhuang211 Dec 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fail to load fixie-ai/ultravox-v0_4_1-llama-3_1-70b with device_map 'auto' #166

Fail to load fixie-ai/ultravox-v0_4_1-llama-3_1-70b with device_map 'auto' #166

MatthewCYM commented Dec 12, 2024

Madoshakalaka commented Dec 19, 2024

farzadab commented Dec 19, 2024 •

edited

Loading

Madoshakalaka commented Dec 20, 2024

Fail to load fixie-ai/ultravox-v0_4_1-llama-3_1-70b with device_map 'auto' #166

Fail to load fixie-ai/ultravox-v0_4_1-llama-3_1-70b with device_map 'auto' #166

Comments

MatthewCYM commented Dec 12, 2024

Madoshakalaka commented Dec 19, 2024

farzadab commented Dec 19, 2024 • edited Loading

Madoshakalaka commented Dec 20, 2024

farzadab commented Dec 19, 2024 •

edited

Loading