Additional option for alternative model Paths to load different model formats #335

John42506176Linux · 2024-08-13T20:54:53Z

John42506176Linux
Aug 13, 2024

Feature request

Additional optional for Docker CLI to load alternative paths of models
port=7997
mid_rerank_model=mixedbread-ai/mxbai-rerank-xsmall-v1
volume=$PWD/data

sudo docker run -it --gpus all
-v $volume:/app/.cache
-p $port:$port
michaelf34/infinity:latest
v2
--batch-size 32
--model-id $mid_rerank_model
--alternative_path onnx/model_quantized.onnx
--port $port

Motivation

I'm currently trying to load the quantized ONNX version of a Reranker model I'm using and I currently can't see an easy solution to doing this with the docker CLI.

Your contribution

Help test or fix few simple bugs.

michaelfeil · 2024-08-13T22:41:22Z

michaelfeil
Aug 13, 2024
Maintainer

@John42506176Linux Your request is not realistic. Besides model weights, you also need to load e.g. the config files, tokenizer, and infer the model class from the config.

I would suggest you to pack your custom reranker into a huggingface repo.

I am confused, as for loading onnx models you need to set —engine optimum, which you are not doing.

0 replies

John42506176Linux · 2024-08-14T01:13:56Z

John42506176Linux
Aug 14, 2024
Author

Thanks, for the information I didn't know if this was an option so wanted to double-check.

I didn't include the --engine optimum option in the contribution but that was used in testing.

0 replies

michaelfeil · 2024-08-14T01:50:31Z

michaelfeil
Aug 14, 2024
Maintainer

Thanks, I'll move it to discussions.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Additional option for alternative model Paths to load different model formats #335

{{title}}

Replies: 3 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Additional option for alternative model Paths to load different model formats #335

John42506176Linux Aug 13, 2024

Feature request

Motivation

Your contribution

Replies: 3 comments

michaelfeil Aug 13, 2024 Maintainer

John42506176Linux Aug 14, 2024 Author

michaelfeil Aug 14, 2024 Maintainer

John42506176Linux
Aug 13, 2024

michaelfeil
Aug 13, 2024
Maintainer

John42506176Linux
Aug 14, 2024
Author

michaelfeil
Aug 14, 2024
Maintainer