Running models on seperate devices #485

Sens612 · 2024-12-02T11:14:59Z

Sens612
Dec 2, 2024

Hi,

I was wondering if it it possible to run 2 models on seperate devices. For example if I want to run

all-MiniLM-L6-v2 on CPU with torch engine
cross-encoder/ms-marco-MiniLM-L-6-v2 on GPU with torch engine

Is that possible to do with the CLI? If so what would the CLI command look like

infinity_emb v2 ...

Answered by michaelfeil

Dec 2, 2024

@Sens612 There is a section in the Readme on CLI usage.

e.g.

infinity_emb v2 \
--model-id all-MiniLM-L6-v2 --engine optimum --device cpu --batch-size 8 \
--model-id cross-encoder/ms-marco-MiniLM-L-6-v2  --engine torch --device cuda --batch-size 16

View full answer

michaelfeil · 2024-12-02T17:32:40Z

michaelfeil
Dec 2, 2024
Maintainer

@Sens612 There is a section in the Readme on CLI usage.

e.g.

infinity_emb v2 \
--model-id all-MiniLM-L6-v2 --engine optimum --device cpu --batch-size 8 \
--model-id cross-encoder/ms-marco-MiniLM-L-6-v2  --engine torch --device cuda --batch-size 16

0 replies

Sens612 · 2024-12-02T18:36:46Z

Sens612
Dec 2, 2024
Author

Aha, I had no clue you could do it that way, I thought majority of the arguments were forced on both models. Looking in the README I don't think this is shown clearly with an example. Either way thank you, and great work on the repo.

2 replies

michaelfeil Dec 2, 2024
Maintainer

infinity-emb-py3.11(base) michael@laptop:~/infinity$ infinity_emb v2 --help
                                                                                                                     
 Usage: infinity_emb v2 [OPTIONS]                                                                                    
                                                                                                                     
 Infinity API ♾️  cli v2. MIT License. Copyright (c) 2023-now Michael Feil                                            
 Multiple Model CLI Playbook:                                                                                        
 - 1. cli options can be overloaded i.e. `v2 --model-id model/id1 --model-id/id2 --batch-size 8 --batch-size 4` @Sens612      
 - 2. or adapt the defaults by setting ENV Variables separated by `;`: INFINITY_MODEL_ID="model/id1;model/id2;" &&   
 INFINITY_BATCH_SIZE="8;4;"                                                                                          
 - 3. single items are broadcasted to `--model-id` length, making `v2 --model-id model/id1 --model-id/id2            
 --batch-size 8` both models have batch-size 8.

michaelfeil Dec 2, 2024
Maintainer

Maybe I should advertise the feature more prominently, but yeah!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running models on seperate devices #485

{{title}}

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Running models on seperate devices #485

Sens612 Dec 2, 2024

Replies: 2 comments · 2 replies

michaelfeil Dec 2, 2024 Maintainer

Sens612 Dec 2, 2024 Author

michaelfeil Dec 2, 2024 Maintainer

michaelfeil Dec 2, 2024 Maintainer

Sens612
Dec 2, 2024

Replies: 2 comments 2 replies

michaelfeil
Dec 2, 2024
Maintainer

Sens612
Dec 2, 2024
Author

michaelfeil Dec 2, 2024
Maintainer

michaelfeil Dec 2, 2024
Maintainer