Make embedding generation go through inference #606

dineshyv · 2024-12-11T18:18:10Z

This PR does the following:

adds the ability to generate embeddings in all supported inference providers.
Moves all the memory providers to use the inference API and improved the memory tests to setup the inference stack correctly and use the embedding models

This is a merge from #589 and #598

ashwinb

lol what's going on flurry of reverts :D

dineshyv · 2024-12-11T18:24:23Z

merged the first PR by mistake. I want all the 3 PRs to go in at once.

This reverts commit 47b2dc8.

This PR adds the ability to generate embeddings in all supported inference providers. ``` pytest -v -s llama_stack/providers/tests/inference/test_embeddings.py -k "bedrock" --inference-model="amazon.titan-embed-text-v2:0" --env EMBEDDING_DIMENSION=1024 pytest -v -s -k "vllm" --inferrence-model="intfloat/e5-mistral-7b-instruct" llama_stack/providers/tests/inference/test_embeddings.py --env EMBEDDING_DIMENSION=4096 --env VLLM_URL="http://localhost:9798/v1" pytest -v -s --inference-model="nomic-ai/nomic-embed-text-v1.5" llama_stack/providers/tests/inference/test_embeddings.py -k "fireworks" --env FIREWORKS_API_KEY=<API_KEY>--env EMBEDDING_DIMENSION=128 pytest -v -s --inference-model="togethercomputer/m2-bert-80M-2k-retrieval" llama_stack/providers/tests/inference/test_embeddings.py -k "together" --env TOGETHER_API_KEY=<API_KEY>--env EMBEDDING_DIMENSION=768 pytest -v -s -k "ollama" --inference-model="all-minilm:v8" llama_stack/providers/tests/inference/test_embeddings.py --env EMBEDDING_DIMENSION=384 torchrun $CONDA_PREFIX/bin/pytest -v -s -k "meta_reference" --inference-model="sentence-transformers/all-MiniLM-L6-v2" llama_stack/providers/tests/inference/test_embeddings.py --env EMBEDDING_DIMENSION=384 ```

dineshyv requested review from ashwinb, yanxi0830, hardikjshah, dltn and raghotham as code owners December 11, 2024 18:18

ashwinb approved these changes Dec 11, 2024

View reviewed changes

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 11, 2024

dineshyv added 2 commits December 12, 2024 11:19

Revert "Revert "add model type to APIs" (#605)"

6a23f24

This reverts commit 47b2dc8.

dineshyv force-pushed the revert-605-revert-588-add-model-type branch from 9d082d9 to d362d2d Compare December 12, 2024 19:30

precommit fixes

c38d377

dineshyv changed the title ~~Revert "Revert "add model type to APIs""~~ Make embedding generation go through inference Dec 12, 2024

fix failing memory tests

ee3f0c6

dineshyv merged commit 96e158e into main Dec 12, 2024
2 checks passed

dineshyv deleted the revert-605-revert-588-add-model-type branch December 12, 2024 19:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make embedding generation go through inference #606

Make embedding generation go through inference #606

dineshyv commented Dec 11, 2024 •

edited

Loading

ashwinb left a comment

dineshyv commented Dec 11, 2024

Make embedding generation go through inference #606

Make embedding generation go through inference #606

Conversation

dineshyv commented Dec 11, 2024 • edited Loading

ashwinb left a comment

Choose a reason for hiding this comment

dineshyv commented Dec 11, 2024

dineshyv commented Dec 11, 2024 •

edited

Loading