Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make embedding generation go through inference #606

Merged
merged 4 commits into from
Dec 12, 2024

Conversation

dineshyv
Copy link
Contributor

@dineshyv dineshyv commented Dec 11, 2024

This PR does the following:

  1. adds the ability to generate embeddings in all supported inference providers.
  2. Moves all the memory providers to use the inference API and improved the memory tests to setup the inference stack correctly and use the embedding models

This is a merge from #589 and #598

Copy link
Contributor

@ashwinb ashwinb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol what's going on flurry of reverts :D

@dineshyv
Copy link
Contributor Author

merged the first PR by mistake. I want all the 3 PRs to go in at once.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 11, 2024
This PR adds the ability to generate embeddings in all supported
inference providers.

```
pytest -v -s llama_stack/providers/tests/inference/test_embeddings.py -k "bedrock" --inference-model="amazon.titan-embed-text-v2:0"  --env EMBEDDING_DIMENSION=1024

 pytest -v -s -k "vllm"  --inferrence-model="intfloat/e5-mistral-7b-instruct"  llama_stack/providers/tests/inference/test_embeddings.py --env EMBEDDING_DIMENSION=4096  --env VLLM_URL="http://localhost:9798/v1"

pytest -v -s --inference-model="nomic-ai/nomic-embed-text-v1.5"  llama_stack/providers/tests/inference/test_embeddings.py  -k "fireworks"  --env FIREWORKS_API_KEY=<API_KEY>--env EMBEDDING_DIMENSION=128

pytest -v -s --inference-model="togethercomputer/m2-bert-80M-2k-retrieval"  llama_stack/providers/tests/inference/test_embeddings.py  -k "together"  --env TOGETHER_API_KEY=<API_KEY>--env EMBEDDING_DIMENSION=768

pytest -v -s -k "ollama"  --inference-model="all-minilm:v8"  llama_stack/providers/tests/inference/test_embeddings.py --env EMBEDDING_DIMENSION=384

 torchrun $CONDA_PREFIX/bin/pytest -v -s -k "meta_reference" --inference-model="sentence-transformers/all-MiniLM-L6-v2"  llama_stack/providers/tests/inference/test_embeddings.py --env EMBEDDING_DIMENSION=384

```
@dineshyv dineshyv force-pushed the revert-605-revert-588-add-model-type branch from 9d082d9 to d362d2d Compare December 12, 2024 19:30
@dineshyv dineshyv changed the title Revert "Revert "add model type to APIs"" Make embedding generation go through inference Dec 12, 2024
@dineshyv dineshyv merged commit 96e158e into main Dec 12, 2024
2 checks passed
@dineshyv dineshyv deleted the revert-605-revert-588-add-model-type branch December 12, 2024 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants