-
Notifications
You must be signed in to change notification settings - Fork 15.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ElasticsearchEmbeddings class for generating embeddings using Elasticsearch models #3401
Merged
dev2049
merged 22 commits into
langchain-ai:master
from
jeffvestal:elasticsearch_embeddings
May 23, 2023
Merged
Add ElasticsearchEmbeddings class for generating embeddings using Elasticsearch models #3401
dev2049
merged 22 commits into
langchain-ai:master
from
jeffvestal:elasticsearch_embeddings
May 23, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…using 'text_field' as the default
@jeffvestal would it be possible to add an example notebook to |
@dev2049 Definitely. It might be a couple days as I'm traveling this week but I'll get something in there. |
dev2049
added
lgtm
PR looks good. Use to confirm that a PR is ready for merging.
and removed
needs documentation
PR needs to be updated with documentation
labels
May 22, 2023
vowelparrot
pushed a commit
that referenced
this pull request
May 24, 2023
…sticsearch models (#3401) This PR introduces a new module, `elasticsearch_embeddings.py`, which provides a wrapper around Elasticsearch embedding models. The new ElasticsearchEmbeddings class allows users to generate embeddings for documents and query texts using a [model deployed in an Elasticsearch cluster](https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-model-ref.html#ml-nlp-model-ref-text-embedding). ### Main features: 1. The ElasticsearchEmbeddings class initializes with an Elasticsearch connection object and a model_id, providing an interface to interact with the Elasticsearch ML client through [infer_trained_model](https://elasticsearch-py.readthedocs.io/en/v8.7.0/api.html?highlight=trained%20model%20infer#elasticsearch.client.MlClient.infer_trained_model) . 2. The `embed_documents()` method generates embeddings for a list of documents, and the `embed_query()` method generates an embedding for a single query text. 3. The class supports custom input text field names in case the deployed model expects a different field name than the default `text_field`. 4. The implementation is compatible with any model deployed in Elasticsearch that generates embeddings as output. ### Benefits: 1. Simplifies the process of generating embeddings using Elasticsearch models. 2. Provides a clean and intuitive interface to interact with the Elasticsearch ML client. 3. Allows users to easily integrate Elasticsearch-generated embeddings. Related issue #3400 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
Merged
dev2049
pushed a commit
that referenced
this pull request
May 24, 2023
Adding example usage for elasticsearch knn embeddings [per](#3401 (comment)) https://github.com/hwchase17/langchain/blob/master/langchain/embeddings/elasticsearch.py
1 task
Merged
Undertone0809
pushed a commit
to Undertone0809/langchain
that referenced
this pull request
Jun 19, 2023
Adding example usage for elasticsearch knn embeddings [per](langchain-ai#3401 (comment)) https://github.com/hwchase17/langchain/blob/master/langchain/embeddings/elasticsearch.py
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR introduces a new module,
elasticsearch_embeddings.py
, which provides a wrapper around Elasticsearch embedding models. The new ElasticsearchEmbeddings class allows users to generate embeddings for documents and query texts using a model deployed in an Elasticsearch cluster.Main features:
embed_documents()
method generates embeddings for a list of documents, and theembed_query()
method generates an embedding for a single query text.text_field
.Benefits:
This is my first PR for this project.
I created an integration test file, however, I could use some guidance on how to set it up since it needs an Elasticsearch cluster running an embedding model.
Let me know if there are any structural changes needed or anything missing.
Related issue #3400