-
Notifications
You must be signed in to change notification settings - Fork 15.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes issue #5072 - adds additional support to Weaviate #5085
Conversation
Implementation is similar to search_distance and where_filter
@@ -201,6 +201,8 @@ def similarity_search_by_text( | |||
query_obj = self._client.query.get(self._index_name, self._query_attrs) | |||
if kwargs.get("where_filter"): | |||
query_obj = query_obj.with_where(kwargs.get("where_filter")) | |||
if kwargs.get("additional"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this apply to other search methods as well? (like similarity_search_by_vector)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can use the same mechanism in the function similarity_search_by_vector. They have different values that you can use in the additional field. But that is specified in the Weaviate documentation. Not sure if we should replicate that here as well.
I would not use it in the functions similarity_search_with_score and max_marginal_relevance_search_by_vector. They allready use the mechanism with the additional field. They already return a score.
Do you want me to add it to the similarity_search_by_vector method as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that'd be great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Add the additional part also to the vector based query
output = docsearch.similarity_search( | ||
"foo", | ||
k=2, | ||
additional=["certainty"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is certainty supposed to be in the output? maybe im misunderstanding what with_additional does
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are absolutely right. I am trying to run the integration tests myself. But I cannot make it work. Now that you mention it, I think the output check should be like this:
assert output == [Document(page_content="foo", metadata={"page": 0, "_additional": {"certainty": 0}})]
I do not know how to get the value for certainty in the test. Of course I can make the change and commit, but I cannot test it myself at the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as in you can't make the integration test pass or you can't make it run at all? to run you can do
pytest tests/integration_tests/vectorstores/test_weaviate.py
you'll want to spin up local weaviate container, instructions for that at top of integration test file
cd tests/integration_tests/vectorstores/docker-compose
docker compose -f weaviate.yml up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, learning about poetry at the same time. I managed to get the integration tests running with your tips. Now it keeps telling me he is missing the OpenAI key:
E ValueError: Error during query: [{'locations': [{'column': 6, 'line': 1}], 'message': 'explorer: get class: vectorize params: vectorize params: vectorize params: vectorize keywords: remote client vectorize: OpenAI API Key: no api key found neither in request header: X-OpenAI-Api-Key nor in environment variable under OPENAI_APIKEY', 'path': ['Get', 'LangChain_47555ec10c224f01abb7b82f7458c0b6']}]
Trying to figure out what to do here. I added the .env file in the integration_tests folder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the integration test requires OpenAI model. if you have an openai api key just set that in the terminal you're running this in with
export OPENAI_API_KEY="<api_key>"
if you don't have openai key i can just run the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I really tried everything I could, but I cannot fix it. If you can try the test, then I will try to fix it later for my environment.
Implementation is similar to search_distance and where_filter # adds 'additional' support to Weaviate queries Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
Implementation is similar to search_distance and where_filter
adds 'additional' support to Weaviate queries
Fixes #5072
Before submitting
Who can review?
Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:
@dev2049