-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] add support for more than one kNN query on nested vectors with multiple inner hits and filter #1768
Comments
@heemin32 would it be possible to triage this enhancement and identify the timeline for a deliverable. We are currently blocked without this work and need to understand if and when this feature would be available. |
Hi @konstadin. Is this functionality provided for text field? I think there is no such method to make two query and get innerHit result for each query even for text field. Could you also check if hybrid search could be used for your use case? https://opensearch.org/docs/latest/search-plugins/hybrid-search/ |
Hi @heemin32. Not aware of functionality provided for text field. however it is available for multiple k-NN search. Search multiple knn fields: Nested kNN Search with 1 Inner hits: Nested kNN Search with multiple Inner hits: Filtered kNN search, applied as a pre-filter: The request is to provide parity for above; to search multiple knn fields on nested embeddings, and return more than 1 inner hit with filter. |
@konstadin you can use a bool query with a should/must clause to search on multiple k-NN fields(nested or non nested doesn't matter). A k-nn query clause in Opensearch is just like any other query clause of Opensearch. it doesn't require any special treatment just like elastic has done. So its more like the way you will search on mutliple text fields you can do the same for k-NN query clause too.
Same goes for a nested field. |
Thanks @navneet1v @heemin32 will take a look. What are the plans to provide feature to return multiple Inner hits? |
@heemin32 is this feature added in 2.15 release of opensearch? |
It is not. This issue is somewhat related with #1743 |
Closing in favor of #2113. |
Is your feature request related to a problem?
Yes, I want to create a document with more than one nested vector in a single document (nested / nested vector), query the document with multiple k-NN queries, gather more than one inner_hit when searching nested k-NN for each query. This feature is available in Elasticsearch and require parity with Opensearch.
What solution would you like?
Expand support for kNN search with nested fields to allow for multiple knn queries.
This solution builds on Enhanced multi-vector support for OpenSearch k-NN search with nested fields.
Instead of one k-NN search with nested fields on a doc, the solution supports:
The response returns:
What alternatives have you considered?
Storing the documents as nested vectors (instead of nested / nested vectors) and using a boolean query with multiple k-NN queries with an aggregation. However, the mapping of which field matched which k-NN query is lost in the aggregation, as are inner hits. The _score racking is questionable if it will be calculated the same way.
Do you have any additional context?
Consider example of storing lines for each paragraph, for each chapter, in a book. Attached is an example mapping, where the lines are stored as nested embeddings in
vector
and paragraphs are nested inembeddings
. Essentially each document stores paragraphs to a chapter, to a book; a document is a collection of paragraphs for a chapter.We want to find the chapters, that have the closest matches to n lines of text, where each line of text represents a k-NN search (
query_1
,query_2
) that will target the nested embeddings invector
.We should have the the ability to filter for a specific book
book_id
, in this example 1234. This will filter out any unrelated books, and be applied as a pre-filter in the k-NN search and not as a post-filter.Sample response included below that returns top 2 documents, with k=2 for each k-NN search.
The text was updated successfully, but these errors were encountered: