Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Customize Distance Calculations for KNN Vectors #14025

Open
jed326 opened this issue Nov 27, 2024 · 0 comments
Open

Customize Distance Calculations for KNN Vectors #14025

jed326 opened this issue Nov 27, 2024 · 0 comments

Comments

@jed326
Copy link

jed326 commented Nov 27, 2024

Description

Today I see 2 ways to provide the distance calculations when using the HNSW vectors in Lucene:

  1. The existing VectorSimilarityFunction, which is encoded into the segment file itself.
  2. Via a customer scorer through a custom KnnVectorsFormat.

IMO this is not a great experience because in order to provide my own scorer I need to implement at least 2 new classes but for the most part the code in those classes would be boilerplate/duplicated code. In fact really the only novel code there would be in RandomVectorScorer#score. I do see that we're a little bit stuck with this because the existing VectorSimilarityFunction class is implemented as an enum so we can't extend it (or really make any changes to it).

I see that adding bit/binary vector support (#13505) is also currently blocked on resolving this, so I wanted to ask:

  1. What's the remaining gap to officially supporting bit vectors in Lucene? Naively it looks as simple as moving the new HnswBitVectorsFormat class introduced in Add BitVectors format and make flat vectors format easier to extend #13288 into the lucene101 package.
  2. Broadly speaking what is the vision here for allowing users to customize the distance calculations? For example does the current approach with implementing a custom format/scorer look like the longer term strategy or instead the long term plan look something like replacing VectorSimilarityFunction with an extensible interface instead?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant