-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elasticsearch 8 compatible high-level client integration #1047
Conversation
...with latest Elasticsearch 8 docker images
Configure ES8 client only when connecting to ES 8.* clusters
commons/com.b2international.index.es8/src/com/b2international/index/es8/Es8Client.java
Show resolved
Hide resolved
commons/com.b2international.index/src/com/b2international/index/Searcher.java
Outdated
Show resolved
Hide resolved
// TODO consider adding support for double primitive lists | ||
FloatIterator it = knn.getQueryVector().iterator(); | ||
final List<Double> queryVector = new ArrayList<>(knn.getQueryVector().size()); | ||
while (it.hasNext()) { | ||
queryVector.add((double) it.next()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit strange on the ES Java Client's part, as the internal representation is certainly limited to 32 bits in precision, as mentioned in Dense vector field type and Functions for vector fields:
The dense_vector field type stores dense vectors of float values.
doc[<field>].vectorValue
– returns a vector’s value as an array of floats
Doubles are most likely used to overcome some JSON serialization limitations. I don't think our Knn search vector should adopt this type and let users think that higher precision can be used than what is actually available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's why I have used FloatList instead of introducing doubles. In our API we will index an array of float values and will support searching using an array of float values, no doubles.
commons/com.b2international.index/src/com/b2international/index/es/query/Es8QueryBuilder.java
Outdated
Show resolved
Hide resolved
…x/Searcher.java Co-authored-by: András Péteri <apeteri@b2international.com>
Codecov Report
@@ Coverage Diff @@
## 8.x #1047 +/- ##
============================================
- Coverage 65.01% 64.65% -0.36%
- Complexity 12268 12274 +6
============================================
Files 1701 1703 +2
Lines 56563 56895 +332
Branches 5231 5286 +55
============================================
+ Hits 36773 36786 +13
- Misses 17543 17861 +318
- Partials 2247 2248 +1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review comments are resolved! 👍
This PR integrates the Elasticsearch 8 compatible high-level client in the index layer. Next to the current obsolete TCP and HTTP clients the new client is available via the IndexAdmin interface (
IndexAdmin#es8Client()
). This client is only available when the connected cluster is version 8.0 or higher. The currently supported client version is8.3.2
.Related to https://snowowl.atlassian.net/browse/SO-4997