Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch 8 compatible high-level client integration #1047

Merged
merged 14 commits into from
Jul 26, 2022

Conversation

cmark
Copy link
Member

@cmark cmark commented Jul 25, 2022

This PR integrates the Elasticsearch 8 compatible high-level client in the index layer. Next to the current obsolete TCP and HTTP clients the new client is available via the IndexAdmin interface (IndexAdmin#es8Client()). This client is only available when the connected cluster is version 8.0 or higher. The currently supported client version is 8.3.2.

Related to https://snowowl.atlassian.net/browse/SO-4997

@cmark cmark added the feature label Jul 25, 2022
@cmark cmark requested a review from apeteri July 25, 2022 11:50
@cmark cmark self-assigned this Jul 25, 2022
Comment on lines +557 to +562
// TODO consider adding support for double primitive lists
FloatIterator it = knn.getQueryVector().iterator();
final List<Double> queryVector = new ArrayList<>(knn.getQueryVector().size());
while (it.hasNext()) {
queryVector.add((double) it.next());
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit strange on the ES Java Client's part, as the internal representation is certainly limited to 32 bits in precision, as mentioned in Dense vector field type and Functions for vector fields:

The dense_vector field type stores dense vectors of float values.

doc[<field>].vectorValue – returns a vector’s value as an array of floats

Doubles are most likely used to overcome some JSON serialization limitations. I don't think our Knn search vector should adopt this type and let users think that higher precision can be used than what is actually available.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's why I have used FloatList instead of introducing doubles. In our API we will index an array of float values and will support searching using an array of float values, no doubles.

…x/Searcher.java

Co-authored-by: András Péteri <apeteri@b2international.com>
@codecov-commenter
Copy link

codecov-commenter commented Jul 25, 2022

Codecov Report

Merging #1047 (e237884) into 8.x (f13da26) will decrease coverage by 0.35%.
The diff coverage is 4.77%.

@@             Coverage Diff              @@
##                8.x    #1047      +/-   ##
============================================
- Coverage     65.01%   64.65%   -0.36%     
- Complexity    12268    12274       +6     
============================================
  Files          1701     1703       +2     
  Lines         56563    56895     +332     
  Branches       5231     5286      +55     
============================================
+ Hits          36773    36786      +13     
- Misses        17543    17861     +318     
- Partials       2247     2248       +1     
Impacted Files Coverage Δ
....index/src/com/b2international/index/Searcher.java 50.00% <0.00%> (-50.00%) ⬇️
...m/b2international/index/es/EsDocumentSearcher.java 73.74% <0.00%> (-6.66%) ⬇️
.../com/b2international/index/es/client/EsClient.java 61.53% <ø> (ø)
...2international/index/es/query/Es8QueryBuilder.java 0.00% <0.00%> (ø)
...index/src/com/b2international/index/query/Knn.java 0.00% <0.00%> (ø)
...ternational/index/revision/RevisionIndexAdmin.java 66.66% <0.00%> (-2.30%) ⬇️
...snowowl/core/request/RevisionIndexReadRequest.java 70.83% <0.00%> (-3.08%) ⬇️
...tional/index/revision/DefaultRevisionSearcher.java 61.53% <37.50%> (-6.11%) ⬇️
...m/b2international/index/es/admin/EsIndexAdmin.java 81.37% <57.14%> (-0.49%) ⬇️
...b2international/index/es/EsIndexClientFactory.java 69.69% <60.00%> (-2.31%) ⬇️
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f13da26...e237884. Read the comment docs.

Copy link
Member

@apeteri apeteri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review comments are resolved! 👍

@cmark cmark merged commit 9e53f42 into 8.x Jul 26, 2022
@cmark cmark deleted the feature/elasticsearch-8-client branch July 26, 2022 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants