Skip to content

Commit

Permalink
Add support for knn_vector property type (#524)
Browse files Browse the repository at this point in the history
* feat: add support for knn_vector property

Signed-off-by: Malte Hedderich <github@hedderich.pro>

* feat: add support for knn index setting

Signed-off-by: Malte Hedderich <github@hedderich.pro>

* fix(IndexSettings.java): add missing alias with index prefix to knn setting

Signed-off-by: Malte Hedderich <github@hedderich.pro>

* feat: add support for knn.algo_param.ef_search setting

Signed-off-by: Malte Hedderich <github@hedderich.pro>

* docs: add Changelog entry for knn_vector property

Signed-off-by: Malte Hedderich <github@hedderich.pro>

* style: fix style check violations

Signed-off-by: Malte Hedderich <github@hedderich.pro>

* 🔥 refactor(KnnVectorMethod.java, KnnVectorProperty.java): remove license headers

Signed-off-by: Malte Hedderich <github@hedderich.pro>

* refactor: remove dense_vector property

Signed-off-by: Malte Hedderich <github@hedderich.pro>

* refactor: remove remaining dense_vector references

Signed-off-by: Malte Hedderich <github@hedderich.pro>

* docs(USER_GUIDE.md): add instructions to create an index with custom settings and mappings

Signed-off-by: Malte Hedderich <github@hedderich.pro>

* docs(USER_GUIDE.md): add knn search examples for script_score and scripting_score with painless extension

Signed-off-by: Malte Hedderich <github@hedderich.pro>

* docs(USER_GUIDE.md): add k-NN to table of contents

Signed-off-by: Malte Hedderich <github@hedderich.pro>

* docs(USER_GUIDE.md): fix position of k-NN search to table of contents

Signed-off-by: Malte Hedderich <github@hedderich.pro>

---------

Signed-off-by: Malte Hedderich <github@hedderich.pro>
  • Loading branch information
maltehedderich authored Jun 20, 2023
1 parent 68e2b6f commit 7d22a7d
Show file tree
Hide file tree
Showing 12 changed files with 810 additions and 471 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ Inspired from [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
## [Unreleased 2.x]

### Added
- Add support for knn_vector field type ([#529](https://github.com/opensearch-project/opensearch-java/pull/524))

### Dependencies

Expand Down
198 changes: 195 additions & 3 deletions USER_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,16 @@
- [Create a client](#create-a-client)
- [Create a client using `RestClientTransport`](#create-a-client-using-restclienttransport)
- [Create a client using `ApacheHttpClient5Transport`](#create-a-client-using-apachehttpclient5transport)
- [Create an index](#create-an-index)
- [Create an index](#create-an-index)
- [Create an index with default settings](#create-an-index-with-default-settings)
- [Create an index with custom settings and mappings](#create-an-index-with-custom-settings-and-mappings)
- [Index data](#index-data)
- [Search for the documents](#search-for-the-documents)
- [Get raw JSON results](#get-raw-json-results)
- [Search documents using a match query](#search-documents-using-a-match-query)
- [Search documents using k-NN](#search-documents-using-k-nn)
- [Exact k-NN with scoring script](#exact-k-nn-with-scoring-script)
- [Exact k-NN with painless scripting extension](#exact-k-nn-with-painless-scripting-extension)
- [Search documents using suggesters](#search-documents-using-suggesters)
- [App Data class](#app-data-class)
- [Using completion suggester](#using-completion-suggester)
Expand Down Expand Up @@ -84,7 +89,7 @@ There are multiple low level transports which `OpenSearchClient` could be config
import org.apache.hc.core5.http.HttpHost;

final HttpHost[] hosts = new HttpHost[] {
new HttpHost("localhost", "http", 9200)
new HttpHost("http", "localhost", 9200)
};

// Initialize the client with SSL and TLS enabled
Expand Down Expand Up @@ -112,7 +117,7 @@ Upcoming OpenSearch `3.0.0` release brings HTTP/2 support and as such, the `Rest
import org.apache.hc.core5.http.HttpHost;

final HttpHost[] hosts = new HttpHost[] {
new HttpHost("localhost", "http", 9200)
new HttpHost("http", "localhost", 9200)
};

final OpenSearchTransport transport = ApacheHttpClient5TransportBuilder
Expand Down Expand Up @@ -140,12 +145,33 @@ OpenSearchClient client = new OpenSearchClient(transport);

## Create an index

### Create an index with default settings

```java
String index = "sample-index";
CreateIndexRequest createIndexRequest = new CreateIndexRequest.Builder().index(index).build();
client.indices().create(createIndexRequest);
```

### Create an index with custom settings and mappings

```java
String index = "sample-index";
IndexSettings settings = new IndexSettings.Builder()
.numberOfShards("2")
.numberOfReplicas("1")
.build();
TypeMapping mapping = new TypeMapping.Builder()
.properties("age", new Property.Builder().integer(new IntegerNumberProperty.Builder().build()).build())
.build();
CreateIndexRequest createIndexRequest = new CreateIndexRequest.Builder()
.index(index)
.settings(settings)
.mappings(mapping)
.build();
client.indices().create(createIndexRequest);
```

## Index data

```java
Expand Down Expand Up @@ -191,6 +217,172 @@ for (int i = 0; i < searchResponse.hits().hits().size(); i++) {
}
```

## Search documents using k-NN

### Exact k-NN with scoring script

1. Create index with custom mapping

```java
String index = "my-knn-index-1";
TypeMapping mapping = new TypeMapping.Builder()
.properties("my_vector", new Property.Builder()
.knnVector(new KnnVectorProperty.Builder()
.dimension(4)
.build())
.build())
.build();
CreateIndexRequest createIndexRequest = new CreateIndexRequest.Builder()
.index(index)
.mappings(mapping)
.build();
client.indices().create(createIndexRequest);
```

2. Index documents

```java
JsonObject doc1 = Json.createObjectBuilder()
.add("my_vector", Json.createArrayBuilder().add(1.5).add(5.5).add(4.5).add(6.4).build())
.add("price", 10.3)
.build();
JsonObject doc2 = Json.createObjectBuilder()
.add("my_vector", Json.createArrayBuilder().add(2.5).add(3.5).add(5.6).add(6.7).build())
.add("price", 5.5)
.build();
JsonObject doc3 = Json.createObjectBuilder()
.add("my_vector", Json.createArrayBuilder().add(4.5).add(5.5).add(6.7).add(3.7).build())
.add("price", 4.4)
.build();

ArrayList<BulkOperation> operations = new ArrayList<>();
operations.add(new BulkOperation.Builder().index(
IndexOperation.of(io -> io.index(index).id("1").document(doc1))
).build());
operations.add(new BulkOperation.Builder().index(
IndexOperation.of(io -> io.index(index).id("2").document(doc2))
).build());
operations.add(new BulkOperation.Builder().index(
IndexOperation.of(io -> io.index(index).id("3").document(doc3))
).build());

BulkRequest bulkRequest = new BulkRequest.Builder()
.index(index)
.operations(operations)
.build();
client.bulk(bulkRequest);
```

3. Search documents using k-NN script score (_This implementation utilizes `com.fasterxml.jackson.databind.JsonNode` as the target document class, which is not part of the OpenSearch Java library. However, any document class that matches the searched data can be used instead._)

```java
InlineScript inlineScript = new InlineScript.Builder()
.source("knn_score")
.lang("knn")
.params(Map.of(
"field", JsonData.of("my_vector"),
"query_value", JsonData.of(List.of(1.5, 5.5, 4.5, 6.4)),
"space_type", JsonData.of("cosinesimil")
))
.build();
Query query = new Query.Builder()
.scriptScore(new ScriptScoreQuery.Builder()
.query(new Query.Builder()
.matchAll(new MatchAllQuery.Builder().build())
.build())
.script(new Script.Builder()
.inline(inlineScript)
.build())
.build())
.build();
SearchRequest searchRequest = new SearchRequest.Builder()
.index(index)
.query(query)
.build();
SearchResponse<JsonNode> searchResponse = client.search(searchRequest, JsonNode.class);
```

### Exact k-NN with painless scripting extension

1. Create index with custom mapping

```java
String index = "my-knn-index-1";
TypeMapping mapping = new TypeMapping.Builder()
.properties("my_vector", new Property.Builder()
.knnVector(new KnnVectorProperty.Builder()
.dimension(4)
.build())
.build())
.build();
CreateIndexRequest createIndexRequest = new CreateIndexRequest.Builder()
.index(index)
.mappings(mapping)
.build();
client.indices().create(createIndexRequest);
```

2. Index documents

```java
JsonObject doc1 = Json.createObjectBuilder()
.add("my_vector", Json.createArrayBuilder().add(1.5).add(5.5).add(4.5).add(6.4).build())
.add("price", 10.3)
.build();
JsonObject doc2 = Json.createObjectBuilder()
.add("my_vector", Json.createArrayBuilder().add(2.5).add(3.5).add(5.6).add(6.7).build())
.add("price", 5.5)
.build();
JsonObject doc3 = Json.createObjectBuilder()
.add("my_vector", Json.createArrayBuilder().add(4.5).add(5.5).add(6.7).add(3.7).build())
.add("price", 4.4)
.build();

ArrayList<BulkOperation> operations = new ArrayList<>();
operations.add(new BulkOperation.Builder().index(
IndexOperation.of(io -> io.index(index).id("1").document(doc1))
).build());
operations.add(new BulkOperation.Builder().index(
IndexOperation.of(io -> io.index(index).id("2").document(doc2))
).build());
operations.add(new BulkOperation.Builder().index(
IndexOperation.of(io -> io.index(index).id("3").document(doc3))
).build());

BulkRequest bulkRequest = new BulkRequest.Builder()
.index(index)
.operations(operations)
.build();
client.bulk(bulkRequest);
```

3. Search documents using k-NN with painless scripting extension (_This implementation utilizes `com.fasterxml.jackson.databind.JsonNode` as the target document class, which is not part of the OpenSearch Java library. However, any document class that matches the searched data can be used instead._)

```java
InlineScript inlineScript = new InlineScript.Builder()
.source("1.0 + cosineSimilarity(params.query_value, doc[params.field])")
.params(Map.of(
"field", JsonData.of("my_vector"),
"query_value", JsonData.of(List.of(1.5, 5.5, 4.5, 6.4))
))
.build();
Query query = new Query.Builder()
.scriptScore(new ScriptScoreQuery.Builder()
.query(new Query.Builder()
.matchAll(new MatchAllQuery.Builder().build())
.build())
.script(new Script.Builder()
.inline(inlineScript)
.build())
.build())
.build();
SearchRequest searchRequest = new SearchRequest.Builder()
.index(index)
.query(query)
.build();
SearchResult<JsonNode> searchResult = client.search(searchRequest, JsonNode.class);
```

## Search documents using suggesters

### App Data class
Expand Down
Loading

0 comments on commit 7d22a7d

Please sign in to comment.