Skip to content

Commit

Permalink
⬆️ add read replica and rotator documents (#189)
Browse files Browse the repository at this point in the history
* ⬆️ add read replica and rotator documents

Signed-off-by: Kosuke Morimoto <ksk@vdaas.org>

* 🤖 auto update stage ⬆️

Signed-off-by: vdaas-ci <ci@vdaas.org>

* fix

Signed-off-by: Kosuke Morimoto <ksk@vdaas.org>

* 🤖 auto update stage ⬆️

Signed-off-by: vdaas-ci <ci@vdaas.org>

* fix typo

Signed-off-by: Kosuke Morimoto <ksk@vdaas.org>

* 🤖 auto update stage ⬆️

Signed-off-by: vdaas-ci <ci@vdaas.org>

---------

Signed-off-by: Kosuke Morimoto <ksk@vdaas.org>
Signed-off-by: vdaas-ci <ci@vdaas.org>
Co-authored-by: vdaas-ci <ci@vdaas.org>
  • Loading branch information
kmrmt and vdaas-ci authored May 21, 2024
1 parent 5fd0d70 commit 93ea65d
Show file tree
Hide file tree
Showing 14 changed files with 388 additions and 35 deletions.
2 changes: 1 addition & 1 deletion VERSIONS/GO_VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.22.2
1.22.3
13 changes: 12 additions & 1 deletion content/docs/api/filter-gateway.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Filter Gateway_api"
date: 2024-04-25T20:33:03+09:00
date: 2024-05-21T15:09:37+09:00
draft: false
weight: 800
description: How to use CRUD API with filter gateway
Expand Down Expand Up @@ -1488,6 +1488,17 @@ service Filter {
| id | string | | the vector ID |
| distance | float | | the distance between result vector and request vector |

### Status Code

| code | desc. |
| :--: | :---------------- |
| 0 | OK |
| 1 | CANCELLED |
| 3 | INVALID_ARGUMENT |
| 4 | DEADLINE_EXCEEDED |
| 5 | NOT_FOUND |
| 13 | INTERNAL |

## MultiSearch RPC

MultiSearch RPC is the method to search objects with multiple objects in **1** request.
Expand Down
4 changes: 2 additions & 2 deletions content/docs/api/remove.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Remove_api"
date: 2024-04-25T20:33:02+09:00
date: 2024-05-21T15:09:37+09:00
draft: false
weight: 500
description: Remove indexes from the Vald cluster
Expand Down Expand Up @@ -380,7 +380,7 @@ gRPC has a message size limitation.<br>
Please be careful that the size of the request exceeds the limit.
</div>

## Input
### Input

- the scheme of `payload.v1.Remove.MultiRequest`

Expand Down
6 changes: 3 additions & 3 deletions content/docs/api/search.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Search_api"
date: 2024-04-25T20:33:02+09:00
date: 2024-05-21T15:09:37+09:00
draft: false
weight: 400
description: Search ANN vectors from the Vald cluster
Expand Down Expand Up @@ -711,7 +711,7 @@ Here are some common reasons and how to resolve each error.
| NOT_FOUND | Search result is empty or insufficient to request result length. | Send a request with another vector or set min_num to a smaller value. |
| INTERNAL | Target Vald cluster or network route has some critical error. | Check target Vald cluster first and check network route including ingress as second. |

### MultiSearchByID RPC
## MultiSearchByID RPC

MultiSearchByID RPC is the method to search vectors with multiple IDs in **1** request.

Expand Down Expand Up @@ -1496,7 +1496,7 @@ Here are some common reasons and how to resolve each error.
| NOT_FOUND | Search result is empty or insufficient to request result length. | Send a request with another vector or set min_num to a smaller value. |
| INTERNAL | Target Vald cluster or network route has some critical error. | Check target Vald cluster first and check network route including ingress as second. |

### MultiLinearSearchByID RPC
## MultiLinearSearchByID RPC

MultiLinearSearchByID RPC is the method to linear search vectors with multiple IDs in **1** request.

Expand Down
254 changes: 238 additions & 16 deletions content/docs/performance/continuous-benchmark.md

Large diffs are not rendered by default.

8 changes: 4 additions & 4 deletions content/docs/tutorial/get-started.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Get Started_tutorial"
date: 2024-04-25T20:33:02+09:00
date: 2024-05-21T13:22:43+09:00
draft: false
weight: 100
description: Running Vald cluster with NGT Agent on Kubernetes and execute client codes
Expand Down Expand Up @@ -46,7 +46,7 @@ If Helm or HDF5 is not installed, please install [Helm](https://helm.sh/docs/int
<details><summary>Installation command for Helm</summary><br>

```bash
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
curl -fsSL https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
```

</details>
Expand Down Expand Up @@ -454,8 +454,8 @@ If you are interested, please refer to [SDKs](/docs/user-guides/sdks).<br>
```go
_, err := client.Flush(ctx, &payload.Flush_Request{})
if err != nil {
glg.Fatal(err)
}
glg.Fatal(err)
}
```
</details>
Expand Down
4 changes: 2 additions & 2 deletions content/docs/tutorial/vald-agent-standalone-on-k8s.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Vald Agent Standalone on K8s_tutorial"
date: 2024-02-15T17:10:12+09:00
date: 2024-05-20T15:12:59+09:00
draft: false
weight: 300
description: Running only Vald Agent on Kubernetes and execute client codes
Expand Down Expand Up @@ -47,7 +47,7 @@ If Helm or HDF5 is not installed, please install [Helm](https://helm.sh/docs/int
<details><summary>Installation command for Helm</summary><br>

```bash
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
curl -fsSL https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
```

</details>
Expand Down
4 changes: 2 additions & 2 deletions content/docs/tutorial/vald-multicluster-on-k8s.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Vald Multicluster on K8s_tutorial"
date: 2024-02-15T17:10:13+09:00
date: 2024-05-20T15:12:59+09:00
draft: false
weight: 500
description: Running Multi Vald Clusters with Mirror Gateway on Kubernetes and execute client codes
Expand Down Expand Up @@ -40,7 +40,7 @@ If Helm or HDF5 is not installed, please install [Helm](https://helm.sh/docs/int
<details><summary>Installation command for Helm</summary><br>

```bash
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
curl -fsSL https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
```

</details>
Expand Down
4 changes: 2 additions & 2 deletions content/docs/user-guides/deployment.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Deployment_user Guides"
date: 2024-02-15T17:10:14+09:00
date: 2024-05-20T15:12:58+09:00
draft: false
weight: 600
description: How to launch Vald cluster on your Kubernetes cluster
Expand Down Expand Up @@ -38,7 +38,7 @@ If Helm is not installed, please install [Helm](https://helm.sh/docs/intro/insta
<details><summary>Installation command for Helm</summary><br>

```bash
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
curl -fsSL https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
```

</details>
Expand Down
116 changes: 116 additions & 0 deletions content/docs/user-guides/read-replica-and-rotator.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
---
title: "Read Replica and Rotator_user Guides"
date: 2024-05-21T13:22:43+09:00
draft: false
weight: 1500
description: How to improve search request speed
menu:
userguides:
parent: User Guides
---

# Read Replica and Rotator

Read replica enhances the search QPS (Queries Per Second) of the Vald cluster by deploying read-only agents in addition to the regular agents and distributing the requests among them. Read replica is deployed as Kubernetes deployments, and depending on the number of replicas (N), QPS increases by approximately 1.7 to 1.8 times \* N.

<div class="notice">
The increase in QPS is possible with sufficient infrastructure (see <a href="#important-notes">Important notes</a>).
</div>

## How to deploy read replica

The read replica is managed with a separate chart from the Vald cluster and is deployed as an addon to the Vald cluster. The Vald cluster should be deployed first, followed by the deployment of the read replica.

> The reason Vald and Vald-readreplica are in separate charts is to avoid conflicts between the read replica's restart and the Helm operator's processes when Vald is managed by a helm operator. Therefore, the read replica will always be deployed using Helm commands.
### When you deploy Vald with Helm command

1. Edit `values.yaml` like below (Please refer to [deployment](/docs/user-guides/deployment) for other fields.)

```yaml
agent:
ngt:
export_index_info_to_k8s: true
readreplica:
enabled: true
minReplicas: 1 # if you don't use hpa, this will be the replicas of the Deployment
maxReplicas: 3
hpa:
enabled: true # if you prefer to use hpa
targetCPUUtilizationPercentage: 80
manager:
index:
operator:
enabled: true
rotation_job_concurrency: 2
```
1. Deploy Vald cluster
```bash
helm install vald vald/vald --values values.yaml
```

1. Deploy `vald-readreplica` with the same `values.yaml`

```bash
helm install vald-readreplica vald/vald-readreplica --values values.yaml
```

### When you deploy Vald cluster with `vald-helm-operator`

1. Edit `valdrelease.yaml` with the same fields as above

1. Deploy Vald cluster

```bash
helm install vald-helm-operator-release vald/vald-helm-operator
kubectl apply -f valdrelease.yaml
```

1. Deploy `vald-readreplica`

```bash
helm install vald-readreplica vald/vald-readreplica --values <YOUR VALUES YAML FILE PATH>
```

## Architecture

Read replica mainly consists of the following four parts.

<img src="/images/guides/read-replica-and-rotator/architecture.png" alt="Read Replica Architecture" />

### Read replica deployment

The deployment that generates Pods where the actual processing of read replica takes place. Read replica accepts read requests (search) and reads the index from the read replica PVC.

### Read replica PVC

The PVC for read replica Pods is used to read the index. It is generated based on the latest snapshot from the PVC of the regular agent. Unlike the agent PVC, it is generated as ROX, allowing it to be read from multiple Pods.

### Index operator

The operator handles the following processes:

1. Monitoring the time when the agent saved the index to the PVC and when the read replica performed index rotation
1. Generating [Read replica rotator](#read-replica-rotator) job when an index save occurs after the most recent rotation

> The Index operator also manages the timing of index create/save operations other than those mentioned above. Please refer to another document for details.

### Read replica rotator

The Kubernetes job handles the following processes:

1. Creating a snapshot from the agent's PVC
1. Generating a PVC for read replica from the snapshot
1. Rolling update of the read replica deployment to launch a group of read replica pods with the latest index

## Important notes

Result consistency is guaranteed

There is a time lag between index insertion, agent save, and the completion of read replica rotation. During this time, there may be inconsistencies between the index in the agent itself and the index in the read replica.

- Sufficient infrastructure is required for QPS scaling

Even if read replicas are deployed, QPS will not scale if sufficient resources are not available in the Kubernetes cluster. Specifically, agent resources and read replica resources should be deployed on separate nodes. Vald sets `podAntiAffinity` to ensure that agent resources and read replica resources are deployed on separate nodes as much as possible.
6 changes: 5 additions & 1 deletion description.json
Original file line number Diff line number Diff line change
Expand Up @@ -223,6 +223,10 @@
"sdks": {
"weight": 1400,
"description": "Sends request and receives response from the Vald cluster"
},
"read-replica-and-rotator": {
"weight": 1500,
"description": "How to improve search request speed"
}
}
}
}
2 changes: 1 addition & 1 deletion preview
Submodule preview updated 73 files
+1 −1 categories/index.html
+1 −1 docs/api/build_proto/index.html
+3 −3 docs/api/filter-gateway/index.html
+1 −1 docs/api/flush/index.html
+1 −1 docs/api/index.html
+1 −1 docs/api/insert/index.html
+1 −1 docs/api/mirror-gateway/index.html
+1 −1 docs/api/object/index.html
+3 −3 docs/api/remove/index.html
+4 −4 docs/api/search/index.html
+1 −1 docs/api/status/index.html
+1 −1 docs/api/update/index.html
+1 −1 docs/api/upsert/index.html
+1 −1 docs/contributing/coding-style/index.html
+1 −1 docs/contributing/contributing-guide/index.html
+1 −1 docs/contributing/development/index.html
+1 −1 docs/contributing/index.html
+1 −1 docs/contributing/unit-test-guideline/index.html
+1 −1 docs/index.html
+1 −1 docs/overview/about-vald/index.html
+1 −1 docs/overview/architecture/index.html
+1 −1 docs/overview/component/agent/index.html
+1 −1 docs/overview/component/discoverer/index.html
+1 −1 docs/overview/component/filter-gateway/index.html
+1 −1 docs/overview/component/index-manager/index.html
+1 −1 docs/overview/component/index.html
+1 −1 docs/overview/component/lb-gateway/index.html
+1 −1 docs/overview/component/mirror-gateway/index.html
+1 −1 docs/overview/data-flow/index.html
+1 −1 docs/overview/index.html
+1 −1 docs/performance/benchmark/index.html
+218 −14 docs/performance/continuous-benchmark/index.html
+1 −1 docs/performance/index.html
+1 −1 docs/performance/loadtest/index.html
+1 −1 docs/performance/tuning-search-performance/index.html
+1 −1 docs/release/changelog/index.html
+1 −1 docs/release/index.html
+1 −1 docs/support/contacts/index.html
+1 −1 docs/support/faq/index.html
+1 −1 docs/support/index.html
+1 −1 docs/troubleshooting/client-side/index.html
+1 −1 docs/troubleshooting/index.html
+1 −1 docs/troubleshooting/mirror-gateway/index.html
+1 −1 docs/troubleshooting/provisioning/index.html
+1 −1 docs/tutorial/get-started-with-faiss-agent/index.html
+4 −4 docs/tutorial/get-started/index.html
+1 −1 docs/tutorial/index.html
+1 −1 docs/tutorial/vald-agent-standalone-on-docker/index.html
+2 −2 docs/tutorial/vald-agent-standalone-on-k8s/index.html
+1 −1 docs/tutorial/vald-multicluster-on-k8s/index.html
+1 −1 docs/usecase/index.html
+1 −1 docs/usecase/usage-example/index.html
+1 −1 docs/user-guides/backup-configuration/index.html
+1 −1 docs/user-guides/capacity-planning/index.html
+1 −1 docs/user-guides/client-api-config/index.html
+1 −1 docs/user-guides/cluster-role-binding/index.html
+1 −1 docs/user-guides/configuration/index.html
+2 −2 docs/user-guides/deployment/index.html
+1 −1 docs/user-guides/filtering-configuration/index.html
+1 −1 docs/user-guides/index-correction/index.html
+1 −1 docs/user-guides/index.html
+1 −1 docs/user-guides/mirroring-configuration/index.html
+1 −1 docs/user-guides/network-policy/index.html
+1 −1 docs/user-guides/observability-configuration/index.html
+1 −1 docs/user-guides/operations/index.html
+31 −0 docs/user-guides/read-replica-and-rotator/index.html
+1 −1 docs/user-guides/sdks/index.html
+1 −1 docs/user-guides/upgrade-cluster/index.html
+ images/guides/read-replica-and-rotator/architecture.png
+ images/performance/benchmark-grafana.png
+1 −1 index.html
+1 −1 sitemap.xml
+1 −1 tags/index.html
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added static/images/performance/benchmark-grafana.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 93ea65d

Please sign in to comment.