Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding new aoss vector search param fields target_throughput, repetitions and time_period #334

Merged
merged 3 commits into from
Jul 3, 2024

Conversation

aseghehey
Copy link
Contributor

Description

Describe what this change achieves.

As per request in issue 317, AOSS would like to have granular control over of load during benchmarking by controlling the targeted TPS for workload as well as the duration of the load.

This PR adds the fields target_throughput, time_period and repetitions to AOSS' vector search procedure. This was tested, and I was able to validate it locally.

Local test details

Tested with the following params:

{
    "target_index_name": "target_index",
    "target_field_name": "target_field",
    "target_index_body": "indices/nmslib-index.json",
    "target_index_dimension": 768,
    "target_index_space_type": "innerproduct",
    "id_field_name": "id",

    "target_index_bulk_size": 100,
    "target_index_bulk_index_data_set_format": "hdf5",
    "target_index_bulk_index_data_set_corpus": "cohere-100k",
    "target_index_bulk_indexing_clients": 10,
    "repetitions": 2,

    "hnsw_ef_search": 256,
    "hnsw_ef_construction": 256,

    "query_k": 10,
    "query_body": {
         "docvalue_fields" : ["id"],
         "stored_fields" : "_none_"
    },

    "query_data_set_format": "hdf5",
    "query_data_set_corpus":"cohere-100k",
    "neighbors_data_set_corpus":"cohere-100k",
    "neighbors_data_set_format":"hdf5",
    "query_count": 10000,
    "target_throughput": 15,
    "time_period": 1500
}

Ran command:

opensearch-benchmark execute-test \
--target-hosts=$ENDPOINT \
--pipeline=benchmark-only \
--workload=vectorsearch \
--workload-repository aoss/benchmarking/opensearch-benchmark-workloads \
--workload-params=aoss/benchmarking/opensearch-benchmark-workloads/vectorsearch/params/aoss/custom_params.json \
--client-options=timeout:120,amazon_aws_log_in:client_option,aws_access_key_id:$ACCESS_KEY,aws_secret_access_key:$SECRET_ACCESS_KEY,region:$region,service:aoss \
--distribution-version=2.1.1 \
--test-procedure=search-only \
--kill-running-processes

Output, which I observed the max throughput did not exceed 15 and the test did not last over 1500s:

   ____                  _____                      __       ____                  __                         __
  / __ \____  ___  ____ / ___/___  ____ ___________/ /_     / __ )___  ____  _____/ /_  ____ ___  ____ ______/ /__
 / / / / __ \/ _ \/ __ \\__ \/ _ \/ __ `/ ___/ ___/ __ \   / __  / _ \/ __ \/ ___/ __ \/ __ `__ \/ __ `/ ___/ //_/
/ /_/ / /_/ /  __/ / / /__/ /  __/ /_/ / /  / /__/ / / /  / /_/ /  __/ / / / /__/ / / / / / / / / /_/ / /  / ,<
\____/ .___/\___/_/ /_/____/\___/\__,_/_/   \___/_/ /_/  /_____/\___/_/ /_/\___/_/ /_/_/ /_/ /_/\__,_/_/  /_/|_|
    /_/

[INFO] [Test Execution ID]: f1adadb3-0e3d-4385-ac87-8d0c0bed6151
[INFO] Executing test with workload [vectorsearch], test_procedure [search-only] and provision_config_instance ['external'] with version [None].

Running prod-queries                                                           [100% done]

------------------------------------------------------
    _______             __   _____
   / ____(_)___  ____ _/ /  / ___/_________  ________
  / /_  / / __ \/ __ `/ /   \__ \/ ___/ __ \/ ___/ _ \
 / __/ / / / / / /_/ / /   ___/ / /__/ /_/ / /  /  __/
/_/   /_/_/ /_/\__,_/_/   /____/\___/\____/_/   \___/
------------------------------------------------------

|                          Metric |         Task |   Value |   Unit |
|--------------------------------:|-------------:|--------:|-------:|
|         Total Young Gen GC time |              |       0 |      s |
|        Total Young Gen GC count |              |       0 |        |
|           Total Old Gen GC time |              |       0 |      s |
|          Total Old Gen GC count |              |       0 |        |
|                  Min Throughput | prod-queries |   13.36 |  ops/s |
|                 Mean Throughput | prod-queries |   14.99 |  ops/s |
|               Median Throughput | prod-queries |      15 |  ops/s |
|                  Max Throughput | prod-queries |      15 |  ops/s |
|         50th percentile latency | prod-queries | 33.3555 |     ms |
|         90th percentile latency | prod-queries | 38.4205 |     ms |
|         99th percentile latency | prod-queries | 59.1448 |     ms |
|       99.9th percentile latency | prod-queries | 213.361 |     ms |
|      99.99th percentile latency | prod-queries | 278.935 |     ms |
|        100th percentile latency | prod-queries | 281.683 |     ms |
|    50th percentile service time | prod-queries | 31.2444 |     ms |
|    90th percentile service time | prod-queries | 36.1528 |     ms |
|    99th percentile service time | prod-queries | 47.7218 |     ms |
|  99.9th percentile service time | prod-queries | 105.628 |     ms |
| 99.99th percentile service time | prod-queries | 276.917 |     ms |
|   100th percentile service time | prod-queries | 279.344 |     ms |
|                      error rate | prod-queries |       0 |      % |


----------------------------------
[INFO] SUCCESS (took 1347 seconds)
----------------------------------

Issues Resolved

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Emanuel Aseghehey added 2 commits July 2, 2024 14:31
…-only test procedure

Signed-off-by: Emanuel Aseghehey <aseghey@amazon.com>
…-only test procedure

Signed-off-by: Emanuel Aseghehey <aseghey@amazon.com>
vectorsearch/README.md Outdated Show resolved Hide resolved
Signed-off-by: Emanuel Aseghehey <emanuelaseghey9@gmail.com>
@VijayanB VijayanB added backport 2 Backport to the "2" branch backport 1 backport 3 Backport to the "3" branch labels Jul 3, 2024
@VijayanB VijayanB merged commit f4a830e into opensearch-project:main Jul 3, 2024
5 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jul 3, 2024
…titions` and `time_period` (#334)

* Adding target_throughput, repetitions and time_period to aoss' search-only test procedure

Signed-off-by: Emanuel Aseghehey <aseghey@amazon.com>

* Adding target_throughput, repetitions and time_period to aoss' search-only test procedure

Signed-off-by: Emanuel Aseghehey <aseghey@amazon.com>

* Adding default values to repetitions and time_period

Signed-off-by: Emanuel Aseghehey <emanuelaseghey9@gmail.com>

---------

Signed-off-by: Emanuel Aseghehey <aseghey@amazon.com>
Signed-off-by: Emanuel Aseghehey <emanuelaseghey9@gmail.com>
Co-authored-by: Emanuel Aseghehey <aseghey@amazon.com>
(cherry picked from commit f4a830e)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@opensearch-trigger-bot
Copy link

The backport to 3 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-3 3
# Navigate to the new working tree
pushd ../.worktrees/backport-3
# Create a new branch
git switch --create backport/backport-334-to-3
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 f4a830eb31451e4df76e56016684641dc373793c
# Push it to GitHub
git push --set-upstream origin backport/backport-334-to-3
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-3

Then, create a pull request where the base branch is 3 and the compare/head branch is backport/backport-334-to-3.

VijayanB pushed a commit that referenced this pull request Jul 3, 2024
…titions` and `time_period` (#334) (#335)

* Adding target_throughput, repetitions and time_period to aoss' search-only test procedure



* Adding target_throughput, repetitions and time_period to aoss' search-only test procedure



* Adding default values to repetitions and time_period



---------




(cherry picked from commit f4a830e)

Signed-off-by: Emanuel Aseghehey <aseghey@amazon.com>
Signed-off-by: Emanuel Aseghehey <emanuelaseghey9@gmail.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Emanuel Aseghehey <aseghey@amazon.com>
harshavamsi pushed a commit to harshavamsi/opensearch-benchmark-workloads that referenced this pull request Jul 16, 2024
…titions` and `time_period` (opensearch-project#334)

* Adding target_throughput, repetitions and time_period to aoss' search-only test procedure

Signed-off-by: Emanuel Aseghehey <aseghey@amazon.com>

* Adding target_throughput, repetitions and time_period to aoss' search-only test procedure

Signed-off-by: Emanuel Aseghehey <aseghey@amazon.com>

* Adding default values to repetitions and time_period

Signed-off-by: Emanuel Aseghehey <emanuelaseghey9@gmail.com>

---------

Signed-off-by: Emanuel Aseghehey <aseghey@amazon.com>
Signed-off-by: Emanuel Aseghehey <emanuelaseghey9@gmail.com>
Co-authored-by: Emanuel Aseghehey <aseghey@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2 Backport to the "2" branch backport 3 Backport to the "3" branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] VectorSearch - Ability to control target-throughput and time-period
2 participants