Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Create-Workload Enhancements] Rearchitect Create-Workload Feature #609

Merged
merged 20 commits into from
Aug 9, 2024

Conversation

IanHoang
Copy link
Collaborator

@IanHoang IanHoang commented Aug 5, 2024

Everything remains the same + the changes based on feedback provided by Govind. Rebased latest main changes so that it picks up the latest IT updates.
Original PR: #586

Description

This PR refactors create-workload to make it extensible and improve development experience. The core functionality and logic of create-workload remains the same but the code structure has been reorganized.

This has been tested with various indices and have compared the results of the original version with the results of these newest changes to ensure that there are no breaking changes.

Since this is reorganizing the foundation, the PR is quite lengthy. Going forward, changes will be smaller and incremental.

Issues Resolved

First steps in bridging gaps laid out in RFC #395. Specifically, this addresses the first step Redesign the Create Workload feature in the RFC.
#587

Testing

  • End to end testing: Created a workload from various indices existing in my personal cluster and ran the workloads afterwards
  • Compared and verified that workloads produced by original create-workload matches results of refactored create-workload
  • Restructured unittests to match unittests from other modules

End to End Testing

Created a workload and ran a test with the workload

$ hoangia@3c22fbd0d988 opensearch-benchmark % opensearch-benchmark create-workload --workload=subscriber-profiles-test-v2-6  --indices=subscriber-profiles  --output-path=/Users/hoangia/Desktop/subscriber-profiles-demo/subscriber-profiles-test --target-hosts="https://asdbfjasdfasdjkf.com" --client-options="basic_auth_user:'asdflkjass',basic_auth_password:'dsfjkasdlf'"

   ____                  _____                      __       ____                  __                         __
  / __ \____  ___  ____ / ___/___  ____ ___________/ /_     / __ )___  ____  _____/ /_  ____ ___  ____ ______/ /__
 / / / / __ \/ _ \/ __ \\__ \/ _ \/ __ `/ ___/ ___/ __ \   / __  / _ \/ __ \/ ___/ __ \/ __ `__ \/ __ `/ ___/ //_/
/ /_/ / /_/ /  __/ / / /__/ /  __/ /_/ / /  / /__/ / / /  / /_/ /  __/ / / / /__/ / / / / / / / / /_/ / /  / ,<
\____/ .___/\___/_/ /_/____/\___/\__,_/_/   \___/_/ /_/  /_____/\___/_/ /_/\___/_/ /_/_/ /_/ /_/\__,_/_/  /_/|_|
    /_/

[INFO] You did not provide an explicit timeout in the client options. Assuming default of 10 seconds.
[INFO] Connected to OpenSearch cluster [fa0152ec255f2264cd9dbb4bb7f74fdc] version [2.5.0].

Extracting documents for index [subscriber-profiles] ...     1000/1000 docs [100.0% done]
Extracting documents for index [subscriber-profiles]...       9000/9000 docs [100.0% done]

[INFO] Workload subscriber-profiles-test-v2-6 has been created. Run it with: opensearch-benchmark --workload-path=/Users/hoangia/Desktop/subscriber-profiles-demo/subscriber-profiles-test
$ hoangia@3c22fbd0d988 opensearch-benchmark % opensearch-benchmark execute-test --workload-path=/Users/hoangia/Desktop/subscriber-profiles-demo/subscriber-profiles-test --target-hosts="https://asdbfjasdfasdjkf.com" --client-options="basic_auth_user:'asdflkjass',basic_auth_password:'dsfjkasdlf'" --test-mode

   ____                  _____                      __       ____                  __                         __
  / __ \____  ___  ____ / ___/___  ____ ___________/ /_     / __ )___  ____  _____/ /_  ____ ___  ____ ______/ /__
 / / / / __ \/ _ \/ __ \\__ \/ _ \/ __ `/ ___/ ___/ __ \   / __  / _ \/ __ \/ ___/ __ \/ __ `__ \/ __ `/ ___/ //_/
/ /_/ / /_/ /  __/ / / /__/ /  __/ /_/ / /  / /__/ / / /  / /_/ /  __/ / / / /__/ / / / / / / / / /_/ / /  / ,<
\____/ .___/\___/_/ /_/____/\___/\__,_/_/   \___/_/ /_/  /_____/\___/_/ /_/\___/_/ /_/_/ /_/ /_/\__,_/_/  /_/|_|
    /_/

[INFO] [Test Execution ID]: dcfa886c-0825-4f2d-a2eb-df2986dae0bc
[INFO] You did not provide an explicit timeout in the client options. Assuming default of 10 seconds.
[INFO] Preparing file offset table for [/Users/hoangia/Desktop/test-big5-v2/subscriber-profiles-test-v2-5/subscriber-profiles-documents-1k.json] ... [OK]
[INFO] Executing test with workload [subscriber-profiles-test-v2-5], test_procedure [default-test-procedure] and provision_config_instance ['external'] with version [2.5.0].

[WARNING] merges_total_time is 3192538 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] merges_total_throttled_time is 1425171 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] indexing_total_time is 7699105 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] refresh_total_time is 959035 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] flush_total_time is 155017 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
Running delete-index                                                           [100% done]
Running create-index                                                           [100% done]
Running cluster-health                                                         [100% done]
Running index-append                                                           [100% done]
Running refresh-after-index                                                    [100% done]
Running force-merge                                                            [100% done]
Running refresh-after-force-merge                                              [100% done]
Running wait-until-merges-finish                                               [100% done]
Running match-all                                                              [100% done]

------------------------------------------------------
    _______             __   _____
   / ____(_)___  ____ _/ /  / ___/_________  ________
  / /_  / / __ \/ __ `/ /   \__ \/ ___/ __ \/ ___/ _ \
 / __/ / / / / / /_/ / /   ___/ / /__/ /_/ / /  /  __/
/_/   /_/_/ /_/\__,_/_/   /____/\___/\____/_/   \___/
------------------------------------------------------

|                                                         Metric |                     Task |       Value |   Unit |
|---------------------------------------------------------------:|-------------------------:|------------:|-------:|
|                     Cumulative indexing time of primary shards |                          |     128.305 |    min |
|             Min cumulative indexing time across primary shards |                          |           0 |    min |
|          Median cumulative indexing time across primary shards |                          |  0.00131667 |    min |
|             Max cumulative indexing time across primary shards |                          |     92.7309 |    min |
|            Cumulative indexing throttle time of primary shards |                          |           0 |    min |
|    Min cumulative indexing throttle time across primary shards |                          |           0 |    min |
| Median cumulative indexing throttle time across primary shards |                          |           0 |    min |
|    Max cumulative indexing throttle time across primary shards |                          |           0 |    min |
|                        Cumulative merge time of primary shards |                          |      53.209 |    min |
|                       Cumulative merge count of primary shards |                          |          53 |        |
|                Min cumulative merge time across primary shards |                          |           0 |    min |
|             Median cumulative merge time across primary shards |                          |           0 |    min |
|                Max cumulative merge time across primary shards |                          |      52.735 |    min |
|               Cumulative merge throttle time of primary shards |                          |     23.7529 |    min |
|       Min cumulative merge throttle time across primary shards |                          |           0 |    min |
|    Median cumulative merge throttle time across primary shards |                          |           0 |    min |
|       Max cumulative merge throttle time across primary shards |                          |     23.7529 |    min |
|                      Cumulative refresh time of primary shards |                          |     15.9812 |    min |
|                     Cumulative refresh count of primary shards |                          |        1952 |        |
|              Min cumulative refresh time across primary shards |                          |           0 |    min |
|           Median cumulative refresh time across primary shards |                          | 0.000541667 |    min |
|              Max cumulative refresh time across primary shards |                          |     9.97653 |    min |
|                        Cumulative flush time of primary shards |                          |     2.58362 |    min |
|                       Cumulative flush count of primary shards |                          |          64 |        |
|                Min cumulative flush time across primary shards |                          |           0 |    min |
|             Median cumulative flush time across primary shards |                          |           0 |    min |
|                Max cumulative flush time across primary shards |                          |     0.66665 |    min |
|                                        Total Young Gen GC time |                          |           0 |      s |
|                                       Total Young Gen GC count |                          |           0 |        |
|                                          Total Old Gen GC time |                          |           0 |      s |
|                                         Total Old Gen GC count |                          |           0 |        |
|                                                     Store size |                          |     23.2084 |     GB |
|                                                  Translog size |                          | 4.45638e-06 |     GB |
|                                         Heap used for segments |                          |           0 |     MB |
|                                       Heap used for doc values |                          |           0 |     MB |
|                                            Heap used for terms |                          |           0 |     MB |
|                                            Heap used for norms |                          |           0 |     MB |
|                                           Heap used for points |                          |           0 |     MB |
|                                    Heap used for stored fields |                          |           0 |     MB |
|                                                  Segment count |                          |         225 |        |
|                                                 Min Throughput |             index-append |     2799.07 | docs/s |
|                                                Mean Throughput |             index-append |     2799.07 | docs/s |
|                                              Median Throughput |             index-append |     2799.07 | docs/s |
|                                                 Max Throughput |             index-append |     2799.07 | docs/s |
|                                        50th percentile latency |             index-append |     324.545 |     ms |
|                                       100th percentile latency |             index-append |     357.841 |     ms |
|                                   50th percentile service time |             index-append |     324.545 |     ms |
|                                  100th percentile service time |             index-append |     357.841 |     ms |
|                                                     error rate |             index-append |           0 |      % |
|                                                 Min Throughput | wait-until-merges-finish |        4.66 |  ops/s |
|                                                Mean Throughput | wait-until-merges-finish |        4.66 |  ops/s |
|                                              Median Throughput | wait-until-merges-finish |        4.66 |  ops/s |
|                                                 Max Throughput | wait-until-merges-finish |        4.66 |  ops/s |
|                                       100th percentile latency | wait-until-merges-finish |     181.306 |     ms |
|                                  100th percentile service time | wait-until-merges-finish |     181.306 |     ms |
|                                                     error rate | wait-until-merges-finish |           0 |      % |
|                                                 Min Throughput |                match-all |        4.73 |  ops/s |
|                                                Mean Throughput |                match-all |        4.73 |  ops/s |
|                                              Median Throughput |                match-all |        4.73 |  ops/s |
|                                                 Max Throughput |                match-all |        4.73 |  ops/s |
|                                       100th percentile latency |                match-all |     382.712 |     ms |
|                                  100th percentile service time |                match-all |     170.999 |     ms |
|                                                     error rate |                match-all |           0 |      % |


--------------------------------
[INFO] SUCCESS (took 11 seconds)
--------------------------------

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Ian Hoang added 19 commits August 5, 2024 10:33
Signed-off-by: Ian Hoang <hoangia@amazon.com>
Signed-off-by: Ian Hoang <hoangia@amazon.com>
…ousCorpusExtractor

Signed-off-by: Ian Hoang <hoangia@amazon.com>
Signed-off-by: Ian Hoang <hoangia@amazon.com>
Signed-off-by: Ian Hoang <hoangia@amazon.com>
Signed-off-by: Ian Hoang <hoangia@amazon.com>
Signed-off-by: Ian Hoang <hoangia@amazon.com>
Signed-off-by: Ian Hoang <hoangia@amazon.com>
…more apt

Signed-off-by: Ian Hoang <hoangia@amazon.com>
Signed-off-by: Ian Hoang <hoangia@amazon.com>
Signed-off-by: Ian Hoang <hoangia@amazon.com>
Signed-off-by: Ian Hoang <hoangia@amazon.com>
Signed-off-by: Ian Hoang <hoangia@amazon.com>
Signed-off-by: Ian Hoang <hoangia@amazon.com>
Signed-off-by: Ian Hoang <hoangia@amazon.com>
Signed-off-by: Ian Hoang <hoangia@amazon.com>
Signed-off-by: Ian Hoang <hoangia@amazon.com>
Signed-off-by: Ian Hoang <hoangia@amazon.com>
Signed-off-by: Ian Hoang <hoangia@amazon.com>
@IanHoang IanHoang changed the title [Create-Workload] Rearchitect Create-Workload Feature [Create-Workload Enhancements] Rearchitect Create-Workload Feature Aug 5, 2024
osbenchmark/workload_generator/extractors.py Outdated Show resolved Hide resolved
osbenchmark/workload_generator/extractors.py Outdated Show resolved Hide resolved
…ment based on feedback

Signed-off-by: Ian Hoang <hoangia@amazon.com>
@IanHoang IanHoang merged commit e79599f into opensearch-project:main Aug 9, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants