Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPVE-353: chore: operf refactoring #451

Merged
merged 1 commit into from
Oct 17, 2023

Conversation

jakobmoellerdev
Copy link
Contributor

@jakobmoellerdev jakobmoellerdev commented Oct 16, 2023

performance test refactoring to prepare runnning in CI.

Supports running in either long-observation modes for idle tests against a cluster, or initiates stress tests automatically by creating pods/pvcs in the cluster and observing that window.

Collected aggregates:
99,95,90 quantiles on CPU and Memory over timeframe of either stress period, or of duration given by flag long-term-observation-window if run-stress=false

Writes a Quantile report as .toml in --output-directory or current working directory if not set.

go run ./test/performance --help
This script retrieves cpu and memory usage metrics for all the workloads created by the logical volume manager storage (lvms).

It creates <instances> of "busybox" pods using PVCs provisioned via the Storage class provided, when in stress test mode, or only collects metrics in idle.

A report is written as "metrics-<start_unix_ts>-<end_unix_ts>.toml" in <output-directory> or current working directory if not set.

Quantile Units in report:
CPU: mcores
Memory: MiB

Usage:
  go run ./test/performance [flags]

Examples:
Stress: go run ./test/performance -t $(oc whoami -t) -s lvms-vg1 -i 64
Idle: go run ./test/performance -t $(oc whoami -t) --run-stress false --long-term-observation-window=5m

Flags:
  -h, --help                                    help for performance
  -i, --instances int                           Number of Workloads/Pvcs (each Workload uses a different PVC) to create in the test (default 4)
  -w, --long-term-observation-window duration   only used when not running stress test, defines observation windows as duration into the past, e.g. 5m means the last 5 minutes (default 5m0s)
  -n, --namespace string                        Namespace where operator is deployed and the PVCs and test pods will be deployed/undeployed (default "openshift-storage")
  -o, --output-directory string                 output directory for the metrics report, working directry by default
  -p, --pattern string                          Pattern used to build the PVCs/Workloads names (default "operf")
  -r, --run-stress                              defines if stress tests should be used, if false uses long-term observation and does not attempt to create stress resources (default true)
  -s, --storage-class string                    Name of the topolvm storage class that will be used in the PVCs (default "lvms-vg1")
  -t, --token string                            authentication token needed to connect with the Openshift cluster.

How to test:

  1. Create a cluster and login via user/pass (ideally kubeadmin)
  2. Run one of the makefile commands, or alternatively create the LVMCluster manually and then run the tool manually.
  3. Observe findings created from the .toml files in the working directory.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 16, 2023
@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 16, 2023

@jakobmoellerdev: This pull request references OCPVE-353 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set.

In response to this:

performance test refactoring to prepare runnning in CI.

Supports running in either long-observation modes for idle tests against a cluster, or initiates stress tests automatically by creating pods/pvcs in the cluster and observing that window.

Collected aggregates:
99,95,90 quantiles on CPU and Memory over timeframe of either stress period, or of duration given by flag long-term-observation-window if run-stress=false

Writes a Quantile report as .toml in --output-directory or current working directory if not set.

go run ./test/performance --help
This script retrieves cpu and memory usage metrics for all the workloads created by the logical volume manager storage (lvms).

It creates <instances> of "busybox" pods using PVCs provisioned via the Storage class provided, when in stress test mode, or only collects metrics in idle.

A report is written as "metrics-<start_unix_ts>-<end_unix_ts>.toml" in <output-directory> or current working directory if not set.

Quantile Units in report:
CPU: mcores
Memory: MiB

Usage:
 go run ./test/performance [flags]

Examples:
Stress: go run ./test/performance -t $(oc whoami -t) -s lvms-vg1 -i 64
Idle: go run ./test/performance -t $(oc whoami -t) --run-stress false --long-term-observation-window=5m

Flags:
 -h, --help                                    help for performance
 -i, --instances int                           Number of Workloads/Pvcs (each Workload uses a different PVC) to create in the test (default 4)
 -w, --long-term-observation-window duration   only used when not running stress test, defines observation windows as duration into the past, e.g. 5m means the last 5 minutes (default 5m0s)
 -n, --namespace string                        Namespace where operator is deployed and the PVCs and test pods will be deployed/undeployed (default "openshift-storage")
 -o, --output-directory string                 output directory for the metrics report, working directry by default
 -p, --pattern string                          Pattern used to build the PVCs/Workloads names (default "operf")
 -r, --run-stress                              defines if stress tests should be used, if false uses long-term observation and does not attempt to create stress resources (default true)
 -s, --storage-class string                    Name of the topolvm storage class that will be used in the PVCs (default "lvms-vg1")
 -t, --token string                            authentication token needed to connect with the Openshift cluster.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 16, 2023

@jakobmoellerdev: This pull request references OCPVE-353 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set.

In response to this:

performance test refactoring to prepare runnning in CI.

Supports running in either long-observation modes for idle tests against a cluster, or initiates stress tests automatically by creating pods/pvcs in the cluster and observing that window.

Collected aggregates:
99,95,90 quantiles on CPU and Memory over timeframe of either stress period, or of duration given by flag long-term-observation-window if run-stress=false

Writes a Quantile report as .toml in --output-directory or current working directory if not set.

go run ./test/performance --help
This script retrieves cpu and memory usage metrics for all the workloads created by the logical volume manager storage (lvms).

It creates <instances> of "busybox" pods using PVCs provisioned via the Storage class provided, when in stress test mode, or only collects metrics in idle.

A report is written as "metrics-<start_unix_ts>-<end_unix_ts>.toml" in <output-directory> or current working directory if not set.

Quantile Units in report:
CPU: mcores
Memory: MiB

Usage:
 go run ./test/performance [flags]

Examples:
Stress: go run ./test/performance -t $(oc whoami -t) -s lvms-vg1 -i 64
Idle: go run ./test/performance -t $(oc whoami -t) --run-stress false --long-term-observation-window=5m

Flags:
 -h, --help                                    help for performance
 -i, --instances int                           Number of Workloads/Pvcs (each Workload uses a different PVC) to create in the test (default 4)
 -w, --long-term-observation-window duration   only used when not running stress test, defines observation windows as duration into the past, e.g. 5m means the last 5 minutes (default 5m0s)
 -n, --namespace string                        Namespace where operator is deployed and the PVCs and test pods will be deployed/undeployed (default "openshift-storage")
 -o, --output-directory string                 output directory for the metrics report, working directry by default
 -p, --pattern string                          Pattern used to build the PVCs/Workloads names (default "operf")
 -r, --run-stress                              defines if stress tests should be used, if false uses long-term observation and does not attempt to create stress resources (default true)
 -s, --storage-class string                    Name of the topolvm storage class that will be used in the PVCs (default "lvms-vg1")
 -t, --token string                            authentication token needed to connect with the Openshift cluster.

How to test:

  1. Create a cluster and login via user/pass (ideally kubeadmin)
  2. Run one of the makefile commands, or alternatively create the LVMCluster manually and then run the tool manually.
  3. Observe findings created from the .toml files in the working directory.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Oct 16, 2023
@codecov-commenter
Copy link

Codecov Report

Merging #451 (d6d45f0) into main (58e7c6f) will decrease coverage by 2.53%.
Report is 16 commits behind head on main.
The diff coverage is n/a.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #451      +/-   ##
==========================================
- Coverage   59.41%   56.88%   -2.53%     
==========================================
  Files          21       22       +1     
  Lines        1614     1735     +121     
==========================================
+ Hits          959      987      +28     
- Misses        539      626      +87     
- Partials      116      122       +6     

see 8 files with indirect coverage changes

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 16, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jakobmoellerdev

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 16, 2023
Copy link
Contributor

@suleymanakbas91 suleymanakbas91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great in general! It just needs a bit of documentation.

Makefile Show resolved Hide resolved
Signed-off-by: Jakob Möller <jmoller@redhat.com>
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 17, 2023

@jakobmoellerdev: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@suleymanakbas91
Copy link
Contributor

Great work!
/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 17, 2023
@openshift-ci openshift-ci bot merged commit 3f13cc4 into openshift:main Oct 17, 2023
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants