Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated test framework can run scripts on launched clusters. Add offline stake operations test case and script. #8510

Merged
merged 10 commits into from
Mar 18, 2020

Conversation

danpaul000
Copy link
Contributor

@danpaul000 danpaul000 commented Feb 27, 2020

Problem

  • Cluster framework needs greater flexibility as we add more flavors of test cases.
  • We don't have test/example scripts for the various staking operations, particularly the offline signing maneuvers.

Summary of Changes

  1. Re-factor the automation framework to be able to run arbitrary scripts on a configurable testnet after the cluster has launched. (system-test/stake-operations-testcases/offline_stake_colo.yml is an example of a testcase that calls a CUSTOM_SCRIPT after successful cluster launch.)

  2. Distinguish between testcases that should run a sleep time, partition looping logic, or a script

  3. Create test script (system-test/stake-operations-testcases/offline_stake_operations.sh) that points at a running cluster and uses a nonce account and the offline signing workflow to do all staking operations. This can be run from the automation framework or manually without any environment dependencies, other than having solana and solana-keygen in your PATH.

  4. Add a buildkite-readable testcase to wrap up the creation of a colo or GCE testnet and run the stake operations against it.

  5. Add SKIP_PERF_RESULTS flag to buildkite testcases to skip printing TPS, confirmation time and slot rate for non-performance (ie staking ops) cases where we don't care about those values. Reducing output clutter.

Fixes #6194

@danpaul000
Copy link
Contributor Author

@garious I bashed this problem pretty hard. This could go in as written as a functional sanity test, but I think it could serve external parties better if broken up into some smaller sample scripts, or turned into how-to documentation. wdyt?

garious
garious previously approved these changes Feb 27, 2020
Copy link
Contributor

@garious garious left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Got a way to ensure this doesn't bitrot? If it executes super fast, it'd be good to see this in CI. Otherwise, part of a nightly run.

@stale
Copy link

stale bot commented Mar 5, 2020

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Mar 5, 2020
@stale stale bot removed the stale [bot only] Added to stale content; results in auto-close after a week. label Mar 11, 2020
@mergify mergify bot dismissed garious’s stale review March 11, 2020 00:53

Pull request has been modified.

@danpaul000
Copy link
Contributor Author

Depends on: #8780

Working this stake integration script into a nightly run as part of a larger test framework refactor.

@danpaul000 danpaul000 force-pushed the offline_stake_ops branch 6 times, most recently from 0964933 to 43e6151 Compare March 12, 2020 00:48
@danpaul000 danpaul000 force-pushed the offline_stake_ops branch 5 times, most recently from eb3b131 to 6541003 Compare March 17, 2020 16:21
@danpaul000 danpaul000 force-pushed the offline_stake_ops branch 2 times, most recently from 4666792 to 9eac83c Compare March 17, 2020 20:02
@danpaul000 danpaul000 force-pushed the offline_stake_ops branch 2 times, most recently from 30855b8 to b8ec80d Compare March 17, 2020 21:26
@danpaul000 danpaul000 changed the title Add cluster test script with offline stake operations Automated test framework can run scripts on launched clusters. Add offline stake operations test case and script. Mar 17, 2020
@danpaul000
Copy link
Contributor Author

Well this ballooned a little bit, but is in good shape now. FYI @mvines @t-nelson on summary point #1.

Copy link
Contributor

@t-nelson t-nelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lookin' good! Just a couple Qs

system-test/automation_utils.sh Show resolved Hide resolved
@@ -6,11 +6,12 @@ steps:
CLOUD_PROVIDER: "colo"
TESTNET_TAG: "colo-perf-cpu-only"
ENABLE_GPU: "false"
TEST_DURATION_SECONDS: 30
TEST_DURATION_SECONDS: 60
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the config changes besides the TEST_TYPE additions intentional or debug artifacts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug artifacts mostly. I just threw the "sanity-testcases" dir in there to run short tests that exercise the functionality of the testing framework, rather than the limits of the cluster. So I can point a buildkite job on a PR against this file rather than an expensive/long nightly-style testcase.

@danpaul000 danpaul000 added the automerge Merge this Pull Request automatically once CI passes label Mar 18, 2020
@solana-grimes solana-grimes merged commit 90c9462 into solana-labs:master Mar 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automerge Merge this Pull Request automatically once CI passes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stake Redelegation Integration Test
4 participants