Merge pull request #10464 from EOSIO/zach-stability-testing

Support Stability Testing for Tests + CI Bug Fixes
EOSIO · Jun 28, 2021 · 90919fd · 90919fd
2 parents 3a5afb9 + 98fb612
commit 90919fd
Show file tree

Hide file tree

Showing 5 changed files with 127 additions and 21 deletions.
diff --git a/.cicd/README.md b/.cicd/README.md
@@ -94,12 +94,14 @@ Pipeline | Details
 [eosio-lrt](https://buildkite.com/EOSIO/eosio-lrt) | runs tests that need more time on merge commits
 [eosio-resume-from-state](https://buildkite.com/EOSIO/eosio-resume-from-state) | loads the current version of `nodeos` from state files generated by specific previous versions of `nodeos` in each [eosio](https://buildkite.com/EOSIO/eosio) build ([Documentation](https://github.com/EOSIO/auto-eks-sync-nodes/blob/master/pipelines/eosio-resume-from-state/README.md))
 [eosio-sync-from-genesis](https://buildkite.com/EOSIO/eosio-sync-from-genesis) | sync the current version of `nodeos` past genesis from peers on common public chains as a smoke test, for each [eosio](https://buildkite.com/EOSIO/eosio) build
+[eosio-test-stability](https://buildkite.com/EOSIO/eosio-test-stability) | prove or disprove test stability by running a test thousands of times
 
 ## See Also
 - Buildkite
   - [DevDocs](https://github.com/EOSIO/devdocs/wiki/Buildkite)
   - [eosio-resume-from-state Documentation](https://github.com/EOSIO/auto-eks-sync-nodes/blob/master/pipelines/eosio-resume-from-state/README.md)
   - [Run Your First Build](https://buildkite.com/docs/tutorials/getting-started#run-your-first-build)
+  - [Stability Testing](https://github.com/EOSIO/eos/blob/HEAD/.cicd/eosio-test-stability.md)
 - [#help-automation](https://blockone.slack.com/archives/CMTAZ9L4D) Slack Channel
 
 </details>
diff --git a/.cicd/eosio-test-stability.md b/.cicd/eosio-test-stability.md
@@ -0,0 +1,81 @@
+# Stability Testing
+Stability testing of EOSIO unit and integration tests is done in the [eosio-test-stability](https://buildkite.com/EOSIO/eosio-test-stability) pipeline. It will take thousands of runs of any given test to identify it as "stable" or "unstable". Runs should be split evenly across "pinned" (fixed dependency version) and "unpinned" (default dependency version) builds because, sometimes, test instability is only expressed in one of these environments. Finally, stability testing should be performed on the Linux fleet first because this fleet is effectively infinite. Once stability is demonstrated on Linux, testing can be performed on the finite macOS Anka fleet.
+
+<details>
+<summary>See More</summary>
+
+## Index
+1. [Configuration](eosio-test-stability.md#configuration)
+   1. [Variables](eosio-test-stability.md#variables)
+   1. [Runs](eosio-test-stability.md#runs)
+   1. [Examples](eosio-test-stability.md#examples)
+1. [See Also](eosio-test-stability.md#see-also)
+
+## Configuration
+The [eosio-test-stability](https://buildkite.com/EOSIO/eosio-test-stability) pipeline uses the same pipeline upload script as [eosio](https://buildkite.com/EOSIO/eosio), [eosio-build-unpinned](https://buildkite.com/EOSIO/eosio-build-unpinned), and [eosio-lrt](https://buildkite.com/EOSIO/eosio-lrt), so all variables from the [pipeline documentation](README.md) apply.
+
+### Variables
+There are five primary environment variables relevant to stability testing:
+```bash
+PINNED='true|false'    # whether to perform the test with pinned dependencies, or default dependencies
+ROUNDS='ℕ'             # natural number defining the number of gated rounds of tests to generate
+ROUND_SIZE='ℕ'         # number of test steps to generate per operating system, per round
+SKIP_MAC='true|false'  # conserve finite macOS Anka agents by excluding them from your testing
+TEST='name'            # PCRE expression defining the tests to run, preceded by '^' and followed by '$'
+TIMEOUT='ℕ'            # set timeout in minutes for all Buildkite steps
+```
+The `TEST` variable is parsed as [pearl-compatible regular expression](https://www.debuggex.com/cheatsheet/regex/pcre) where the expression in `TEST` is preceded by `^` and followed by `$`. To specify one test, set `TEST` equal to the test name (e.g. `TEST='read_only_query'`). Specify two tests as `TEST='(nodeos_short_fork_take_over_lr_test|read_only_query)'`. Or, perhaps, you want all of the `restart_scenarios` tests. Then, you could define `TEST='restart-scenario-test-.*'` and Buildkite will generate `ROUND_SIZE` steps each round for each operating system for all three restart scenarios tests.
+
+### Runs
+The number of total test runs will be:
+```bash
+RUNS = ROUNDS * ROUND_SIZE * OS_COUNT * TEST_COUNT # where:
+OS_COUNT   = 'ℕ' # the number of supported operating systems
+TEST_COUNT = 'ℕ' # the number of tests matching the PCRE filter in TEST
+```
+
+### Examples
+We recommend stability testing one test per build with two builds per test, on Linux at first. Kick off one pinned build on Linux...
+```bash
+PINNED='true'
+ROUNDS='42'
+ROUND_SIZE'5'
+SKIP_MAC='true'
+TEST='read_only_query'
+```
+...and one unpinned build on Linux:
+```bash
+PINNED='true'
+ROUNDS='42'
+ROUND_SIZE'5'
+SKIP_MAC='true'
+TEST='read_only_query'
+```
+Once the Linux runs have proven stable, and if instability was observed on macOS, kick off two equivalent builds on macOS instead of Linux. One pinned build on macOS...
+```bash
+PINNED='true'
+ROUNDS='42'
+ROUND_SIZE'5'
+SKIP_LINUX='true'
+SKIP_MAC='false'
+TEST='read_only_query'
+```
+...and one unpinned build on macOS:
+```bash
+PINNED='true'
+ROUNDS='42'
+ROUND_SIZE'5'
+SKIP_LINUX='true'
+SKIP_MAC='false'
+TEST='read_only_query'
+```
+If these runs are against `eos:develop` and `develop` has five supported operating systems, this pattern would consist of 2,100 runs per test across all four builds. If the runs are against `eos:release/2.1.x` which, at the time of this writing, supports eight operating systems, this pattern would consist of 3,360 runs per test across all four builds. This gives you and your team strong confidence that any test instability occurs less than 1% of the time.
+
+# See Also
+- Buildkite
+  - [DevDocs](https://github.com/EOSIO/devdocs/wiki/Buildkite)
+  - [EOSIO Pipelines](https://github.com/EOSIO/eos/blob/HEAD/.cicd/README.md)
+  - [Run Your First Build](https://buildkite.com/docs/tutorials/getting-started#run-your-first-build)
+- [#help-automation](https://blockone.slack.com/archives/CMTAZ9L4D) Slack Channel
+
+</details>
diff --git a/.cicd/generate-pipeline.sh b/.cicd/generate-pipeline.sh
@@ -5,15 +5,36 @@ set -eo pipefail
 export MOJAVE_ANKA_TAG_BASE=${MOJAVE_ANKA_TAG_BASE:-'clean::cicd::git-ssh::nas::brew::buildkite-agent'}
 export MOJAVE_ANKA_TEMPLATE_NAME=${MOJAVE_ANKA_TEMPLATE_NAME:-'10.14.6_6C_14G_80G'}
 export PLATFORMS_JSON_ARRAY='[]'
+[[ -z "$ROUNDS" ]] && export ROUNDS='1'
+[[ -z "$ROUND_SIZE" ]] && export ROUND_SIZE='1'
 BUILDKITE_BUILD_AGENT_QUEUE='automation-eks-eos-builder-fleet'
 BUILDKITE_TEST_AGENT_QUEUE='automation-eks-eos-tester-fleet'
-[[ -z "$ROUNDS" ]] && export ROUNDS='1'
 # attach pipeline documentation
-export DOCS_URL="https://github.com/EOSIO/eos/blob/${BUILDKITE_COMMIT:-master}/.cicd/README.md"
-export RETRY="$(buildkite-agent meta-data get pipeline-upload-retries --default '0')"
+export DOCS_URL="https://github.com/EOSIO/eos/blob/$(git rev-parse HEAD)/.cicd"
+export RETRY="$([[ "$BUILDKITE" == 'true' ]] && buildkite-agent meta-data get pipeline-upload-retries --default '0' || echo "${RETRY:-0}")"
 if [[ "$BUILDKITE" == 'true' && "$RETRY" == '0' ]]; then
-    echo "This documentation is also available on [GitHub]($DOCS_URL)." | buildkite-agent annotate --append --style 'info' --context 'documentation'
+    echo "This documentation is also available on [GitHub]($DOCS_URL/README.md)." | buildkite-agent annotate --append --style 'info' --context 'documentation'
     cat .cicd/README.md | buildkite-agent annotate --append --style 'info' --context 'documentation'
+    if [[ "$BUILDKITE_PIPELINE_SLUG" == 'eosio-test-stability' ]]; then
+        echo "This documentation is also available on [GitHub]($DOCS_URL/eosio-test-stability.md)." | buildkite-agent annotate --append --style 'info' --context 'test-stability'
+        cat .cicd/eosio-test-stability.md | buildkite-agent annotate --append --style 'info' --context 'test-stability'
+    fi
+fi
+[[ "$BUILDKITE" == 'true' ]] && buildkite-agent meta-data set pipeline-upload-retries "$(( $RETRY + 1 ))"
+# guard against accidentally spawning too many jobs
+if (( $ROUNDS > 1 || $ROUND_SIZE > 1 )) && [[ -z "$TEST" ]]; then
+    echo '+++ :no_entry: WARNING: Your parameters will spawn a very large number of jobs!' 1>&2
+    echo "Setting ROUNDS='$ROUNDS' and/or ROUND_SIZE='$ROUND_SIZE' in the environment without also setting TEST to a specific test will cause ALL tests to be run $(( $ROUNDS * $ROUND_SIZE )) times, which will consume a large number of agents! We recommend doing stability testing on ONE test at a time. If you're certain you want to do this, set TEST='.*' to run all tests $(( $ROUNDS * $ROUND_SIZE )) times." 1>&2
+    [[ "$BUILDKITE" == 'true' ]] && cat | buildkite-agent annotate --append --style 'error' --context 'no-TEST' <<-MD
+    Your build was cancelled because you set \`ROUNDS\` and/or \`ROUND_SIZE\` without also setting \`TEST\` in your build environment. This would cause each test to be run $(( $ROUNDS * $ROUND_SIZE )) times, which will consume a lot of Buildkite agents.
+
+    We recommend stability testing one test at a time by setting \`TEST\` equal to the test name. Alternatively, you can specify a set of test names using [Perl-Compatible Regular Expressions](https://www.debuggex.com/cheatsheet/regex/pcre), where \`TEST\` is parsed as \`^${TEST}$\`.
+
+    If you _really_ meant to run every test $(( $ROUNDS * $ROUND_SIZE )) times, set \`TEST='.*'\`.
+MD
+    exit 255
+elif [[ "$TEST" = '.*' ]]; then # if they want to run every test, just spawn the jobs like normal
+    unset TEST
 fi
 # Determine if it's a forked PR and make sure to add git fetch so we don't have to git clone the forked repo's url
 if [[ $BUILDKITE_BRANCH =~ ^pull/[0-9]+/head: ]]; then
@@ -170,7 +191,7 @@ EOF
 EOF
     fi
 done
-cat <<EOF
+[[ -z "$TEST" ]] && cat <<EOF
 
   - label: ":docker: Docker - Build and Install"
     command: "./.cicd/installation-build.sh"
@@ -193,7 +214,7 @@ if [[ "$DCMAKE_BUILD_TYPE" != 'Debug' ]]; then
         echo "    # round $ROUND of $ROUNDS"
         # parallel tests
         echo '    # parallel tests'
-        echo $PLATFORMS_JSON_ARRAY | jq -cr '.[]' | while read -r PLATFORM_JSON; do
+        [[ -z "$TEST" ]] && echo $PLATFORMS_JSON_ARRAY | jq -cr '.[]' | while read -r PLATFORM_JSON; do
             if [[ ! "$(echo "$PLATFORM_JSON" | jq -r .FILE_NAME)" =~ 'macos' ]]; then
                 cat <<EOF
   - label: "$(echo "$PLATFORM_JSON" | jq -r .ICON) $(echo "$PLATFORM_JSON" | jq -r .PLATFORM_NAME_FULL) - Unit Tests"
@@ -247,11 +268,10 @@ EOF
 
 EOF
             fi
-        echo
         done
         # wasm spec tests
         echo '    # wasm spec tests'
-        echo $PLATFORMS_JSON_ARRAY | jq -cr '.[]' | while read -r PLATFORM_JSON; do
+        [[ -z "$TEST" ]] && echo $PLATFORMS_JSON_ARRAY | jq -cr '.[]' | while read -r PLATFORM_JSON; do
             if [[ ! "$(echo "$PLATFORM_JSON" | jq -r .FILE_NAME)" =~ 'macos' ]]; then
                 cat <<EOF
   - label: "$(echo "$PLATFORM_JSON" | jq -r .ICON) $(echo "$PLATFORM_JSON" | jq -r .PLATFORM_NAME_FULL) - WASM Spec Tests"
@@ -302,13 +322,16 @@ EOF
 
 EOF
             fi
-        echo
         done
         # serial tests
         echo '    # serial tests'
         echo $PLATFORMS_JSON_ARRAY | jq -cr '.[]' | while read -r PLATFORM_JSON; do
             IFS=$oIFS
-            SERIAL_TESTS="$(cat tests/CMakeLists.txt | grep nonparallelizable_tests | grep -v "^#" | awk -F" " '{ print $2 }')"
+            if [[ -z "$TEST" ]]; then
+                SERIAL_TESTS="$(cat tests/CMakeLists.txt | grep nonparallelizable_tests | grep -v "^#" | awk -F ' ' '{ print $2 }' | sort | uniq)"
+            else
+                SERIAL_TESTS="$(cat tests/CMakeLists.txt | grep -v "^#" | awk -F ' ' '{ print $2 }' | sort | uniq | grep -P "^$TEST$" | awk "{while(i++<$ROUND_SIZE)print;i=0}")"
+            fi
             for TEST_NAME in $SERIAL_TESTS; do
                 if [[ ! "$(echo "$PLATFORM_JSON" | jq -r .FILE_NAME)" =~ 'macos' ]]; then
                     cat <<EOF
@@ -360,15 +383,14 @@ EOF
 
 EOF
                 fi
-                echo
             done
             IFS=$nIFS
         done
         # long-running tests
         echo '    # long-running tests'
-        echo $PLATFORMS_JSON_ARRAY | jq -cr '.[]' | while read -r PLATFORM_JSON; do
+        [[ -z "$TEST" ]] && echo $PLATFORMS_JSON_ARRAY | jq -cr '.[]' | while read -r PLATFORM_JSON; do
             IFS=$oIFS
-            LR_TESTS="$(cat tests/CMakeLists.txt | grep long_running_tests | grep -v "^#" | awk -F" " '{ print $2 }')"
+            LR_TESTS="$(cat tests/CMakeLists.txt | grep long_running_tests | grep -v "^#" | awk -F" " '{ print $2 }' | sort | uniq)"
             for TEST_NAME in $LR_TESTS; do
                 if [[ ! "$(echo "$PLATFORM_JSON" | jq -r .FILE_NAME)" =~ 'macos' ]]; then
                     cat <<EOF
@@ -420,7 +442,6 @@ EOF
 
 EOF
                 fi
-                echo
             done
             IFS=$nIFS
         done
@@ -431,7 +452,7 @@ EOF
         fi
     done
     # Execute multiversion test
-    if [[ ! "$PINNED" == 'false' || "$SKIP_MULTIVERSION_TEST" == 'false' ]]; then
+    if [[ -z "$TEST" && ( ! "$PINNED" == 'false' || "$SKIP_MULTIVERSION_TEST" == 'false' ) ]]; then
         cat <<EOF
   - label: ":pipeline: Multiversion Test"
     command: 
@@ -448,7 +469,7 @@ EOF
 EOF
     fi
     # trigger eosio-lrt post pr
-    if [[ -z $BUILDKITE_TRIGGERED_FROM_BUILD_ID && $TRIGGER_JOB == "true" ]]; then
+    if [[ -z "$TEST" && -z $BUILDKITE_TRIGGERED_FROM_BUILD_ID && $TRIGGER_JOB == "true" ]]; then
         if ( [[ ! $PINNED == false ]] ); then
             cat <<EOF
   - label: ":pipeline: Trigger Long Running Tests"
@@ -471,7 +492,7 @@ EOF
         fi
     fi
     # trigger eosio-sync-from-genesis for every build
-    if [[ "$BUILDKITE_PIPELINE_SLUG" == 'eosio' && -z "${SKIP_INSTALL}${SKIP_LINUX}${SKIP_DOCKER}${SKIP_SYNC_TESTS}" ]]; then
+    if [[ -z "$TEST" && "$BUILDKITE_PIPELINE_SLUG" == 'eosio' && -z "${SKIP_INSTALL}${SKIP_LINUX}${SKIP_DOCKER}${SKIP_SYNC_TESTS}" ]]; then
         cat <<EOF
   - label: ":chains: Sync from Genesis Test"
     trigger: "eosio-sync-from-genesis"
@@ -491,7 +512,7 @@ EOF
 EOF
     fi
     # trigger eosio-resume-from-state for every build
-    if [[ "$BUILDKITE_PIPELINE_SLUG" == 'eosio' && -z "${SKIP_INSTALL}${SKIP_LINUX}${SKIP_DOCKER}${SKIP_SYNC_TESTS}" ]]; then
+    if [[ -z "$TEST" && "$BUILDKITE_PIPELINE_SLUG" == 'eosio' && -z "${SKIP_INSTALL}${SKIP_LINUX}${SKIP_DOCKER}${SKIP_SYNC_TESTS}" ]]; then
         cat <<EOF
   - label: ":outbox_tray: Resume from State Test"
     trigger: "eosio-resume-from-state"
@@ -527,6 +548,8 @@ cat <<EOF
     timeout: ${TIMEOUT:-10}
     soft_fail: true
 
+EOF
+[[ -z "$TEST" ]] && cat <<EOF
   - wait
 
     # packaging
@@ -677,7 +700,7 @@ cat <<EOF
       buildkite-agent artifact upload eosio.rb
     agents:
       queue: "automation-basic-builder-fleet"
-    timeout: "${TIMEOUT:-5}"
+    timeout: ${TIMEOUT:-5}
     skip: ${SKIP_PACKAGE_BUILDER}${SKIP_MAC}${SKIP_MACOS_10_14}
 
   - label: ":docker: :ubuntu: Docker - Build 18.04 Docker Image"

diff --git a/scripts/long-running-test.sh b/scripts/long-running-test.sh
@@ -3,7 +3,7 @@ set -eo pipefail
 # variables
 echo "--- $([[ "$BUILDKITE" == 'true' ]] && echo ':evergreen_tree: ')Configuring Environment"
 GIT_ROOT="$(dirname $BASH_SOURCE[0])/.."
-[[ -z "$TEST" ]] && export TEST="$1"
+[[ -z "$1" ]] || export TEST="$1"
 if [[ "$(uname)" == 'Linux' ]]; then
     . /etc/os-release
     if [[ "$ID" == 'centos' ]]; then

diff --git a/scripts/serial-test.sh b/scripts/serial-test.sh
@@ -3,7 +3,7 @@ set -eo pipefail
 # variables
 echo "--- $([[ "$BUILDKITE" == 'true' ]] && echo ':evergreen_tree: ')Configuring Environment"
 GIT_ROOT="$(dirname $BASH_SOURCE[0])/.."
-[[ -z "$TEST" ]] && export TEST="$1"
+[[ -z "$1" ]] || export TEST="$1"
 if [[ "$(uname)" == 'Linux' ]]; then
     . /etc/os-release
     if [[ "$ID" == 'centos' ]]; then