Internal: Improve use case testing #685

georgemccabe · 2020-11-03T19:15:26Z

We have discussed a few ideas to improve the use case tests that are run that are very time consuming. We should generate a testing strategy document to outline what improvements can be made and what we desire to have implemented. Once we have a solid plan, create sub-issues.

Describe the New Feature

Initial brainstorming ideas include:

Testing that the commands and environments created by METplus are ‘correct’ without running the MET tool so these tests will run extremely quickly
Generating a set of empty files that mimic the data required to run use cases so that the commands can be built
In Travis, only run the actual use cases for certain branches and/or pull requests. For other builds, generate commands using the dummy data to quickly see if commands are generated correctly.
Create Docker data volume(s) to store dummy files and text files containing expected commands and associated environment variables that are built to compare to the results. Set up process to update the "reference" output if intended differences are found -- similar to how MET performs regression testing

Acceptance Testing

List input data types and sources.
Describe tests required for new functionality.

Time Estimate

Many hours - sub-issues are needed

Sub-Issues

Consider breaking the new feature down into sub-issues.

Add a checkbox for each sub-issue here.

Relevant Deadlines

None

Funding Source

None

Define the Metadata

Assignee

Select engineer(s) or no engineer required
Select scientist(s) or no scientist required

Labels

Select component(s)
Select priority
Select requestor(s)

Projects and Milestone

Review projects and select relevant Repository and Organization ones or add "alert:NEED PROJECT ASSIGNMENT" label
Select milestone to next major version milestone or "Future Versions"

Define Related Issue(s)

Document for testing could be used as a guide to set up testing for other project.
Consider the impact to the other METplus components.

METplus, MET, METdatadb, METviewer, METexpress, METcalcpy, METplotpy

New Feature Checklist

See the METplus Workflow for details.

The text was updated successfully, but these errors were encountered:

georgemccabe · 2020-11-03T19:17:07Z

It will be overwhelming to have a use case in the repository to every possible configuration of MET tools, so we should consider creating tests that cover more cases besides the use cases. For example, we could use ensure that the wrappers are able to generate the MET unit tests: https://github.com/dtcenter/MET/blob/main_v9.1/test/xml/unit_python.xml

georgemccabe · 2021-06-18T19:38:22Z

Update on this issue:

Logic was added to automation to only run new use cases in push events unless a ci keyword is added to the last commit message. Use case groups can be marked as "NEW" and will run for all push events. Pull requests still run all of the use cases and difference tests.
Many wrappers now have pytests that do not run MET but build commands and check that the config settings build the expected commands and set the appropriate environment variables. There are many iterations of config settings, so these new tests have added a lot of time to the pytest automation job. We may want to consider breaking up the pytest runs so that they can run in parallel. Currently we just run all pytests that are found in the internal_tests/pytests directory in a single run.
Implementing sub-issue Improve approach to obtain additional python packages needed for some use cases #839 will improve time to set up different environments for use cases by creating the environments and saving them in docker images instead of installing the packages on the fly before running each use case.

georgemccabe · 2022-03-22T15:41:45Z

This issue is very general and testing could endlessly be improved. I think that the remaining task from this issue that would be useful is split up pytests so that they can run in parallel. There may be a way to group tests through pytest so it is easier to run them in separate jobs in GHA. Once that is completed to remove this bottleneck in the testing workflow, this issue could be closed.

…oups of tests in automation to speed things up. removed unused imports from tests and removed parenthesis from assert statements when they are not needed

…est' to does not start with 'pytest' to allow groups of pytests

georgemccabe · 2022-07-07T18:36:50Z

Update on improving testing by making pytests run faster.
There are a few options that allow running subsets of pytests:

Install pytest-xdist package that allows -n <num> option to split tests into <num> processes. This approach is quick to implement but has some drawbacks that prevent it from being a viable solution with the current tests. Some tests are expected to be run in a certain order because they either rely on other tests to generate output to run (i.e. StatAnalysis plotting tests) or they remove output files if they exist before running to ensure a clean run (i.e. MTD file list files). Since the grouping of the tests is done behind the scenes, the success of the tests may not be consistent.
The -k <keyword> option can be passed to pytest to only run test functions with <keyword> in the name. We don't have a set naming convention for the test names, so we would have to rename the test to conform to some standard. This is tedious and would require strict naming rules for new tests.
A directory can be passed to pytest to only run tests inside that directory. This would require moving files around to adjust the groups that would be run. Not ideal.
Custom markers can be defined adding @pytest.mark.<marker-name> before each test and adding -m <marker-name> to the pytest command. You can run all tests that do not have a given marker by using -m "not <marker-name>" and run multiple markers by using -m "<marker-name1> or <marker-name2>". This seems like the most maintainable approach, as tests can live wherever and be marked to run inside a group. One limitation is I have not found a way to run any tests that do not have a custom marker at all, so new tests without a custom marker could be missed.

georgemccabe · 2022-07-07T20:21:29Z

Another interesting note is the pytests appear to run much slower when they are called within a large group of tests. For example, the grid_stat tests take about 10 seconds to run when only those tests are run, but they take more than twice as long when run with all of the tests.

georgemccabe · 2022-07-07T21:14:19Z

Due to the overhead of setting up each job in the current implementation of the test suite, it appears that splitting up the pytests into separate jobs does not provide the desired benefit. It also takes away from the number of jobs that can be started to run when the full suite of use case tests are also run, which would likely slow down total time.

It does appear that splitting the pytests into groups via markers and running each group serially in a single job does speed up execution considerably. The 'pytests' job previously took ~28m to run. Splitting the tests into 3 groups takes <11m. Comparing that to running pytests in 3 separate jobs, the longest job in my test took just under 10m. It appears that our effort is better spent splitting the pytests into smaller groups to run within a single job.

In case we do decide we need to split the pytests into different jobs, the following code snippets can be used to recreate this functionality:

In .github/jobs/get_use_cases_to_run.sh, add the following if $run_unit_tests is true:

pytests_groups_filepath=.github/parm/pytest_groups.txt
pytests=""
for x in `cat $pytests_groups_filepath`; do
  pytests=$pytests"\"pytests_$x\","
done

In .github/actions/run_tests/entrypoint.sh, instead of looping over the pytest_groups.txt file to add commands for each marker, parse out the marker info and call pytest once:

  # strip off pytests_ from marker string
  marker="$( cut -d '_' -f 2- <<< "$INPUT_CATEGORIES" )"
  # remove underscore after 'not' and around 'or'
  marker="${marker//_or_/ or }"
  marker="${marker//not_/not }"
  echo Running Pytests marker=$marker

The checks using the 'status' variable are no longer needed in this case, so that can be removed as well.

… this logic is desired in the future)

* per #685, added custom pytest markers for many tests so we can run groups of tests in automation to speed things up. removed unused imports from tests and removed parenthesis from assert statements when they are not needed * per #685, change logic to check if test category is not equal to 'pytest' to does not start with 'pytest' to allow groups of pytests * per #685, run pytests with markers to subset tests into groups * fixed check if string starts with pytests * added missing pytest marker name * added logic to support running all pytests that do not match a given marker with the 'not <marker>' syntax * change pytest group to wrapper because the test expects another test to have run prior to running * fix 'not' logic by adding quotation marks around value * another approach to fixing not functionality for tests * added util marker to more tests * fixed typo in not logic * added util marker to more tests again * fixed logic to split string * marked rest of util tests with util marker * fixed another typo in string splitting logic * tested change that should properly split strings * moved wrapper tests into wrapper directory * changed marker for plotting tests * added plotting marker * improved logic for removing underscore after 'not' and around 'or' to specify more complex marker strings * test running group of 3 markers * fixed path the broke when test file was moved into a lower directory * Changed StatAnalysis tests to use plotting marker because some of the tests involve plotting but the other StatAnalysis tests produce output that are used in the plotting tests * changed some tests from marker 'wrapper' to 'wrapper_a' to split up some of these tests into separate runs * test to see if running pytests in single job but split into groups by marker will improve the timing enough * fixed typos in new logic * removed code that is no longer needed (added comment in issue #685 if this logic is desired in the future) * per #685, divided pytests into smaller groups * added a test that will fail to test that the entire pytest job will fail as expected * add error message if any pytests failed to help reviewer search for failed tests * removed failing test after confirming that entire pytest job properly reports an error if any test fails * turn on single use case group to make sure logic to build matrix of test jobs to run still works as expected * turn off use case after confirming tests were created properly * added documentation to contributor's guide to describe changes to unit test logic * added note about adding new pytest markers

georgemccabe · 2022-12-19T22:09:54Z

The pytests are now run in groups and PR #1992 includes updates to log formatting so that the GitHub Actions log output is easier to navigate. Closing this issue.

georgemccabe added component: testing Software testing issue priority: blocker Blocker type: new feature Make it do something new requestor: NCAR National Center for Atmospheric Research component: CI/CD Continuous integration and deployment issues labels Nov 3, 2020

georgemccabe added this to the METplus-4.0 milestone Nov 3, 2020

georgemccabe assigned ksearight Nov 3, 2020

georgemccabe assigned JohnHalleyGotway, georgemccabe and fentoad72 Nov 3, 2020

georgemccabe modified the milestones: METplus-4.0.0, METplus-4.1.0 Feb 25, 2021

georgemccabe unassigned fentoad72 Jun 18, 2021

georgemccabe moved this to Todo in METplus-Wrappers-5.0.0-beta1 (6/22/22) Feb 2, 2022

georgemccabe modified the milestones: METplus-4.1.0, METplus-5.0.0 Feb 15, 2022

georgemccabe removed this from METplus-Wrappers-5.0.0-beta1 (6/22/22) Jun 22, 2022

georgemccabe added this to METplus-Wrappers-5.0.0-beta2 (8/3/22) Jun 22, 2022

georgemccabe moved this to Todo in METplus-Wrappers-5.0.0-beta2 (8/3/22) Jun 22, 2022

georgemccabe moved this from Todo to In Progress in METplus-Wrappers-5.0.0-beta2 (8/3/22) Jul 6, 2022

georgemccabe added a commit that referenced this issue Jul 7, 2022

per #685, change logic to check if test category is not equal to 'pyt…

46104ba

…est' to does not start with 'pytest' to allow groups of pytests

georgemccabe added a commit that referenced this issue Jul 7, 2022

per #685, run pytests with markers to subset tests into groups

b509fee

georgemccabe added a commit that referenced this issue Jul 7, 2022

removed code that is no longer needed (added comment in issue #685 if…

69b0cce

… this logic is desired in the future)

georgemccabe added a commit that referenced this issue Jul 7, 2022

per #685, divided pytests into smaller groups

050c1a2

georgemccabe mentioned this issue Jul 7, 2022

Feature 685 group pytests #1692

Merged

14 tasks

georgemccabe linked a pull request Jul 7, 2022 that will close this issue

Feature 685 group pytests #1692

Merged

14 tasks

georgemccabe moved this from In Progress to Review in METplus-Wrappers-5.0.0-beta2 (8/3/22) Jul 7, 2022

georgemccabe added this to METplus-Wrappers-5.1.0 Development Dec 10, 2022

georgemccabe modified the milestones: METplus-5.0.0, METplus-5.1.0 Dec 10, 2022

georgemccabe moved this to In Progress in METplus-Wrappers-5.1.0 Development Dec 10, 2022

georgemccabe linked a pull request Dec 16, 2022 that will close this issue

Feature 685 log updates #1992

Merged

14 tasks

georgemccabe closed this as completed Dec 19, 2022

Repository owner moved this from In Progress to Done in METplus-Wrappers-5.1.0 Development Dec 19, 2022

Repository owner moved this from Review to Done in METplus-Wrappers-5.0.0-beta2 (8/3/22) Dec 19, 2022

georgemccabe changed the title ~~Improve use case testing~~ Internal: Improve use case testing Apr 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Internal: Improve use case testing #685

Internal: Improve use case testing #685

georgemccabe commented Nov 3, 2020

georgemccabe commented Nov 3, 2020

georgemccabe commented Jun 18, 2021

georgemccabe commented Mar 22, 2022

georgemccabe commented Jul 7, 2022

georgemccabe commented Jul 7, 2022

georgemccabe commented Jul 7, 2022

georgemccabe commented Dec 19, 2022

Internal: Improve use case testing #685

Internal: Improve use case testing #685

Comments

georgemccabe commented Nov 3, 2020

Describe the New Feature

Acceptance Testing

Time Estimate

Sub-Issues

Relevant Deadlines

Funding Source

Define the Metadata

Assignee

Labels

Projects and Milestone

Define Related Issue(s)

New Feature Checklist

georgemccabe commented Nov 3, 2020

georgemccabe commented Jun 18, 2021

georgemccabe commented Mar 22, 2022

georgemccabe commented Jul 7, 2022

georgemccabe commented Jul 7, 2022

georgemccabe commented Jul 7, 2022

georgemccabe commented Dec 19, 2022