Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal: Improve use case testing #685

Closed
8 of 21 tasks
georgemccabe opened this issue Nov 3, 2020 · 7 comments · Fixed by #1692 or #1992
Closed
8 of 21 tasks

Internal: Improve use case testing #685

georgemccabe opened this issue Nov 3, 2020 · 7 comments · Fixed by #1692 or #1992
Assignees
Labels
component: CI/CD Continuous integration and deployment issues component: testing Software testing issue priority: blocker Blocker requestor: NCAR National Center for Atmospheric Research type: new feature Make it do something new
Milestone

Comments

@georgemccabe
Copy link
Collaborator

We have discussed a few ideas to improve the use case tests that are run that are very time consuming. We should generate a testing strategy document to outline what improvements can be made and what we desire to have implemented. Once we have a solid plan, create sub-issues.

Describe the New Feature

Initial brainstorming ideas include:

  • Testing that the commands and environments created by METplus are ‘correct’ without running the MET tool so these tests will run extremely quickly
  • Generating a set of empty files that mimic the data required to run use cases so that the commands can be built
  • In Travis, only run the actual use cases for certain branches and/or pull requests. For other builds, generate commands using the dummy data to quickly see if commands are generated correctly.
  • Create Docker data volume(s) to store dummy files and text files containing expected commands and associated environment variables that are built to compare to the results. Set up process to update the "reference" output if intended differences are found -- similar to how MET performs regression testing

Acceptance Testing

List input data types and sources.
Describe tests required for new functionality.

Time Estimate

Many hours - sub-issues are needed

Sub-Issues

Consider breaking the new feature down into sub-issues.

  • Add a checkbox for each sub-issue here.

Relevant Deadlines

None

Funding Source

None

Define the Metadata

Assignee

  • Select engineer(s) or no engineer required
  • Select scientist(s) or no scientist required

Labels

  • Select component(s)
  • Select priority
  • Select requestor(s)

Projects and Milestone

  • Review projects and select relevant Repository and Organization ones or add "alert:NEED PROJECT ASSIGNMENT" label
  • Select milestone to next major version milestone or "Future Versions"

Define Related Issue(s)

Document for testing could be used as a guide to set up testing for other project.
Consider the impact to the other METplus components.

New Feature Checklist

See the METplus Workflow for details.

  • Complete the issue definition above, including the Time Estimate and Funding source.
  • Fork this repository or create a branch of develop.
    Branch name: feature_<Issue Number>_<Description>
  • Complete the development and test your changes.
  • Add/update log messages for easier debugging.
  • Add/update unit tests.
  • Add/update documentation.
  • Push local changes to GitHub.
  • Submit a pull request to merge into develop.
    Pull request: feature <Issue Number> <Description>
  • Define the pull request metadata, as permissions allow.
    Select: Reviewer(s), Project(s), Milestone, and Linked issues
  • Iterate until the reviewer(s) accept and merge your changes.
  • Delete your fork or branch.
  • Close this issue.
@georgemccabe georgemccabe added component: testing Software testing issue priority: blocker Blocker type: new feature Make it do something new requestor: NCAR National Center for Atmospheric Research component: CI/CD Continuous integration and deployment issues labels Nov 3, 2020
@georgemccabe georgemccabe added this to the METplus-4.0 milestone Nov 3, 2020
@georgemccabe
Copy link
Collaborator Author

It will be overwhelming to have a use case in the repository to every possible configuration of MET tools, so we should consider creating tests that cover more cases besides the use cases. For example, we could use ensure that the wrappers are able to generate the MET unit tests: https://github.com/dtcenter/MET/blob/main_v9.1/test/xml/unit_python.xml

@georgemccabe
Copy link
Collaborator Author

Update on this issue:

  • Logic was added to automation to only run new use cases in push events unless a ci keyword is added to the last commit message. Use case groups can be marked as "NEW" and will run for all push events. Pull requests still run all of the use cases and difference tests.
  • Many wrappers now have pytests that do not run MET but build commands and check that the config settings build the expected commands and set the appropriate environment variables. There are many iterations of config settings, so these new tests have added a lot of time to the pytest automation job. We may want to consider breaking up the pytest runs so that they can run in parallel. Currently we just run all pytests that are found in the internal_tests/pytests directory in a single run.
  • Implementing sub-issue Improve approach to obtain additional python packages needed for some use cases #839 will improve time to set up different environments for use cases by creating the environments and saving them in docker images instead of installing the packages on the fly before running each use case.

@georgemccabe
Copy link
Collaborator Author

This issue is very general and testing could endlessly be improved. I think that the remaining task from this issue that would be useful is split up pytests so that they can run in parallel. There may be a way to group tests through pytest so it is easier to run them in separate jobs in GHA. Once that is completed to remove this bottleneck in the testing workflow, this issue could be closed.

georgemccabe added a commit that referenced this issue Jul 7, 2022
…oups of tests in automation to speed things up. removed unused imports from tests and removed parenthesis from assert statements when they are not needed
georgemccabe added a commit that referenced this issue Jul 7, 2022
…est' to does not start with 'pytest' to allow groups of pytests
@georgemccabe
Copy link
Collaborator Author

Update on improving testing by making pytests run faster.
There are a few options that allow running subsets of pytests:

  • Install pytest-xdist package that allows -n <num> option to split tests into <num> processes. This approach is quick to implement but has some drawbacks that prevent it from being a viable solution with the current tests. Some tests are expected to be run in a certain order because they either rely on other tests to generate output to run (i.e. StatAnalysis plotting tests) or they remove output files if they exist before running to ensure a clean run (i.e. MTD file list files). Since the grouping of the tests is done behind the scenes, the success of the tests may not be consistent.
  • The -k <keyword> option can be passed to pytest to only run test functions with <keyword> in the name. We don't have a set naming convention for the test names, so we would have to rename the test to conform to some standard. This is tedious and would require strict naming rules for new tests.
  • A directory can be passed to pytest to only run tests inside that directory. This would require moving files around to adjust the groups that would be run. Not ideal.
  • Custom markers can be defined adding @pytest.mark.<marker-name> before each test and adding -m <marker-name> to the pytest command. You can run all tests that do not have a given marker by using -m "not <marker-name>" and run multiple markers by using -m "<marker-name1> or <marker-name2>". This seems like the most maintainable approach, as tests can live wherever and be marked to run inside a group. One limitation is I have not found a way to run any tests that do not have a custom marker at all, so new tests without a custom marker could be missed.

@georgemccabe
Copy link
Collaborator Author

Another interesting note is the pytests appear to run much slower when they are called within a large group of tests. For example, the grid_stat tests take about 10 seconds to run when only those tests are run, but they take more than twice as long when run with all of the tests.

@georgemccabe
Copy link
Collaborator Author

Due to the overhead of setting up each job in the current implementation of the test suite, it appears that splitting up the pytests into separate jobs does not provide the desired benefit. It also takes away from the number of jobs that can be started to run when the full suite of use case tests are also run, which would likely slow down total time.

It does appear that splitting the pytests into groups via markers and running each group serially in a single job does speed up execution considerably. The 'pytests' job previously took ~28m to run. Splitting the tests into 3 groups takes <11m. Comparing that to running pytests in 3 separate jobs, the longest job in my test took just under 10m. It appears that our effort is better spent splitting the pytests into smaller groups to run within a single job.

In case we do decide we need to split the pytests into different jobs, the following code snippets can be used to recreate this functionality:

In .github/jobs/get_use_cases_to_run.sh, add the following if $run_unit_tests is true:

pytests_groups_filepath=.github/parm/pytest_groups.txt
pytests=""
for x in `cat $pytests_groups_filepath`; do
  pytests=$pytests"\"pytests_$x\","
done

In .github/actions/run_tests/entrypoint.sh, instead of looping over the pytest_groups.txt file to add commands for each marker, parse out the marker info and call pytest once:

  # strip off pytests_ from marker string
  marker="$( cut -d '_' -f 2- <<< "$INPUT_CATEGORIES" )"
  # remove underscore after 'not' and around 'or'
  marker="${marker//_or_/ or }"
  marker="${marker//not_/not }"
  echo Running Pytests marker=$marker

The checks using the 'status' variable are no longer needed in this case, so that can be removed as well.

georgemccabe added a commit that referenced this issue Jul 7, 2022
@georgemccabe georgemccabe linked a pull request Jul 7, 2022 that will close this issue
14 tasks
georgemccabe added a commit that referenced this issue Jul 14, 2022
* per #685, added custom pytest markers for many tests so we can run groups of tests in automation to speed things up. removed unused imports from tests and removed parenthesis from assert statements when they are not needed

* per #685, change logic to check if test category is not equal to 'pytest' to does not start with 'pytest' to allow groups of pytests

* per #685, run pytests with markers to subset tests into groups

* fixed check if string starts with pytests

* added missing pytest marker name

* added logic to support running all pytests that do not match a given marker with the 'not <marker>' syntax

* change pytest group to wrapper because the test expects another test to have run prior to running

* fix 'not' logic by adding quotation marks around value

* another approach to fixing not functionality for tests

* added util marker to more tests

* fixed typo in not logic

* added util marker to more tests again

* fixed logic to split string

* marked rest of util tests with util marker

* fixed another typo in string splitting logic

* tested change that should properly split strings

* moved wrapper tests into wrapper directory

* changed marker for plotting tests

* added plotting marker

* improved logic for removing underscore after 'not' and around 'or' to specify more complex marker strings

* test running group of 3 markers

* fixed path the broke when test file was moved into a lower directory

* Changed StatAnalysis tests to use plotting marker because some of the tests involve plotting but the other StatAnalysis tests produce output that are used in the plotting tests

* changed some tests from marker 'wrapper' to 'wrapper_a' to split up some of these tests into separate runs

* test to see if running pytests in single job but split into groups by marker will improve the timing enough

* fixed typos in new logic

* removed code that is no longer needed (added comment in issue #685 if this logic is desired in the future)

* per #685, divided pytests into smaller groups

* added a test that will fail to test that the entire pytest job will fail as expected

* add error message if any pytests failed to help reviewer search for failed tests

* removed failing test after confirming that entire pytest job properly reports an error if any test fails

* turn on single use case group to make sure logic to build matrix of test jobs to run still works as expected

* turn off use case after confirming tests were created properly

* added documentation to contributor's guide to describe changes to unit test logic

* added note about adding new pytest markers
@georgemccabe georgemccabe linked a pull request Dec 16, 2022 that will close this issue
14 tasks
@georgemccabe
Copy link
Collaborator Author

The pytests are now run in groups and PR #1992 includes updates to log formatting so that the GitHub Actions log output is easier to navigate. Closing this issue.

@georgemccabe georgemccabe changed the title Improve use case testing Internal: Improve use case testing Apr 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: CI/CD Continuous integration and deployment issues component: testing Software testing issue priority: blocker Blocker requestor: NCAR National Center for Atmospheric Research type: new feature Make it do something new
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

4 participants