Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Continuous Integration with GitHub Actions in MET #1546

Closed
1 of 22 tasks
fentoad72 opened this issue Nov 3, 2020 · 5 comments · Fixed by #2029
Closed
1 of 22 tasks

Implement Continuous Integration with GitHub Actions in MET #1546

fentoad72 opened this issue Nov 3, 2020 · 5 comments · Fixed by #2029
Assignees
Labels
component: CI/CD Continuous integration and deployment issues component: testing Software testing issue priority: medium Medium Priority requestor: NCAR National Center for Atmospheric Research type: enhancement Improve something that it is currently doing type: new feature Make it do something new
Milestone

Comments

@fentoad72
Copy link

fentoad72 commented Nov 3, 2020

Replace italics below with details for this issue.

Describe the New Feature

Implement Travis CI into MET

  • error checking on git push
  • DockerHub/Docker image integration
  • what tests to run?

Acceptance Testing

John -- what tests should be run each build

Time Estimate

Estimate the amount of work required here.
Issues should represent approximately 1 to 3 days of work.

Sub-Issues

  • Write a travis.yml file
  • Integration/conflict with Travis on METplus

Relevant Deadlines

MET future versions

Funding Source

Define the Metadata

Assignee

  • Select engineer(s) or no engineer required
  • Select scientist(s) or no scientist required

Labels

  • Select component(s)
  • Select priority
  • Select requestor(s)

Projects and Milestone

  • Review projects and select relevant Repository and Organization ones or add "alert:NEED PROJECT ASSIGNMENT" label
  • Select milestone to next major version milestone or "Future Versions"

Define Related Issue(s)

Consider the impact to the other METplus components.

New Feature Checklist

See the METplus Workflow for details.

  • Complete the issue definition above, including the Time Estimate and Funding source.
  • Fork this repository or create a branch of develop.
    Branch name: feature_<Issue Number>_<Description>
  • Complete the development and test your changes.
  • Add/update log messages for easier debugging.
  • Add/update unit tests.
  • Add/update documentation.
  • Push local changes to GitHub.
  • Submit a pull request to merge into develop.
    Pull request: feature <Issue Number> <Description>
  • Define the pull request metadata, as permissions allow.
    Select: Reviewer(s), Project(s), Milestone, and Linked issues
  • Iterate until the reviewer(s) accept and merge your changes.
  • Delete your fork or branch.
  • Close this issue.
@fentoad72 fentoad72 added type: enhancement Improve something that it is currently doing component: testing Software testing issue type: new feature Make it do something new priority: medium Medium Priority requestor: NCAR National Center for Atmospheric Research component: CI/CD Continuous integration and deployment issues labels Nov 3, 2020
@fentoad72
Copy link
Author

Outline basic step we want to do --

  • branch rules (which branches kick off a travis build)
  • branch rules (add logic on what tests to run based on build (develop, feature_, ...)
  • building the Docker image
  • pull the develop and build using that as cache-from, it will only build what needs to be updated
  • run make test for develop & feature branches
  • run regression tests and save a docker data volume of output for reference image with output saved
  • use docker data volume to compare each run (docker data vol is "truth"), don't need to run reference each time
  • how to evaluate a different output from reference, is this something that makes sense -- if so, update reference (evaluating the differences would probably be done by hand)
  • how to behave with feature branches (later)

@ksearight
Copy link
Contributor

This issue needs to be retitled for GitHub Actions instead of Travis (or closed).

@JohnHalleyGotway JohnHalleyGotway changed the title Implement Travis in MET Implement Continuous Integration with GH-Actions in MET Feb 13, 2021
@JohnHalleyGotway
Copy link
Collaborator

JohnHalleyGotway commented Jan 19, 2022

On 1/19/22, @georgemccabe and @JohnHalleyGotway met to discuss these details.

Goals for this test automation.

  • Should restructure the MET Dockerfile define a base image from which the MET images inherits.

    • Recommend having 3 base images:
      • Minimal image for compiling MET source code.
      • Minimal + packages for running unit and regression tests.
      • Minimal + useful runtime tools (e.g. ncview, wgrib)
  • For each push to each feature or bugfix branch:

    • Run "make install test" to confirm the code compiles. Only for changes to the "met/src" subdirectory. Disable using "ci-skip-compile". Test using "docker build".
  • For all PR's into the develop or main_v* branch.

    • ...

georgemccabe added a commit that referenced this issue Jan 19, 2022
… MET and tools needed to run unit tests into seperate Dockerfiles
georgemccabe added a commit that referenced this issue Jan 20, 2022
…rom a Dockerfile so that it can be called from Dockerfile and Dockerfile.copy
georgemccabe added a commit that referenced this issue Jan 20, 2022
…cker image instead of cloning the repository
georgemccabe added a commit that referenced this issue Jan 20, 2022
georgemccabe added a commit that referenced this issue Jan 20, 2022
…. This was not needed in Dockerfile because the commands were chained together with &&, but in a script execution will continue after a failed command
georgemccabe added a commit that referenced this issue Jan 20, 2022
georgemccabe added a commit that referenced this issue Jan 20, 2022
…local source code, set environment variables that set docker build args in workflow yml
georgemccabe added a commit that referenced this issue Jan 20, 2022
…hat should trigger compilation test and make change that should break compilation
georgemccabe added a commit that referenced this issue Jan 20, 2022
JohnHalleyGotway added a commit that referenced this issue Jan 24, 2022
georgemccabe added a commit that referenced this issue Jan 25, 2022
…fact to download in dependent step, download all output artifacts and run diff at end of workflow
georgemccabe added a commit that referenced this issue Jan 25, 2022
…because that is an invalid character for GHA artifacts
georgemccabe added a commit that referenced this issue Jan 25, 2022
…tables since this was changed in PR #2001, ci-run-unit
georgemccabe added a commit that referenced this issue Jan 25, 2022
…1a, create output directory before copying unit test artifact files for diff job, ci-run-unit
@JohnHalleyGotway
Copy link
Collaborator

JohnHalleyGotway commented Jan 25, 2022

@georgemccabe thanks for making lots of good progress on this issue. As discussed in Slack, we have many options for reducing the overall runtime of the unit tests in GHA (which are currently a little less than an hour). Here's a list of tasks to consider (add more as you see fit):

  • Break the input test data down into small pieces so that each test pulls smaller data volumes overall. It takes ~3 minutes for each job to pull the full set of input data).
    • Break the inputs down into some logical subsets.
  • Remove independent tasks from the longest chain.
    • Group 1a includes unit_ascii2nc.xml which takes ~12 minutes to run. Move any ascii2nc tasks whose output is not used downstream into a separate unit XML file which can run as an independent task. Be sure to write output to a directory which matches the unit_{NAME}.xml convention. That will likely make ascii2nc run faster than pb2nc in group 1a.
      • unit_ascii2nc.xml runs the tool 27 times but only 6 output files are used downstream:
        dup_test.nc
        edr_hourly.20130827.nc
        gauge_2012041012_24hr.nc
        gdas1.20120409.t12z.prepbufr.ascii_name.nc
        surfrad.nc
        trmm_2012040912_3hr.nc
    • If that makes ascii2nc run faster than pb2nc (~7 minutes), then do the same thing for unit_pb2nc.xml.
      • unit_pb2nc.xml runs the tool 10 times but only 3 output files are used downstream:
        gdas1.20120409.t12z.prepbufr.nc
        ndas.20120409.t12z.prepbufr.tm00.nc
        ndas.20120410.t12z.prepbufr.tm00.nc
  • Break down long running serial tasks into ones that can run concurrently:
    • unit_ref_config.xml runs gen_vx_mask once, and then pb2nc/pcp_combine/point_stat/grid_stat for 5 different lead times, and then stat_analysis to sum them all up. That takes 30 minutes. We could break that down into 5 different unit_ref_config_f{NN}.xml files for NN = 00, 12, 24, 36, 48 followed by unit_ref_config.xml to run series-analysis on the output. Rather than running gen_vx_mask once as a pre-req, running it 5 times (at the beginning of each unit_ref_config_f{NN}.xml file) would likely be faster overall.

georgemccabe added a commit that referenced this issue Jan 25, 2022
JohnHalleyGotway added a commit that referenced this issue Jan 26, 2022
…at we can run more tasks concurrently and complete the unit tests faster via GitHub actions.
JohnHalleyGotway added a commit that referenced this issue Jan 26, 2022
… 1a and ref_config jobs to consume fewer machines, hoping that provides resources for the 2a and 2b jobs to start sooner.
JohnHalleyGotway added a commit that referenced this issue Jan 26, 2022
…12, 24, 36, and 48 run 8, 11, 11, 14, and 11 tests, respectively. So do lead 36 by itself as a 3rd job.
JohnHalleyGotway added a commit that referenced this issue Jan 26, 2022
…h tighter dependencies. Elimate Unit Tests group 3a and have each stat_analysis XML run in group 2a once it dependent tests complete.
JohnHalleyGotway added a commit that referenced this issue Jan 26, 2022
@georgemccabe georgemccabe linked a pull request Jan 27, 2022 that will close this issue
14 tasks
georgemccabe added a commit that referenced this issue Jan 27, 2022
… so that an error will occur if MET_TEST_MET_PYTHON_EXE is set when running a test that needs it to be set
@JohnHalleyGotway JohnHalleyGotway changed the title Implement Continuous Integration with GH-Actions in MET Implement Continuous Integration with GitHub Actions in MET Mar 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: CI/CD Continuous integration and deployment issues component: testing Software testing issue priority: medium Medium Priority requestor: NCAR National Center for Atmospheric Research type: enhancement Improve something that it is currently doing type: new feature Make it do something new
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants