-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[develop] Update Jenkinsfile to run post-build functional tests #847
[develop] Update Jenkinsfile to run post-build functional tests #847
Conversation
@BruceKropp-Raytheon A quick question regarding the |
@BruceKropp-Raytheon I had to kill the CI tests because they weren't using your updated Jenkinsfile. I have created a temporary sandbox for testing this PR and the progress can be seen here:
I'll let you know if I encounter any failures with the new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BruceKropp-Raytheon I'm finding that the new Functional WorkflowTaskTests
are failing with the following message:
+ bash --login 'EXPTDIR=/glade/scratch/epicufsrt/jenkins/workspace/_SRW_App_Jenkinsfile_test_PR-847/expt_dirs/test_community /glade/scratch/epicufsrt/jenkins/workspace/_SRW_App_Jenkinsfile_test_PR-847/.cicd/scripts/srw_ftest.sh'
bash: EXPTDIR=/glade/scratch/epicufsrt/jenkins/workspace/_SRW_App_Jenkinsfile_test_PR-847/expt_dirs/test_community /glade/scratch/epicufsrt/jenkins/workspace/_SRW_App_Jenkinsfile_test_PR-847/.cicd/scripts/srw_ftest.sh: No such file or directory
script returned exit code 127
I think that you will need to remove the EXPTDIR
entry from the Jenkinsfile
. This is apparently causing issues and this variable is already being set in srw_ftest.sh
itself.
@BruceKropp-Raytheon The new Functional WorkflowTaskTests have passed on Cheyenne Intel and Cheyenne GNU. Will continue to monitor Hera Intel, Hera GNU, and Jet results. |
@BruceKropp-Raytheon The new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BruceKropp-Raytheon The srw_ftest.sh
successfully completes on Hera, Jet, and Orion (though the test is skipped on Hera due to issues with ulimit -s unlimited
) using the Jenkins pipeline.
Approving this PR now.
@BruceKropp-Raytheon @MichaelLueken |
@natalie-perlin I think that would work. from: PRE_TASK_CMDS: '{ ulimit -s unlimited; ulimit -a; }' to: PRE_TASK_CMDS: '{ ulimit -S -s unlimited; ulimit -a; }' |
@BruceKropp-Raytheon @natalie-perlin The
For whatever reason, it looks like Hera tests will fail if you attempt to change the stack size via ulimit outside of the WE2E tests. |
okay. So lets revert to the previous, cancel this last commit? |
@BruceKropp-Raytheon Sure, if you use the following command, you will undo the last commit:
Please note that Git might not allow you to push the update (it will complain about your local version being behind what is in the repository and to update it using |
Successfully tested after the following changes to the ./.cicd/scripts/srw_ftest.sh (lines 128-130)
and the following in ./scripts/exregional_make_grid.sh:
By some reason, removing curly brackets used with eval command made a difference. The log file for make_grid task, run_make_grid-log.txt, showed the following:
In other words, the script successfully changed the stack size to max allowed (soft) limit. Execution stopped only because it was looking for a binary executable under ./install_intel/exec/, whereas in my build it was located under ./exec/ |
@natalie-perlin @BruceKropp-Raytheon Removing the {} from I'd like to include this test for all machines, but I'd like to know why there is an issue only with the |
.cicd/scripts/srw_ftest.sh
Outdated
|
||
# Limit to machines that are fully ready | ||
deny_machines=( gaea hera ) | ||
#sed "s|ulimit -s unlimited;|ulimit -S -s unlimited;|" -i ${workspace}/ush/machine/hera.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change lines 129-130 to include hera:
deny_machines=( gaea )
sed "s|ulimit -s unlimited;|ulimit -S -s unlimited;|" -i ${workspace}/ush/machine/hera.yaml
and line 63 in ./scripts/exregional_make_grid.sh to the following (remove curly brackets):
eval $PRE_TASK_CMDS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@natalie-perlin and @BruceKropp-Raytheon -
I've found another way to make the srw_ftest.sh
script run on Hera that doesn't require removing the curly brackets from ${PRE_TASK_CMDS}
in scripts/exregional_make_grid.sh
.
When I moved line 130:
sed "s|ulimit -s unlimited;|ulimit -S -s unlimited;|" -i ${workspace}/ush/machine/hera.yaml
to line 89 (this line can be moved to any line before the .generate_FV3LAM_wflow.py
line), the wrapper script successfully ran on Hera.
@@ -86,6 +86,8 @@ set -e -u
export PYTHONPATH=${workspace}/ush/python_utils/workflow-tools:${workspace}/ush/python_utils/workflow-tools/src
+sed "s|ulimit -s unlimited;|ulimit -S -s unlimited;|" -i ${workspace}/ush/machine/hera.yaml
+
cd ${workspace}/ush
# Consistency check ...
#./config_utils.py -c ./config.yaml -v ./config_defaults.yaml -k "(\!rocoto\b)"
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: .cicd/scripts/srw_ftest.sh
Untracked files:
(use "git add <file>..." to include in what will be committed)
build_intel/
tests/build_test244350.out
no changes added to commit (use "git add" and/or "git commit -a")
# Try hera with the first few simple SRW tasks ...
./run_make_grid.sh ... COMPLETE
./run_get_ics.sh ... COMPLETE
./run_get_lbcs.sh ... COMPLETE
./run_make_orog.sh ... COMPLETE
I think that this modification would be better than removing the curly brackets in an ex-script. If you could try making this modification and double-checking that it will run manually, I would be greatly appreciative. Thanks!
@BruceKropp-Raytheon @natalie-perlin The srw_ftest.sh script is failing on Hera with the following error message:
This is seen both in the pipeline and while manually testing the script. The traceback for the error is:
It should be noted that I'm able to make the srw_test.sh test works without issue, which needs to run It's not currently clear to me what is causing the issue. I'll work some on this tomorrow and hopefully figure something else (unless you are able to figure out the issue sooner). |
@BruceKropp-Raytheon @natalie-perlin - Looking more closely at the traceback, I'm seeing something that I don't understand - The @christinaholtNOAA Is this the right behavior, or should the work stay in I'll do some more digging to see why the |
@MichaelLueken I have gone back and rerun manually from a previous commit that we believed was working correctly last week (July 12), 4bcfa8e . This commit builds/tests fine on Orion, but is now failing on Jet, as you have pointed out above. It seems unrelated to the latest changes in this PR, so perhaps there was an environment setting change on Jet (and Hera)? |
You are correct, I don't think that this PR is causing the issue, at least directly. The issue is that the |
okay. We currently have PYTHONPATH defined in the script. Does this need adjusting?
|
The |
Understood. The motivation for this PR was to exercise the User Documentation described in 5.4.2 Running the Workflow Using Stand-Alone Scripts . With the setup instructions coming from 5.3 Generate the Forecast Experiment . Obviously we have had to make some adjustments in the script to produce success for this PR. Ultimately the User documentation may need adjusting to reflect the correct way to perform these tasks on any supported tier1 platform. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BruceKropp-Raytheon Appending the following path to the PYTHONPATH
entry will allow the srw_ftest.sh
to run on Jet:
${workspace}/ush/python_utils/workflow-tools/src
I will be able to resubmit the Jet and Orion Jenkins tests after this change has been made to ensure that the pipeline is looking good, then I can run the Hera tests tomorrow once the machine returns.
…tils/workflow-tools/src to help Jet and Hera
@BruceKropp-Raytheon @natalie-perlin The |
Automated Jenkins tests have completed for Hera. All tests on Hera GNU successfully passed. For Hera Intel,
The automated Jenkins tests have been submitted on Jet. Once again, the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent news! Thank you for testing the recent changes. Approving now
DESCRIPTION OF CHANGES:
This is a new CI gate to ensure that code changes still allow E2E tests to run on supported platforms.
It uses the existing CI functional test script, .cicd/scripts/srw_ftest.sh , to setup and try the first few workflow tasks.
If these succeed, we can be confident that E2E tests can still be launched as usual.
Type of change
TESTS CONDUCTED:
The CI script, srw_ftest.sh, was run on each of the Teir1 on-prem platforms, and succeeds on those that ALREADY succeed in running E2E tests via CI. Namely Cheyenne and Jet. Other platforms have open issues preventing E2E tests from running to completion vi CI.
DEPENDENCIES:
DOCUMENTATION:
ISSUE:
EPIC PLATFORM-635
CHECKLIST
LABELS (optional):
A Code Manager needs to add the following labels to this PR:
CONTRIBUTORS (optional):