-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[production/AQM.v7] Add ecFlow option to production branch for AQM #821
[production/AQM.v7] Add ecFlow option to production branch for AQM #821
Conversation
@lgannoaa, I got the following error in the 'aqm_manager' task:
Please let me know how to fix this. |
@lgannoaa, @JianpingHuang-NOAA, as described above, you can find the quick start guide here (Ch 1.3 , pp 15): https://drive.google.com/file/d/1zyCvHmuL_LWx0Yto6B1GXF04m-n6Kbs2/view?usp=drive_link |
I compared the results of 'dyn/phy_006' in the first cycle ('00') by rocoto and ecFlow. They are bit-to-bit identical. |
I recall management's guidance is for @chan-hoo to learn how to test AQM ecflow for merge PR #812 Please see the recommendation of the action below: |
This cycle 20230605/00 is the starting cycle of the run. aqm_manager job should not be run for this cycle. Please enter the following CMD: |
@lgannoaa, (1) As mentioned above, PR 812 ignores the structure of the current UFS SRW App workflow. It just put the ecFlow scripts into the existing (pre-generated) workflow. Whenever a new workflow is generated for a different experiment, the scripts should be modified manually. (2) The COM directory name is based on the default value of "RUN" in the configuration file (config.yaml). If you change it in the configuration file, it will be changed to what you want. As can be seen in Table 1 (pp.4) of the NCO standards, COMIN=$COMROOT/$NET/$model_ver/$RUN.$PDY/$cyc. In the sample script, RUN="aqm_nco_aqmna13km". (3) I agree. You are the developer of the 'aqm_manager' task script. So I am asking for you to take a look at the error message. |
(1-1) It just put the ecFlow scripts into the existing (pre-generated) workflow. |
@lgannoaa, I've fixed the issue on the |
@lgannoaa, I am sorry if you feel like this PR ignores your PR. It is not at all. Your PR was really helpful for me to understand the structure of ecFlow and how it works. My concern is that your PR is too case-specific and it requires for a user to modify the scripts manually for his/her own run. I think that the other reviewers of UFS SRW App will not approve your approach when I add it to the develop branch. The next versions of AQM will continue to use the UFS SRW App. Therefore, we should follow the structure of the UFS SRW App even for ecFlow. I believe this PR follows the NCO standards as well. Please test this PR on your end and let me know if you find any issues. |
@chan-hoo, PR #812 represents our teamwork. @JianpingHuang-NOAA , Kai Wang and I took time to develop/fix issues in the AQM package. It was tested on both the rocoto workflow from Jianping and ecflow workflow from me. Jianping reviewed and approved the PR 812. Management requested you to freeze the code change from the AQM package until the ecflow change (PR 812) is merged. You reported that you were not familiar with the ecflow workflow. Therefore, management asked me to teach you ecflow workflow and give you guidance for testing PR 812 in ecflow in order to merge the PR 812. If you want the package to generate an ecflow experiment control file automatically, you may work on adding this function to the Python utility. This utility currently generates only experiment control files for rocoto workflow. Therefore, it is in-line with the design to include ecflow function here. This isolated work can be done after merge ecflow PR 812. The work should not impact ecflow workflow in PR 812. In addition, this kind of approach will not work with NCO implementation because the experiment control files in production can be configured by NCO in static form. It is against NCO production policy to remove or modify it automatically. Management have indicated in the ecflow merge meeting that any new change or enhancement can be done after the PR 812 have merged. Management also indicated that all change should be reviewed and approved by Jianping before merge for this implementation. Any change request should be handled by github issue. Please proceed to merge PR 812. We have a few issue requests awaiting for merge PR 812 to be completed to submit. When can you finish merge work? |
@JianpingHuang-NOAA
Chan-Hoo introduced a new PR 821 by copy ecflow workflow source code from PR 812 and modified the workflow control mechanism. He did not proceed with merging PR 812. I spot checked PR 821 using Chan-Hoo testing run location indicated below and found it: HOMEaqm: /lfs/h2/emc/lam/save/chan-hoo.jeon/prod_ecflow_test/ufs-srweather-appjob PR 821 review: |
@lgannoaa, @arunchawla-NOAA, @MatthewPyle-NOAA, @aerorahul, I'd like to hear your opinions on this PR. |
(0) @JianpingHuang-NOAA has already approved the PR 812. Both Jianping and Kai have already tested PR 812. Please merge PR 812 first. |
If this PR is able to run with ecflow and is not breaking the SRW app structure then please proceed to merge this PR to ensure ecflow capabilty is available and move on to the other tasks related to EE2 and science requirements. I do not want this to be dragged on anymore. Sorry I could not comment earlier. |
@arunchawla-NOAA, I am testing the ecflow option again now. Once it is complete (approx. in 2 hours), I'll merge this PR. |
@arunchawla-NOAA, @MatthewPyle-NOAA, @aerorahul, @BenjaminBlake-NOAA, @RatkoVasic-NOAA, @lgannoaa, @JianpingHuang-NOAA, I need at least one approval to merge this PR. Please approve this. |
@chan-hoo in nexus/*.ecf, shouldn't wallclock values be also parameters? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can give you an approval for merging, but it should come from the right person. I am not that.
@RatkoVasic-NOAA, Your're right. It can be. I didn't have enough time to parameterize all the configuration variables for this production branch. I had to make this change in a week. As a quick solution, I just did it for the critical parameters to run the scripts. I'll parameterize all variables necessary for the ecf scripts in the develop branch later. |
@aerorahul @RatkoVasic-NOAA, thank you for your approvals! |
The results of ecflow are bit-to-bit identical to those of rocoto. Merging now. |
DESCRIPTION OF CHANGES:
config_default.yaml
are renamed with the suffix_dfv
(default value) and these variables are defined in job cards (/include/envir-p1.h).envir-p1.h
and they will replace the default values defined inush/machine/wcoss2
.KEEPDATA
, which is defined in/include/head.h
is changed to aTRUE/FALSE
flag inenvir-p1.h
to meet the standard in the current workflow./ush/load_modules_run_task.sh
is used to load the necessary modules for each task as the rocoto workflow does in order for us to manage the module list in one place (/modulefiles/tasks/wcoss2).parm/ecflow
to the home directory to follow the structure in/include/head.h
when running `python3 generate_FV3LAM_wflow.py (Step 7 in my document below).Type of change
TESTS CONDUCTED:
ISSUE:
Fixes issue mentioned in #797
CONTRIBUTORS:
@lgannoaa