-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cads for andrew #616
Cads for andrew #616
Conversation
…d sensor flags (iasi, hirs, airs, etc.) to the qc_irsnd subroutine.
… cloud detection routine.
…d_detect and statistical_cloud_detect
… subroutine. If a channel used by the CO2_slicing routine has a missing value, reject the profile.
… subroutine. If a channel used by the CO2_slicing routine is bad/missing, reject the profile.
…ult. These flags include airs_co2, cris_co2, iasi_co2, hirs_co2 and goessndr_co2. When the flag is true the subroutine co2_cloud_detect will be used to determine cloud layer.
…IIRS cloud information within the CrIS bufr. Added namelist flags to determine which cloud detection routine to use. If cris_co2, airs_co2, iasi_co2, hirs_co2 and/or goessnder_co2 are true, use the co2_cloud_detect subroutine. The original cloud detection subroutine (statistical_cloud_detect) is the default if any flags are missing or set to false.
…bset. Also added logic if CO2_cloud_detect is used,specific channels must be available and pass minimum quality control.
…SI subset. CO2 required channels must also pass minimal quality control or profile will be rejected.
…ubset. Also added logic if CO2_cloud_detect is used,specific channels must be available and pass minimum quality control. AIRS is no longer an operational data set so these changes were never properly tested.
…e available and pass minimum quality control. These are channels 3 - 7 which are the basic CO2 sounding channels of this instrument.
…ch channel pair only tested a specific layer. All channel pairs should test from the tropopause to their pre determined level. Starting from the tropopause with each channel pair finds considerably more cirrus. A CrIS channel was changed in the 3rd pair, cloud thresholds were adjusted lower. There were other cosmetic changes like the radiative transfer integration, changed the subroutine name of the emc_legacy cloud test, etc.
…loud_and_aerosol_detection software. Specifically you will see the variable chan_level. Other variables were added (radiance_overcast, radiance_ratio) to compute chan_level.
… software. These include cris_cads, iasi_cads, and airs_cads. These variables need to be added to the script exglobal_atmos_analysis.sh to call this routine.
…routine) requires chan_level to be added in this routine. chan_level is NOT used in this routine and should NOT change the value of any variable going out of this subroutine.
…outine qc_irsnd. This variable is used in qc_irsnd when determining the clear/cloudy channels for the IR sensors AIRS, IASI, and CrIS.
…tware. This module contains the code (subroutines) developed by ECMWF and available on the NWP SAF.
…contains the setup and call routines for the cloud_and_aerosol_detection software. There are several code additions, deletions, and reorganizations in this push.
…_aerosol_detection software. This code is available from the NWP SAF and is specifically version 3. The only code changes made to these subroutines are to be compatible with the GSI. Logic changes were kept to a minimum.
…F90. Added an 11 - 12 micron test to qcmod to remove potential low level clouds.
…S) for use in CADS. These are NOT complete yet.
…in the fix_gsi directory. In this case the IASI_CLDDET.NL was modified to NOT use the AVHRR cluster information as it is not ready yet. AIRS_CLDDET.NL CRIS_CLDDET.NL IASI_CLDDET.NL IASING_CLDDET.NL IRS_CLDDET.NL
Fixed conflict in gsimod.f90 and removed exglobal_atmos_analysis.sh Conflicts: scripts/exglobal_atmos_analysis.sh src/gsi/gsimod.F90
…d CrIS into the CADS
WCOSS2 (Cactus)
Hera
Orion
The rrfs_3denvar_glbens failure is due to the run time check
This is not a fatal failure. The hafs_4denvar_glbens failure is due to the run time check
This is not a fatal failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks @wx20jjung!
Approve.
@wx20jjung , GSI PR #624 has been merged into I can rerun ctests on WCOSS2 after you update your branch. PR #624 should not impact this PR but it's best to confirm by rerunning the ctests. Would you please run ctests on Hera after you update your branch. Do you ever run on Orion? |
I merged the latest develop branch with CADS_for_Andrew on S4 and the gsi
now fails. I suspect this is an S4 issue but I need to back out of it, at
least on S4. I will merge my CADS_for_Andrew copy on Hera and try the
crests there.
I have an orion account but do not have allocations similar to WCOSS.
…On Thu, Nov 30, 2023 at 12:01 PM RussTreadon-NOAA ***@***.***> wrote:
@wx20jjung <https://github.com/wx20jjung> , GSI PR #624
<#624> has been merged into develop.
This PR updates the GSI build to spack-stack on non-production machines. We
should bring this update into wx20jjung:CADS_for_Andrew.
I can rerun ctests on WCOSS2 after you update your branch. PR #624
<#624> should not impact this PR but
it's best to confirm by rerunning the ctests. Would you please run ctests
on Hera after you update your branch. Do you ever run on Orion?
—
Reply to this email directly, view it on GitHub
<#616 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMPASAYQ5HYRFPFRRZB3VEDYHC3VDAVCNFSM6AAAAAA4GCGHLOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZUGE3TCMBXGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
The S4 failure is unexpected. @DavidHuber-NOAA , PR #624 was tested on S4, right? I merged the current head of
The
The loproc_updat task 0 memory usage is indeed noticeably higher than the contrl
|
@RussTreadon-NOAA Yes, #624 was tested in the global workflow on S4 at C96/C48 deterministic/ensemble resolutions. |
Thank you @DavidHuber-NOAA for the confirmation. Jim's failure is puzzling. |
@wx20jjung @RussTreadon-NOAA The issue on S4 is that the wrong modules are being loaded at runtime. Since the job is running within the global-workflow, it is still loading hpc-stack modules (e.g. hdf5/1.10.6), which is causing the crash. You are welcome to try and merge in |
These files are in gw_ss ?
…On Thu, Nov 30, 2023 at 2:10 PM David Huber ***@***.***> wrote:
@wx20jjung <https://github.com/wx20jjung> @RussTreadon-NOAA
<https://github.com/RussTreadon-NOAA> The issue on S4 is that the wrong
modules are being loaded at runtime. Since the job is running within the
global-workflow, it is still loading hpc-stack modules (e.g. hdf5/1.10.6),
which is causing the crash. You are welcome to try and merge in
***@***.***:DavidHuber-NOAA/global-workflow -b feature/spack-stack OR
just copy over the module_base.s4.lua and versions/run.s4.ver files.
—
Reply to this email directly, view it on GitHub
<#616 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMPASA3YN23ODEFYCXTTCVDYHDKZLAVCNFSM6AAAAAA4GCGHLOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZUGM4TGNRWGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@wx20jjung Yes, though I forgot that there are a few version files you will need to copy. You may be better off just copying over the contents of |
I am not able to run my internal CADS cycle tests as my global-workflow is not compatible (yet) with these changes. I did run the ctests on hera. Here are the results.
1/7 Test #4: [=[netcdf_fv3_regional]=] ........ Passed 726.39 sec 100% tests passed, 0 tests failed out of 7 Total Test time (real) = 1852.19 sec |
Orion ctests
The
The wall times do not look anomalous given high run to run variability on Orion, especially when running in the
The
Again, the wall times to not look anomalous given high run to run variability on Orion
@wx20jjung , would you please update |
I am getting the following error. Can anyone help to resolve this?
***@***.*** GSI_CADS]$ git push origin CADS_for_Andrew
Counting objects: 56, done.
Delta compression using up to 48 threads.
Compressing objects: 100% (29/29), done.
Writing objects: 100% (30/30), 8.76 KiB | 0 bytes/s, done.
Total 30 (delta 19), reused 0 (delta 0)
remote: Resolving deltas: 100% (19/19), completed with 14 local objects.
To https://github.com/wx20jjung/GSI.git
! [remote rejected] CADS_for_Andrew -> CADS_for_Andrew (refusing to allow
a Personal Access Token to create or update workflow
`.github/workflows/gcc.yml` without `workflow` scope)
error: failed to push some refs to 'https://github.com/wx20jjung/GSI.git'
…On Fri, Dec 1, 2023 at 1:15 PM RussTreadon-NOAA ***@***.***> wrote:
*Orion ctests*
Manually merged the head of develop into wx20jjung:CADS_for_Andrew. Build
updated working copy on Orion and run ctests with the following results
Test project /work2/noaa/da/rtreadon/git/gsi/pr616_update/build
Start 1: global_4denvar
Start 2: rtma
Start 3: rrfs_3denvar_glbens
Start 4: netcdf_fv3_regional
Start 5: hafs_4denvar_glbens
Start 6: hafs_3denvar_hybens
Start 7: global_enkf
1/7 Test #4: netcdf_fv3_regional .............. Passed 484.67 sec
2/7 Test #7: global_enkf ...................... Passed 489.37 sec
3/7 Test #3: rrfs_3denvar_glbens .............. Passed 607.09 sec
4/7 Test #2: rtma ............................. Passed 970.90 sec
5/7 Test #6: hafs_3denvar_hybens .............. Passed 1396.86 sec
6/7 Test #1: global_4denvar ...................***Failed 1502.96 sec
7/7 Test #5: hafs_4denvar_glbens ..............***Failed 1637.40 sec
71% tests passed, 2 tests failed out of 7
Total Test time (real) = 1637.42 sec
The following tests FAILED:
1 - global_4denvar (Failed)
5 - hafs_4denvar_glbens (Failed)
The global_4denvar test failed due to the runtime check
The runtime for global_4denvar_hiproc_updat is 307.642959 seconds. This has exceeded maximum allowable threshold time of 304.149277 seconds,
resulting in Failure of timethresh2 the regression test.
The wall times do not look anomalous given high run to run variability on
Orion, especially when running in the /work fileset.
global_4denvar_hiproc_contrl/stdout:The total amount of wall time = 276.499343
global_4denvar_hiproc_updat/stdout:The total amount of wall time = 307.642959
global_4denvar_loproc_contrl/stdout:The total amount of wall time = 373.393028
global_4denvar_loproc_updat/stdout:The total amount of wall time = 390.099795
The hafs_4denvar_glbens test failed for the same reason
The runtime for hafs_4denvar_glbens_hiproc_updat is 300.890503 seconds. This has exceeded maximum allowable threshold time of 293.352983 seconds,
resulting in Failure of timethresh2 the regression test.
Again, the wall times to not look anomalous given high run to run
variability on Orion
hafs_4denvar_glbens_hiproc_contrl/stdout:The total amount of wall time = 266.684530
hafs_4denvar_glbens_hiproc_updat/stdout:The total amount of wall time = 300.890503
hafs_4denvar_glbens_loproc_contrl/stdout:The total amount of wall time = 429.190717
hafs_4denvar_glbens_loproc_updat/stdout:The total amount of wall time = 433.160648
@wx20jjung <https://github.com/wx20jjung> , would you please update
wx20jjung:CADS_for_Andrew with the current head of develop?
—
Reply to this email directly, view it on GitHub
<#616 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMPASA7U34OKY73RI56HG7LYHINC3AVCNFSM6AAAAAA4GCGHLOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZWGU3DOMRSGE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@wx20jjung , not sure what the problem is. Try the following
Your local copy of
|
I did this both on hera and s4. I continue to get the same error.
…On Fri, Dec 1, 2023 at 1:59 PM RussTreadon-NOAA ***@***.***> wrote:
@wx20jjung <https://github.com/wx20jjung> , not sure what the problem is.
Try the following
1. create a new directory. mkdir update
2. cd into the new directory. cd update
3. git clone --recursive https://github.com/wx20jjung/GSI.git .
4. git checkout CADS_for_Andrew
5. git submodule sync
6. git submodule update
7. git remote add upstream https://github.com/NOAA-EMC/GSI
8. git remote -v ... make sure you see something like what's on the
GSI GS User Information wiki
<https://github.com/NOAA-EMC/GSI/wiki/GSI-User-Information> under *Updating
your Fork when the Official Repository is Updated*
9. git remote update If you see a fix submodule error, ignore it.
10. git merge upstream/develop This merges the authoritative develop
into the working copy of your branch. The merge command opens an editor.
Accept the provided commit log message. Exit the editor. You should see
something like the below
(gdasapp) Orion-login-3:/work/noaa/da/rtreadon/git/gsi/update$ git merge upstream/develop
hint: Waiting for your editor to close the file... PuTTY X11 proxy: unable to connect to forwarded X server: Network error: Connection refused
Display localhost:27.0 unavailable, simulating -nw
Merge made by the 'recursive' strategy.
.github/workflows/gcc.yml | 31 +++++++++++----------
.github/workflows/intel.yml | 47 +++++++++++++++++++-------------
ci/spack.yaml | 4 +--
modulefiles/gsi_cheyenne.gnu.lua | 36 ++++++++++++------------
modulefiles/gsi_cheyenne.intel.lua | 32 +++++++++++++---------
modulefiles/gsi_common.lua | 16 ++++++-----
modulefiles/gsi_gaea.lua | 24 ++++++++++------
modulefiles/gsi_hera.gnu.lua | 22 +++++++--------
modulefiles/gsi_hera.intel.lua | 21 ++++++--------
modulefiles/gsi_hercules.lua | 26 ++++++++++++++++++
modulefiles/gsi_jet.lua | 22 ++++++---------
modulefiles/gsi_orion.lua | 21 ++++++--------
modulefiles/gsi_s4.lua | 23 +++++++---------
modulefiles/gsi_wcoss2.lua | 28 ++++++++++++++++++-
regression/regression_param.sh | 137 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++------------------------------------
regression/regression_var.sh | 19 +++++++++----
ush/detect_machine.sh | 2 ++
ush/module-setup.sh | 7 +++++
ush/sub_hercules | 170 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
ush/sub_orion | 2 ++
20 files changed, 485 insertions(+), 205 deletions(-)
create mode 100644 modulefiles/gsi_hercules.lua
create mode 100755 ush/sub_hercules
(gdasapp) Orion-login-3:/work/noaa/da/rtreadon/git/gsi/update$
Your local copy of CADS_for_Andrew is now update to date with `develop.
Push your updated working copy back to your repo.
1. git push origin CADS_for_Andrew
—
Reply to this email directly, view it on GitHub
<#616 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMPASA5ARWKH7PUK45YFERTYHISHBAVCNFSM6AAAAAA4GCGHLOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZWGYZDGMJUGE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@wx20jjung It looks like you are having trouble with your authentication. You can try changing to ssh authentication:
This assumes you have an SSH key in your GitHub profile. If this fails, you can follow this guide to get set up. |
Your branch is now up to date with |
Please update the working copy of your branch and ensure that everything looks correct. |
@wx20jjung , we can schedule this PR for merger into
|
Description
Fixes #428
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Checklist
DUE DATE for this PR is 10/12/2023. If this PR is not merged into
develop
by this date, the PR will be closed and returned to the developer.