-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CRTM-fix differs from WCOSS2, CRTM produces copious errors in the GSI #963
Comments
These coefficient files should come from I checked the coefficient files in these github and tarred versions and found that in both cases the metop-c coefficient files are small (lacking the antenna correction information). So the WCOSS2 coefficient files appear inconsistent with the official release, but are in fact the correct ones to use. @Hang-Lei-NOAA worked with NCO to install the coefficient files on the WCOSS machines, so maybe he can explain the difference. My guess is that as this was supposed to only be an incremental change (only six files needed to be added) only these files were updated on WCOSS2 (which is definitely the safest way to manage this change). Ideally the CRTM github tag should be updated to ensure the correct files are present. Adding @BenjaminTJohnson who is the CRTM manager. |
There could be a naming issue here. On the CRTM side, we named the
coefficients that have ACCoeff information (antenna correction) as `*_v2*` --
when NCO took and installed these, they renamed them to without the v2,
leading to frequent confusion.
so `amsua_metop-c_v2.SpcCoeff.bin` from CRTM tarballs became
`amsua_metop-c.SpcCoeff.bin` in the NCO installations.
Also, it should be pointed out that `amsua_metop-c_v2.SpcCoeff.bin` (ACCoeff
with larger file size) from the above tarball is erroneously located in the
`Little_Endian` directory, it should be `Big_Endian`.
So let me know what you want the files to be named and I will fix these
issues for all future EMC releases.
…-Ben
|
@BenjaminTJohnson Thanks for your reply. So in the tarball and in the tag we have two
The As you say, on WCOSS, the file with the same cksum has been renamed to
The reason for So does The same argument applies to Tagging @emilyhcliu for information. |
@AlexanderRichert-NOAA @BenjaminTJohnson I believe I see the issue. In the spack recipe, the "little-endian"
Is it possible to replace this file in the current installations? |
As David pointed out, that should be the issue. The spack-stack should used the right CRTM tag for installation as we previously discussed with Alex. The little_endian file should be replaced by Big_endian file. |
@BenjaminTJohnson @Hang-Lei-NOAA @AlexanderRichert-NOAA @DavidHuber-NOAA This appears to be the solution. I am however very confused about why this renaming is being done rather than correcting the github tag/tarball itself. @BenjaminTJohnson Can you confirm what |
@ADCollard "v2" was just that version of that coefficient file -- it nothing to do with the internal antenna correction version. "v2" was used to separate from the non-corrected version. I'm more than happy to rename the files, just let me know what your preference is for the ACCoeff versions and the non-ACCoeff versions (presently |
Thanks, @BenjaminTJohnson, for clarifying the meaning of v2 here. It should be noted that this is different to the files for metop-a where v2 does mean version 2 of the AC coefficients. Is there any need to retain versions of the SpcCoeff files that have no AC coeffs? I would simply rename: Again, this should not be done for metop-a, where the v2 file actually reflects a second version of the AC coefficients that are being used at a handful of RARS stations. Tagging @emilyhcliu for awareness |
@ADCollard @emilyhcliu sure, no problem. Which prior release of CRTM do you want that change to be made? Right now I plan to update tag Also, any changes would be incorporated into the upcoming |
@BenjaminTJohnson If we are going to be creating new tags rather than just updating the current one (yes I know this is the correct practice) I think should not do this at this time to avoid further confusion. But for any future tags we should rename the files as described above. But I defer to @Hang-Lei-NOAA on how best to proceed from the EMC perspective. |
@climbfuji @AlexanderRichert-NOAA The easiest way to resolve this in each installation is to rename the existing 1.6.0
I have verified that the If this could be done on Orion/intel first, I can verify it works for the GSI before the other systems' intel and gnu installations are updated. |
@gmao-yzhu I see you are not part of this repository so will close this |
Thank you, Ben (Ruston).
Also, thanks Ben (Johnson) for generating the Little Endian data file for us.
Much appreciated!
Yanqiu
From: Benjamin Ruston ***@***.***>
Reply-To: JCSDA/spack-stack ***@***.***>
Date: Thursday, January 25, 2024 at 3:41 PM
To: JCSDA/spack-stack ***@***.***>
Cc: "Zhu, Yanqiu (GSFC-6101)" ***@***.***>, Mention ***@***.***>
Subject: [EXTERNAL] [BULK] Re: [JCSDA/spack-stack] CRTM-fix differs from WCOSS2, CRTM produces copious errors in the GSI (Issue #963)
CAUTION: This email originated from outside of NASA. Please take care when clicking links or opening attachments. Use the "Report Message" button to report suspicious messages to the NASA SOC.
@gmao-yzhu<https://github.com/gmao-yzhu> I see you are not part of this repository
but this will make the following issue a duplicate: JCSDA/crtm#46<JCSDA/crtm#46>
so will close this
—
Reply to this email directly, view it on GitHub<#963 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ARUMEBRWCQXRIV2AH2N5BETYQK7PFAVCNFSM6AAAAABCHQHE4GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJQHE3DINZXGM>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
@AlexanderRichert-NOAA @climbfuji @ulmononian Just wanted to check in on this and see what you thought of my proposed solution. Also wondering when I can start testing on Orion. |
@DavidHuber-NOAA I just made the manual change on Orion in /work/noaa/epic/role-epic/spack-stack/orion/spack-stack-1.6.0/envs/unified-env/install/intel/2022.0.2/crtm-fix-2.4.0.1_emc-ezbyzji/fix only. |
@climbfuji Excellent, thank you! I will give that a try today. |
@climbfuji The regression tests were successful! Could the same fix be applied across all systems? I can provide specific commands for each system if needed. |
@DavidHuber-NOAA: @RatkoVasic-NOAA and i can handle this on the remainder of the rdhpcs intel/gnu installations. identical process on each machine, i.e., are the commands the same for each system (of course, with different paths to |
@ulmononian @RatkoVasic-NOAA Great, thank you. Yes, the same process will work on each system. |
Let's add a list of machines to the description of this issue. We need to do this for ALL environments and ALL compilers on each of the machines. It would also be good to backport the bug fix for spack to the release branch. |
when you say "all envs", do you mean this should extend to envs beyond unified env, e.g., gsi-addon? |
@DavidHuber-NOAA Do we need to apply this bug fix for |
@climbfuji You are correct. We do not need to do this in |
@DavidHuber-NOAA We are nearly done here. Someone needs to apply the fix for spack-stack-1.6.0 on the EPIC ParallelWorks systems (@ulmononian FYI), and once Nautilus is coming out of maintenance I will finish that last system on my end. |
Nautilus done - just need EPIC to finish ParallelWorks on their end to close this |
@climbfuji Fantastic, thank you. I have tested this on all GSI-RT-supported platforms (Hera, Jet, Orion, Hercules) and results are as expected. |
Any plan for S4? |
It's already there, see list at the top of this issue. |
@DavidHuber-NOAA @ulmononian @AlexanderRichert-NOAA I ported the bug fix in the PR for develop back to the release/1.6.0 branch of our spack fork, retagged, then updated the submodule pointer in our spack-stack release branch and retagged. I then decided to close out this issue - @ulmononian or someone else from EPIC is going to make the remaining changes in the EPIC ParallelWorks installations next week and tick the box in the issue description above. |
LIST OF FIXED INSTALLATIONS
FYI @climbfuji @RatkoVasic-NOAA @ulmononian @AlexanderRichert-NOAA
AWS Single-Node AMI RedHat 8SKIP (used by JCSDA only)JCSDA CI containersSKIP (used by JCSDA only)Describe the bug
The CRTM fix files seem to be incomplete for amsua_metop-c. When comparing the fix files against those on WCOSS2, the following size difference was noted:
Hera:
1252 Jan 4 21:17 amsua_metop-c.SpcCoeff.bin
WCOSS2:
12196 Nov 13 15:03 amsua_metop-c.SpcCoeff.bin
Possibly related, the GSI produces ~60K lines of errors when running the CRTM of
Remove_AntCorr(FAILURE) : Input iFOV inconsistent with AC data
Expected behavior
The fix files would match between WCOSS2 and the research systems.
System:
All RDHPCS machines and S4.
Additional context
Found while testing the GSI: NOAA-EMC/GSI#684.
The text was updated successfully, but these errors were encountered: