-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GSI fails to run using newer Intel compilers (i.e, Intel 2022) #447
Comments
This crash was replicated with the intel 2022.3.0 compiler after compiling a fresh hpc-stack. Enabling various debug options (e.g. Running this test with valgrind (with options Conditional jump or move locations:
Invalid read/write locations:
Uninitialized value created by a stack allocation:
Crash backtrace:
|
@NOAA-EMC/gsi-admins |
Thank you, @aerorahul , for reminding us of this known issue. Since we no longer have a GSI code manager I do not know who will work on this issue. Adding @dtkleist for awareness. |
@aerorahul @RussTreadon-NOAA I have reached out to Intel via Google to help look into this. Depending on suggestions from Intel, it may be beneficial to work with a developer. I'll keep you posted. |
Hi @DavidHuber-NOAA, @aerorahul and @RussTreadon-NOAA We are trying to get a developer to work on this. Can we get a little summary on 1. what compilers it has been tested with, 2. which is the platform to test this (I would think HERA, but want to confirm) and 3. what is the test case to run ? Should the GSI regression tests be enough for this ? |
If the developer is interested, I have an hpc-stack build devoted to this purpose (built with 2022.3.0) located here on Hera: |
Some headway has been made on this issue. Building the hpc-stack suite with ifort and icc version 2022.3.0 and the GSI with icx and ifx version 2023.0.0 allowed some of the regression tests to run to completion, while those that fail to do so in a different location. Attached are the logs for the global_3dvar_hiproc_updat (fails) and global_3dvar_loproc_updat (passes) tests. Note that icx and ifx version 2022.3.0 produce an ICE when compiling the GSI. |
As it stands, this issue is blocking the move to a unified spack environment and creating a lot of unnecessary work for the spack-stack team. Can we just lower the optimization for that one particular file for now, or put CPP pragmas around the code that prevent optimizing this section. |
@DavidHuber-NOAA @RussTreadon-NOAA Can you check if #533 fixes the problem? |
Tests on Orion and Hera with
global_3dvar ctest results were not altered by this change. More extensive testing is necessary to ensure no adverse impacts with other configurations. |
So this is solved? @RussTreadon-NOAA do you have a fork that we can test with the workflow? This looks like great news !! |
@arunchawla-NOAA , additional testing is necessary to ensure that (a) the change really works and (b) it does not alter analysis results. Ideally, one would build NOAA-EMC/GSI I shared my finding in hopes that it is useful to those actively working this issue. |
do you have a fork that can be tested ? |
@arunchawla-NOAA @RussTreadon-NOAA This looks great! I will give this a try with Intel 2022.3.0 and 2023.0.0 and run a full test suite. Since 2023 is desired, I should note that ifort and icc will soon be deprecated, so Intel support has suggested using ifx and icx instead. Using these compilers (version 2023.0.0) did allow some regression tests to pass (like global_3dvar_loproc_updat), but others are still failing due to MPI errors (global_3dvar_hiproc_updat). Attached is the output from the global_3dvar_hiproc_updat test. The lines producing the errors are stpjo.f90:303-309. Given that, I will try out Russ's fix with both mpiifort and ifx to see how they both perform. |
Intel's current position is to use icx+icpx+ifort for production, and icx+icpx+ifx for testing and debugging only. ifx is not yet ready for production. |
Noted, thanks @climbfuji |
@arunchawla-NOAA @RussTreadon-NOAA Using Intel version 2022.3.0, this workaround did prevent the crash in setuprad.f90, but unfortunately, I am still getting the crash in read_files.f90:
|
Not surprising. Thank you @DavidHuber-NOAA for running an intel/2022.3 test. |
Running with icx and mpiifort version 2023.0.0 with Russ's fix completes the global_3dvar_loproc_updat test, but crashes during the global_3dvar_hiproc_updat test as before. |
I have now tried running the RTs with -O0 optimization for all modules using Intel 2023.0.0, which still fails on the hiproc 3dvar test case. I believe this is an MPI bug and will report it to Intel as such. |
@DavidHuber-NOAA Here are a few ideas:
|
I spent some time moving/porting these tests (with restricted data replaced) over to S4 where I have substantially larger allocations. After doing so, I was able to fix an uninitialized variable bug in general_read_fv3atm.f90 and general_read_gfsatm.f90.
|
This is a mystery! Thanks for all your debugging efforts. |
I've run the RTs with a number of
After encountering this, I disabled |
I agree we can't merge this until we have a supported intel/2022 stack on
Cactus/Dogwood. We should make it clear we are stalled until and unless
this happens.
…On Mon, Aug 21, 2023 at 11:33 AM RussTreadon-NOAA ***@***.***> wrote:
@arunchawla-NOAA <https://github.com/arunchawla-NOAA> , the 91ca898
<hu5970@91ca898>
commit to hu5970:intel2022 *not sufficient* for this PR to be merged into
develop.
We can build and run hu5970:intel2022 on Acorn using Mark's spack-stack
for either intel/19 or intel/2021. 91ca898
<hu5970@91ca898>
is the intel/19 build. Neither this nor the intel/2021 build results in all
9 ctests reproducing develop at bit level. Initial differences are found
at what appears to be round off level.
I don't expect the intel/2021 build to reproduce develop at bit level. I
would expect the intel/19 spack-stack to reproduce develop. Maybe the
intel/19 spack-stack library builds differ in terms of compiler options
with respect to hpc-stack. Another avenue to explore is that the Acorn
spack-stack build does not specify module versions for several libraries.
The spack-stack defaults may differ from the versions used in the develop
build.
The 91ca898
<hu5970@91ca898>
commit to hu5970:intel2022 is an important step forward but it is not the
final step. We can't merge this PR into develop until we can build and
run on Dogwood and Cactus.
—
Reply to this email directly, view it on GitHub
<#447 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANDS4FT4Q32CP6BJ6EVZTV3XWN5WDANCNFSM55BICZ2Q>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
George W Vandenberghe
*Lynker Technologies at * NOAA/NWS/NCEP/EMC
5830 University Research Ct., Rm. 2141
College Park, MD 20740
***@***.***
301-683-3769(work) 3017751547(cell)
|
Commit 06ed4d7 to
|
Acorn spack-stack test @arunchawla-NOAA requested that we test Since we have built and run First build attempt failed because Most ctests fail with netcdf error. For example,
Notice that build uses production netcdf
This is odd since
Further investigation is needed. |
@AlexanderRichert-NOAA can you please take a look here ?Thanks |
Acorn spack-stack test (continued) Comment out initial
in
Rerun
in build log. Seems there are some dependencies among loaded modules which force the load of Given successful build with new spack-stack, rerun ctests. All tests run to completion but 6 our of 9 tests fail
Examine failed cases. For all failed cases except one the initial total radiance penalties differ in the 12th, 13th or 14th digit depending on the test. The one exception is The radiance penalty differences are at round off level. The Bottom line: it is possible to build and run |
Thank you for all your help here Russ !!!! So my take home message is that if we build this spack stack on Dogwood and Cactus we will be ok to merge this PR back and start using spack stack. @AlexanderRichert-NOAA how different is this spack stack from the unified environment that is being built across all the other RDHPCS platforms ? Can we assume that this will work everywhere ? |
There are now multiple hpc-stacks available on acorn, cactus, and dogwood.
i=intel, ic=intel-classic, io=intel-oneapi. The io stack is still missing
hdf5 et. al. (I haven't had time to investigate why yet)
/apps/test/hpc-stack/build/i-19.1.3.304__m-8.1.12__h-1.10.6__n-4.7.4__p-2.5.10__e-8.4.2/
/apps/test/hpc-stack/build/i-19.1.3.304__m-8.1.12__h-1.10.6__n-4.9.2__p-2.5.10__e-8.4.2/
/apps/test/hpc-stack/build/i-19.1.3.304__m-8.1.12__h-1.14.0__n-4.9.2__p-2.5.10__e-8.4.2/
/apps/test/hpc-stack/build/ic-2022.2.0.262__m-8.1.12__h-1.10.6__n-4.7.4__p-2.5.10__e-8.4.2/
/apps/test/hpc-stack/build/ic-2022.2.0.262__m-8.1.12__h-1.10.6__n-4.9.2__p-2.5.10__e-8.4.2/
/apps/test/hpc-stack/build/ic-2022.2.0.262__m-8.1.12__h-1.14.0__n-4.9.2__p-2.5.10__e-8.4.2/
/apps/test/hpc-stack/build/io-2022.2.0.262__m-8.1.12__h-1.10.6__n-4.7.4__p-2.5.10__e-8.4.2/
/apps/test/hpc-stack/build/io-2022.2.0.262__m-8.1.12__h-1.10.6__n-4.9.2__p-2.5.10__e-8.4.2/
/apps/test/hpc-stack/build/io-2022.2.0.262__m-8.1.12__h-1.14.0__n-4.9.2__p-2.5.10__e-8.4.2/
GSI has been built and run successfully with 'i' and 'ic' using both
hdf5/1.10.6 and hdf5/1.14.0 as follows:
Building with intel-classic/2022.2.0.262 using hdf5/1.10.6
Removing prior build/test from hu5970intel2022_ic_h-1.10.6
PULLING GSI BRANCH intel2022 from https://github.com/hu5970/GSI.git into
hu5970intel2022_ic_h-1.10.6 -- output redirected to git.output
checking out commit 06ed4d7
Using stack:
/apps/test/hpc-stack/ic-2022.2.0.262__m-8.1.12__h-1.10.6__n-4.9.2__p-2.5.10__e-8.4.2
STARTING BUILD OF GSI in hu5970intel2022_ic_h-1.10.6 -- output redirected
to build.output and build.output.overflow
Executable found. Good to go.
Cheating by copying the just-made executables into the check path.
RUNNING TESTS IN
/lfs/h1/hpc/cstaff/Steven.Bongiovanni/gsi/testing/hu5970intel2022_ic_h-1.10.6/build
-- output dumped here for viewing
Test project
/lfs/h1/hpc/cstaff/Steven.Bongiovanni/gsi/testing/hu5970intel2022_ic_h-1.10.6/build
1/9 Test #8: netcdf_fv3_regional .............. Passed 602.28 sec
2/9 Test #7: rrfs_3denvar_glbens .............. Passed 603.97 sec
3/9 Test #9: global_enkf ...................... Passed 666.78 sec
4/9 Test #2: global_4dvar ..................... Passed 1681.60 sec
5/9 Test #3: global_4denvar ...................***Failed 1741.74 sec
6/9 Test #4: hwrf_nmm_d2 ...................... Passed 1745.20 sec
7/9 Test #5: hwrf_nmm_d3 ...................... Passed 1869.95 sec
8/9 Test #1: global_3dvar .....................***Failed 2041.62 sec
9/9 Test #6: rtma ............................. Passed 2107.06 sec
Total Test time (real) = 2107.07 sec
Checking for segfaults in output files:No segfault found.
Building with intel-classic/2022.2.0.262 using hdf5/1.14.0
Removing prior build/test from hu5970intel2022_ic_h-1.14.0
PULLING GSI BRANCH intel2022 from https://github.com/hu5970/GSI.git into
hu5970intel2022_ic_h-1.14.0 -- output redirected to git.output
checking out commit 06ed4d7
Using stack:
/apps/test/hpc-stack/ic-2022.2.0.262__m-8.1.12__h-1.14.0__n-4.9.2__p-2.5.10__e-8.4.2
STARTING BUILD OF GSI in hu5970intel2022_ic_h-1.14.0 -- output redirected
to build.output and build.output.overflow
Executable found. Good to go.
Cheating by copying the just-made executables into the check path.
RUNNING TESTS IN
/lfs/h1/hpc/cstaff/Steven.Bongiovanni/gsi/testing/hu5970intel2022_ic_h-1.14.0/build
-- output dumped here for viewing
Test project
/lfs/h1/hpc/cstaff/Steven.Bongiovanni/gsi/testing/hu5970intel2022_ic_h-1.14.0/build
1/9 Test #7: rrfs_3denvar_glbens .............. Passed 603.98 sec
2/9 Test #8: netcdf_fv3_regional .............. Passed 662.27 sec
3/9 Test #9: global_enkf ...................... Passed 667.00 sec
4/9 Test #4: hwrf_nmm_d2 ...................... Passed 1445.17 sec
5/9 Test #6: rtma ............................. Passed 1687.17 sec
6/9 Test #5: hwrf_nmm_d3 ...................... Passed 1810.16 sec
7/9 Test #2: global_4dvar ..................... Passed 1981.72 sec
8/9 Test #3: global_4denvar ...................***Failed 2041.64 sec
9/9 Test #1: global_3dvar .....................***Failed 2101.69 sec
Total Test time (real) = 2101.69 sec
Checking for segfaults in output files:No segfault found.
Building with intel/19.1.3.304 using hdf5/1.10.6
Removing prior build/test from hu5970intel2022_i_h-1.10.6
PULLING GSI BRANCH intel2022 from https://github.com/hu5970/GSI.git into
hu5970intel2022_i_h-1.10.6 -- output redirected to git.output
checking out commit 06ed4d7
Using stack:
/apps/test/hpc-stack/i-19.1.3.304__m-8.1.12__h-1.10.6__n-4.9.2__p-2.5.10__e-8.4.2
STARTING BUILD OF GSI in hu5970intel2022_i_h-1.10.6 -- output redirected to
build.output and build.output.overflow
Executable found. Good to go.
Cheating by copying the just-made executables into the check path.
RUNNING TESTS IN
/lfs/h1/hpc/cstaff/Steven.Bongiovanni/gsi/testing/hu5970intel2022_i_h-1.10.6/build
-- output dumped here for viewing
Test project
/lfs/h1/hpc/cstaff/Steven.Bongiovanni/gsi/testing/hu5970intel2022_i_h-1.10.6/build
1/9 Test #8: netcdf_fv3_regional .............. Passed 602.25 sec
2/9 Test #7: rrfs_3denvar_glbens .............. Passed 603.97 sec
3/9 Test #9: global_enkf ...................... Passed 727.10 sec
4/9 Test #2: global_4dvar ..................... Passed 1681.47 sec
5/9 Test #3: global_4denvar ...................***Failed 1741.72 sec
6/9 Test #4: hwrf_nmm_d2 ...................... Passed 1745.41 sec
7/9 Test #5: hwrf_nmm_d3 ...................... Passed 1870.15 sec
8/9 Test #1: global_3dvar .....................***Failed 2041.72 sec
9/9 Test #6: rtma ............................. Passed 2107.03 sec
Total Test time (real) = 2107.04 sec
Checking for segfaults in output files:No segfault found.
Building with intel/19.1.3.304 using hdf5/1.14.0
Removing prior build/test from hu5970intel2022_i_h-1.14.0
PULLING GSI BRANCH intel2022 from https://github.com/hu5970/GSI.git into
hu5970intel2022_i_h-1.14.0 -- output redirected to git.output
checking out commit 06ed4d7
Using stack:
/apps/test/hpc-stack/i-19.1.3.304__m-8.1.12__h-1.14.0__n-4.9.2__p-2.5.10__e-8.4.2
STARTING BUILD OF GSI in hu5970intel2022_i_h-1.14.0 -- output redirected to
build.output and build.output.overflow
Executable found. Good to go.
Cheating by copying the just-made executables into the check path.
RUNNING TESTS IN
/lfs/h1/hpc/cstaff/Steven.Bongiovanni/gsi/testing/hu5970intel2022_i_h-1.14.0/build
-- output dumped here for viewing
Test project
/lfs/h1/hpc/cstaff/Steven.Bongiovanni/gsi/testing/hu5970intel2022_i_h-1.14.0/build
1/9 Test #8: netcdf_fv3_regional .............. Passed 602.19 sec
2/9 Test #7: rrfs_3denvar_glbens .............. Passed 604.03 sec
3/9 Test #9: global_enkf ...................... Passed 667.29 sec
4/9 Test #3: global_4denvar ...................***Failed 1681.71 sec
5/9 Test #1: global_3dvar .....................***Failed 1741.56 sec
6/9 Test #5: hwrf_nmm_d3 ...................... Passed 1810.10 sec
7/9 Test #2: global_4dvar ..................... Passed 1921.55 sec
8/9 Test #4: hwrf_nmm_d2 ...................... Passed 1985.62 sec
9/9 Test #6: rtma ............................. Passed 2167.13 sec
Total Test time (real) = 2167.14 sec
Checking for segfaults in output files:No segfault found.
*Many thanks to Russ* for cleaning up the module files for wcoss as it
allowed these builds to run clean with just the following changes:
diff --git a/modulefiles/gsi_common_wcoss2.lua
b/modulefiles/gsi_common_wcoss2.lua
index 1b2a88db..4c16df78 100644
--- a/modulefiles/gsi_common_wcoss2.lua
+++ b/modulefiles/gsi_common_wcoss2.lua
@@ -12,14 +12,13 @@
-local nemsio_ver=os.getenv("nemsio_ver") or "2.5.2"
+local nemsio_ver=os.getenv("nemsio_ver") or "2.5.4"
…-load(pathJoin("netcdf-c", netcdf_c_ver))
-load(pathJoin("netcdf-fortran", netcdf_fortran_ver))
+load(pathJoin("netcdf", netcdf_c_ver))
@@ -33,5 +32,5 @@
-load(pathJoin("gsi-ncdiag",ncdiag_ver))
+load(pathJoin("ncdiag",ncdiag_ver))
diff --git a/modulefiles/gsi_wcoss2.lua b/modulefiles/gsi_wcoss2.lua
index a225514e..63bc3cb4 100644
--- a/modulefiles/gsi_wcoss2.lua
+++ b/modulefiles/gsi_wcoss2.lua
@@ -3,14 +3,16 @@ help([[
load("PrgEnv-intel")
+load("envvar")
load("intel")
-prepend_path("MODULEPATH",
"/lfs/h1/emc/nceplibs/noscrub/Mark.Potts/spack-stack/spack-stack-1.4.1/envs/unified-dev-19/install/modulefiles/Core")
-load("stack-intel")
-load("stack-cray-mpich")
+prepend_path("MODULEPATH",
"/apps/test/hpc-stack/i-19.1.3.304__m-8.1.12__h-1.14.0__n-4.9.2__p-2.5.10__e-8.4.2/modulefiles/stack")
+load("hpc")
+load("hpc-intel")
+load("hpc-cray-mpich")
On Wed, Aug 23, 2023 at 9:28 AM RussTreadon-NOAA ***@***.***> wrote:
*Acorn spack-stack test (continued)*
Comment out initial
load("netcdf-c")
load("netcdf-fortran")
in gsi_common.lua and add the following to the end of the file
unload("netcdf/4.7.4")
load("netcdf-c")
load("netcdf-fortran")
Rerun build.sh. This time see
-- FindNetCDF defines targets:
-- - NetCDF_VERSION [4.9.2]
-- - NetCDF_PARALLEL [TRUE]
-- - NetCDF_C_CONFIG_EXECUTABLE [/lfs/h1/emc/nceplibs/noscrub/spack-stack/spack-stack-1.4.1/envs/unified-env-intel22/install/intel/2022.0.2.262/netcdf-c-4.9.2-lnnrgek/bin/nc-config]
-- - NetCDF::NetCDF_C [SHARED] [Root: /lfs/h1/emc/nceplibs/noscrub/spack-stack/spack-stack-1.4.1/envs/unified-env-intel22/install/intel/2022.0.2.262/netcdf-c-4.9.2-lnnrgek] Lib: /lfs/h1/emc/nceplibs/noscrub/spack-stack/spack-stack-1.4.1/envs/unified-env-intel22/install/intel/2022.0.2.262/netcdf-c-4.9.2-lnnrgek/lib/libnetcdf.so
-- - NetCDF_Fortran_CONFIG_EXECUTABLE [/lfs/h1/emc/nceplibs/noscrub/spack-stack/spack-stack-1.4.1/envs/unified-env-intel22/install/intel/2022.0.2.262/netcdf-fortran-4.6.0-hzledc6/bin/nf-config]
-- - NetCDF::NetCDF_Fortran [SHARED] [Root: /lfs/h1/emc/nceplibs/noscrub/spack-stack/spack-stack-1.4.1/envs/unified-env-intel22/install/intel/2022.0.2.262/netcdf-fortran-4.6.0-hzledc6] Lib: /lfs/h1/emc/nceplibs/noscrub/spack-stack/spack-stack-1.4.1/envs/unified-env-intel22/install/intel/2022.0.2.262/netcdf-fortran-4.6.0-hzledc6/lib/libnetcdff.so
in build log. Seems there are some dependencies among loaded modules which
force the load of netcdf/4.7.4. The gsi_common.lua used to build with
Mark's spack-stack does not specify versions for most modules. Adding
version numbers back to gsi_common.lua could help sort out what's going
on.
Given successful build with new spack-stack, rerun ctests. All tests run
to completion but 6 our of 9 tests fail
***@***.***:/lfs/h2/emc/da/noscrub/russ.treadon/git/gsi/pr571_alex/build> ctest -j 9
Test project /lfs/h2/emc/da/noscrub/russ.treadon/git/gsi/pr571_alex/build
Start 1: global_3dvar
Start 2: global_4dvar
Start 3: global_4denvar
Start 4: hwrf_nmm_d2
Start 5: hwrf_nmm_d3
Start 6: rtma
Start 7: rrfs_3denvar_glbens
Start 8: netcdf_fv3_regional
Start 9: global_enkf
1/9 Test #8: netcdf_fv3_regional ..............***Failed 542.23 sec
2/9 Test #7: rrfs_3denvar_glbens .............. Passed 664.38 sec
3/9 Test #9: global_enkf ...................... Passed 1087.75 sec
4/9 Test #5: hwrf_nmm_d3 ......................***Failed 1511.27 sec
5/9 Test #2: global_4dvar .....................***Failed 1861.54 sec
6/9 Test #1: global_3dvar .....................***Failed 1861.66 sec
7/9 Test #4: hwrf_nmm_d2 ......................***Failed 2045.72 sec
8/9 Test #3: global_4denvar ...................***Failed 2221.85 sec
9/9 Test #6: rtma ............................. Passed 2288.61 sec
33% tests passed, 6 tests failed out of 9
Total Test time (real) = 2288.62 sec
The following tests FAILED:
1 - global_3dvar (Failed)
2 - global_4dvar (Failed)
3 - global_4denvar (Failed)
4 - hwrf_nmm_d2 (Failed)
5 - hwrf_nmm_d3 (Failed)
8 - netcdf_fv3_regional (Failed)
Errors while running CTest
Examine failed cases. For all failed cases except one the initial total
radiance penalties differ in the 12th, 13th or 14th digit depending on the
test. The one exception is hwrf_nmm_d3. This test does not assimilate
radiances. The total initial penalties are identical between the control
and update. Differences in the hwrf_nmm_d3 test first show up in the
initial gradients. The initial gradients differ in the 8th digit.
The radiance penalty differences are at round off level. The hwrf_nmm_d3
gradient differences are larger but may still reflect numerical round off
differences. The control executables are built with intel19. The update
executables are built with intel2021.
*Bottom line:* it is possible to build and run hu5970:intel2022 using
modules from
/lfs/h1/emc/nceplibs/noscrub/spack-stack/spack-stack-1.4.1/envs/unified-env-intel22/install/modulefiles
—
Reply to this email directly, view it on GitHub
<#447 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/A7X3GSSYBZETLOKE64L7ZOLXWYAPZANCNFSM55BICZ2Q>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@bongi-NOAA thanks for this test!! If there are 2022 hpc-stacks available on Dogwood / Cactus then we should just update the modules to use one of the 2022 stacks and proceed with this PR. We can switch to spack-stack once it is installed by NCO |
@arunchawla-NOAA re: the spack-stack unified environment, I would expect it to be the same as 1.4.1 on all our other platforms (same package versions/build options). |
CRTM coefficient problem The crtm/2.4.0 coefficients associated with The spack-stack The production The spack-stack CRTM_FIX contains 1706 files. The production CRTM_FIX cotnains 1410 files. A diff of the two directories finds 445 $sensor.TauCoeff.bin files differ. The ctests reported above bypassed this problem by explicitly pointing at CRTM coefficients in We need to get the correct CRTM coefficients in the spack-stack CRTM_FIX. |
Noted. I'll looked into this. Created JCSDA/spack-stack#735 |
@RussTreadon-NOAA it appears that the file ftp://ftp.ssec.wisc.edu/pub/s4/CRTM/fix_REL-2.4.0_emc.tgz, which is where spack-stack and hpc-stack get the fix files, has changed a couple of times in the last year or so. I don't know why they changed, but in any case I'm fairly certain that if the hpc-stack version were reinstalled, it would look like the current spack-stack one. |
@RussTreadon-NOAA wow that was a great catch! Who is managing these fix files ? |
I @'d the CRTM manager (Ben Johnson) in JCSDA/spack-stack#735 |
Acorn spack-stack questions While the spack-stack in
after
Why are these lines added?
|
I can speak to some of these items: 1- My guess here is that it's to avoid production packages, but that's just a guess. |
I'm fine with I don't know what modifications to make to |
I have asked NCO to create a spack-stack for us in an experimental space. That will work while we get issues ironed out |
…s for updat and contrl ctest jobs (NOAA-EMC#447)
The GSI and EnKF compiles without issue using Intel 2022 and the hpc-stack built Intel 2022 libraries. However, while running the executable built using Intel 2022, the jobs fail:
Looking at the output:
From the traceback:
Line 555 in src/gsi/setuprad.f90 is:
GSI/src/gsi/setuprad.f90
Line 555 in e23204e
Either compiling the code with -O0 (no optimization) or writing out iuse_rad and predx in setuprad, the same test runs through to completion without a segmentation fault.
To compile using Intel 2022:
local hpc_intel_ver=os.getenv("hpc_intel_ver") or "18.0.5.274"
local hpc_intel_ver=os.getenv("hpc_intel_ver") or "2022.1.2"
local hpc_impi_ver=os.getenv("hpc_impi_ver") or "2018.0.4"
local hpc_impi_ver=os.getenv("hpc_impi_ver") or "2022.1.2"
local w3nco_ver=os.getenv("w3nco_ver") or "2.4.1"
load(pathJoin("w3nco", w3nco_ver))
The text was updated successfully, but these errors were encountered: