Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support IntelLLVM compiler #2224

Merged
merged 49 commits into from
Oct 1, 2024

Conversation

DusanJovic-NOAA
Copy link
Collaborator

@DusanJovic-NOAA DusanJovic-NOAA commented Apr 3, 2024

Commit Queue Requirements:

  • Fill out all sections of this template.
  • All sub component pull requests have been reviewed by their code managers.
  • Run the full Intel+GNU RT suite (compared to current baselines) on either Hera/Derecho/Hercules
  • Commit 'test_changes.list' from previous step

Description:

Add preliminary support for Intel LLVM compiler. Added new top-level cmake/IntelLLVM.cmake configuration file. Update CMakeLists.txt in various submodules to recognize IntelLLVM as supported compiler and add Intel LLVM compiler flags.

Useful porting guide: https://www.intel.com/content/www/us/en/developer/articles/guide/porting-guide-for-ifort-to-ifx.html

At this moment I only tried to build the model on Hercules.

In order to use new compiler you need to set the following three environment variables:

export I_MPI_CC=icx
export I_MPI_CXX=icpx
export I_MPI_F90=ifx

Commit Message:

* UFSWM -  Add support for Intel LLVM compiler
  * CICE -  Add support for Intel LLVM compiler
  * FV3 -  Add support for Intel LLVM compiler
    * ccpp-physics -  Add support for Intel LLVM compiler
    * atmos_cubed_sphere -  Add support for Intel LLVM compiler 
    * upp -  Add support for Intel LLVM compiler 
  * WW3 - Add support for Intel LLVM compiler

Priority:

  • Normal

Git Tracking

UFSWM:

Sub component Pull Requests:

UFSWM Blocking Dependencies:

  • None

Changes

Regression Test Changes (Please commit test_changes.list):

  • No Baseline Changes.

Input data Changes:

  • None.

Library Changes/Upgrades:

  • No Updates

Testing Log:

  • RDHPCS
    • Hera
    • Orion
    • Hercules
    • Jet
    • Gaea
    • Derecho
  • WCOSS2
    • Dogwood/Cactus
    • Acorn
  • CI
  • opnReqTest (complete task if unnecessary)

@DusanJovic-NOAA DusanJovic-NOAA marked this pull request as draft April 3, 2024 15:37
@BrianCurtis-NOAA
Copy link
Collaborator

Is this something for only Hercules, or will other systems be implementing IntelLLVM ?

@DusanJovic-NOAA
Copy link
Collaborator Author

DusanJovic-NOAA commented Apr 3, 2024

Is this something for only Hercules, or will other systems be implementing IntelLLVM ?

Eventually we must switch to IntelLLVM on all platforms. When, I don't know. But classic Intel compilers are now deprecated, see:
https://www.intel.com/content/www/us/en/developer/articles/release-notes/oneapi-fortran-compiler-release-notes.html

NOTE: Intel® Fortran Compiler Classic (ifort) is now deprecated and will be discontinued in late 2024. Intel recommends that customers transition now to using the LLVM-based Intel® Fortran Compiler (ifx) for continued Windows* and Linux* support, new language support, new language features, and optimizations. For more information on ifx, see the Intel® Fortran Compiler Developer Guide and Reference and the Porting Guide for ifort Users to ifx.

@DusanJovic-NOAA DusanJovic-NOAA changed the title [DRAFT for testing] Support IntelLLMV compiler [DRAFT for testing] Support IntelLLVM compiler Apr 4, 2024
@DusanJovic-NOAA
Copy link
Collaborator Author

Compilation with Intel LLVM on Hera fails with this error:

/scratch2/NCEPDEV/fv3-cam/Dusan.Jovic/ufs/support_intelllvm/ufs-weather-model/MOM6-interface/
MOM6/src/framework/MOM_unit_testing.F90(128): error #5533: Feature found on this line is not 
yet supported in ifx                                                                         
  test%proc => proc
--^
compilation aborted for /scratch2/NCEPDEV/fv3-cam/Dusan.Jovic/ufs/support_intelllvm/ufs-weath
er-model/MOM6-interface/MOM6/src/framework/MOM_unit_testing.F90 (code 3)
make[2]: *** [MOM6-interface/CMakeFiles/mom6_obj.dir/build.make:1765: MOM6-interface/CMakeFil
es/mom6_obj.dir/MOM6/src/framework/MOM_unit_testing.F90.o] Error 3

@DusanJovic-NOAA
Copy link
Collaborator Author

All coupled debug tests fail on Hercules with this error:

179: forrtl: severe (408): fort: (3): Subscript #3 of the array UVEL has value 0 which is less than the lower bound of 1
179: 
179: Image              PC                Routine            Line        Source             
179: ufs_model          000000000B5A1262  accum_hist               2318  ice_history.F90
179: ufs_model          000000000B4BFB6F  initializerealize         835  ice_comp_nuopc.F90
179: ufs_model          0000000000B94024  Unknown               Unknown  Unknown
179: ufs_model          0000000000B97F9F  Unknown               Unknown  Unknown
179: ufs_model          0000000000D0EBC7  Unknown               Unknown  Unknown
179: ufs_model          0000000000CFB022  Unknown               Unknown  Unknown

This is looks like a bug in the ifx.

@DeniseWorthen
Copy link
Collaborator

@NickSzapiro-NOAA Would you please take a look at what is happening here w/ CICE? You'll need to get from @DusanJovic-NOAA build instructions and maybe a sandbox run directory.

@NickSzapiro-NOAA
Copy link
Collaborator

Of course, @DeniseWorthen @DusanJovic-NOAA. Please let me know if there are any special instructions needed to reproduce. On first pass, it's curious that the ice speed may be the first call to accum_hist with an array that was not allocated:
https://github.com/NOAA-EMC/CICE/blob/emc/develop/cicecore/cicedyn/analysis/ice_history.F90#L2317-L2320

@DusanJovic-NOAA
Copy link
Collaborator Author

I was able to create a small test program that reproduces an error identical to the one we see in CICE. Dom (@climbfuji) confirmed that it works with the latest Intel LLVM compiler (ifx (IFX) 2024.1.0 20240308), so it is probably just a bug in the older versions.

Version on Hercules is (ifx (IFX) 2023.1.0 20230320). On Hera it's even older (ifx (IFORT) 2022.0.0 20211123), and that version can not even compile the code with MOM6. See above error.

We need to find a machine (NOAA R&D) on which we can get the latest version of Intel LLVM compiler (2024.1.0) installed.

@climbfuji
Copy link
Collaborator

I can install it on Hercules in the EPIC spack-stack space, but we still need to manage building all dependencies with it.

@junwang-noaa
Copy link
Collaborator

If it is confirmed that the latest Intel LLVM compiler fixes the issue, let's ask sysadmin to install the latest version.

@DusanJovic-NOAA
Copy link
Collaborator Author

I can install it on Hercules in the EPIC spack-stack space, but we still need to manage building all dependencies with it.

Eventually yes, but for now I think we can still build all libraries (spack-stack) with ifort (but the new version 2024) and only the model with ifx. Object files and module files should be binary compatible between ifort and ifx.

According to Intel:
... Binaries and libraries generated with ifort can be linked with binaries and libraries built with ifx, and .mod files generated with one compiler can be used by the other (64-bit targets only). Both compilers use the the same runtime libraries. ...

@climbfuji
Copy link
Collaborator

Except that there's no more icc anc icpc, so at least for those we have to switch to icx and icpx. It should be fairly straightforward to build just the dependencies for the ufs-weather-model with ifort and/or ifx and get all the other packages compiled in a second step.

@NickSzapiro-NOAA
Copy link
Collaborator

@DusanJovic-NOAA
Copy link
Collaborator Author

DusanJovic-NOAA commented Apr 5, 2024

@DusanJovic-NOAA Does filling worka first (https://github.com/NOAA-EMC/CICE/blob/emc/develop/cicecore/cicedyn/analysis/ice_history.F90#L2321-L2331) give a temporary fix?

I think it should. Looks like having an array expression (that depends on iblk) as an actual subroutine argument is what fails.

@DusanJovic-NOAA
Copy link
Collaborator Author

DusanJovic-NOAA commented Jul 19, 2024

Currently debug builds fail (on Hera) with the following error:

[100%] Linking Fortran executable ufs_model
ld: cannot find -lpioc
ld: attempted static link of dynamic object `/apps/oneapi/compiler/2023.2.0/linux/compiler/lib/intel64_lin/libiomp5.so'
make[2]: *** [CMakeFiles/ufs_model.dir/build.make:121: ufs_model] Error 1
make[1]: *** [CMakeFiles/Makefile2:328: CMakeFiles/ufs_model.dir/all] Error 2
make: *** [Makefile:136: all] Error 2

That's why all debug tests are currently disabled in rt_intelllvm.conf.

All non-debug tests successfully compile, execute and reproduce it's own outputs, ie:

./rt.sh -l rt_intelllvm.conf -e -c
./rt.sh -l rt_intelllvm.conf -e -m  

is successful.

@NickSzapiro-NOAA
Copy link
Collaborator

The CICE #ifndef LLVM PR was merged at CICE-Consortium. Here is a new EMC/CICE branch (https://github.com/NickSzapiro-NOAA/CICE/tree/cice_llvm) and PR (NOAA-EMC/CICE#93) for use in this PR.

It is the same lines of code for us in CICE and cleaner git history. If can use, there is no need to repeat testing.

@jkbk2004
Copy link
Collaborator

jkbk2004 commented Sep 30, 2024

@NickSzapiro-NOAA @BrianCurtis-NOAA @zach1221 @FernandoAndrade-NOAA We will start merging process for this pr.

@zach1221
Copy link
Collaborator

@NickSzapiro-NOAA @BrianCurtis-NOAA @zach1221 @FernandoAndrade-NOAA We will start merging process for this pr.

Should we wait for ORTs or no? Fernando and I have been trying to run on Hera, but the jobs have sat in pending.

@BrianCurtis-NOAA
Copy link
Collaborator

@NickSzapiro-NOAA @BrianCurtis-NOAA @zach1221 @FernandoAndrade-NOAA We will start merging process for this pr.

Should we wait for ORTs or no? Fernando and I have been trying to run on Hera, but the jobs have sat in pending.

I think it's OK to skip those here.

@NickSzapiro-NOAA
Copy link
Collaborator

For CICE, is it ok to change to this PR (NOAA-EMC/CICE#93) and branch (https://github.com/NickSzapiro-NOAA/CICE/tree/cice_llvm) instead?

@jkbk2004
Copy link
Collaborator

For CICE, is it ok to change to this PR (NOAA-EMC/CICE#93) and branch (https://github.com/NickSzapiro-NOAA/CICE/tree/cice_llvm) instead?

@NickSzapiro-NOAA I don't want to keep pushing code changes to this pr since all tests are done. You can bring up changes to next pr #2437.

@grantfirl
Copy link
Collaborator

@NickSzapiro-NOAA @BrianCurtis-NOAA @zach1221 @FernandoAndrade-NOAA We will start merging process for this pr.

@jkbk2004 Shall I go ahead and merge ufs-community/ccpp-physics#220 to start?

@jkbk2004
Copy link
Collaborator

@NickSzapiro-NOAA @BrianCurtis-NOAA @zach1221 @FernandoAndrade-NOAA We will start merging process for this pr.

@jkbk2004 Shall I go ahead and merge ufs-community/ccpp-physics#220 to start?

@grantfirl give us about two hours more. @NickSzapiro-NOAA wants to switch around cice feature branches. but no actual code base changes.

@NickSzapiro-NOAA
Copy link
Collaborator

NickSzapiro-NOAA commented Sep 30, 2024

Apologies for the last minute changes but commit of cice_llvm branch to EMC/CICE instead seems preferable in the interests of cleaner git history. Just to be explicit, we're talking about the same 4 narrow code changes in cicecore/cicedyn/analysis/ice_history.F90 and cicecore/drivers/nuopc/cmeps/ice_prescribed_mod.F90 in
NOAA-EMC/CICE@develop...NickSzapiro-NOAA:CICE:sync_cice
vs
NOAA-EMC/CICE@develop...NickSzapiro-NOAA:CICE:cice_llvm

I just tried to make a sanity check of cpld_control_gfsv17 on hercules with cice_llvm branch. Compile is successful but Disk quota is full today so run stops. @jkbk2004 @BrianCurtis-NOAA I don't want to delay commit queue but not sure what to do. It is the same lines of code that have already been tested...

If switch is ok, PR closes CICE issue NOAA-EMC/CICE#93 instead

Update: cpld_control_gfsv17 passed b4b on hercules with cice_llvm branch @jkbk2004 @BrianCurtis-NOAA

@jkbk2004
Copy link
Collaborator

@NickSzapiro-NOAA Let's go with NOAA-EMC/CICE#93. Please, go ahead to merge the CICE pr. @grantfirl Please, go ahead to merge ufs-community/ccpp-physics#220

@jkbk2004
Copy link
Collaborator

@FernandoAndrade-NOAA @gspetro-NOAA will you merge NOAA-EMC/UPP#1001?

@grantfirl
Copy link
Collaborator

@NickSzapiro-NOAA Let's go with NOAA-EMC/CICE#93. Please, go ahead to merge the CICE pr. @grantfirl Please, go ahead to merge ufs-community/ccpp-physics#220

@jkbk2004 Done

@NickSzapiro-NOAA
Copy link
Collaborator

CICE is merged too

@MatthewMasarik-NOAA
Copy link
Collaborator

@jkbk2004, WW3 is merged.

@WenMeng-NOAA
Copy link
Contributor

@FernandoAndrade-NOAA @gspetro-NOAA will you merge NOAA-EMC/UPP#1001?

@jkbk2004 @DusanJovic-NOAA UPP is merged.

@jkbk2004 jkbk2004 merged commit bad50ef into ufs-community:develop Oct 1, 2024
3 checks passed
@DusanJovic-NOAA DusanJovic-NOAA deleted the support_intelllvm branch October 8, 2024 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
No Baseline Change No Baseline Change Ready for Commit Queue The PR is ready for the Commit Queue. All checkboxes in PR template have been checked.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Upgrade to Intel LLVM Compilers