Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{2023.06}[foss/2021a] WRF-dmpar V4.3 #290

Conversation

TopRichard
Copy link
Collaborator

No description provided.

@eessi-bot
Copy link

eessi-bot bot commented Jun 29, 2023

Instance eessi-bot-citc-aws is configured to build:

  • arch x86_64/generic for repo eessi-2021.12
  • arch x86_64/generic for repo eessi-2023.06-compat
  • arch x86_64/generic for repo eessi-2023.06-software
  • arch x86_64/intel/haswell for repo eessi-2021.12
  • arch x86_64/intel/haswell for repo eessi-2023.06-compat
  • arch x86_64/intel/haswell for repo eessi-2023.06-software
  • arch x86_64/intel/skylake_avx512 for repo eessi-2021.12
  • arch x86_64/intel/skylake_avx512 for repo eessi-2023.06-compat
  • arch x86_64/intel/skylake_avx512 for repo eessi-2023.06-software
  • arch x86_64/amd/zen2 for repo eessi-2021.12
  • arch x86_64/amd/zen2 for repo eessi-2023.06-compat
  • arch x86_64/amd/zen2 for repo eessi-2023.06-software
  • arch x86_64/amd/zen3 for repo eessi-2021.12
  • arch x86_64/amd/zen3 for repo eessi-2023.06-compat
  • arch x86_64/amd/zen3 for repo eessi-2023.06-software
  • arch aarch64/generic for repo eessi-2021.12
  • arch aarch64/generic for repo eessi-2023.06-compat
  • arch aarch64/generic for repo eessi-2023.06-software
  • arch aarch64/neoverse_n1 for repo eessi-2021.12
  • arch aarch64/neoverse_n1 for repo eessi-2023.06-compat
  • arch aarch64/neoverse_n1 for repo eessi-2023.06-software
  • arch aarch64/neoverse_v1 for repo eessi-2021.12
  • arch aarch64/neoverse_v1 for repo eessi-2023.06-compat
  • arch aarch64/neoverse_v1 for repo eessi-2023.06-software

@TopRichard
Copy link
Collaborator Author

bot: build repo:eessi-2023.06-software arch:aarch64/generic

@eessi-bot
Copy link

eessi-bot bot commented Jun 29, 2023

Updates by the bot instance eessi-bot-citc-aws (click for details)
  • received bot command build repo:eessi-2023.06-software arch:aarch64/generic from TopRichard

    • expanded format: build repository:eessi-2023.06-software architecture:aarch64/generic
  • handling command build repository:eessi-2023.06-software architecture:aarch64/generic resulted in:

@eessi-bot
Copy link

eessi-bot bot commented Jun 29, 2023

New job on instance eessi-bot-citc-aws for architecture aarch64-generic for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.06/pr_290/5666

date job status comment
Jun 29 07:00:14 UTC 2023 submitted job id 5666 awaits release by job manager
Jun 29 07:01:10 UTC 2023 released job awaits launch by Slurm scheduler
Jun 29 07:05:12 UTC 2023 running job 5666 is running
Jun 29 07:49:59 UTC 2023 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-5666.out
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-generic-1688024937.tar.gzsize: 10 MiB (11156920 bytes)
entries: 759
modules under 2023.06/software/linux/aarch64/generic/modules/all
Bison/3.7.6-GCCcore-10.3.0.lua
Doxygen/1.9.1-GCCcore-10.3.0.lua
JasPer/2.0.28-GCCcore-10.3.0.lua
libiconv/1.16-GCCcore-10.3.0.lua
netCDF/4.8.0-gompi-2021a.lua
netCDF-Fortran/4.5.3-gompi-2021a.lua
tcsh/6.22.04-GCCcore-10.3.0.lua
time/1.9-GCCcore-10.3.0.lua
software under 2023.06/software/linux/aarch64/generic/software
Bison/3.7.6-GCCcore-10.3.0
Doxygen/1.9.1-GCCcore-10.3.0
JasPer/2.0.28-GCCcore-10.3.0
libiconv/1.16-GCCcore-10.3.0
netCDF/4.8.0-gompi-2021a
netCDF-Fortran/4.5.3-gompi-2021a
tcsh/6.22.04-GCCcore-10.3.0
time/1.9-GCCcore-10.3.0
other under 2023.06/software/linux/aarch64/generic
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@boegel
Copy link
Contributor

boegel commented Jul 8, 2023

Installation failed in configure step, details below.

Looks like our pre-configure hook for WRF needs work?

== configuring...
== Running pre-configure hook...
== Using custom preconfigopts for WRF:  sed -i 's/Linux x86_64 ppc64le, gfortran/Linux x86_64 aarch64 ppc64le, gfortran/g' arch/configure_new.defaults &&
  >> running interactive command:
	[started at: 2023-06-29 07:47:46]
	[working dir: /cvmfs/pilot.eessi-hpc.org/versions/2023.06/software/linux/aarch64/generic/software/WRF/4.3-foss-2021a-dmpar/WRF-4.3]
	[output logged in /tmp/eb-dxromz72/eb-a8u9yx8o/eb-yad5i647/eb-vayxs19v/eb-p6cdsroq/eb-hoeqxajd/eb-b4whiekr/eb-8z9yjdda/easybuild-run_cmd_qa-3x6bwut9.log]
	sed -i 's/Linux x86_64 ppc64le, gfortran/Linux x86_64 aarch64 ppc64le, gfortran/g' arch/configure_new.defaults &&   ./configure
  >> interactive command completed: exit 2, ran in 00h00m01s
== ... (took 1 secs)
== FAILED: Installation ended unsuccessfully (build directory: /cvmfs/pilot.eessi-hpc.org/versions/2023.06/software/linux/aarch64/generic/software/WRF/4.3-foss-2021a-dmpar): build failed (first 300 chars): cmd " sed -i 's/Linux x86_64 ppc64le, gfortran/Linux x86_64 aarch64 ppc64le, gfortran/g' arch/configure_new.defaults &&   ./configure " exited with exit code 2 and output:
sed: can't read arch/configure_new.defaults: No such file or directory
 (took 36 secs)

@bedroge
Copy link
Collaborator

bedroge commented Jul 10, 2023

Installation failed in configure step, details below.

Looks like our pre-configure hook for WRF needs work?

== configuring...
== Running pre-configure hook...
== Using custom preconfigopts for WRF:  sed -i 's/Linux x86_64 ppc64le, gfortran/Linux x86_64 aarch64 ppc64le, gfortran/g' arch/configure_new.defaults &&
  >> running interactive command:
	[started at: 2023-06-29 07:47:46]
	[working dir: /cvmfs/pilot.eessi-hpc.org/versions/2023.06/software/linux/aarch64/generic/software/WRF/4.3-foss-2021a-dmpar/WRF-4.3]
	[output logged in /tmp/eb-dxromz72/eb-a8u9yx8o/eb-yad5i647/eb-vayxs19v/eb-p6cdsroq/eb-hoeqxajd/eb-b4whiekr/eb-8z9yjdda/easybuild-run_cmd_qa-3x6bwut9.log]
	sed -i 's/Linux x86_64 ppc64le, gfortran/Linux x86_64 aarch64 ppc64le, gfortran/g' arch/configure_new.defaults &&   ./configure
  >> interactive command completed: exit 2, ran in 00h00m01s
== ... (took 1 secs)
== FAILED: Installation ended unsuccessfully (build directory: /cvmfs/pilot.eessi-hpc.org/versions/2023.06/software/linux/aarch64/generic/software/WRF/4.3-foss-2021a-dmpar): build failed (first 300 chars): cmd " sed -i 's/Linux x86_64 ppc64le, gfortran/Linux x86_64 aarch64 ppc64le, gfortran/g' arch/configure_new.defaults &&   ./configure " exited with exit code 2 and output:
sed: can't read arch/configure_new.defaults: No such file or directory
 (took 36 secs)

Just had a quick look at the contents of the source tarball: apparently, version 3.9.x had arch/configure_{old,new}.defaults, while version 4.3 only has a arch/configure.defaults. The latter already includes the following:

#ARCH    Linux i486 i586 i686 armv7l aarch64, gfortran compiler with gcc #serial smpar dmpar dm+sm

So I guess the hook is no longer required for newer version, and we should add a version check to it.

@bedroge
Copy link
Collaborator

bedroge commented Jul 10, 2023

Just had a quick look at the contents of the source tarball: apparently, version 3.9.x had arch/configure_{old,new}.defaults, while version 4.3 only has a arch/configure.defaults. The latter already includes the following:

#ARCH    Linux i486 i586 i686 armv7l aarch64, gfortran compiler with gcc #serial smpar dmpar dm+sm

So I guess the hook is no longer required for newer version, and we should add a version check to it.

Checked some more versions:

  • 3.9 has these old/new files
  • 4.0 up to and including 4.2.1 just have configure.defaults, but no support for aarch64 yet
  • 4.2.2 and newer do have aarch64 listed in configure.defaults

So the hook would have to check for these three version ranges and do the corresponding action.

@TopRichard TopRichard force-pushed the eessi-2023.06-WRF-dmpar/4.3-foss/2021a branch from 59af092 to 3ab61d3 Compare September 18, 2023 14:01
@TopRichard
Copy link
Collaborator Author

bot: build repo:eessi-2023.06-software arch:aarch64/generic

@eessi-bot
Copy link

eessi-bot bot commented Sep 18, 2023

Updates by the bot instance eessi-bot-citc-aws (click for details)
  • received bot command build repo:eessi-2023.06-software arch:aarch64/generic from TopRichard

    • expanded format: build repository:eessi-2023.06-software architecture:aarch64/generic
  • handling command build repository:eessi-2023.06-software architecture:aarch64/generic resulted in:

@eessi-bot
Copy link

eessi-bot bot commented Sep 18, 2023

New job on instance eessi-bot-citc-aws for architecture aarch64-generic for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.09/pr_290/7402

date job status comment
Sep 18 14:05:03 UTC 2023 submitted job id 7402 awaits release by job manager
Sep 18 14:05:46 UTC 2023 released job awaits launch by Slurm scheduler
Sep 18 14:09:49 UTC 2023 running job 7402 is running
Sep 18 17:13:43 UTC 2023 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-7402.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-generic-1695057099.tar.gzsize: 640 MiB (671663018 bytes)
entries: 7432
modules under 2023.06/software/linux/aarch64/generic/modules/all
netCDF-Fortran/4.5.3-gompi-2021a.lua
tcsh/6.22.04-GCCcore-10.3.0.lua
time/1.9-GCCcore-10.3.0.lua
WRF/4.3-foss-2021a-dmpar.lua
software under 2023.06/software/linux/aarch64/generic/software
netCDF-Fortran/4.5.3-gompi-2021a
tcsh/6.22.04-GCCcore-10.3.0
time/1.9-GCCcore-10.3.0
WRF/4.3-foss-2021a-dmpar
other under 2023.06/software/linux/aarch64/generic
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

eb_hooks.py Outdated Show resolved Hide resolved
@TopRichard TopRichard force-pushed the eessi-2023.06-WRF-dmpar/4.3-foss/2021a branch from f2d8bc6 to 7cc4a1b Compare September 18, 2023 16:12
@TopRichard
Copy link
Collaborator Author

bot: build repo:eessi-2023.06-software arch:x86_64/generic
bot: build repo:eessi-2023.06-software arch:x86_64/intel/haswell
bot: build repo:eessi-2023.06-software arch:x86_64/intel/skylake_avx512
bot: build repo:eessi-2023.06-software arch:x86_64/amd/zen3
bot: build repo:eessi-2023.06-software arch:x86_64/amd/zen2
bot: build repo:eessi-2023.06-software arch:aarch64/neoverse_v1
bot: build repo:eessi-2023.06-software arch:aarch64/neoverse_n1

@eessi-bot
Copy link

eessi-bot bot commented Sep 19, 2023

Updates by the bot instance eessi-bot-citc-aws (click for details)
  • received bot command build repo:eessi-2023.06-software arch:x86_64/generic from TopRichard

    • expanded format: build repository:eessi-2023.06-software architecture:x86_64/generic
  • received bot command build repo:eessi-2023.06-software arch:x86_64/intel/haswell from TopRichard

    • expanded format: build repository:eessi-2023.06-software architecture:x86_64/intel/haswell
  • received bot command build repo:eessi-2023.06-software arch:x86_64/intel/skylake_avx512 from TopRichard

    • expanded format: build repository:eessi-2023.06-software architecture:x86_64/intel/skylake_avx512
  • received bot command build repo:eessi-2023.06-software arch:x86_64/amd/zen3 from TopRichard

    • expanded format: build repository:eessi-2023.06-software architecture:x86_64/amd/zen3
  • received bot command build repo:eessi-2023.06-software arch:x86_64/amd/zen2 from TopRichard

    • expanded format: build repository:eessi-2023.06-software architecture:x86_64/amd/zen2
  • received bot command build repo:eessi-2023.06-software arch:aarch64/neoverse_v1 from TopRichard

    • expanded format: build repository:eessi-2023.06-software architecture:aarch64/neoverse_v1
  • received bot command build repo:eessi-2023.06-software arch:aarch64/neoverse_n1 from TopRichard

    • expanded format: build repository:eessi-2023.06-software architecture:aarch64/neoverse_n1
  • handling command build repository:eessi-2023.06-software architecture:x86_64/generic resulted in:

  • handling command build repository:eessi-2023.06-software architecture:x86_64/intel/haswell resulted in:

  • handling command build repository:eessi-2023.06-software architecture:x86_64/intel/skylake_avx512 resulted in:

  • handling command build repository:eessi-2023.06-software architecture:x86_64/amd/zen3 resulted in:

  • handling command build repository:eessi-2023.06-software architecture:x86_64/amd/zen2 resulted in:

  • handling command build repository:eessi-2023.06-software architecture:aarch64/neoverse_v1 resulted in:

  • handling command build repository:eessi-2023.06-software architecture:aarch64/neoverse_n1 resulted in:

@eessi-bot
Copy link

eessi-bot bot commented Sep 19, 2023

New job on instance eessi-bot-citc-aws for architecture x86_64-generic for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.09/pr_290/7404

date job status comment
Sep 19 05:50:04 UTC 2023 submitted job id 7404 awaits release by job manager
Sep 19 05:50:50 UTC 2023 released job awaits launch by Slurm scheduler
Sep 19 05:54:01 UTC 2023 running job 7404 is running
Sep 19 08:35:10 UTC 2023 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-7404.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-generic-1695112404.tar.gzsize: 630 MiB (661251339 bytes)
entries: 7432
modules under 2023.06/software/linux/x86_64/generic/modules/all
netCDF-Fortran/4.5.3-gompi-2021a.lua
tcsh/6.22.04-GCCcore-10.3.0.lua
time/1.9-GCCcore-10.3.0.lua
WRF/4.3-foss-2021a-dmpar.lua
software under 2023.06/software/linux/x86_64/generic/software
netCDF-Fortran/4.5.3-gompi-2021a
tcsh/6.22.04-GCCcore-10.3.0
time/1.9-GCCcore-10.3.0
WRF/4.3-foss-2021a-dmpar
other under 2023.06/software/linux/x86_64/generic
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@eessi-bot
Copy link

eessi-bot bot commented Sep 19, 2023

New job on instance eessi-bot-citc-aws for architecture x86_64-intel-haswell for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.09/pr_290/7405

date job status comment
Sep 19 05:50:11 UTC 2023 submitted job id 7405 awaits release by job manager
Sep 19 05:50:48 UTC 2023 released job awaits launch by Slurm scheduler
Sep 19 05:53:59 UTC 2023 running job 7405 is running
Sep 19 08:35:08 UTC 2023 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-7405.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-intel-haswell-1695112382.tar.gzsize: 630 MiB (661249478 bytes)
entries: 7432
modules under 2023.06/software/linux/x86_64/intel/haswell/modules/all
netCDF-Fortran/4.5.3-gompi-2021a.lua
tcsh/6.22.04-GCCcore-10.3.0.lua
time/1.9-GCCcore-10.3.0.lua
WRF/4.3-foss-2021a-dmpar.lua
software under 2023.06/software/linux/x86_64/intel/haswell/software
netCDF-Fortran/4.5.3-gompi-2021a
tcsh/6.22.04-GCCcore-10.3.0
time/1.9-GCCcore-10.3.0
WRF/4.3-foss-2021a-dmpar
other under 2023.06/software/linux/x86_64/intel/haswell
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@eessi-bot
Copy link

eessi-bot bot commented Sep 19, 2023

New job on instance eessi-bot-citc-aws for architecture x86_64-intel-skylake_avx512 for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.09/pr_290/7406

date job status comment
Sep 19 05:50:17 UTC 2023 submitted job id 7406 awaits release by job manager
Sep 19 05:50:46 UTC 2023 released job awaits launch by Slurm scheduler
Sep 19 05:53:57 UTC 2023 running job 7406 is running
Sep 19 08:12:36 UTC 2023 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-7406.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-intel-skylake_avx512-1695111013.tar.gzsize: 630 MiB (661453130 bytes)
entries: 7432
modules under 2023.06/software/linux/x86_64/intel/skylake_avx512/modules/all
netCDF-Fortran/4.5.3-gompi-2021a.lua
tcsh/6.22.04-GCCcore-10.3.0.lua
time/1.9-GCCcore-10.3.0.lua
WRF/4.3-foss-2021a-dmpar.lua
software under 2023.06/software/linux/x86_64/intel/skylake_avx512/software
netCDF-Fortran/4.5.3-gompi-2021a
tcsh/6.22.04-GCCcore-10.3.0
time/1.9-GCCcore-10.3.0
WRF/4.3-foss-2021a-dmpar
other under 2023.06/software/linux/x86_64/intel/skylake_avx512
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@eessi-bot
Copy link

eessi-bot bot commented Sep 19, 2023

New job on instance eessi-bot-citc-aws for architecture x86_64-amd-zen3 for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.09/pr_290/7407

date job status comment
Sep 19 05:50:23 UTC 2023 submitted job id 7407 awaits release by job manager
Sep 19 05:50:44 UTC 2023 released job awaits launch by Slurm scheduler
Sep 19 05:55:07 UTC 2023 running job 7407 is running
Sep 19 07:40:41 UTC 2023 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-7407.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen3-1695109164.tar.gzsize: 630 MiB (661349012 bytes)
entries: 7432
modules under 2023.06/software/linux/x86_64/amd/zen3/modules/all
netCDF-Fortran/4.5.3-gompi-2021a.lua
tcsh/6.22.04-GCCcore-10.3.0.lua
time/1.9-GCCcore-10.3.0.lua
WRF/4.3-foss-2021a-dmpar.lua
software under 2023.06/software/linux/x86_64/amd/zen3/software
netCDF-Fortran/4.5.3-gompi-2021a
tcsh/6.22.04-GCCcore-10.3.0
time/1.9-GCCcore-10.3.0
WRF/4.3-foss-2021a-dmpar
other under 2023.06/software/linux/x86_64/amd/zen3
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@eessi-bot
Copy link

eessi-bot bot commented Sep 19, 2023

New job on instance eessi-bot-citc-aws for architecture x86_64-amd-zen2 for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.09/pr_290/7408

date job status comment
Sep 19 05:50:30 UTC 2023 submitted job id 7408 awaits release by job manager
Sep 19 05:50:42 UTC 2023 released job awaits launch by Slurm scheduler
Sep 19 05:53:55 UTC 2023 running job 7408 is running
Sep 19 07:54:00 UTC 2023 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-7408.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen2-1695109951.tar.gzsize: 630 MiB (661370349 bytes)
entries: 7432
modules under 2023.06/software/linux/x86_64/amd/zen2/modules/all
netCDF-Fortran/4.5.3-gompi-2021a.lua
tcsh/6.22.04-GCCcore-10.3.0.lua
time/1.9-GCCcore-10.3.0.lua
WRF/4.3-foss-2021a-dmpar.lua
software under 2023.06/software/linux/x86_64/amd/zen2/software
netCDF-Fortran/4.5.3-gompi-2021a
tcsh/6.22.04-GCCcore-10.3.0
time/1.9-GCCcore-10.3.0
WRF/4.3-foss-2021a-dmpar
other under 2023.06/software/linux/x86_64/amd/zen2
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@eessi-bot
Copy link

eessi-bot bot commented Sep 19, 2023

New job on instance eessi-bot-citc-aws for architecture aarch64-neoverse_v1 for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.09/pr_290/7409

date job status comment
Sep 19 05:50:36 UTC 2023 submitted job id 7409 awaits release by job manager
Sep 19 05:50:40 UTC 2023 released job awaits launch by Slurm scheduler
Sep 19 05:55:04 UTC 2023 running job 7409 is running
Sep 20 05:54:28 UTC 2023 finished
🤷 UNKNOWN (click triangle for detailed information)
  • Job results file _bot_job7409.result does not exist in job directory or reading it failed.
  • No artefacts were found/reported.

@eessi-bot
Copy link

eessi-bot bot commented Sep 19, 2023

New job on instance eessi-bot-citc-aws for architecture aarch64-neoverse_n1 for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.09/pr_290/7410

date job status comment
Sep 19 05:50:43 UTC 2023 submitted job id 7410 awaits release by job manager
Sep 19 05:51:52 UTC 2023 released job awaits launch by Slurm scheduler
Sep 19 05:55:11 UTC 2023 running job 7410 is running
Sep 19 08:43:30 UTC 2023 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-7410.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-neoverse_n1-1695112915.tar.gzsize: 640 MiB (671591312 bytes)
entries: 7432
modules under 2023.06/software/linux/aarch64/neoverse_n1/modules/all
netCDF-Fortran/4.5.3-gompi-2021a.lua
tcsh/6.22.04-GCCcore-10.3.0.lua
time/1.9-GCCcore-10.3.0.lua
WRF/4.3-foss-2021a-dmpar.lua
software under 2023.06/software/linux/aarch64/neoverse_n1/software
netCDF-Fortran/4.5.3-gompi-2021a
tcsh/6.22.04-GCCcore-10.3.0
time/1.9-GCCcore-10.3.0
WRF/4.3-foss-2021a-dmpar
other under 2023.06/software/linux/aarch64/neoverse_n1
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@boegel
Copy link
Contributor

boegel commented Sep 20, 2023

bot: build repo:eessi-2023.06-software arch:aarch64/neoverse_v1

@eessi-bot
Copy link

eessi-bot bot commented Sep 20, 2023

Updates by the bot instance eessi-bot-citc-aws (click for details)
  • received bot command build repo:eessi-2023.06-software arch:aarch64/neoverse_v1 from boegel

    • expanded format: build repository:eessi-2023.06-software architecture:aarch64/neoverse_v1
  • handling command build repository:eessi-2023.06-software architecture:aarch64/neoverse_v1 resulted in:

@eessi-bot
Copy link

eessi-bot bot commented Sep 20, 2023

New job on instance eessi-bot-citc-aws for architecture aarch64-neoverse_v1 for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.09/pr_290/7448

date job status comment
Sep 20 07:51:29 UTC 2023 submitted job id 7448 awaits release by job manager
Sep 20 09:17:13 UTC 2023 released job awaits launch by Slurm scheduler
Sep 20 09:20:47 UTC 2023 running job 7448 is running
Sep 21 09:21:40 UTC 2023 finished
🤷 UNKNOWN (click triangle for detailed information)
  • Job results file _bot_job7448.result does not exist in job directory or reading it failed.
  • No artefacts were found/reported.

@boegel
Copy link
Contributor

boegel commented Sep 20, 2023

Hmm, looks like the WRF tests are hanging on aarch64/neoverse-v1?
This is ~6h into the build... Previous attempt was cancelled after 24h.

bot         8884  0.0  0.0   4160  1472 ?        S    09:20   0:00          |       |   |   \_ /bin/bash ./run_in_compat_layer_env.sh ./EESSI-pilot-install-software.sh --build-logs-dir /mnt/shared/bot-build-logs
bot         8935  0.0  0.0   7872  1856 ?        S    09:20   0:00          |       |   |       \_ /cvmfs/pilot.eessi-hpc.org/versions/2023.06/compat/linux/aarch64/bin/bash /cvmfs/pilot.eessi-hpc.org/versions/2023.06/compat/linux/aarch64/startprefix
bot         8936  0.0  0.0   4992  3584 ?        S    09:20   0:00          |       |   |           \_ /cvmfs/pilot.eessi-hpc.org/versions/2023.06/compat/linux/aarch64/bin/bash -l
bot         8950  0.0  0.0   4352  3328 ?        S    09:21   0:00          |       |   |               \_ /bin/bash ./EESSI-pilot-install-software.sh --build-logs-dir /mnt/shared/bot-build-logs
bot         9086  0.0  0.2 162304 78656 ?        Sl   09:23   0:06          |       |   |                   \_ /cvmfs/pilot.eessi-hpc.org/versions/2023.06/compat/linux/aarch64/usr/lib/python-exec/python3.11/python -m easybuild.main --easystack /eessi_bot_job/eessi-2023.06
bot        62992  0.0  0.0 168320 27648 ?        Sl   09:45   0:00          |       |   |                       \_ mpirun -n 4 ./wrf.exe
bot        63003 99.9  0.4 623680 140160 ?       Rl   09:45 356:50          |       |   |                           \_ ./wrf.exe
bot        63004 99.9  0.3 601920 113728 ?       Rl   09:45 356:50          |       |   |                           \_ ./wrf.exe
bot        63005 99.9  0.3 601984 112960 ?       Rl   09:45 356:50          |       |   |                           \_ ./wrf.exe
bot        63006 99.9  0.3 602432 113344 ?       Rl   09:45 356:50          |       |   |                           \_ ./wrf.exe

edit: some more info:

  • the hang occurs when the em_heldsuarez test case is being run, a grep in the EasyBuild log file reveals:
    == 2023-09-20 09:44:56,372 wrf.py:376 DEBUG Building and running test em_fire
    == 2023-09-20 09:45:04,211 wrf.py:376 DEBUG Building and running test em_heldsuar
    
  • the test run is done in the installation directory, which is probably not a good idea (it may facilitate the hang), so we should look into avoiding that, and running the tests in a temporary directory instead?
    == 2023-09-20 09:45:10,901 run.py:217 DEBUG run_cmd: running cmd ulimit -s unlimited && mpirun -n 1 ./ideal.exe && mpirun -n 4 ./wrf.exe (in /cvmfs/pilot.eessi-hpc.org/versions/2023.06/software/linux/aarch64/neoverse_v1/software/WRF/4.3-foss-2021a-dmpar/WRF-4.3/run)
    == 2023-09-20 09:45:10,901 run.py:236 INFO running cmd: ulimit -s unlimited && mpirun -n 1 ./ideal.exe && mpirun -n 4 ./wrf.exe
    
    Care must be taken when copying the run directory to a local temporary directory though, since there are some symbolic links in there:
    $ ls -lrt /tmp/bot/EESSI/eessi.xtSIZPx2Fr/overlay-upper/versions/2023.06/software/linux/aarch64/neoverse_v1/software/WRF/4.3-foss-2021a-dmpar/WRF-4.3/run | grep '\->'
    lrwxrwxrwx. 1 bot users        14 Sep 20 09:44 tc.exe -> ../main/tc.exe
    lrwxrwxrwx. 1 bot users        16 Sep 20 09:44 real.exe -> ../main/real.exe
    lrwxrwxrwx. 1 bot users        17 Sep 20 09:44 ndown.exe -> ../main/ndown.exe
    lrwxrwxrwx. 1 bot users        27 Sep 20 09:44 input_jet -> ../test/em_b_wave/input_jet
    lrwxrwxrwx. 1 bot users        30 Sep 20 09:45 input_sounding -> ../test/em_fire/input_sounding
    lrwxrwxrwx. 1 bot users        15 Sep 20 09:45 wrf.exe -> ../main/wrf.exe
    lrwxrwxrwx. 1 bot users        17 Sep 20 09:45 ideal.exe -> ../main/ideal.exe
    

@boegel
Copy link
Contributor

boegel commented Sep 20, 2023

@TopRichard Maybe we should check if we're observing the same problem with WRF-4.4.1-foss-2022b-dmpar.eb (in another PR)?

@boegel
Copy link
Contributor

boegel commented Sep 21, 2023

@TopRichard Can you re-try the build for aarch64/neoverse-v1 using the update WRF easyblock from easybuilders/easybuild-easyblocks#3006, just to see if that helps at all? May not make a difference...

@TopRichard
Copy link
Collaborator Author

bot: build repo:eessi-2023.06-software arch:aarch64/neoverse_v1

@eessi-bot
Copy link

eessi-bot bot commented Sep 21, 2023

Updates by the bot instance eessi-bot-citc-aws (click for details)
  • received bot command build repo:eessi-2023.06-software arch:aarch64/neoverse_v1 from TopRichard

    • expanded format: build repository:eessi-2023.06-software architecture:aarch64/neoverse_v1
  • handling command build repository:eessi-2023.06-software architecture:aarch64/neoverse_v1 resulted in:

@eessi-bot
Copy link

eessi-bot bot commented Sep 21, 2023

New job on instance eessi-bot-citc-aws for architecture aarch64-neoverse_v1 for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.09/pr_290/7456

date job status comment
Sep 21 10:47:37 UTC 2023 submitted job id 7456 awaits release by job manager
Sep 21 10:48:27 UTC 2023 released job awaits launch by Slurm scheduler
Sep 21 10:52:31 UTC 2023 running job 7456 is running
Sep 21 11:12:58 UTC 2023 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-7456.out
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-neoverse_v1-1695294750.tar.gzsize: 0 MiB (695285 bytes)
entries: 62
modules under 2023.06/software/linux/aarch64/neoverse_v1/modules/all
tcsh/6.22.04-GCCcore-10.3.0.lua
software under 2023.06/software/linux/aarch64/neoverse_v1/software
tcsh/6.22.04-GCCcore-10.3.0
other under 2023.06/software/linux/aarch64/neoverse_v1
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@boegel boegel changed the base branch from 2023.06 to pilot.eessi-hpc.org-2023.06 November 21, 2023 21:18
@TopRichard TopRichard closed this Dec 20, 2023
@TopRichard TopRichard deleted the eessi-2023.06-WRF-dmpar/4.3-foss/2021a branch December 20, 2023 08:39
trz42 pushed a commit to trz42/software-layer that referenced this pull request Apr 16, 2024
{2023.06}[foss/2023a] R-bundle-CRAN 2023.12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants