-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: WRF tutorial : gen_retro_icbc.csh has set paramfile twice #295
Comments
another note on this: note: shell_scripts/init_ensemble_var.csh 59
60 echo " QUEUEING ENSEMBLE MEMBER $n at `date`"
61
62 mkdir -p ${RUN_DIR}/advance_temp${n}
63
64 # TJH why does the run_dir/*/input.nml come from the template_dir and not the rundir?
65 # TJH furthermore, template_dir/input.nml.template and rundir/input.nml are identical. SIMPLIFY.
66
67 ${LINK} ${RUN_DIR}/WRF_RUN/* ${RUN_DIR}/advance_temp${n}/.
68 ${LINK} ${TEMPLATE_DIR}/input.nml.template ${RUN_DIR}/advance_temp${n}/input.nml
69
70 ${COPY} ${OUTPUT_DIR}/${initial_date}/wrfinput_d01_${gdate[1]}_${gdate[2]}_mean \
71 ${RUN_DIR}/advance_temp${n}/wrfvar_output.nc
72 sleep 3
73 ${COPY} ${RUN_DIR}/add_bank_perts.ncl ${RUN_DIR}/advance_temp${n}/.
74
75 set cmd3 = "ncl 'MEM_NUM=${n}' 'PERTS_DIR="\""${PERTS_DIR}"\""' ${RUN_DIR}/advance_temp${n}/add_bank_perts.ncl"
76 ${REMOVE} ${RUN_DIR}/advance_temp${n}/nclrun3.out
77 cat >! ${RUN_DIR}/advance_temp${n}/nclrun3.out << EOF
78 $cmd3
79 EOF
80 echo $cmd3 >! ${RUN_DIR}/advance_temp${n}/nclrun3.out.tim # TJH replace cat above
81
82 cat >! ${RUN_DIR}/rt_assim_init_${n}.csh << EOF |
note 2: There are a couple of places that everything in the WRF_RUN directory gets linked: init_ensemble_var.csh
new_advance_model.csh
A users asked a question about rsl.out.0000 and rsl.error.0000 getting linked to WRF_RUN for every ensemble member. So all ensemble members would be writing to WRF_RUN/rsl.out.0000 Is the script expecting that you never run wrf.exe in WRF_RUN? I think the scripts are expecting only the files needed to run wrf, not any output files. |
note 3: prep_ic.csh if ( $#argv > 0 ) then
set n = ${1} # pass in the ensemble member number
set datep = ${2} # needed for correct path to file
set dn = ${3}
set paramfile = ${4}
else # values come from environment variables #TJH If these are not set ....
set n = $mem_num
set datep = $date
set dn = $domain
set paramfile = $paramf
endif
source $paramfile
-echo "prep_ic.csh using n=$n datep=$datep dn=$dn paramfile=$paramf" ! paramfile might be ${4} not $paramf
+echo "prep_ic.csh using n=$n datep=$datep dn=$dn paramfile=$paramfile" |
@hkershaw-brown Just curious about the status of this bug --- seems the paramfile being set twice is the source of the bug, and the other comments are related to general improvements of the WRF-DART tutorial scripting? |
@braczka I have not worked on the wrf tutorial scripts. I've helped several users work through the tutorial, and I would recommend if a user is familiar with scripting that they are better off writing their own scripts. The tutorial states "You will need to edit these scripts, perhaps extensively, to run them within your particular computing environment." This is an understatement. I'd like to rewrite the wrf tutorial, but this issue has been hanging out there because we haven't had the manpower/resource to commit to it. I would use a smaller wrf case (the run takes an hour on Cheyenne, which is too long to be debugging scripts efficiently). |
Thanks @hkershaw-brown , it also makes sense to keep this issue open for reference for now with your additional notes as I become more familiar with WRF. |
will do, I'll leave this issue open. |
Forgot to make note of another linkage error in WRF-DART tutorial independent from set param.csh which was already fixed. When executing
Existing link command fails:
because
or
Will keep this open just a bit longer to see if current users uncover any other easy fixes. |
When I run the basic WRF tutorial and run
An example of the error is the following:
My module environment while running the job is: Currently Loaded Modules: 1) ncarenv/1.3 3) ncarcompilers/0.5.0 5) ncl/6.6.2 7) diffuse/0.4.8 2) intel/19.0.5 4) netcdf/4.7.4 6) nco/5.0.3 8) mpt/2.22 However, when executing the script there is an automatic update to the mpt version as: I searched for a similar bug related to MPI_LAUNCH_TIMEOUT on the WRF forum and found something similar here They recommended running the job in serial and not in parallel, thus I removed the MPI command altogether in favor of: This worked, but I am unsure if this is something worth correcting in the WRF Tutorial scripting or just a result of how the |
@braczka, you are running into this issue because all my WRF executables are compiled with openmpi. I am not a fan of mpt. So, I'd replace mpiexec_mpt with mpirun. Hopefully, this will fix you issue. |
I gotcha -- I tried the mpirun command before with openmpi and it was failing before --- but I see now that's because the modules were automatically replacing openmpi with mpt while running gen_retro_icbc.csh and I didn't catch it.... |
Thanks for feedback @mgharamti and @hkershaw-brown. When submitting the |
The Easiest fix is to edit the WRF tutorial instructions as:
I will update. |
🐛 Your bug may already be reported!
Please search on the issue tracker before creating a new issue.
Quick note on gen_retro_icbc.csh
Will fill in details when I run this.
Describe the bug
We have had a couple of users last week hit problems in the wrf tutorial where
input.nml templates were not found.
Error Message
Please provide any error messages.
Which model(s) are you working with?
WRF
Version of DART
Which version of DART are you using?
v9.11.11
Have you modified the DART code?
No
Build information
I think this is any machine
The text was updated successfully, but these errors were encountered: