-
Dear all experts, I recently transferred my simulation from 2.0 to 2.1 in Cheyenne. FATAL from PE 2: mpp_domains_define.inc: not all the pe_end are in the pelist I feel like this is related to CPU number issue from previous posts but I have checked the layout_XY of my custom predefined domain by using get_layout.sh (suggested layout_x=6, layout_y=8). I also used 48 cpus to run. The path for my simulation is in /glade/work/htan2013/UFS_2.1/expt_dirs/HRRR_3km_20200306. I have attached my config.yaml, predef_grid_params.yaml, logs for all steps including FCST. I would greatly appreciate any assistance in this matter. Best regards, config.yaml.txt |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
@htan2013 It looks like you are over-subscribing your MPI tasks. You have the following settings per your log file:
With While you could set |
Beta Was this translation helpful? Give feedback.
@htan2013 It looks like you are over-subscribing your MPI tasks. You have the following settings per your log file:
With
QUILTING=TRUE
, the write component is activated and assigned MPI tasks based on the number of write tasks and groups. With your settings, you have 6×8=48 CPUs assigned to model integration, and 1×24=24 CPUs assigned to the write component (creating output files) for a total of 72 CPUs, which is more than the 48 specified in your mpirun command.While you could set
QUILTING=FALSE
, this option is not extensively tested, so I would recommend changing your layout_x and layout_y to compensate for t…