-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
application called MPI_Abort(MPI_COMM_WORLD, 0) - process 73 #446
Comments
Thanks for writing @InterstellarPenguin. Could you also post the You can also schedule extra debug information in the logging.yml file as described in our ReadTheDocs: |
@InterstellarPenguin, please note that we do not recommend using GCHP with coarse resolution meteorology. I do not think using the 2x2.5 fields is causing the problem, but it will cause less accurate results. |
If C24 works but C48 does not, I recommend trying to run with more cores. Also try explicitly requesting all memory per node with SBATCH. |
Thanks @lizziel @yantosca , I've checked allPEs.log, there are bugs related to some extdata in the image below: BTW, in the GCHP.rc, I'm not sure about 'GCHPchem_INTERNAL_CHECKPOINT_FILE: Restarts/gcchem_internal_checkpoin' is correct. |
If a previous run generated Restarts/gcchem_internal_checkpoint and it was not renamed or deleted by the run script then when you try to run again the model will crash. Do you still have this file after your run crashes? What run script are you using? The run scripts are designed to avoid this issue so if the one you are using it is not catching this then we would definitely like to know. Generally the O-server is only needed by certain systems when you run with greater than 1000 cores. Try running again with the O-server off, with gcchem_internal_checkpoint deleted if it is present, and with GCHPchem_INTERNAL_CHECKPOINT_FILE set back to the orginal setting. Please note that we do not recommend using the carbon simulation with version 14.4. Fixes are coming in 14.5.1. See github issues: |
Hi @InterstellarPenguin, please open a new issue for questions outside the scope of the original MPI error. Thanks! |
Your name
Linyang Guo
Your affiliation
UCAS
What happened? What did you expect to happen?
Hi, all! I'm running a c48 simultion, but it crashed with the following error:
I'm not sure whether the error is related to the settings or MPI.
What are the steps to reproduce the bug?
setCommonSettings.sh:
gchp.job:
ExtData.rc:
Please attach any relevant configuration and log files.
setCommonSettings.txt.txt
ExtData.txt
btw, the MetDir has been changed to my own ExtData, and when I run the GCHP with C24 instead of C48, it completes successfully.
What GCHP version were you using?
14.4.3
What environment were you running GCHP on?
Local cluster
What compiler and version were you using?
ifort 2021.3.0
What MPI library and version were you using?
Intel MPI 2021.3.0
Will you be addressing this bug yourself?
Yes
Additional information
No response
The text was updated successfully, but these errors were encountered: