Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spin-up simulation immediately stops after time stepping begins #2383

Closed
alphasue12 opened this issue Jul 17, 2024 · 8 comments
Closed

Spin-up simulation immediately stops after time stepping begins #2383

alphasue12 opened this issue Jul 17, 2024 · 8 comments
Assignees
Labels
category: Debug Help Request for assistance debugging GEOS-Chem topic: Runtime Error Related to runtime issues (e.g. simulation stopped w/ error) topic: User Environment Relating to libraries, containers, AMIs, etc.

Comments

@alphasue12
Copy link

Your name

SU Yi

Your affiliation

Fudan University

What happened? What did you expect to happen?

The spin-up simulation almost stopped immediately after the log prints "* B e g i n T i m e S t e p p i n g !! *" and "TPCORE_FVDAS (based on GMI) Tracer Transport Module successfully initialized" (please refer to the log.txt). No restart files were generated.
I also open the debug mode to investigate the source of the error, it seems the error occurs in the "tpcore_fvdas_mod.F90" (please refer to the debug.txt in the attachment), but i wonder how to fix it.
debug.txt

What are the steps to reproduce the bug?

i am doing an spin-up simulation with the resolution of 2x2.5 for the global with GCClassic (13.3.4). The simulation is not in nested grid mode.

Please attach any relevant configuration and log files.

build_summary.txt
HEMCO.log.txt
HEMCO_Config.txt
HEMCO_Diagn.rc.txt
HISTORY.rc.txt
input.geos.txt
debug.txt
log.txt

What GEOS-Chem version were you using?

13.3.4

What environment were you running GEOS-Chem on?

Local cluster

What compiler and version were you using?

gcc 7.5.0 on Ubuntu 18.04

Will you be addressing this bug yourself?

Yes

In what configuration were you running GEOS-Chem?

GCClassic

What simulation were you running?

Full chemistry

As what resolution were you running GEOS-Chem?

2x2.5

What meterology fields did you use?

MERRA-2

Additional information

I am using this old version because i want to reproduce the results from a research paper, which used this same version of GCClassic. When i run the spin-up simulation with the same configurations except for the resolution (use 4x5 instead), the simulation is successful in producing the restart file.

@alphasue12 alphasue12 added the category: Bug Something isn't working label Jul 17, 2024
@yantosca yantosca self-assigned this Jul 17, 2024
@yantosca yantosca added topic: Runtime Error Related to runtime issues (e.g. simulation stopped w/ error) category: Debug Help Request for assistance debugging GEOS-Chem and removed category: Bug Something isn't working labels Jul 17, 2024
@yantosca
Copy link
Contributor

Thanks for writing @alphasue12. I think you may not have maxed out your stack memory limits in your ~/.bashrc file. Please see this entry on ReadTheDocs:

@yantosca yantosca added the topic: User Environment Relating to libraries, containers, AMIs, etc. label Jul 17, 2024
@alphasue12
Copy link
Author

alphasue12 commented Jul 18, 2024

Thank you for your suggestion and I just tried it. But i still have the same "Program received signal SIGSEGV: Segmentation fault" after the log prints NASA-GSFC Tracer Transport Module successfully initialized. I made sure that ulimit -s unlimited"and export OMP_STACKSIZE=500m have been added to ~/.bashrc and have been executed before executing the ./gcclassic command. What else could cause the problem?

@yantosca
Copy link
Contributor

Thanks for the feedback @alphasue12. If you are running the GEOS-Chem job in a scheduler like SLURM, you might want to add a source ~/.bashrc in the run script. That will make sure that the environment variables you define in ~/.bashrc also get defined in the environment where the job runs.

@yantosca
Copy link
Contributor

@alphasue12: I'm not sure how your system is set up, but if you are trying to run GEOS-Chem Classic on a login node, you might not have enough memory there. On our system, when you log in, you are placed on a login node, and from there you can schedule interactive or batch jobs on computational nodes. Our login nodes only allow 4GB of memory, so if you have a similar setup on your cluster, this is what may be causing your jobs to die. You can ask your sysadmin for more info.

@alphasue12
Copy link
Author

@yantosca thanks for your detailed and kind reply. I am currently using a Ubuntu server with 64 processors and installed with 82 GB memory. It seems that this problem may be related to memory setup of my server. I plan to migrate my GCClassic onto a computing platform which uses SLURM scheduler and try again.

@yantosca
Copy link
Contributor

@alphasue12: Were you able to fix your issue?

@yantosca
Copy link
Contributor

yantosca commented Oct 7, 2024

Closing out this issue.

@yantosca yantosca closed this as completed Oct 7, 2024
@gopikrishnangs44
Copy link

@alphasue12 Did you fix it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Debug Help Request for assistance debugging GEOS-Chem topic: Runtime Error Related to runtime issues (e.g. simulation stopped w/ error) topic: User Environment Relating to libraries, containers, AMIs, etc.
Projects
None yet
Development

No branches or pull requests

3 participants