diff --git a/doc/source/chapter02_beginner-tutorial/quick-start.rst b/doc/source/chapter02_beginner-tutorial/quick-start.rst index 5d02d7f..f60c322 100644 --- a/doc/source/chapter02_beginner-tutorial/quick-start.rst +++ b/doc/source/chapter02_beginner-tutorial/quick-start.rst @@ -21,7 +21,7 @@ After entering some basic information, you will be required to enter your credit Now you should have an AWS account! It's time to run the model in cloud. (You can skip Step 1 for the next time, of course) -Step 2: Launch a server with GEOS-Chem pre-installed +Step 2: Launch a server with GEOS-Chem pre-installed ---------------------------------------------------- Log in to AWS console, and click on EC2 (Elastic Compute Cloud), which is the most basic cloud computing service. @@ -74,10 +74,10 @@ Select your instance, click on the "Connect" button near the blue "Launch Instan .. figure:: img/connect_instruction.png :width: 500 px -- On Mac or Linux, copy the ``ssh -i "xx.pem" root@xxx.com`` command under "Example". - Before using that command to ssh to your server, do some minor stuff: - - (1) ``cd`` to the directory where store your Key Pair (preferably ``$HOME/.ssh``) +- On Mac or Linux, copy the ``ssh -i "xx.pem" root@xxx.com`` command under "Example". + Before using that command to ssh to your server, do some minor stuff: + + (1) ``cd`` to the directory where store your Key Pair (preferably ``$HOME/.ssh``) (2) Use ``chmod 400 xx.pem`` to change the key pair's permission (also mentioned in the above figure; only need to do this at the first time). (3) Change the user name in that command from ``root`` to ``ubuntu``. (You'll be asked to use ``ubuntu`` if you keep ``root``). - On Windows, please refer to the guide for `MobaXterm `_ and `Putty `_ (Your life would probably be easier with MobaXterm). @@ -93,11 +93,11 @@ That's a system with GEOS-Chem already built! **Trouble shooting**: if you have trouble ``ssh`` to the server, please :doc:`make sure you don't mess-up the "security group" configuration `. Go to the pre-generated run directory:: - + $ cd ~/tutorial/geosfp_4x5_standard Just run the pre-compiled the model by:: - + $ ./geos.mp Or you can re-compile the model on your own:: @@ -134,8 +134,7 @@ If you wait for the simulation to finish (takes 5~10 min), it will produce `NetC time:calendar = "gregorian" ; time:axis = "T" ; -`Anaconda Python `_ and `xarray `_ are already installed on the server for analyzing all kinds of NetCDF files. If you are not familiar with Python and xarray, checkout my tutorial on -`xarray for GEOS-Chem `_. +`Anaconda Python `_ and `xarray `_ are already installed on the server for analyzing all kinds of NetCDF files. If you are not familiar with Python and xarray, checkout my `Python/xarray tutorial for GEOS-Chem users `_. Activate the pre-installed `geoscientific Python environment `_ by ``source activate geo`` (it is generally a bad idea to directly install things into the root Python environment), and then start ``ipython`` from the command line:: @@ -163,9 +162,9 @@ Activate the pre-installed `geoscientific Python environment `_. If you have been using Jupyter on your local machine, the user experience on the cloud would be exactly the same. To use Jupyter on remote servers, re-login to the server with port-forwarding option ``-L 8999:localhost:8999``:: - + $ ssh -i "xx.pem" ubuntu@xxx.com -L 8999:localhost:8999 - + Then simply run ``jupyter notebook --NotebookApp.token='' --no-browser --port=8999``:: $ jupyter notebook --NotebookApp.token='' --no-browser --port=8999 @@ -190,7 +189,7 @@ We encourage users to try the new NetCDF diagnostics, but you can still use the Also, you could indeed download the output data and use old tools like IDL & MATLAB to analyze them, but we highly recommend the open-source Python/Jupyter/xarray ecosystem. It will vastly improve user experience and working efficiency, and also help open science and reproducible research. -Step 5: Shut down the server (Very important!!) +Step 5: Shut down the server (Very important!!) ----------------------------------------------- Right-click on the instance in your console to get this menu: @@ -199,10 +198,10 @@ Right-click on the instance in your console to get this menu: There are two different ways to stop being charged: -- "Stop" will make the system inactive, so that you'll not be charged by the CPU time, +- "Stop" will make the system inactive, so that you'll not be charged by the CPU time, and only be charged by the negligible disk storage fee. You can re-start the server at any time and all files will be preserved. - "Terminate" will completely remove that virtual server so you won't be charged at all after that. - Unless you save your system as an AMI or transfer the data to other storage services, + Unless you save your system as an AMI or transfer the data to other storage services, you will lose all your data and software. You will learn how to save your data and configurations persistently in the next tutorials. You might also want to :doc:`simplify your ssh login command <../chapter06_appendix/ssh-config>`. \ No newline at end of file diff --git a/doc/source/chapter03_advanced-tutorial/gchp.rst b/doc/source/chapter03_advanced-tutorial/gchp.rst new file mode 100644 index 0000000..5aea3b3 --- /dev/null +++ b/doc/source/chapter03_advanced-tutorial/gchp.rst @@ -0,0 +1,60 @@ +GEOS-Chem High-Performance version (GCHP) (experiential) +======================================================== + +We've successfully made GCHP running on the cloud. **It is functioning correctly** but there are several issues to be resolved: + +- GCHP compiles with gfortran, but the overall performance is ~20% slower than with ifort. The major slow down comes from the new advection module (GFDL-FV3). +- The initial I/O takes a long time (does not affect long-term simulations). +- The data analysis pipeline is not fully documented. We do have some `preliminary scripts `_ to process and regrid cubed-sphere data, though. +- It is only set up to run in a single node, with at most 72 CPUs (c5.18xlarge). + +Right now it is pretty good for learning and for small experiments. We will make a major update after the formal release of v11-02. + +GCHP inside Singularity container +--------------------------------- + +We will be using :doc:`containers <./container>` to run GCHP. It allows you to set up GCHP quickly on almost **any machines**, not just on Amazon cloud. You can adapt this guide for your own server. + +The Singularity image for GCHP can be obtained from `Singularity Hub `_, with the command:: + + $ singularity pull --name GCHP.simg shub://JiaweiZhuang/Singularity_GC + +Launch server +------------- + +Launch from the AMI ID ``ami-21f37a5e``. This AMI is just to provide sample input data and pre-configured run directory. The software libraries will be provided by Singularity container. + +The minimum `hardware requirement `_ is ``r4.2xlarge`` with 8 CPUs and 60 GB memory. The minimum number of MPI processes for GCHP is 6 (one for each cubed-sphere panel). You can still start a GCHP simulation on an instance with <6 CPUs, but the program is likely to die somewhere. + +Test run +-------- + +After launching the instance and logging-in (username is ``ubuntu``), you should see:: + + $ ls + gcdata GCHP GCHP.simg miniconda singularity + +Run the container interactively by:: + + $ singularity shell GCHP.simg + +If you just execute the container by ``./GCHP.simg``, it will print some instructions. + +Go to the run directory and execute the pre-compiled executable:: + + $ cd ~/GCHP/gchp_standard + $ mpirun -np 6 ./geos + +Test compile +------------ + +To re-compile the model, for now you need to specify the code directory when starting the container:: + + $ SINGULARITYENV_GC_CODE_DIR=~/GCHP/Code.v11-02_gchp singularity shell GCHP.simg + +Then re-compile the model in the run directory:: + + $ cd ~/GCHP/gchp_standard + $ make compile_clean + +For more information please see `the official tutorial on GCHP wiki `_. \ No newline at end of file diff --git a/doc/source/chapter03_advanced-tutorial/index.rst b/doc/source/chapter03_advanced-tutorial/index.rst index 10ef94d..f5f4637 100644 --- a/doc/source/chapter03_advanced-tutorial/index.rst +++ b/doc/source/chapter03_advanced-tutorial/index.rst @@ -9,4 +9,5 @@ This chapter provides advanced tutorials to improve your research workflow. Make iam-role advanced-awscli container - hpc-overview \ No newline at end of file + hpc-overview + gchp \ No newline at end of file diff --git a/doc/source/chapter06_appendix/aws-resources-for-gc.rst b/doc/source/chapter06_appendix/aws-resources-for-gc.rst index c5ce2ba..ea14393 100644 --- a/doc/source/chapter06_appendix/aws-resources-for-gc.rst +++ b/doc/source/chapter06_appendix/aws-resources-for-gc.rst @@ -1,4 +1,4 @@ -List of public AWS resources for GEOS-Chem +List of public AWS resources for GEOS-Chem ========================================== Currently all resources are in us-east-1 (N. Virginia). @@ -7,7 +7,7 @@ Currently all resources are in us-east-1 (N. Virginia). | Resource | ID/name | Size | Content | +===================+========================+==========+==================================+ || Tutorial AMI | ami-ab925cd6 | 70 GB | | -| | | | 1. gfortran 5.4.0, | +| | | | 1. gfortran 5.4.0, | | | | | netCDF-Fortran 4.4.3 | | | | | 2. GC environment variables | | | | | 3. GC source code and Unit Tester| @@ -18,6 +18,10 @@ Currently all resources are in us-east-1 (N. Virginia). | | | | environment | | | | | 7. Sample gcdata directory | +-------------------+------------------------+----------+----------------------------------+ +|| GCHP | ami-21f37a5e | 100 GB | 1. Pre-configured GCHP rundir | +|| experimental | | | 2. Sample GCHP input data | +|| AMI | | | 3. GCHP container environment | ++-------------------+------------------------+----------+----------------------------------+ || S3 bucket for | s3://gcgrid | ~30 TB | All current GEOS-Chem input data | || all GC data | (requester-pay) | | | +-------------------+------------------------+----------+----------------------------------+ diff --git a/doc/source/index.rst b/doc/source/index.rst index 63b37c9..b0ce03c 100644 --- a/doc/source/index.rst +++ b/doc/source/index.rst @@ -1,7 +1,7 @@ GEOS-Chem on cloud computing platforms ====================================== -`GEOSChem-on-cloud `_ project aims to build a cloud computing capability for `GEOS-Chem `_ that can be easily accessed by researchers worldwide. +`GEOSChem-on-cloud `_ project aims to build a cloud computing capability for `GEOS-Chem `_ that can be easily accessed by researchers worldwide. See :ref:`motivation-label` for the motivation of this project. See :ref:`quick-start-label` to start your first GEOS-Chem simulation on the `Amazon Web Services (AWS) `_ cloud within 10 minutes (and within seconds for the next time). @@ -15,9 +15,9 @@ How to use this documentation **For GEOS-Chem users**, this website contains everything you need in order to use GEOS-Chem on the cloud. You will be able to finish a complete research workflow, from model simulations to output data analysis and management. **If it is your first time trying GEOS-Chem, this project is perhaps your best starting point**, because :ref:`you don't need to do any initial setup ` and the model is guaranteed to work correctly (see :ref:`quick start guide `). Note that this website is not a user guide on the GEOS-Chem model itself. Please refer to our comprehensive `user guide `_ and `wiki `_ for all details about GEOS-Chem. To run on the cloud, we only support versions newer than `v11-02a `_ for `GNU-Fortran compatibility `_. -**For non-GEOS-Chem-users**, this documentation can be used as an introduction to AWS for scientific computing, especially for **Earth science model simulations**. Since all Earth science models are highly similar from a software perspective, it should be quite easy to adapt this guide for you specific use case. More than 90% of this website is about general AWS concepts and tutorials, which doesn't require GEOS-Chem-specific knowledge. Please get a feeling of cloud computing workflow by exploring :doc:`beginner tutorials <../chapter02_beginner-tutorial/index>` and then refer to the :doc:`developer guide <../chapter04_developer-guide/index>` to build your own model. Although cloud computing has a lot of potential in Earth science, it is still significantly under-utilized due to :doc:`the lack of accessible tutorials <./chapter01_overview/external-resources>` for Earth science researchers. This project tries to fill this gap. +**For non-GEOS-Chem-users**, this documentation can be used as an introduction to AWS for scientific computing, especially for **Earth science model simulations**. Since all Earth science models are highly similar from a software perspective, it should be quite easy to adapt this guide for you specific use case. More than 90% of this website is about general AWS concepts and tutorials, which doesn't require GEOS-Chem-specific knowledge. Please get a feeling of cloud computing workflow by exploring :doc:`beginner tutorials <./chapter02_beginner-tutorial/index>` and then refer to the :doc:`developer guide <./chapter04_developer-guide/index>` to build your own model. Although cloud computing has a lot of potential in Earth science, it is still significantly under-utilized due to :doc:`the lack of accessible tutorials <./chapter01_overview/external-resources>` for Earth science researchers. This project tries to fill this gap. -For general reference, GEOS-Chem is a `Chemical Transport Model `_ for simulating atmospheric chemical compositions. It has been developed over 20 years and is used by `more than 100 research groups worldwide `_. The program is mainly written in Fortran 90. `All model source code `_ is `distributed freely under the MIT license `_. Input and output data formats are mostly NetCDF, which can be analyzed easily by most languages such as Python, R and MATLAB. IDL (Interactive Data Language) has historically been the major data analysis tool but now we embrace open-source tools especially Python, `Jupyter `_ and `xarray `_. The classic version of GEOS-Chem uses OpenMP parallelization (shared-memory, multi-threading). `The MPI version of GEOS-Chem `_ has also been developed and we are working on making it available on the cloud. +For general reference, GEOS-Chem is a `Chemical Transport Model `_ for simulating atmospheric chemical compositions. It has been developed over 20 years and is used by `more than 100 research groups worldwide `_. The program is mainly written in Fortran 90. `All model source code `_ is `distributed freely under the MIT license `_. Input and output data formats are mostly NetCDF, which can be analyzed easily by most languages such as Python, R and MATLAB. IDL (Interactive Data Language) has historically been the major data analysis tool but now we embrace open-source tools especially Python, `Jupyter `_ and `xarray `_. The classic version of GEOS-Chem uses OpenMP parallelization (shared-memory, multi-threading). `The MPI version of GEOS-Chem `_ has also been developed and we have an :doc:`an experimental version that runs on the cloud<./chapter03_advanced-tutorial/gchp>`. Table of Contents