Skip to content

Commit

Permalink
Merge remote-tracking branch 'upstream/development' into AllocateFewe…
Browse files Browse the repository at this point in the history
…rGuardCells
  • Loading branch information
NeilZaim committed Aug 5, 2022
2 parents 1953392 + 4f88c5b commit eb23f1e
Show file tree
Hide file tree
Showing 266 changed files with 5,527 additions and 4,226 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/cuda.yml
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ jobs:
which nvcc || echo "nvcc not in PATH!"
git clone https://github.com/AMReX-Codes/amrex.git ../amrex
cd amrex && git checkout --detach 2d931f63cb4d611d0d23d694726889647f8a482d && cd -
cd amrex && git checkout --detach 22.08 && cd -
make COMP=gcc QED=FALSE USE_MPI=TRUE USE_GPU=TRUE USE_OMP=FALSE USE_PSATD=TRUE USE_CCACHE=TRUE -j 2
build_nvhpc21-11-nvcc:
Expand Down
7 changes: 3 additions & 4 deletions .github/workflows/insitu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,17 +16,16 @@ jobs:
CC: clang
CXXFLAGS: "-Werror -Wshadow -Woverloaded-virtual -Wunreachable-code -Wno-error=pass-failed"
CMAKE_GENERATOR: Ninja
CMAKE_PREFIX_PATH: /root/install/sensei/develop/lib/cmake
CMAKE_PREFIX_PATH: /root/install/sensei/v4.0.0/lib64/cmake
container:
image: ryankrattiger/sensei:fedora33-vtk-mpi-20210616
image: senseiinsitu/ci:fedora35-amrex-20220613
steps:
- uses: actions/checkout@v2
- name: Configure
run: |
cmake -S . -B build \
-DWarpX_SENSEI=ON \
-DWarpX_COMPUTE=NOACC \
-DCMAKE_CXX_STANDARD=14
-DWarpX_COMPUTE=NOACC
- name: Build
run: |
cmake --build build -j 2
Expand Down
4 changes: 2 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ repos:

# Changes tabs to spaces
- repo: https://github.com/Lucas-C/pre-commit-hooks
rev: v1.2.0
rev: v1.3.0
hooks:
- id: remove-tabs
exclude: 'Make.WarpX|Make.package|Makefile|GNUmake'
Expand All @@ -67,7 +67,7 @@ repos:

# Autoremoves unused Python imports
- repo: https://github.com/hadialqattan/pycln
rev: v1.3.3
rev: v2.1.1
hooks:
- id: pycln
name: pycln (python)
Expand Down
11 changes: 8 additions & 3 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Preamble ####################################################################
#
cmake_minimum_required(VERSION 3.18.0)
project(WarpX VERSION 22.06)
cmake_minimum_required(VERSION 3.20.0)
project(WarpX VERSION 22.08)

include(${WarpX_SOURCE_DIR}/cmake/WarpXFunctions.cmake)

Expand Down Expand Up @@ -447,8 +447,13 @@ if(WarpX_LIB)
)

# this will also upgrade/downgrade dependencies, e.g., when the version of picmistandard changes
if(WarpX_MPI)
set(pyWarpX_REQUIREMENT_FILE "requirements_mpi.txt")
else()
set(pyWarpX_REQUIREMENT_FILE "requirements.txt")
endif()
add_custom_target(${WarpX_CUSTOM_TARGET_PREFIX}pip_install_requirements
python3 -m pip install ${PYINSTALLOPTIONS} -r ${WarpX_SOURCE_DIR}/requirements.txt
python3 -m pip install ${PYINSTALLOPTIONS} -r "${WarpX_SOURCE_DIR}/${pyWarpX_REQUIREMENT_FILE}"
WORKING_DIRECTORY
${WarpX_BINARY_DIR}
)
Expand Down
4 changes: 2 additions & 2 deletions Docs/source/acknowledge_us.rst
Original file line number Diff line number Diff line change
Expand Up @@ -58,8 +58,8 @@ If your project uses the specific algorithms, please consider citing the respect
`DOI:10.1088/1367-2630/ac4ef1 <https://doi.org/10.1088/1367-2630/ac4ef1>`__

- Zoni E, Lehe R, Shapoval O, Belkin D, Zaim N, Fedeli L, Vincenti H, Vay JL.
**A Hybrid Nodal-Staggered Pseudo-Spectral Electromagnetic Particle-In-Cell Method with Finite-Order Centering**. under review, 2022.
`arXiv:2106.12919 <https://arxiv.org/abs/2106.12919>`__
**A hybrid nodal-staggered pseudo-spectral electromagnetic particle-in-cell method with finite-order centering**. *Computer Physics Communications* **279**, 2022.
`DOI:10.1016/j.cpc.2022.108457 <https://doi.org/10.1016/j.cpc.2022.108457>`__

- Shapoval O, Lehe R, Thevenet M, Zoni E, Zhao Y, Vay JL.
**Overcoming timestep limitations in boosted-frame Particle-In-Cell simulations of plasma-based acceleration**.
Expand Down
4 changes: 2 additions & 2 deletions Docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,9 +71,9 @@
# built documents.
#
# The short X.Y version.
version = u'22.06'
version = u'22.08'
# The full version, including alpha/beta/rc tags.
release = u'22.06'
release = u'22.08'

# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
Expand Down
2 changes: 1 addition & 1 deletion Docs/source/install/dependencies.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ WarpX depends on the following popular third party software.
Please see installation instructions below.

- a mature `C++17 <https://en.wikipedia.org/wiki/C%2B%2B17>`__ compiler, e.g., GCC 7, Clang 7, NVCC 11.0, MSVC 19.15 or newer
- `CMake 3.18.0+ <https://cmake.org>`__
- `CMake 3.20.0+ <https://cmake.org>`__
- `Git 2.18+ <https://git-scm.com>`__
- `AMReX <https://amrex-codes.github.io>`__: we automatically download and compile a copy of AMReX
- `PICSAR <https://github.com/ECP-WarpX/picsar>`__: we automatically download and compile a copy of PICSAR
Expand Down
1 change: 1 addition & 0 deletions Docs/source/install/hpc.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ HPC Systems
hpc/summit
hpc/spock
hpc/crusher
hpc/frontier
hpc/juwels
hpc/lassen
hpc/quartz
Expand Down
10 changes: 7 additions & 3 deletions Docs/source/install/hpc/cori.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,11 @@ Cori (NERSC)

The `Cori cluster <https://docs.nersc.gov/systems/cori/>`_ is located at NERSC.

If you are new to this system, please see the following resources:

Introduction
------------

If you are new to this system, **please see the following resources**:

* `GPU nodes <https://docs-dev.nersc.gov/cgpu/access>`__

Expand All @@ -14,8 +18,8 @@ If you are new to this system, please see the following resources:
* `Jupyter service <https://docs.nersc.gov/services/jupyter/>`__
* `Production directories <https://www.nersc.gov/users/storage-and-file-systems/>`__:

* ``$SCRATCH``: per-user production directory (20TB)
* ``/global/cscratch1/sd/m3239``: shared production directory for users in the project ``m3239`` (50TB)
* ``$SCRATCH``: per-user production directory, purged every 30 days (20TB)
* ``/global/cscratch1/sd/m3239``: shared production directory for users in the project ``m3239``, purged every 30 days (50TB)
* ``/global/cfs/cdirs/m3239/``: community file system for users in the project ``m3239`` (100TB)

Installation
Expand Down
16 changes: 10 additions & 6 deletions Docs/source/install/hpc/crusher.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,19 @@ The `Crusher cluster <https://docs.olcf.ornl.gov/systems/crusher_quick_start_gui
Each node contains 4 AMD MI250X GPUs, each with 2 Graphics Compute Dies (GCDs) for a total of 8 GCDs per node.
You can think of the 8 GCDs as 8 separate GPUs, each having 64 GB of high-bandwidth memory (HBM2E).

If you are new to this system, please see the following resources:

Introduction
------------

If you are new to this system, **please see the following resources**:

* `Crusher user guide <https://docs.olcf.ornl.gov/systems/crusher_quick_start_guide.html>`_
* Batch system: `Slurm <https://docs.olcf.ornl.gov/systems/crusher_quick_start_guide.html#running-jobs>`_
* `Production directories <https://docs.olcf.ornl.gov/data/storage_overview.html>`_:
* `Production directories <https://docs.olcf.ornl.gov/data/index.html#data-storage-and-transfers>`_:

* ``$PROJWORK/$proj/``: shared with all members of a project (recommended)
* ``$MEMBERWORK/$proj/``: single user (usually smaller quota)
* ``$WORLDWORK/$proj/``: shared with all users
* ``$PROJWORK/$proj/``: shared with all members of a project, purged every 90 days (recommended)
* ``$MEMBERWORK/$proj/``: single user, purged every 90 days(usually smaller quota)
* ``$WORLDWORK/$proj/``: shared with all users, purged every 90 days
* Note that the ``$HOME`` directory is mounted as read-only on compute nodes.
That means you cannot run in your ``$HOME``.

Expand Down Expand Up @@ -97,7 +101,7 @@ Known System Issues
May 16th, 2022 (OLCFHELP-6888):
There is a caching bug in Libfrabric that causes WarpX simulations to occasionally hang on Crusher on more than 1 node.

As a work-around, please export the following environment variable in your job scripts unti the issue is fixed:
As a work-around, please export the following environment variable in your job scripts until the issue is fixed:

.. code-block:: bash
Expand Down
119 changes: 119 additions & 0 deletions Docs/source/install/hpc/frontier.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
.. _building-frontier:

Frontier (OLCF)
===============

The `Frontier cluster (see: Crusher) <https://docs.olcf.ornl.gov/systems/crusher_quick_start_guide.html>`_ is located at OLCF.
Each node contains 4 AMD MI250X GPUs, each with 2 Graphics Compute Dies (GCDs) for a total of 8 GCDs per node.
You can think of the 8 GCDs as 8 separate GPUs, each having 64 GB of high-bandwidth memory (HBM2E).


Introduction
------------

If you are new to this system, **please see the following resources**:

* `Crusher user guide <https://docs.olcf.ornl.gov/systems/crusher_quick_start_guide.html>`_
* Batch system: `Slurm <https://docs.olcf.ornl.gov/systems/crusher_quick_start_guide.html#running-jobs>`_
* `Production directories <https://docs.olcf.ornl.gov/data/index.html#data-storage-and-transfers>`_:

* ``$PROJWORK/$proj/``: shared with all members of a project, purged every 90 days (recommended)
* ``$MEMBERWORK/$proj/``: single user, purged every 90 days (usually smaller quota)
* ``$WORLDWORK/$proj/``: shared with all users, purged every 90 days
* Note that the ``$HOME`` directory is mounted as read-only on compute nodes.
That means you cannot run in your ``$HOME``.


Installation
------------

Use the following commands to download the WarpX source code and switch to the correct branch.
**You have to do this on Summit/OLCF Home/etc. since Frontier cannot connect directly to the internet**:

.. code-block:: bash
git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
git clone https://github.com/AMReX-Codes/amrex.git $HOME/src/amrex
git clone https://github.com/ECP-WarpX/picsar.git $HOME/src/picsar
git clone -b 0.14.5 https://github.com/openPMD/openPMD-api.git $HOME/src/openPMD-api
To enable HDF5, work-around the broken ``HDF5_VERSION`` variable (empty) in the Cray PE by commenting out the following lines in ``$HOME/src/openPMD-api/CMakeLists.txt``:
https://github.com/openPMD/openPMD-api/blob/0.14.5/CMakeLists.txt#L216-L220

We use the following modules and environments on the system (``$HOME/frontier_warpx.profile``).

.. literalinclude:: ../../../../Tools/machines/frontier-olcf/frontier_warpx.profile.example
:language: bash
:caption: You can copy this file from ``Tools/machines/frontier-olcf/frontier_warpx.profile.example``.

We recommend to store the above lines in a file, such as ``$HOME/frontier_warpx.profile``, and load it into your shell after a login:

.. code-block:: bash
source $HOME/frontier_warpx.profile
Then, ``cd`` into the directory ``$HOME/src/warpx`` and use the following commands to compile:

.. code-block:: bash
cd $HOME/src/warpx
rm -rf build
cmake -S . -B build \
-DWarpX_COMPUTE=HIP \
-DWarpX_amrex_src=$HOME/src/amrex \
-DWarpX_picsar_src=$HOME/src/picsar \
-DWarpX_openpmd_src=$HOME/src/openPMD-api
cmake --build build -j 32
The general :ref:`cmake compile-time options <building-cmake>` apply as usual.


.. _running-cpp-frontier:

Running
-------

.. _running-cpp-frontier-MI100-GPUs:

MI250X GPUs (2x64 GB)
^^^^^^^^^^^^^^^^^^^^^

After requesting an interactive node with the ``getNode`` alias above, run a simulation like this, here using 8 MPI ranks and a single node:

.. code-block:: bash
runNode ./warpx inputs
Or in non-interactive runs:

.. literalinclude:: ../../../../Tools/machines/frontier-olcf/submit.sh
:language: bash
:caption: You can copy this file from ``Tools/machines/frontier-olcf/submit.sh``.


.. _post-processing-frontier:

Post-Processing
---------------

For post-processing, most users use Python via OLCFs's `Jupyter service <https://jupyter.olcf.ornl.gov>`__ (`Docs <https://docs.olcf.ornl.gov/services_and_applications/jupyter/index.html>`__).

Please follow the same guidance as for :ref:`OLCF Summit post-processing <post-processing-summit>`.

.. _known-frontier-issues:

Known System Issues
-------------------

.. warning::

May 16th, 2022 (OLCFHELP-6888):
There is a caching bug in Libfrabric that causes WarpX simulations to occasionally hang on Frontier on more than 1 node.

As a work-around, please export the following environment variable in your job scripts until the issue is fixed:

.. code-block:: bash
export FI_MR_CACHE_MAX_COUNT=0 # libfabric disable caching
7 changes: 7 additions & 0 deletions Docs/source/install/hpc/juwels.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,12 @@ Juwels (JSC)

The `Juwels supercomputer <https://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUWELS/JUWELS_node.html>`_ is located at JSC.


Introduction
------------

If you are new to this system, **please see the following resources**:

See `this page <https://apps.fz-juelich.de/jsc/hps/juwels/quickintro.html>`_ for a quick introduction.
(Full `user guide <http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUWELS/UserInfo/UserInfo_node.html>`__).

Expand All @@ -20,6 +26,7 @@ See `this page <https://apps.fz-juelich.de/jsc/hps/juwels/quickintro.html>`_ for
* ``$FASTDATA/``: Storage location for large data (backed up)
* Note that the ``$HOME`` directory is not designed for simulation runs and producing output there will impact performance.


Installation
------------

Expand Down
6 changes: 5 additions & 1 deletion Docs/source/install/hpc/lassen.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,11 @@ Lassen (LLNL)

The `Lassen V100 GPU cluster <https://hpc.llnl.gov/hardware/platforms/lassen>`_ is located at LLNL.

If you are new to this system, please see the following resources:

Introduction
------------

If you are new to this system, **please see the following resources**:

* `LLNL user account <https://lc.llnl.gov/lorenz/mylc/mylc.cgi>`_
* `Lassen user guide <https://hpc.llnl.gov/training/tutorials/using-lcs-sierra-system>`_
Expand Down
6 changes: 5 additions & 1 deletion Docs/source/install/hpc/lawrencium.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,11 @@ Lawrencium (LBNL)

The `Lawrencium cluster <http://scs.lbl.gov/Systems>`_ is located at LBNL.

If you are new to this system, please see the following resources:

Introduction
------------

If you are new to this system, **please see the following resources**:

* `Lawrencium user guide <https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/lbnl-supercluster/lawrencium>`_
* Batch system: `Slurm <https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/slurm-usage-instructions>`_
Expand Down
7 changes: 7 additions & 0 deletions Docs/source/install/hpc/lxplus.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,12 @@ LXPLUS (CERN)

The LXPLUS cluster is located at CERN.


Introduction
------------

If you are new to this system, **please see the following resources**:

* `Lxplus documentation <https://lxplusdoc.web.cern.ch>`__
* Batch system: `HTCondor <https://batchdocs.web.cern.ch/index.html>`__
* Filesystem locations:
Expand All @@ -14,6 +20,7 @@ The LXPLUS cluster is located at CERN.

Through LXPLUS we have access to CPU and GPU nodes (the latter equipped with NVIDIA V100 and T4 GPUs).


Installation
------------
Only very little software is pre-installed on LXPLUS so we show how to install from scratch all the dependencies using `Spack <https://spack.io>`__.
Expand Down
6 changes: 5 additions & 1 deletion Docs/source/install/hpc/ookami.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,11 @@ Ookami (Stony Brook)

The `Ookami cluster <https://www.stonybrook.edu/ookami/>`__ is located at Stony Brook University.

If you are new to this system, please see the following resources:

Introduction
------------

If you are new to this system, **please see the following resources**:

* `Ookami documentation <https://www.stonybrook.edu/commcms/ookami/support/index_links_and_docs.php>`__
* Batch system: `Slurm <https://www.stonybrook.edu/commcms/ookami/support/faq/example-slurm-script>`__ (see `available queues <https://www.stonybrook.edu/commcms/ookami/support/faq/queues_on_ookami>`__)
Expand Down
10 changes: 7 additions & 3 deletions Docs/source/install/hpc/perlmutter.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,19 @@ Perlmutter (NERSC)

The `Perlmutter cluster <https://docs.nersc.gov/systems/perlmutter/>`_ is located at NERSC.

If you are new to this system, please see the following resources:

Introduction
------------

If you are new to this system, **please see the following resources**:

* `NERSC user guide <https://docs.nersc.gov/>`__
* Batch system: `Slurm <https://docs.nersc.gov/systems/perlmutter/#running-jobs>`__
* `Jupyter service <https://docs.nersc.gov/services/jupyter/>`__
* `Production directories <https://docs.nersc.gov/filesystems/perlmutter-scratch/>`__:

* ``$PSCRATCH``: per-user production directory (<TBD>TB)
* ``/global/cscratch1/sd/m3239``: shared production directory for users in the project ``m3239`` (50TB)
* ``$PSCRATCH``: per-user production directory, purged every 30 days (<TBD>TB)
* ``/global/cscratch1/sd/m3239``: shared production directory for users in the project ``m3239``, purged every 30 days (50TB)
* ``/global/cfs/cdirs/m3239/``: community file system for users in the project ``m3239`` (100TB)


Expand Down
Loading

0 comments on commit eb23f1e

Please sign in to comment.