Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building Analogs Ensemble on HPC #86

Open
Weiming-Hu opened this issue Feb 26, 2020 · 8 comments
Open

Building Analogs Ensemble on HPC #86

Weiming-Hu opened this issue Feb 26, 2020 · 8 comments
Assignees
Labels
tutorial should be changed and posted on the website as a tutorial

Comments

@Weiming-Hu
Copy link
Owner

Weiming-Hu commented Feb 26, 2020

Building with GCC on XSEDE Stampede2

# I have already installed eccodes at ~/packages/release
# You can use this script to install eccodes locally
#
# https://github.com/Weiming-Hu/AnalogsEnsemble/blob/master/CAnEnIO/cmake/install_eccodes.sh
# 

# Load modules
module purge
module load gcc/7.1.0 boost/1.64 netcdf/4.6.2 cmake/3.16.1

# Download the source files (~10 Mb)
wget https://github.com/Weiming-Hu/AnalogsEnsemble/archive/master.zip

# Unzip
unzip master.zip

# Create a separate folder to store all intermediate files during the installation process
cd AnalogsEnsemble-master/

# Out-of-tree build
mkdir build && cd build
CC=gcc CXX=g++ cmake -DCMAKE_INSTALL_PREFIX="~/packages/release" -DCMAKE_PREFIX_PATH="~/packages/release;$TACC_NETCDF_DIR" -DCMAKE_INSTALL_RPATH="`echo ~`/packages/release/lib;$TACC_NETCDF_LIB;$TACC_BOOST_LIB" ..

# Build
make -j 4

# Install
make install
@Weiming-Hu Weiming-Hu self-assigned this Feb 26, 2020
@Weiming-Hu
Copy link
Owner Author

Weiming-Hu commented Feb 26, 2020

Building with Intel on XSEDE Stampede2

# I have already installed eccodes at ~/packages/release
# You can use this script to install eccodes locally
#
# https://github.com/Weiming-Hu/AnalogsEnsemble/blob/master/CAnEnIO/cmake/install_eccodes.sh
# 

# Load modules
module load boost netcdf

# Download the source files (~10 Mb)
wget https://github.com/Weiming-Hu/AnalogsEnsemble/archive/master.zip

# Unzip
unzip master.zip

# Create a separate folder to store all intermediate files during the installation process
cd AnalogsEnsemble-master/

# Out-of-tree build
mkdir build && cd build
CC=icc CXX=icpc cmake -DCMAKE_INSTALL_PREFIX="~/packages/release" -DCMAKE_PREFIX_PATH="~/packages/release;$TACC_NETCDF_DIR" -DCMAKE_INSTALL_RPATH="`echo ~`/packages/release/lib;$TACC_NETCDF_LIB;$TACC_BOOST_LIB" ..

# Build
make -j 4

# Install
make install

@Weiming-Hu Weiming-Hu added good first issue Good for newcomers tutorial should be changed and posted on the website as a tutorial and removed good first issue Good for newcomers labels Feb 26, 2020
@Weiming-Hu
Copy link
Owner Author

Weiming-Hu commented Mar 10, 2020

Building with Intel and MPI on XSEDE Stampede2

# I have already installed eccodes at ~/packages/release
# You can use this script to install eccodes locally
#
# https://github.com/Weiming-Hu/AnalogsEnsemble/blob/master/CAnEnIO/cmake/install_eccodes.sh
# 

# Load modules
module load boost netcdf impi

# Download the source files (~10 Mb)
wget https://github.com/Weiming-Hu/AnalogsEnsemble/archive/master.zip

# Unzip
unzip master.zip

# Create a separate folder to store all intermediate files during the installation process
cd AnalogsEnsemble-master/

# Out-of-tree build
mkdir build && cd build

# Enable MPI and disable OpenMP
#
# Why no multi-threading? Please read.
# https://weiming-hu.github.io/AnalogsEnsemble/doc#mpi-and-openmp
#
# TL;DR
# MPI should be fast enough for most cases.
#
CC=icc CXX=icpc cmake -DENABLE_MPI=ON -DENABLE_OPENMP=ON -DCMAKE_INSTALL_PREFIX="~/packages/release" -DCMAKE_PREFIX_PATH="~/packages/release;$TACC_NETCDF_DIR" -DCMAKE_INSTALL_RPATH="`echo ~`/packages/release/lib;$TACC_NETCDF_LIB;$TACC_BOOST_LIB" ..

# Build
make -j 4

# Install
make install

@Weiming-Hu Weiming-Hu changed the title Building CAnEn on XSEDE Stampede2 Building CAnEn on HPC Mar 12, 2020
@Weiming-Hu
Copy link
Owner Author

Weiming-Hu commented Mar 12, 2020

Building with Intel and MPI on NCAR Cheyenne

# Load modules
module load cmake eccodes

# Download the source files (~10 Mb)
wget https://github.com/Weiming-Hu/AnalogsEnsemble/archive/master.zip

# Unzip
unzip master.zip

# Create a separate folder to store all intermediate files during the installation process
cd AnalogsEnsemble-master/

# Out-of-tree build
mkdir build && cd build
CC=icc CXX=icpc cmake -DBUILD_BOOST=ON -DENABLE_MPI=ON -DCMAKE_PREFIX_PATH="$NCAR_ROOT_ECCODES;$NETCDF" ..

# Build
make -j 4

# Install. Unfortunately, there is no install rules when you are building Boost
# So you need to use the executable in the build tree
#
# make install
ls apps/anen_grib
ls apps/grib_convert

@Weiming-Hu Weiming-Hu pinned this issue Mar 25, 2020
@Weiming-Hu Weiming-Hu changed the title Building CAnEn on HPC Building Analogs Ensemble on HPC Mar 27, 2020
@Weiming-Hu
Copy link
Owner Author

Weiming-Hu commented Jul 22, 2020

Building with Intel and MPI on NREL Eagle

# I have already installed eccodes at ~/packages/release
# You can use this script to install eccodes locally
#
# https://github.com/Weiming-Hu/AnalogsEnsemble/blob/master/CAnEnIO/cmake/install_eccodes.sh
# 

# Load modules
module load intel-mpi/2019.6 boost/1.69.0/intel-18.0.3 netcdf-c/4.6.2/intel-18.0.3-mpi cmake/3.12.3 hdf5/1.10.4/intel1803-impi

# NetCDF C++ Extensions
#
# Install the NetCDF C++ extensions because the NetCDF module on
# Eagle doesn't have the C++ extensions
#
wget https://github.com/Unidata/netcdf-cxx4/archive/v4.3.1.zip
unzip v4.3.1.zip && cd netcdf-cxx4-4.3.1/
mkdir build && cd build
CC=icc CXX=icpc cmake -DCMAKE_INSTALL_PREFIX="~/packages" -DCMAKE_C_FLAGS="-I$HDF5_INCLUDE -I$I_MPI_ROOT/intel64/include" ..
make -j 12
ctest
make install
cd ../..
rm -rf v4.3.1.zip netcdf-cxx4-4.3.1

# PAnEn
#
# Download the source files (~10 Mb)
wget https://github.com/Weiming-Hu/AnalogsEnsemble/archive/master.zip

# Unzip
unzip master.zip

# Create a separate folder to store all intermediate files during the installation process
cd AnalogsEnsemble-master/

# Out-of-tree build
mkdir build && cd build

# Generate build tree
CC=icc CXX=icpc cmake -DENABLE_MPI=ON -DCMAKE_INSTALL_PREFIX="~/packages" -DCMAKE_PREFIX_PATH="~/packages;$NETCDF_ROOT_DIR" -DCMAKE_INSTALL_RPATH="`echo ~`/packages/lib64;`echo ~`/packages/lib;$NETCDF_ROOT_DIR" ..

# Build
make -j 12

# Install
make install

# Delete files
cd ../..
rm -rf AnalogsEnsemble-master/ master.zip 

@Weiming-Hu
Copy link
Owner Author

Weiming-Hu commented Oct 15, 2020

Building with GNU and MPI on NCAR Cheyenne (with PyTorch)

This set of instructions will first build libTorch on Cheyenne which is very time-consuming. So prepare some papers to read while the program is building 😄

I couldn't use the pre-built version from the offitial library because it is built on a newer glibc version but Cheyenne has the older version. I was able to specifically link the libraries but the program didn't run correctly. So I chose to build libTorch ad hoc.

# Load modules
module load gnu cmake eccodes git python/3.7.5

Sanity check!

wuh20@cheyenne3:~/github/AnalogsEnsemble/build> ml

Currently Loaded Modules:
  1) ncarenv/1.3   2) gnu/9.1.0   3) cmake/3.18.2   4) eccodes/2.12.5   5) git/2.22.0   6) python/3.7.5   7) ncarcompilers/0.5.0   8) netcdf/4.7.4   9) mpt/2.22

Looks good. Let's continue.

#
# I assume you have a folder, packages, already created under your user root.
# All installation will happen under that directory
#

##################
# Build libTroch #
##################
#
# Referenced from https://github.com/pytorch/pytorch/blob/master/docs/libtorch.rst
#

# Download the source code
cd packages/
git clone --recursive https://github.com/pytorch/pytorch.git
cd pytorch/

# Out-of-tree build
mkdir build_libtorch && cd build_libtorch

# Prepare a virtual environment
virtualenv -p python3 venv
source venv/bin/activate

# Install dependencies
pip install pyyaml

# Call the build script
export MAX_JOBS=8
python ../tools/build_libtorch.py

# Install
cd build
make install

################
# Build AnEn #
################

# Download the source files (~10 Mb)
wget https://github.com/Weiming-Hu/AnalogsEnsemble/archive/master.zip

# Unzip
unzip master.zip

# Create a separate folder to store all intermediate files during the installation process
cd AnalogsEnsemble-master/

# Out-of-tree build
mkdir build && cd build
CC=gcc CXX=g++ cmake -DBUILD_BOOST=ON -DENABLE_MPI=ON -DENABLE_AI=ON -DCMAKE_PREFIX_PATH="$NCAR_ROOT_ECCODES;$NETCDF;$HOME/packages/pytorch/torch" ..

# Build
make -j 8

# Install. Unfortunately, there is no install rules when you are building Boost
# So you need to use the executable in the build tree
#
# make install
ls apps/anen_grib
ls apps/grib_convert

@Weiming-Hu
Copy link
Owner Author

Weiming-Hu commented Dec 17, 2021

Building with GNU on CW3E Feather/Skyriver

First, you need to install eccodes. I have the instruction in this script. Perhaps, you also need to install CMake because the system one is too old. Instructions can be found here.

Then, we are going to install an older version of the NetCDF C++4 extension because the system version of NetCDF is not up to date. Please see the release page for such requirements.

# Verify system NetCDF version
nc-config --version # netCDF 4.3.3.1

wget https://github.com/Unidata/netcdf-cxx4/archive/refs/tags/v4.3.0.zip
unzip v4.3.0.zip
cd netcdf-cxx4-4.3.0/ && mkdir build && cd build
cmake -DCMAKE_INSTALL_PREFIX="~/packages/netcdf-cxx4" ..
make -j 12
ctest
make install
cd ../..
rm -rf v4.3.0.zip netcdf-cxx4-4.3.0

Finally, let's build AnEn.

wget https://github.com/Weiming-Hu/AnalogsEnsemble/archive/master.zip
unzip master.zip

rm master.zip

cd AnalogsEnsemble-master/
mkdir build && cd build

# Generate build tree
# Note that I have included both the NetCDF C++4 and the eccodes paths.
# You need to change the eccodes path accordingly based on your installation.
#
cmake -DCMAKE_PREFIX_PATH="~/packages/netcdf-cxx4;~/packages/eccodes" -DBUILD_BOOST=ON ..

make -j 16

# Find the executable below
file apps/anen_netcdf/anen_netcdf

@Weiming-Hu
Copy link
Owner Author

Weiming-Hu commented Jan 12, 2022

Build with Intel and MPI on Comet


PERFORMANCE TIPS

  1. Put your source code in a project folder like /cw3e/mead/projects/ would speed up file I/O.
  2. Run command on a compute node would help performance.

To start off, load modules

module load netcdf hdf5 intelmpi

# module list
# Currently Loaded Modulefiles:
# 1) intel/2018.1.163      2) mvapich2_ib/2.3.2     3) hdf5/1.10.3           4) netcdf/4.6.1          5) intelmpi/2018.1.163

We need to install cmake, netcdf-c++4, and eccodes. Instructions can be found in previous comments.

When running cmake for netcdf-c++4, use the following make command:

CC=icc CXX=icc cmake -DCMAKE_INSTALL_PREFIX="~/packages/netcdf-cxx4-4.3.1/release" -DCMAKE_PREFIX_PATH="/opt/netcdf/4.6.1/intel/mvapich2_ib;/opt/hdf5/1.10.3/intel/mvapich2_ib;/opt/intel/2018.1.163/compilers_and_libraries_2018.1.163/linux/mpi/intel64" -DCMAKE_C_FLAGS="-I/opt/hdf5/1.10.3/intel/mvapich2_ib/include/ -I/opt/intel/2018.1.163/compilers_and_libraries_2018.1.163/linux/mpi/intel64/include" -DCMAKE_CXX_FLAGS="-I/opt/hdf5/1.10.3/intel/mvapich2_ib/include -I/opt/intel/2018.1.163/compilers_and_libraries_2018.1.163/linux/mpi/intel64/include" ..

Otherwise, build didn't work for me. It was complaining about not being able to find mpi.h and hdf5.h.

When running cmake for PAnEn, using the following command:

CC=icc CXX=icc cmake -DENABLE_MPI=ON -DENABLE_OPENMP=ON -DCMAKE_PREFIX_PATH="~/packages/eccodes-2.24.1-Source/release/;~/packages/netcdf-cxx4-4.3.1/release/;/opt/netcdf/4.6.1/intel/mvapich2_ib" -DBUILD_BOOST=ON ..

The rest of the steps are pretty standard. Please refer to previous comments.

@Weiming-Hu
Copy link
Owner Author

Weiming-Hu commented Feb 22, 2022

Build with GCC, MPI, and PyTorch on Comet

  • Get a compute node to improve performance: srun --partition=shared --pty --nodes=1 --ntasks-per-node=24 -t 12:00:00 --wait=0 --export=ALL /bin/bash
  • Load modules: module purge && module load gnu openmpi_ib netcdf
  • Compile boost binaries: I have tested with boost_1_71_0.tar.bz2.
  • Create a python virtual environment and install PyYAML and type_extensions
  • Compile PyTorch from source
cd ~/packages/
wget https://github.com/pytorch/pytorch/releases/download/v1.10.2/pytorch-v1.10.2.tar.gz
tar xvf pytorch-v1.10.2.tar.gz
cd pytorch-v1.10.2/
mkdir build && cd build
cmake -DBUILD_SHARED_LIBS:BOOL=ON -DCMAKE_BUILD_TYPE:STRING=Release -DPYTHON_EXECUTABLE:PATH=`which python3` -DCMAKE_INSTALL_PREFIX:PATH=../release ../
make install -j 24
  • Specify environment variables:
# A python environment where you are going to use the AnEnGrid package.
# This could be just a newly created environment from conda or pip
#
export EXE_PYTHON=$HOME/large/venv_deepanalogs/bin/python3.9

# The directory where you install your compiled boost
export DIR_BOOST=$HOME/large/packages/boost_1_71_0/release/

# The directory where you install Eccodes
export DIR_EC=$HOME/large/packages/eccodes-2.24.1-Source/release/

# The directory where you install NetCDF C++4
export DIR_NC=$HOME/large/packages/netcdf-cxx4-4.3.1/release/

# The directory where you extracted libTorch
export DIR_TORCH=$HOME/large/packages/pytorch-v1.10.2/release
  • Download PAnEn source code and extract
  • Out-of-tree build and run cmake (read previous instructions if you don't know what out-of-tree build means):
CC=gcc CXX=g++ cmake \
    -DENABLE_AI=ON -DENABLE_MPI=ON -DENABLE_OPENMP=ON \
    -DCMAKE_PREFIX_PATH="$DIR_BOOST;$DIR_EC;$NETCDFHOME;$DIR_NC;$MPIHOME;$DIR_TORCH" \
    -DBUILD_PYGRID=ON -DBUILD_SHARED_LIBS=ON \
    -DPYTHON_EXECUTABLE=$EXE_PYTHON ..
  • Build: make -j 24

To test load AnEnGrid:

# Assume you have AnEnGrid.cpython-xxx.so under the following directory
export PYTHONPATH=$HOME/github/AnalogsEnsemble/build/CGrid:$PYTHONPATH

# Test load
python -c "from AnEnGrid import AnEnGrid"

To see the AI extension from anen_netcdf

$HOME/github/AnalogsEnsemble/build/apps/anen_netcdf/anen_netcdf -h

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tutorial should be changed and posted on the website as a tutorial
Projects
None yet
Development

No branches or pull requests

1 participant