Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Execution spaces: support for memory backends and execution policies #543

Merged
merged 56 commits into from
Dec 19, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
f90c365
added memory backends spec
cnpetra Aug 31, 2022
56510d6
mem backend in RAJA vec
cnpetra Sep 1, 2022
71f7e10
misc
cnpetra Sep 1, 2022
2254da4
added Transfer Implementation for Umpire memory backend
cnpetra Sep 2, 2022
d9bd0a2
instrumented RAJA vector class to use the new backend
cnpetra Sep 2, 2022
90a2b01
removed unnecessary const from some testing methods
cnpetra Sep 2, 2022
0dcb824
implementation of memory backends
cnpetra Sep 2, 2022
1aa3025
raja vector fully ported to the new abstract backends
cnpetra Sep 2, 2022
f1cc9bc
allocators for Cpp and Cuda memory
cnpetra Sep 2, 2022
c77b988
Hardware backend in Cuda Csr
cnpetra Sep 5, 2022
5e1c150
reorg: new directory and names for execution backends
cnpetra Sep 7, 2022
713554d
renamed HWBackend to ExecSpace
cnpetra Sep 7, 2022
0d056da
added cmake files for ExecBackends
cnpetra Sep 7, 2022
463a089
moved ExecSpaceInfo to ExecSpace.hpp
cnpetra Sep 7, 2022
17c9ae6
removed asserts
cnpetra Sep 7, 2022
a35796d
split raja vector into impl .h and cuda .cpp
cnpetra Sep 19, 2022
228a75d
close to final design of exec spaces
cnpetra Sep 19, 2022
bff3d05
blended in mem backends with exec policies backends
cnpetra Sep 19, 2022
e44c03c
renamed CUDA mem backend header
cnpetra Sep 19, 2022
c4dd061
renamed file for umpire memory backend
cnpetra Sep 19, 2022
746c228
renamed C++ mem backend
cnpetra Sep 19, 2022
2c93868
RAJA Exec Policies in separate impl .hpp files
cnpetra Sep 19, 2022
794b1d4
added Hip and Cuda RAJA Impl files
cnpetra Sep 19, 2022
c87e5da
removed hiop_raja_defs.hpp
cnpetra Sep 19, 2022
4adb714
renamed exec space member variable, added explicit template instantia…
cnpetra Sep 26, 2022
87e2a8c
Merge branch 'develop' into hwbackends_memory_dev
cnpetra Oct 25, 2022
a2bb874
test for cuda condensed linear algebra temporary disabled when RAJA i…
cnpetra Oct 25, 2022
63ead2f
fixed compilation errors (CUDA without RAJA)
cnpetra Oct 25, 2022
48b3950
cuda execution policies: members for block sizes
cnpetra Oct 25, 2022
f0a0b0e
changed csr kernels to take block size as arguments (not the whole ex…
cnpetra Oct 26, 2022
f2e0e37
updated documentation and added draft of the new options (+documentat…
cnpetra Oct 26, 2022
6aecba4
fix of previous commit: added options for pridec instead of nlp solver
cnpetra Oct 26, 2022
94d412a
Merge branch 'develop' into hwbackends_memory_dev
cnpetra Oct 27, 2022
fc9e34f
adding openmp raja
cnpetra Oct 28, 2022
28d1789
added HIP implementation vector file
cnpetra Nov 4, 2022
c77c581
implemented memory backend for hip
cnpetra Nov 4, 2022
d9988b6
reviewer's comments: HIP and CUDA are exclusive
cnpetra Dec 1, 2022
5946695
removed hiopVectorRajaPar
cnpetra Dec 3, 2022
5c24791
addressing issues with OpenMP builds
cnpetra Dec 3, 2022
9fd3f27
fixed missing headers
cnpetra Dec 3, 2022
cb0d45d
fixing (last) OpenMP error
cnpetra Dec 3, 2022
44c31a0
fixed broken `elif`
cnpetra Dec 4, 2022
032ca51
reviewers' comments
cnpetra Dec 5, 2022
892fd5c
memory allocators supports array size types as template parameters
cnpetra Dec 5, 2022
721b1cc
cleaning up
cnpetra Dec 7, 2022
6aa3df3
cleaning up LinAlgFactory
cnpetra Dec 7, 2022
9cc4f5c
Merge branch 'develop' into hwbackends_memory_dev
cnpetra Dec 7, 2022
d4afc16
fixed merge-related errors
cnpetra Dec 7, 2022
9dbbfce
final fixes for reviewers' comments
cnpetra Dec 18, 2022
d7e4cbc
fixed compilation errors on hip platforms
cnpetra Dec 9, 2022
474f05c
implemented memory transfers between Hip and Umpire memory backends
cnpetra Dec 9, 2022
df6136e
added install headers for exe backend
cnpetra Dec 9, 2022
9c092c2
Merge branch 'develop' into hwbackends_memory_dev
cnpetra Dec 18, 2022
b5a65c6
fixed merge compilation errors
cnpetra Dec 18, 2022
7414469
fixed runtime issues due to merge
cnpetra Dec 19, 2022
4794a18
fixed assert
cnpetra Dec 19, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -388,6 +388,7 @@ configure_file(
include_directories(${CMAKE_BINARY_DIR})

include_directories(src/Interface)
include_directories(src/ExecBackends)
include_directories(src/Optimization)
include_directories(src/LinAlg)
include_directories(src/Utils)
Expand Down
1 change: 1 addition & 0 deletions src/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
add_subdirectory(Interface)
add_subdirectory(ExecBackends)
add_subdirectory(Optimization)
add_subdirectory(LinAlg)
add_subdirectory(Utils)
Expand Down
2 changes: 1 addition & 1 deletion src/Drivers/Dense/NlpDenseConsEx1.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
#define HIOP_EXAMPLE_DENSE_EX1

#include "hiopVector.hpp"
#include "hiopLinAlgFactory.hpp"
#include "LinAlgFactory.hpp"
#include "hiopNlpFormulation.hpp"
#include "hiopInterface.hpp"
#include "hiopAlgFilterIPM.hpp"
Expand Down
2 changes: 1 addition & 1 deletion src/Drivers/MDS/NlpMdsEx1.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
//this include is not needed in general
//we use hiopMatrixDense in this particular example for convienience
#include "hiopMatrixDenseRowMajor.hpp"
#include "hiopLinAlgFactory.hpp"
#include "LinAlgFactory.hpp"

#ifdef HIOP_USE_MPI
#include "mpi.h"
Expand Down
2 changes: 1 addition & 1 deletion src/Drivers/MDS/NlpMdsEx2.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
//this include is not needed in general
//we use hiopMatrixDense in this particular example for convienience
#include "hiopMatrixDense.hpp"
#include "hiopLinAlgFactory.hpp"
#include "LinAlgFactory.hpp"

#ifdef HIOP_USE_MPI
#include "mpi.h"
Expand Down
25 changes: 22 additions & 3 deletions src/Drivers/MDS/NlpMdsRajaEx1.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -65,9 +65,28 @@
#include <hiopMatrixDenseRowMajor.hpp>
#include <hiopMatrixRajaDense.hpp>

#include <hiop_raja_defs.hpp>
using ex1_raja_exec = hiop::hiop_raja_exec;
using ex1_raja_reduce = hiop::hiop_raja_reduce;
//TODO: A good idea to not use the internal HiOp Raja policies here and, instead, give self-containing
// definitions of the policies here so that the user gets a better grasp of the concept and does not
// rely on the internals of HiOp. For example:
// #define RAJA_LAMBDA [=] __device__
// using ex1_raja_exec = RAJA::cuda_exec<128>;
// more defs here


#if defined(HIOP_USE_CUDA)
#include "ExecPoliciesRajaCudaImpl.hpp"
using ex1_raja_exec = hiop::ExecRajaPoliciesBackend<hiop::ExecPolicyRajaCuda>::hiop_raja_exec;
using ex1_raja_reduce = hiop::ExecRajaPoliciesBackend<hiop::ExecPolicyRajaCuda>::hiop_raja_reduce;
#elif defined(HIOP_USE_HIP)
#include <ExecPoliciesRajaHipImpl.hpp>
using ex1_raja_exec = hiop::ExecRajaPoliciesBackend<hiop::ExecPolicyRajaHip>::hiop_raja_exec;
using ex1_raja_reduce = hiop::ExecRajaPoliciesBackend<hiop::ExecPolicyRajaHip>::hiop_raja_reduce;
#else
//#if !defined(HIOP_USE_CUDA) && !defined(HIOP_USE_HIP)
#include <ExecPoliciesRajaOmpImpl.hpp>
using ex1_raja_exec = hiop::ExecRajaPoliciesBackend<hiop::ExecPolicyRajaOmp>::hiop_raja_exec;
using ex1_raja_reduce = hiop::ExecRajaPoliciesBackend<hiop::ExecPolicyRajaOmp>::hiop_raja_reduce;
#endif

using namespace hiop;

Expand Down
2 changes: 1 addition & 1 deletion src/Drivers/MDS/NlpMdsRajaEx1.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@
//this include is not needed in general
//we use hiopMatrixDense in this particular example for convienience
#include <hiopMatrixDense.hpp>
#include <hiopLinAlgFactory.hpp>
#include <LinAlgFactory.hpp>

#ifdef HIOP_USE_MPI
#include "mpi.h"
Expand Down
44 changes: 26 additions & 18 deletions src/Drivers/PriDec/NlpPriDecEx2SparseRaja.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,21 @@

#include <RAJA/RAJA.hpp>

using ex9_raja_exec = hiop::hiop_raja_exec;
using ex9_raja_reduce = hiop::hiop_raja_reduce;
#if defined(HIOP_USE_CUDA)
#include <ExecPoliciesRajaCudaImpl.hpp>
using ex9_raja_exec = hiop::ExecRajaPoliciesBackend<hiop::ExecPolicyRajaCuda>::hiop_raja_exec;
using ex9_raja_reduce = hiop::ExecRajaPoliciesBackend<hiop::ExecPolicyRajaCuda>::hiop_raja_reduce;
#elif defined(HIOP_USE_HIP)
#include <ExecPoliciesRajaHipImpl.hpp>
using ex9_raja_exec = hiop::ExecRajaPoliciesBackend<hiop::ExecPolicyRajaHip>::hiop_raja_exec;
using ex9_raja_reduce = hiop::ExecRajaPoliciesBackend<hiop::ExecPolicyRajaHip>::hiop_raja_reduce;
#else
//#if !defined(HIOP_USE_CUDA) && !defined(HIOP_USE_HIP)
#include <ExecPoliciesRajaOmpImpl.hpp>
using ex9_raja_exec = hiop::ExecRajaPoliciesBackend<hiop::ExecPolicyRajaOmp>::hiop_raja_exec;
using ex9_raja_reduce = hiop::ExecRajaPoliciesBackend<hiop::ExecPolicyRajaOmp>::hiop_raja_reduce;
#endif

using namespace hiop;

PriDecMasterProbleEx2Sparse::
Expand Down Expand Up @@ -109,12 +122,12 @@ bool PriDecMasterProbleEx2Sparse::eval_f_rterm(size_t idx, const int& n, const d
{
assert(nx_==n);
rval=-1e+20;
hiopSolveStatus status;
double* xi;

#ifdef HIOP_USE_MPI
double t3 = MPI_Wtime();
double t4 = 0.;
//to monitor contingency compute time
//double t3 = MPI_Wtime();
//double t4 = 0.;
#endif
cnpetra marked this conversation as resolved.
Show resolved Hide resolved

// xi can be set below
Expand All @@ -128,13 +141,6 @@ bool PriDecMasterProbleEx2Sparse::eval_f_rterm(size_t idx, const int& n, const d

ex9_recourse = new PriDecRecourseProbleEx2Sparse(nc_, nS_, S_, x, xi, mem_space_);

// set a few contingencies to have different sparse structures to create unbalanced load
/*
if(idx%30==0) {
ex9_recourse->set_sparse(0.3);
}
*/

hiopNlpSparse nlp(*ex9_recourse);
nlp.options->SetStringValue("duals_update_type", "linear");
//nlp.options->SetStringValue("dualsInitialization", "zero");
Expand All @@ -152,25 +158,27 @@ bool PriDecMasterProbleEx2Sparse::eval_f_rterm(size_t idx, const int& n, const d

hiopAlgFilterIPMNewton solver(&nlp);


//assert("for debugging" && false); //for debugging purpose
status = solver.run();
hiopSolveStatus status = solver.run();
assert(status==Solve_Success ||
status==Solve_Success_RelTol ||
status==Solve_Acceptable_Level);

rval = solver.getObjective();
if(y_==nullptr) {
y_ = new double[ny_];
}
solver.getSolution(y_);

#ifdef HIOP_USE_MPI
#ifdef HIOP_USE_MPI
// uncomment if want to monitor contingency computing time
/* t4 = MPI_Wtime();
/*
t4 = MPI_Wtime();
if(idx==0||idx==1) {
printf( "Elapsed time for contingency %d is %f\n",idx, t4 - t3 );
printf(" Objective for idx %d value %18.12e, xi %18.12e\n",idx,rval,xi[0]);
}
*/
#endif
#endif

delete[] xi;
delete ex9_recourse;
Expand Down
18 changes: 15 additions & 3 deletions src/Drivers/PriDec/NlpPriDecEx2UserRecourseSparseRaja.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,21 @@
using size_type = hiop::size_type;
using index_type = hiop::index_type;

#include <hiop_raja_defs.hpp>
using ex9_raja_exec = hiop::hiop_raja_exec;
using ex9_raja_reduce = hiop::hiop_raja_reduce;
#if defined(HIOP_USE_CUDA)
#include <ExecPoliciesRajaCudaImpl.hpp>
using ex9_raja_exec = hiop::ExecRajaPoliciesBackend<hiop::ExecPolicyRajaCuda>::hiop_raja_exec;
using ex9_raja_reduce = hiop::ExecRajaPoliciesBackend<hiop::ExecPolicyRajaCuda>::hiop_raja_reduce;
#elif defined(HIOP_USE_HIP)
#include <ExecPoliciesRajaHipImpl.hpp>
using ex9_raja_exec = hiop::ExecRajaPoliciesBackend<hiop::ExecPolicyRajaHip>::hiop_raja_exec;
using ex9_raja_reduce = hiop::ExecRajaPoliciesBackend<hiop::ExecPolicyRajaHip>::hiop_raja_reduce;
#else
//#if !defined(HIOP_USE_CUDA) && !defined(HIOP_USE_HIP)
#include <ExecPoliciesRajaOmpImpl.hpp>
using ex9_raja_exec = hiop::ExecRajaPoliciesBackend<hiop::ExecPolicyRajaOmp>::hiop_raja_exec;
using ex9_raja_reduce = hiop::ExecRajaPoliciesBackend<hiop::ExecPolicyRajaOmp>::hiop_raja_reduce;
#endif

using namespace hiop;

/** This class provide an example of what a user of hiop::hiopInterfacePriDecProblem
Expand Down
5 changes: 4 additions & 1 deletion src/Drivers/Sparse/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,10 @@ endif(HIOP_USE_GINKGO)
add_test(NAME NlpSparse2_1 COMMAND ${RUNCMD} "$<TARGET_FILE:NlpSparseEx2.exe>" "500" "-selfcheck")
add_test(NAME NlpSparse2_2 COMMAND ${RUNCMD} "$<TARGET_FILE:NlpSparseEx2.exe>" "500" "-inertiafree" "-selfcheck")
if(HIOP_USE_CUDA)
add_test(NAME NlpSparse2_3 COMMAND ${RUNCMD} "$<TARGET_FILE:NlpSparseEx2.exe>" "500" "-cusolver" "-inertiafree" "-selfcheck")
#disable cuda condensed linear algebra test when raja is not present (temporary)
cnpetra marked this conversation as resolved.
Show resolved Hide resolved
if(HIOP_USE_RAJA)
add_test(NAME NlpSparse2_3 COMMAND ${RUNCMD} "$<TARGET_FILE:NlpSparseEx2.exe>" "500" "-cusolver" "-inertiafree" "-selfcheck")
endif(HIOP_USE_RAJA)
endif(HIOP_USE_CUDA)
if(HIOP_USE_GINKGO)
add_test(NAME NlpSparse2_4 COMMAND ${RUNCMD} "$<TARGET_FILE:NlpSparseEx2.exe>" "500" "-ginkgo" "-inertiafree" "-selfcheck")
Expand Down
2 changes: 1 addition & 1 deletion src/Drivers/Sparse/NlpSparseEx2.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
//this include is not needed in general
//we use hiopMatrixSparse in this particular example for convenience
#include "hiopMatrixSparse.hpp"
#include "hiopLinAlgFactory.hpp"
#include "LinAlgFactory.hpp"

#ifdef HIOP_USE_MPI
#include "mpi.h"
Expand Down
13 changes: 13 additions & 0 deletions src/ExecBackends/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Set headers to be installed as part of the hiop interface
set(hiopExecBackends_INTERFACE_HEADERS
ExecSpace.hpp
ExecPoliciesRajaCudaImpl.hpp
ExecPoliciesRajaHipImpl.hpp
ExecPoliciesRajaOmpImpl.hpp
MemBackendCppImpl.hpp
MemBackendCudaImpl.hpp
MemBackendHipImpl.hpp
MemBackendUmpireImpl.hpp
)

install(FILES ${hiopExecBackends_INTERFACE_HEADERS} DESTINATION include)
104 changes: 104 additions & 0 deletions src/ExecBackends/ExecPoliciesRajaCudaImpl.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
// Copyright (c) 2022, Lawrence Livermore National Security, LLC.
// Produced at the Lawrence Livermore National Laboratory (LLNL).
// LLNL-CODE-742473. All rights reserved.
//
// This file is part of HiOp. For details, see https://github.com/LLNL/hiop. HiOp
// is released under the BSD 3-clause license (https://opensource.org/licenses/BSD-3-Clause).
// Please also read "Additional BSD Notice" below.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
// i. Redistributions of source code must retain the above copyright notice, this list
// of conditions and the disclaimer below.
// ii. Redistributions in binary form must reproduce the above copyright notice,
// this list of conditions and the disclaimer (as noted below) in the documentation and/or
// other materials provided with the distribution.
// iii. Neither the name of the LLNS/LLNL nor the names of its contributors may be used to
// endorse or promote products derived from this software without specific prior written
// permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
// OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT
// SHALL LAWRENCE LIVERMORE NATIONAL SECURITY, LLC, THE U.S. DEPARTMENT OF ENERGY OR
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
// OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
// AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
// EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Additional BSD Notice
// 1. This notice is required to be provided under our contract with the U.S. Department
// of Energy (DOE). This work was produced at Lawrence Livermore National Laboratory under
// Contract No. DE-AC52-07NA27344 with the DOE.
// 2. Neither the United States Government nor Lawrence Livermore National Security, LLC
// nor any of their employees, makes any warranty, express or implied, or assumes any
// liability or responsibility for the accuracy, completeness, or usefulness of any
// information, apparatus, product, or process disclosed, or represents that its use would
// not infringe privately-owned rights.
// 3. Also, reference herein to any specific commercial products, process, or services by
// trade name, trademark, manufacturer or otherwise does not necessarily constitute or
// imply its endorsement, recommendation, or favoring by the United States Government or
// Lawrence Livermore National Security, LLC. The views and opinions of authors expressed
// herein do not necessarily state or reflect those of the United States Government or
// Lawrence Livermore National Security, LLC, and shall not be used for advertising or
// product endorsement purposes.

/**
* @file ExecPoliciesRajaCudaImpl.hpp
*
* @author Cosmin G. Petra <petra1@llnl.gov>, LLNL
* @author Slaven Peles <peless@ornl.gov>, ORNL
* @author Nai-Yuan Chiang <chiang7@llnl.gov>, LLNL
*/

/**
* This file contains CUDA RAJA policies. Should be generally included only in CUDA
* compilation units.
*/

#ifndef HIOP_EXEC_POL_RAJA_CUDA
#define HIOP_EXEC_POL_RAJA_CUDA

#if defined(HIOP_USE_RAJA) && defined(HIOP_USE_CUDA)

#include "ExecSpace.hpp"

#include <cuda.h>
#include <RAJA/RAJA.hpp>

namespace hiop
{
#define RAJA_LAMBDA [=] __device__

cnpetra marked this conversation as resolved.
Show resolved Hide resolved
template<>
struct ExecRajaPoliciesBackend<ExecPolicyRajaCuda>
{
static constexpr unsigned short int HIOP_RAJA_GPU_BLOCK_SIZE = 128;

using hiop_raja_exec = RAJA::cuda_exec<HIOP_RAJA_GPU_BLOCK_SIZE>;
using hiop_raja_reduce = RAJA::cuda_reduce;
using hiop_raja_atomic = RAJA::cuda_atomic;

// The following are primarily for _matrix_exec_
using hiop_block_x_loop = RAJA::cuda_block_x_loop;
using hiop_thread_x_loop = RAJA::cuda_thread_x_loop;
template<typename T>
using hiop_kernel = RAJA::statement::CudaKernel<T>;

using matrix_exec =
RAJA::KernelPolicy<
hiop_kernel<
RAJA::statement::For<1, hiop_block_x_loop,
RAJA::statement::For<0, hiop_thread_x_loop,
RAJA::statement::Lambda<0>
>
>
>
>;

};
cnpetra marked this conversation as resolved.
Show resolved Hide resolved
} //end of namespace
#endif //defined(HIOP_USE_RAJA) && defined(HIOP_USE_CUDA)
#endif
Loading