Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Const to SID Neighbor Table Element Type #1808

Merged
merged 97 commits into from
Oct 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
97 commits
Select commit Hold shift + click to select a range
eb86cbd
Renamed BlockSizes → ThreadBlockSizes
fthaler Jun 12, 2024
f10f9ad
Add loop blocking to fn GPU backend
fthaler Jun 17, 2024
c616749
Cleanup/refactor
fthaler Jun 18, 2024
44f13fd
Treat compile-time block sizes as compile-time
fthaler Jun 18, 2024
5da7ed2
Merge remote-tracking branch 'origin/master' into loop-blocking
fthaler Jun 18, 2024
b1577c1
Merge remote-tracking branch 'origin/master' into loop-blocking
fthaler Jun 18, 2024
9a5edf1
Minor cleanup
fthaler Jun 18, 2024
90e3a50
Use uint3 instead of dim3
fthaler Jun 18, 2024
f70b885
Silent warning
fthaler Jun 18, 2024
6e77d53
Add __launch_bounds__
fthaler Jun 18, 2024
508852d
Fixed comment
fthaler Jun 18, 2024
bade455
Slightly simpler meta:: calculation
fthaler Jun 18, 2024
1641040
Maybe fix for Clang-CUDA compilation
fthaler Jun 18, 2024
85dbec8
Merge remote-tracking branch 'origin/master' into loop-blocking
fthaler Jun 18, 2024
990555e
Added another ::template
fthaler Jun 18, 2024
ea9e3c9
Make clang-14-cuda-11 happy
fthaler Jun 18, 2024
d3d5228
Fix HIP compilation
fthaler Jun 19, 2024
bba6c78
Some cleanup
fthaler Jun 19, 2024
1b547d0
Fix formatting
fthaler Jun 19, 2024
6bc7bcf
Check types of ThreadBlockSizes and LoopBlockSizes
fthaler Jun 19, 2024
cf22abe
Merge branch 'master' into loop-blocking
fthaler Jun 24, 2024
afb3201
Merge branch 'master' into loop-blocking
fthaler Jun 26, 2024
21e27ea
Faster loop blocking WIP state
fthaler Jul 22, 2024
6be319a
Add const_host_view to const data store
fthaler Jul 22, 2024
ade9e8f
Added sid::make_unrolled_loop
fthaler Jul 22, 2024
0cfff11
Use explicitly unrolled loops in GPU fn backend
fthaler Jul 22, 2024
980933c
Re-enable verification
fthaler Jul 22, 2024
64098e2
Revert some irrelevant changes
fthaler Jul 22, 2024
8a021dc
Revert irrelevant changes to neighbor tables
fthaler Jul 22, 2024
8ab05d1
Revert more irrelevant changes
fthaler Jul 22, 2024
5db3605
Revert compile-time dimensions
fthaler Jul 22, 2024
05361de
Revert irrelevant changes in unstructured backend
fthaler Jul 22, 2024
d8b7b43
Formatting
fthaler Jul 22, 2024
23d2e65
Cleanup nabla stencils
fthaler Jul 22, 2024
2977c39
Fix NVCC warning
fthaler Jul 22, 2024
e2706ad
Re-enable nabla tests
fthaler Jul 22, 2024
2628178
Use vertex_field_id where applicable
fthaler Jul 22, 2024
e43a8b6
Cleanup storage SID adaptor
fthaler Jul 22, 2024
981b30a
Revert irrelevant changes
fthaler Jul 22, 2024
b9d4d34
Increased vertical block size
fthaler Jul 22, 2024
7ea497b
Revert temporary build changes
fthaler Jul 22, 2024
6c4c4f1
Fix missing ;
fthaler Jul 22, 2024
c5b3ad9
Use reasonable unroll factors in loop unrolling tests
fthaler Jul 22, 2024
34fd7e7
Fix missing include
fthaler Jul 23, 2024
e103e15
Merge remote-tracking branch 'upstream/master' into fast-loop-blockin…
fthaler Jul 23, 2024
702fa7c
Only include cuda_type_traits.hpp when required
fthaler Jul 23, 2024
68a2bd9
Enable fast-math on GPUs
fthaler Jul 23, 2024
19af796
Fix capturing of variable
fthaler Jul 23, 2024
8474c66
Disable k-blocking for now
fthaler Jul 23, 2024
69e13e6
Fix accidental use of single precision floats in fn nabla test
fthaler Jul 23, 2024
eb9096c
Updated references
fthaler Jul 23, 2024
e0fcc95
Merge remote-tracking branch 'origin/fix-nabla-float_t' into fast-loo…
fthaler Jul 23, 2024
0996c15
Possible workaround against compiler crash
fthaler Jul 24, 2024
adce6f2
Merge remote-tracking branch 'upstream/master' into fast-loop-blockin…
fthaler Jul 24, 2024
0a29491
Revert "Possible workaround against compiler crash"
fthaler Jul 24, 2024
2e6e3e0
Re-add accidentally lost index check
fthaler Jul 24, 2024
d2a6769
Selectively enable k-blocking
fthaler Jul 24, 2024
bf0ba6c
Merge remote-tracking branch 'upstream/master' into fast-loop-blockin…
fthaler Jul 24, 2024
0b0a306
Fix use of k_blocked_backend_t
fthaler Jul 24, 2024
c591f79
Explicitly use __ldg on pointer derefs
fthaler Jul 24, 2024
7699cb5
Merge remote-tracking branch 'upstream/master' into fast-loop-blockin…
fthaler Sep 2, 2024
07cae69
Fix compilation after merge
fthaler Sep 2, 2024
2a0914f
Revert unrelated changes to master
fthaler Sep 2, 2024
82c35e2
Merge remote-tracking branch 'upstream/master' into loop-blocking
fthaler Sep 25, 2024
c3edf6f
Merge remote-tracking branch 'upstream/master' into fast-loop-blockin…
fthaler Sep 25, 2024
ef4fed7
Merge branch 'fast-loop-blocking-tmp' into loop-blocking
fthaler Sep 25, 2024
ffd070e
Use sid::make_loop in sid::make_unrolled_loop if unroll factor is 1
fthaler Sep 25, 2024
6e4bcbe
Remove leftover --use_fast_math
fthaler Sep 25, 2024
b3bd8e3
Cleanup mp_find
fthaler Sep 25, 2024
304878d
Added a few comments
fthaler Sep 25, 2024
a83027f
Address review comment
fthaler Sep 25, 2024
c896ea1
Unconditional lookup in neighbor table
fthaler Jul 24, 2024
d16c1ea
Merge remote-tracking branch 'upstream/master' into loop-blocking
fthaler Oct 7, 2024
4e115a5
Revert "Unconditional lookup in neighbor table"
fthaler Oct 7, 2024
3e39f82
Revert accidental changes to Daint envs
fthaler Oct 7, 2024
494f8de
Use default loop block sizes in fn_select
fthaler Oct 7, 2024
27fc972
Disable loop unrolling for now
fthaler Oct 22, 2024
05206ac
Updated perftest references
fthaler Oct 22, 2024
071992a
Only 8 loads for neighbors and passing tests
iomaganaris Oct 15, 2024
ac48f24
Reduces loads but errors with "operation not supported on global/shar…
iomaganaris Oct 17, 2024
d69fd86
Only read neighbor tables twice
iomaganaris Oct 17, 2024
b39c679
Fix pointing to pointer of temporary array and m_index == -1
iomaganaris Oct 17, 2024
7884d4d
Keep only the necessary changes (const table ptr)
iomaganaris Oct 17, 2024
2d2a97f
Add const in sid in sid_neighbor_table
iomaganaris Oct 22, 2024
0642864
Create as_const_sid for nanobind bindings to pass input fields as con…
iomaganaris Oct 22, 2024
3aede67
Add const in as_neighbor_table changes
iomaganaris Oct 22, 2024
0a08822
Apply suggestions from code review
iomaganaris Oct 23, 2024
45fcc27
Fix as_sid and as_const_sid variable names
iomaganaris Oct 23, 2024
f341795
Use ndarray.ReadOnly to add_const if necessary in as_sid and remove a…
iomaganaris Oct 23, 2024
c1b8543
Merge remote-tracking branch 'upstream/master' into const-sid-neighbo…
fthaler Oct 29, 2024
cc97916
Fix minor style inconsistency
fthaler Oct 29, 2024
5732d10
Exclude nanobind adapter changes
fthaler Oct 29, 2024
3913959
Fix sid::as_const for C arrays
fthaler Oct 29, 2024
6cd93db
Allow reference return values of sid::as_const
fthaler Oct 29, 2024
9e13ddf
Don’t use ldg_ptr in simple_ptr_holder as it breaks sid::as_const on …
fthaler Oct 30, 2024
4e7309b
Merge remote-tracking branch 'upstream/master' into const-sid-neighbo…
fthaler Oct 30, 2024
fc5a002
Workaround for Clang-CUDA bug
fthaler Oct 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions include/gridtools/fn/sid_neighbor_table.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
#include "../common/array.hpp"
#include "../common/ldg_ptr.hpp"
#include "../fn/unstructured.hpp"
#include "../sid/as_const.hpp"
#include "../sid/concept.hpp"

namespace gridtools::fn::sid_neighbor_table {
Expand Down Expand Up @@ -61,14 +62,15 @@ namespace gridtools::fn::sid_neighbor_table {
static_assert(!std::is_same_v<IndexDimension, NeighborDimension>,
"The index dimension and the neighbor dimension must be different.");

const auto origin = sid::get_origin(sid);
const auto strides = sid::get_strides(sid);
decltype(auto) const_sid = sid::as_const(std::forward<Sid>(sid));
const auto origin = sid::get_origin(const_sid);
const auto strides = sid::get_strides(const_sid);

return sid_neighbor_table<IndexDimension,
NeighborDimension,
MaxNumNeighbors,
sid::ptr_holder_type<Sid>,
sid::strides_type<Sid>>{
decltype(origin),
decltype(strides)>{
origin, strides}; // Note: putting the return type into the function signature will crash nvcc 12.0
}
} // namespace sid_neighbor_table_impl_
Expand Down
4 changes: 2 additions & 2 deletions include/gridtools/sid/as_const.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -51,14 +51,14 @@ namespace gridtools {
* probably might we need the `host` and `device` variations as well
*/
template <class Src,
class Ptr = sid::ptr_type<std::decay_t<Src>>,
class Ptr = sid::ptr_type<std::remove_cv_t<std::remove_reference_t<Src>>>,
std::enable_if_t<std::is_pointer_v<Ptr> && !std::is_const_v<std::remove_pointer_t<Ptr>>, int> = 0>
as_const_impl_::const_adapter<Src> as_const(Src &&src) {
return {std::forward<Src>(src)};
}

template <class Src,
class Ptr = sid::ptr_type<std::decay_t<Src>>,
class Ptr = sid::ptr_type<std::remove_cv_t<std::remove_reference_t<Src>>>,
std::enable_if_t<!std::is_pointer_v<Ptr> || std::is_const_v<std::remove_pointer_t<Ptr>>, int> = 0>
decltype(auto) as_const(Src &&src) {
return std::forward<Src>(src);
Expand Down
3 changes: 1 addition & 2 deletions include/gridtools/sid/simple_ptr_holder.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@

#include "../common/defs.hpp"
#include "../common/host_device.hpp"
#include "../common/ldg_ptr.hpp"

#define GT_FILENAME <gridtools/sid/simple_ptr_holder.hpp>
#include GT_ITERATE_ON_TARGETS()
Expand All @@ -39,7 +38,7 @@ namespace gridtools {
simple_ptr_holder() = default;
GT_TARGET GT_FORCE_INLINE constexpr simple_ptr_holder(T const &ptr) : m_val{ptr} {}
#endif
GT_TARGET GT_FORCE_INLINE constexpr decltype(auto) operator()() const { return as_ldg_ptr(m_val); }
GT_TARGET GT_FORCE_INLINE constexpr T const &operator()() const { return m_val; }
};

template <class T>
Expand Down
5 changes: 4 additions & 1 deletion tests/unit_tests/fn/test_fn_sid_neighbor_table.cu
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,10 @@ namespace gridtools::fn {
using dim_hymap_t = hymap::keys<edge_dim_t, edge_to_cell_dim_t>;
auto contents = sid::synthetic()
.set<sid::property::origin>(sid::host_device::simple_ptr_holder(device_data.get()))
.set<sid::property::strides>(dim_hymap_t::make_values(num_neighbors, 1));
.set<sid::property::strides>(dim_hymap_t::make_values(num_neighbors, 1))
// for whatever reason, setting strides_kind is required
// by Clang-CUDA (tested Clang 17 + CUDA 12.4)
.set<sid::property::strides_kind, sid::unknown_kind>();

const auto table = as_neighbor_table<edge_dim_t, edge_to_cell_dim_t, num_neighbors>(contents);
using table_t = std::decay_t<decltype(table)>;
Expand Down
10 changes: 10 additions & 0 deletions tests/unit_tests/sid/test_sid_as_const.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -32,5 +32,15 @@ namespace gridtools {
static_assert(std::is_same_v<sid::ptr_type<testee_t>, double const *>);
EXPECT_EQ(sid::get_origin(src)(), sid::get_origin(testee)());
}

TEST(as_const, c_array) {
int src[3][2] = {{0, 1}, {10, 11}, {20, 21}};
auto testee = sid::as_const(src);
using testee_t = decltype(testee);

static_assert(is_sid<testee_t>());
static_assert(std::is_same_v<sid::ptr_type<testee_t>, int const *>);
EXPECT_EQ(sid::get_origin(src)(), sid::get_origin(testee)());
}
} // namespace
} // namespace gridtools
Loading