You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 4, 2022. It is now read-only.
@spdomin I'd like to start by identifying a set of canonical tests with high aspect ratios and benchmarking their performance. Ideally, the tests wouldn't be too large, so that some analysis might be possible.
…in#60)
The version of the element-algorithm that uses std::vectors for
scratch arrays (TestElemAlgorithmWithVectors) can not execute correctly
in true multi-threaded mode since the vectors can't feasibly be
allocated on a per-thread basis.
Thus, this version of the element-algorithm is now restricted to
explicitly *not* run multi-threaded.
The version of the element-algorithm that uses Kokkos::views for
scratch arrays (TestElemAlgorithmWithViews) now executes correctly
in true multi-threaded mode (OpenMP). It uses the SharedMemView
construct that Christian had prototyped. Unfortunately this means
that it can not use the loop-encapsulation mechanism outlined in
recent slides.
A new version of the element-algorithm has now been added
(TestElementAlgorithmWithTemplate) which uses a templated kernel
function for the inner-loop-body, meaning that scratch arrays are
automatic arrays allocated at compile time. Thus no resize operations
are needed. This works in multi-threaded mode, and is the
fastest of the 3 approaches, by a small margin.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Activities:
The text was updated successfully, but these errors were encountered: