Skip to content
This repository has been archived by the owner on Jan 4, 2022. It is now read-only.

ECP 6: Establish and improve strong scaling of linear solves on high aspect ratio meshes #60

Open
spdomin opened this issue Sep 26, 2016 · 2 comments

Comments

@spdomin
Copy link
Owner

spdomin commented Sep 26, 2016

Activities:

  1. Evaluate current solver technologies for high aspect ratio meshes.
  2. Develop improved coarsening and preconditioning techniques.
  3. Evaluate possible algorithmic discretization approach improvements.
  4. Explore approximate LHS contribution’s effect on linear and nonlinear solver convergence.
@jhux2
Copy link

jhux2 commented Oct 3, 2016

@spdomin I'd like to start by identifying a set of canonical tests with high aspect ratios and benchmarking their performance. Ideally, the tests wouldn't be too large, so that some analysis might be possible.

@spdomin
Copy link
Owner Author

spdomin commented Oct 4, 2016

Sounds great. The heatedWaterChannel is one that you used before. I can find more next week.

crtrott pushed a commit to crtrott/Nalu that referenced this issue Feb 6, 2017
…in#60)

The version of the element-algorithm that uses std::vectors for
scratch arrays (TestElemAlgorithmWithVectors) can not execute correctly
in true multi-threaded mode since the vectors can't feasibly be
allocated on a per-thread basis.
Thus, this version of the element-algorithm is now restricted to
explicitly *not* run multi-threaded.

The version of the element-algorithm that uses Kokkos::views for
scratch arrays (TestElemAlgorithmWithViews) now executes correctly
in true multi-threaded mode (OpenMP). It uses the SharedMemView
construct that Christian had prototyped. Unfortunately this means
that it can not use the loop-encapsulation mechanism outlined in
recent slides.

A new version of the element-algorithm has now been added
(TestElementAlgorithmWithTemplate) which uses a templated kernel
function for the inner-loop-body, meaning that scratch arrays are
automatic arrays allocated at compile time. Thus no resize operations
are needed. This works in multi-threaded mode, and is the
fastest of the 3 approaches, by a small margin.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants