Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for assigning the execution units (NUMA/CPUs, GPUs) to the subtrees. #13

Merged
merged 66 commits into from
May 31, 2017
Merged

Fixes for assigning the execution units (NUMA/CPUs, GPUs) to the subtrees. #13

merged 66 commits into from
May 31, 2017

Conversation

venovako
Copy link

@venovako venovako commented Apr 13, 2017

Accesses (reads/writes) to some of the task-shared variables in LDL^T and Cholesky C++ kernels that explicitly control behaviour of tasks and their status have been made atomic.

Partially FIXED assigning the execution units (NUMA/CPUs, GPUs) to the subtrees.
That includes a reorganisation of a mechanism for task distribution across the hardware topology.

NUMA binding is still based on OpenMP's proc_bind(spread), which may not be the most robust approach.

Small code-style fixes.

Introduced more volatile attributes on CUDA shared memory variables.

FIXME: Pivoting in the indefinite case, in CUDA, alongside with uninitialised memory accesses (consult cuda-memcheck --tool initcheck).

Vedran Novakovic added 30 commits April 13, 2017 11:39
… distributing work to the execution units.
…that proc_bind(spread) will adequately cover the NUMA regions.
… reasonable second-level nested parallelism within a NUMA node.
@flipflapflop flipflapflop merged commit 1f6148b into ralna:master May 31, 2017
@venovako venovako deleted the small_fixes branch June 5, 2017 10:45
@venovako venovako restored the small_fixes branch June 16, 2017 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants