-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix uncoalesced memory reads #1910
Comments
This is an attempt to summarize the information gathered from benchmarks, and experimental development branches. Here is a high-level summary of the numbers. The indexing_and_static_ndranges.jl
thermo_bench_bw.jl
Offset benchmark
index_swapping.jl
Discussion / summary
|
I need to take a look at a new nsight report. But this may have been fixed by #1969. |
Broadcast operations on our most simple expressions result in uncoalesced memory reads for
VIJFH
datalayouts, which was observed via nsight compute.Issues are being analyzed/outlined in the following benchmark scripts:
benchmarks/scripts/index_swapping.jl
benchmarks/scripts/indexing_and_static_ndranges.jl
benchmarks/scripts/thermo_bench_bw.jl
benchmarks/scripts/benchmark_offset.jl
Tasks
The text was updated successfully, but these errors were encountered: