miniBUDE Timer Updates #5

pranav-sivaraman · 2024-02-02T01:56:24Z

Double check data movement
Add configurable warmup iterations

RAJA Update

Add scope guards to Kokkos views so they are destroyed before Kokkos finalize is called.

Spack injects its own compiler wrappers which rpath shared libraries for us and also any user defined flags. Therefore we do not want to set the compiler in this scenario.

…rtesian product

jhdavis8

Looks good, two small changes to discuss

src/acc/fasten.hpp

src/omp/fasten.hpp

jhdavis8 · 2024-02-12T18:54:18Z

@joy-kitson would you be able to review this one as well?

joy-kitson

It looks like in most of the models you're timing allocation under the umbrella of host to device data movement, unless I'm missing something. I think it might be worth discussing whether or not this is the intended behavior for our timers, as if it is I think I need to make some changes in Cloverleaf (and if memory serves, we may need to do so in some of the other apps as well).

joy-kitson · 2024-02-08T16:44:26Z

src/acc/fasten.hpp

+    host(energies[:nposes])
+
+  auto deviceToHostEnd = now();
+  sample.deviceToHost = {deviceToHostStart, deviceToHostEnd};


Is there a reason we're saving the start and end times rather than just the time elapsed?

joy-kitson · 2024-02-08T18:24:54Z

src/bude.h

@@ -85,9 +86,10 @@ struct Sample {
  size_t ppwi, wgsize;
  std::vector<float> energies;
  std::vector<std::pair<TimePoint, TimePoint>> kernelTimes;
-  std::optional<std::pair<TimePoint, TimePoint>> contextTime;
+  std::optional<std::pair<TimePoint, TimePoint>> hostToDevice;
+  std::optional<std::pair<TimePoint, TimePoint>> deviceToHost;


I guess it makes sense to do it this way if that's what's in the existing code, but it does still seem rather unorthodox, and also doesn't quite line up with what we do in the other codes

The final output is just the elapsed time so I think it shouldn't be a problem copying the way they do it.

src/cuda/fasten.hpp

pranav-sivaraman added 18 commits January 24, 2024 14:49

Merge pull request #1 from hpcgroup/raja-update

9f26acb

RAJA Update

feat: add hostToDevice and deviceToHost timers

96dc803

feat: add timers to final output

2bafe8c

feat(cuda): use new hostToDevice and deviceToHost timers

a80a6db

feat(kokkos): add option to build with seperate installed kokkos

2162743

feat(kokkos): add kokkos finalize

b92b42a

Add scope guards to Kokkos views so they are destroyed before Kokkos finalize is called.

refactor(kokkos): create kokkos view without initializing

eb07b22

feat(openacc): use new hostToDevice and deviceToHost timers

9c71cca

feat(omp): use hostToDevice/deviceToHost timers

0f23ff9

feat(hip): use hostToDevice/deviceToHost timers

9f7f518

feat(sycl): use hostToDevice/deviceToHost timers

0e57e55

refactor(sycl): do not override CMAKE_CXX_COMPILER

bd7741e

Spack injects its own compiler wrappers which rpath shared libraries for us and also any user defined flags. Therefore we do not want to set the compiler in this scenario.

fix(kokkos): adding option to build with external kokkos

bea0523

feat(kokkos): use Kokkos unmanaged views to avoid extra copy

d5f58d5

feat(raja): add new host to device and device to host timers

d72fdb7

revert(kokkos): remove Kokkos::finalize otherwise we can't run the ca…

50a5bf5

…rtesian product

Merge remote-tracking branch 'fork/v2' into timer-updates

feaee38

feat(sycl-usm): add new data movement timers

15d6bd0

pranav-sivaraman changed the title ~~miniBUDE Timer and Tuning Updates~~ miniBUDE Timer Updates Feb 7, 2024

feat: add configurable warmup iterations

cb42a71

pranav-sivaraman marked this pull request as ready for review February 7, 2024 22:50

pranav-sivaraman requested review from joy-kitson and jhdavis8 February 7, 2024 22:50

jhdavis8 requested changes Feb 8, 2024

View reviewed changes

src/acc/fasten.hpp Show resolved Hide resolved

src/omp/fasten.hpp Show resolved Hide resolved

fix: set energies to point to final vector data location

7975137

pranav-sivaraman requested a review from jhdavis8 February 8, 2024 16:54

jhdavis8 approved these changes Feb 8, 2024

View reviewed changes

joy-kitson reviewed Feb 12, 2024

View reviewed changes

refactor: add option to allow users to provide their own offload flags

e568be7

pranav-sivaraman added 3 commits February 13, 2024 18:44

refactor(sycl-usm): acpp to adaptivecpp

aa30f47

fix: remove result allocation from host to device timer

91e1617

feat: add csv build option

1c259f3

pranav-sivaraman merged commit 0aca8e0 into v2 Feb 16, 2024

pranav-sivaraman deleted the timer-updates branch March 2, 2024 23:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

miniBUDE Timer Updates #5

miniBUDE Timer Updates #5

pranav-sivaraman commented Feb 2, 2024 •

edited

Loading

jhdavis8 left a comment

jhdavis8 commented Feb 12, 2024

joy-kitson left a comment

joy-kitson Feb 8, 2024

pranav-sivaraman Feb 12, 2024

joy-kitson Feb 8, 2024

pranav-sivaraman Feb 12, 2024

miniBUDE Timer Updates #5

miniBUDE Timer Updates #5

Conversation

pranav-sivaraman commented Feb 2, 2024 • edited Loading

jhdavis8 left a comment

Choose a reason for hiding this comment

jhdavis8 commented Feb 12, 2024

joy-kitson left a comment

Choose a reason for hiding this comment

joy-kitson Feb 8, 2024

Choose a reason for hiding this comment

pranav-sivaraman Feb 12, 2024

Choose a reason for hiding this comment

joy-kitson Feb 8, 2024

Choose a reason for hiding this comment

pranav-sivaraman Feb 12, 2024

Choose a reason for hiding this comment

pranav-sivaraman commented Feb 2, 2024 •

edited

Loading