Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fuse inner loop kernels in device CKF #695

Merged
merged 1 commit into from
Sep 21, 2024

Conversation

stephenswat
Copy link
Member

@stephenswat stephenswat commented Sep 3, 2024

The inner loop of the device CKF consists of five loops: material interaction application, measurement counting, candidate finding, hole writing, and propagation. I believe that the middle three can be easily merged into a single kernel, reducing the amount of work we have to do on the host and simplifying the code a lot. This commit makes that change.

@stephenswat stephenswat added refactor Change the structure of the code shared Changes related to shared code labels Sep 3, 2024
Copy link
Member

@krasznaa krasznaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While not understanding most of the changes in the logic, I like the general direction quite a bit. As long as the code still works after this in the same way as it did before, I'm very much on board. 😉

Copy link
Contributor

@beomki-yeo beomki-yeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reducing the number of kernels is definitely good. I think the CPU finding algorithm will also need to remove orig_param_id and skip_counter to keep the symmetry but this can be done later.

Have you checked the performance of this?

@stephenswat stephenswat force-pushed the refactor/fuse_ckf branch 3 times, most recently from 997ee7c to d60e5d7 Compare September 20, 2024 13:00
@stephenswat
Copy link
Member Author

PR updated.

Have you checked the performance of this?

It's essentially identical:

Before:

Reconstructed track parameters: 3061231
Time totals:
                  File reading  80234 ms
            Warm-up processing  984 ms
              Event processing  8227 ms
Throughput:
            Warm-up processing  98.4611 ms/event, 10.1563 events/s
              Event processing  82.2799 ms/event, 12.1536 events/s

After:

Reconstructed track parameters: 3063909
Time totals:
                  File reading  79725 ms
            Warm-up processing  27888 ms
              Event processing  8229 ms
Throughput:
            Warm-up processing  2788.83 ms/event, 0.358573 events/s
              Event processing  82.2956 ms/event, 12.1513 events/s

The inner loop of the device CKF consists of five loops: material
interaction application, measurement counting, candidate finding, hole
writing, and propagation. I believe that the middle three can be easily
merged into a single kernel, reducing the amount of work we have to do
on the host and simplifying thd code a lot. This commit makes that
change.
@stephenswat stephenswat merged commit 43358fc into acts-project:main Sep 21, 2024
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactor Change the structure of the code shared Changes related to shared code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants