Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split the CUDA CKF into different TUs #742

Merged
merged 1 commit into from
Oct 24, 2024

Conversation

stephenswat
Copy link
Member

This commit splits the monstrously large CUDA track finding (and some CCL) translation unit up into smaller ones, one for each of the kernels. This should speed up compilation times and decrease memory usage.

Also groups the payloads for each of the functions into convenient structs, so we don't need to pass 20+ arguments for some of the kernel calls.

Does not change the functionality of the code.

@stephenswat stephenswat added refactor Change the structure of the code cuda Changes related to CUDA labels Oct 16, 2024
@stephenswat stephenswat force-pushed the refactor/ckf_tus branch 2 times, most recently from bbeb4f7 to 30aebf1 Compare October 16, 2024 13:30
@stephenswat
Copy link
Member Author

stephenswat commented Oct 16, 2024

My plan to finally get rid of those .ipp files has failed. 🫡

Anyway, updated.

@stephenswat stephenswat force-pushed the refactor/ckf_tus branch 4 times, most recently from 24a9c9e to 4ec0290 Compare October 23, 2024 08:48
Copy link
Member

@krasznaa krasznaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure there's no hastiness on this one...

Copy link
Member

@krasznaa krasznaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, let's go with this general setup at the moment. As we discussed in person, I have some further ideas for how to tweak the code later on even further, but these changes do not make those future changes any more difficult. (The general direction of my idea is very similar.)

@stephenswat
Copy link
Member Author

Great! I've rebased and incorporated the change requests

@stephenswat stephenswat requested a review from krasznaa October 23, 2024 14:16
Copy link
Member

@krasznaa krasznaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a little more to go...

device/cuda/src/finding/kernels/build_tracks.cuh Outdated Show resolved Hide resolved
This commit splits the monstrously large CUDA track finding translation
unit up into smaller ones, one for each of the kernels. This should
speed up compilation times and decrease memory usage.

Also groups the payloads for each of the functions into convenient
structs, so we don't need to pass 20+ arguments for some of the kernel
calls.

Does not change the functionality of the code.
Copy link

Copy link
Member

@krasznaa krasznaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get it in finally...

@stephenswat stephenswat merged commit d4d6531 into acts-project:main Oct 24, 2024
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda Changes related to CUDA refactor Change the structure of the code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants