Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid loops in s2fft.sampling.reindex functions to reduce compile and run times #245

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

matt-graham
Copy link
Collaborator

@matt-graham matt-graham commented Nov 19, 2024

The index conversion functions in s2fft.sampling.reindex currently use Python loops (over L iterations) to perform the reindexing operations slice by slice. As these loops are unrolled when JIT compiling this leads to long compile times as L gets bigger.

This PR refactors the reindexing functions to instead construct arrays of indices and use these to gather or scatter the relevant entries in single indexing operations, thus avoiding the loops. For the functions converting between HEALPix and 2D harmonic coefficient indexing, we can use the numpy.triu_indices function to construct the relevant index arrays. For the maps between 1D and 2D layouts we can use numpy.where on a boolean array constructed so that the relevant array elements are True. In both cases the explicitly constructed index arrays add some memory overhead - as they depend only L which is static they will be constants at compile time, and in all cases are I believe $\mathcal{O}(L^2)$ in size - however this is no larger than the memory requirements of the harmonic coefficient arrays themselves, and the computational graphs of the resulting compiled functions are much smaller (no longer scaling with L) so its unclear if overall the memory requirements are much larger.

I've done some benchmarking of the new implementations here compared to the current implementations and in all cases tested the compile times are significantly lower and the run times comparable or significantly better:

https://gist.github.com/matt-graham/7a2b5b77f51b5301b8910ced176d8a8a

For example for the flm_2d_to_hp_fast function, the following plot shows the compile times of the current implementation ("Current"), the proposed implementation ("triu_indices based") and two other alternatives I tried out ("Slice based" and "Nested fori_loop") for various L values

flm_2d_to_hp compile time comparison

and the corresponding plot for run times of the compiled functions

flm_2d_to_hp run time comparison

On both compile and run times the proposed "triu_indices based" implementation is significantly quicker than both the current implementation and the other alternatives considered.

EDIT: There was a slight bug in the flm_hp_to_2d_fast reimplementation now hopefully fixed. The run times for the new function are now a little slower but the compile time is still much better than current implementation and importantly not quickly growing with L.

Copy link

codecov bot commented Nov 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.06%. Comparing base (909e6f1) to head (d2c8f54).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #245      +/-   ##
==========================================
- Coverage   96.07%   96.06%   -0.02%     
==========================================
  Files          31       31              
  Lines        3567     3555      -12     
==========================================
- Hits         3427     3415      -12     
  Misses        140      140              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@jasonmcewen jasonmcewen self-requested a review November 26, 2024 09:53
Copy link
Contributor

@jasonmcewen jasonmcewen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@CosmoMatt
Copy link
Collaborator

@matt-graham nice, this was a temporary loop that I never got around to removing!

Copy link
Collaborator

@CosmoMatt CosmoMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants