Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add flop count perf job back in #205

Merged
merged 1 commit into from
Feb 28, 2024
Merged

Add flop count perf job back in #205

merged 1 commit into from
Feb 28, 2024

Conversation

charleskawczynski
Copy link
Member

@charleskawczynski charleskawczynski commented Feb 27, 2024

This PR adds the flop counting perf job back into CI.

I was a bit too impatient waiting for triscale-innov/GFlops.jl#49 to be fixed. I tried messaging the author on slack, it's been months since the issue was opened, and users still can't update to Julia 1.10. So, I made CountFlops.jl.

I think @dennisYatunin developed his own version, too. We could combine / extend it.

I think this is a very useful tool for simple, pointwise functions, like Thermodynamics, CloudMicrophysics, and SurfaceFluxes.

@charleskawczynski charleskawczynski changed the title Add flop count operf job back in Add flop count perf job back in Feb 27, 2024
Copy link
Member

@nefrathenrici nefrathenrici left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! It would be good to add similar tests to microphysics.

Copy link
Member

@dennisYatunin dennisYatunin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good overall. The tool I'm developing will likely end up looking more similar to AllocCheck than to GFlops (maybe also with a few pieces from Cthulhu), but there might be some room for combining it with CountFlops because its overall scope is related. That being said, it would probably be easier to just merge your PR into GFlops, instead of maintaining a copy/fork. We can try using CountFlops for now, though, since that PR has already been open for a while.

As you've mentioned before, the main downside of the pure Cassette-based implementation in GFlops/CountFlops is that it does not account for compiler optimizations like constant propagation, which means that it can only give an upper bound on the number of times some function gets called. This is why I've been looking into how AllocCheck analyzes the IR with GPUCompiler, since we probably need to account for the compiler's IR transformations to estimate quantities like the register pressure of a kernel. Something akin to CountFlops.@count_ops should be okay for testing the simple functions in Thermodynamics, but it might not be very helpful for the more complex kernels in ClimaAtmos.

If CountFlops.@count_ops can be turned into a test for Float32/Float64 type stability, we could potentially use that test instead of the restrictions on FT throughout this repo. The resulting code would be more generic and readable, and we wouldn't need to sacrifice any type safety. Such a test should work despite the aforementioned optimization issue, since it does not need to depend on the exact number of Float32/Float64 operations.

@charleskawczynski charleskawczynski merged commit d3b90e5 into main Feb 28, 2024
17 of 18 checks passed
@charleskawczynski charleskawczynski deleted the ck/countflops branch June 10, 2024 18:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants