-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more inbounds and inlining #1189
Conversation
bors try |
Interestingly, this doesn’t seem to have much of an impact. Any differences could easily be variation. |
The CPU vector dss is the slowest of the spectral kernels, and using IntrospectionTools along with using Revise; using ClimaCore
include(joinpath(pkgdir(ClimaCore), "test", "Operators", "spectralelement", "benchmark_utils.jl"))
include(joinpath(pkgdir(ClimaCore), "test", "Operators", "spectralelement", "benchmark_kernels.jl"))
args = setup_kernel_args(["--device", "CPU"]);
kernel_vector_dss!(args)
using IntrospectionTools
@code_summary kernel_vector_dss!(args) shows that we're currently generating the same native code as if we used |
Actually, the allocation job shows that this fixes some allocation sites. So perhaps we should reconsider. |
b2610fb
to
07c3a3c
Compare
Superseded by #1338 |
This PR applies more
@inbounds
and inlining to