Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for MultiBroadcastFusion #1641

Merged
merged 1 commit into from
May 3, 2024
Merged

Add support for MultiBroadcastFusion #1641

merged 1 commit into from
May 3, 2024

Conversation

charleskawczynski
Copy link
Member

@charleskawczynski charleskawczynski commented Mar 11, 2024

This PR adds support for the use of MultiBroadcastFusion.jl, in order to allow users to fuse multiple broadcast expressions into a single kernel launch.

We'll be able to decorate multiple broadcasts with @fused_direct, e.g.,:

@fused_direct begin
    @. y1 = x1+x2+x3
    @. y2 = x1*x2*x3
end

Which will result in the compiler being able to hoist global memory reads, improving performance.

I'll open an issue on expanding this for a few cases:

  • Support for more memory layouts
  • Support for more types of broadcast styles (right now, only point-wise kernels, with compatible broadcast types)

But I'd like to start with this simple case first, since there should be some low hanging fruit that we can leverage (and also see how things work in production).

Once JuliaRegistries/General#102559 is merged, I'll update the PR to add the correct dependency.

This is a step towards CliMA/ClimaAtmos.jl#2632.

Right now, we are restricted to the following limitations:

  • Only similar field types can be fused, and no combinations of non-similar types are allowed in the left or right-hand side of the expression (they should error). For example, for following are not allowed
  • Vertical interpolation and finite difference operators are not (yet) supported.
  • Horizontal operators are not (yet) supported.

Some examples summarizing this are below:

The following will error:

@fused_direct begin
  @. center_field = 1
  @. face_field = 1
end

The following will error:

@fused_direct begin
  @. center_sphere = 1
  @. center_column = 1
end

The following will error:

@fused_direct begin
  @. center_field_1 = 1 * ᶜinterp(face_field)
  @. center_field_2 = 2 *  ᶜinterp(face_field)
end

The following will error:

@fused_direct begin
  @. center_field_1 = 1 * ᶜinterp(face_field)
  @. center_field_2 = 2 *  ᶜinterp(face_field)
end

The following will work:

@fused_direct begin
  @. center_field_1 = 2
  @. center_field_2 = 1
end

Any pointwise function should work, and it's advantageous to use @fused_direct when there is at least one variable that is shared across multiple broadcast expressions in any way. For example:

converting

    @. ᶜ∇²uₕʲs.:($$j) = C12(ᶜ∇²uʲs.:($$j))
    @. ᶜ∇²uᵥʲs.:($$j) = C3(ᶜ∇²uʲs.:($$j))

to

@fused begin
    @. ᶜ∇²uₕʲs.:($$j) = C12(ᶜ∇²uʲs.:($$j))
    @. ᶜ∇²uᵥʲs.:($$j) = C3(ᶜ∇²uʲs.:($$j))
end

will reduce 1 memory read of because ᶜ∇²uʲs.:($$j) shows up in the right-hand side of both expressions. In addition, converting

    @. ᶜts = ts_gs(thermo_args..., ᶜspecific, ᶜK, ᶜΦ, Y.c.ρ)
    @. ᶜp = TD.air_pressure(thermo_params, ᶜts)

to

@fused begin
    @. ᶜts = ts_gs(thermo_args..., ᶜspecific, ᶜK, ᶜΦ, Y.c.ρ)
    @. ᶜp = TD.air_pressure(thermo_params, ᶜts)
end

Will result in the reuse of ᶜts, resulting in 1 fewer read, as the value computed in the first line can be reused in the second line.

@Sbozzolo
Copy link
Member

Can you please add a short usage guide for developers? (Ie, when should we use @fused?)

@charleskawczynski
Copy link
Member Author

I've renamed @fused to @fused_direct, so that we also specify the style of how the expressions are collected. This will help differentiate against when we add more collection methods.

@dennisYatunin, I think that this is in good enough shape to merge. Can you take a look when you have a chance?

Copy link
Member

@dennisYatunin dennisYatunin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start! Looking forward to seeing this expanded to mismatched center/face/surface spaces and to non-pointwise operators.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants