Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A GPU compatible version for non-orographic gravity wave parameterization #3267

Merged
merged 1 commit into from
Sep 14, 2024

Conversation

Xiaoyang-Xie
Copy link
Contributor

Purpose

Let Nogw parameterization work on GPU

To-do

Speed it up (Now this version now runs 50% slower than the previous version of the code.)

Content

Replace the for-end loop structure with unrolled functions


  • I have read and checked the items on the review checklist.

@Sbozzolo
Copy link
Member

This code is really complex, has little to no comments that explain what is happening, uses advanced features and introduces a dependency on UnrolledUtilities, a package that is not documented in such a way that people can understand what it does and when/how to use it without in-person guidance. I am worried that almost no one will be able to extend and maintain this piece of code.

@dennisYatunin can you help reducing complexity and improving clarity?

If you really want to use UnrolledUtilities, can you first add documentation so that people can understand what the package does and how to use it? If you want to use UnrolledUtilities in ClimaAtmos, the target demographics for your documentation should be graduate students / research scientists.

Comment on lines 1 to 4
dt_save_state_to_disk: "400secs"
initial_condition: "IsothermalProfile"
t_end: "1500secs"
hyperdiff: false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
dt_save_state_to_disk: "400secs"
initial_condition: "IsothermalProfile"
t_end: "1500secs"
hyperdiff: false
t_end: "1500secs"

@Xiaoyang-Xie Xiaoyang-Xie force-pushed the xxy/gw_GPUversion branch 3 times, most recently from 6987d2a to 8f9fd3c Compare September 11, 2024 17:18
Comment on lines 10 to 14
Operators.placeholder_space(current_space, parent_space) =
current_space isa Spaces.AbstractSpace &&
parent_space isa Spaces.AbstractSpace &&
Spaces.issubspace(current_space, parent_space) ?
Operators.PlaceholderSpace() : current_space
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overrides a method in ClimaCore. If this is needed, it needs to be put into ClimaCore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is needed. Otherwise the code could not run on GPU. Dennis tells me he will open a pull request in ClimaCore soon.

function calc_intermitency(ρ_source_level, source_ampl, nk, Bsum)
return (source_ampl / ρ_source_level / nk) / Bsum
end

function gw_average!(wave_forcing, wave_forcing_m1)
L1 = Operators.LeftBiasedC2F(; bottom = Operators.SetValue(0.0))
L1 = Operators.LeftBiasedC2F(; bottom = Operators.SetValue(0.0f0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not hard-code Float32 here? (and below)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! I will change it.

@. uforcing = uforcing + u_waveforcing
@. vforcing = vforcing + v_waveforcing

end
return nothing
end

# calculate the intermittency factor eps -> assuming constant Δc.
# Using column_accumulate function, calculate the gravity wave forcing at each point.
function waveforcing_column_accumulate!(
Copy link
Member

@charleskawczynski charleskawczynski Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not keen on nested anonymous functions like this. Can we somehow break the contents of this giant function into smaller function calls?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've refactored part of the wave source calculation into smaller function calls. Does this approach work for you? Breaking down the unrolled_reduce function, which is a big part of the whole function, isn’t easy because it directly uses many externally passed parameters. If we avoid using anonymous functions, we’ll have to keep passing those parameters in the reduce operation, which might make the function less elegant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wave_source looks great, thanks for updating that. It would be nice if it's possible to do something similar with the block inside the unrolled_reduce(Val(nc), (mask, FT1(0.0))) do (mask, fm), (n) block, but I understand if it's not possible.

@Xiaoyang-Xie Xiaoyang-Xie added this pull request to the merge queue Sep 14, 2024
Merged via the queue into main with commit b1c50ea Sep 14, 2024
13 of 16 checks passed
@Xiaoyang-Xie Xiaoyang-Xie deleted the xxy/gw_GPUversion branch September 14, 2024 02:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants