Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ClimaDiagnostics #2866

Merged
merged 1 commit into from
Apr 22, 2024
Merged

Use ClimaDiagnostics #2866

merged 1 commit into from
Apr 22, 2024

Conversation

Sbozzolo
Copy link
Member

@Sbozzolo Sbozzolo commented Apr 2, 2024

This PR moves the logic underlying the diagnostic module to a separate package, ClimaDiagnostics.jl. In the processes, everything was documented and tests were added.

The reason for this change is twofold:

  • allow other packages (mainly ClimaLand and ClimaCoupler) to use this infrastructure,
  • allow for faster development by moving the module to a place where we can iterate faster

Diagnostics ClimaDiagnostics are more flexible: they are no longer tied to a specific number of iteration and can be triggered with any conditions (e.g., output a variable it is NaN). In the current form, they also allocate x15 times as much.

Closes: #2623, #2598, #2837
Closes #2910

@Sbozzolo Sbozzolo force-pushed the gb/use_climadiagnostics branch 14 times, most recently from 512a258 to 9e48c9b Compare April 8, 2024 21:32
@Sbozzolo Sbozzolo force-pushed the gb/use_climadiagnostics branch 2 times, most recently from 5ecaefc to 0ca5847 Compare April 10, 2024 01:12
Copy link
Member

@charleskawczynski charleskawczynski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before continuing with this PR, I insist that we fuse our diagnostic callbacks, so that we ensure that there is a path to something performant, before shipping this code off to ClimaDiagnostics. I raised this issue when this code was first added, and moving it to a separate repository will only make this problem more difficult to fix.

@Sbozzolo
Copy link
Member Author

One of the reasons to move to a separate repo is to allow us to more easily update it. As you can see by the diff, everything is now encapsulated in ClimaDiagnostics and we can change how things are done without touching ClimaAtmos. ClimaDiagnostics already has a full integration test that runs in 20 seconds, greatly speeding up the pace of developlment. If we need optimization work, it will be faster to do it there.

Also, ClimaLand and ClimaCoupler are going to use ClimaDiagnostics soon. They will probably have slightly different needs that will require some additional work in ClimaDiagnostics. I think that it is more important to bring the code to the end user sooner and figure out what needs to be done to support their needs instead of adding additional complexity to improve performance.

@charleskawczynski
Copy link
Member

Also, ClimaLand and ClimaCoupler are going to use ClimaDiagnostics soon.

If you'd prefer to fuse the callbacks in ClimaDiagnostics first, that's fine, but do that first. I don't want multiple repositories depending on a design pattern that does not scale. Fusing the callbacks changes the interface and the way it's called, so it's a breaking change.

@Sbozzolo
Copy link
Member Author

Also, ClimaLand and ClimaCoupler are going to use ClimaDiagnostics soon.

If you'd prefer to fuse the callbacks in ClimaDiagnostics first, that's fine, but do that first.

Sounds good. Can you define what you mean with fusing callbacks exactly?

Fusing the callbacks changes the interface and the way it's called, so it's a breaking change.

With ClimaDiagnostics, the interface is one line integrator = ClimaDiagnostics.IntegratorWithDiagnostics(integrator, scheduled_diagnostics) + the definition of a ScheduledDiagnostic is. It might need to change, but one of the design goals of ClimaDiagnostics is to handle as much as possible internally, so I hope it won't.

@charleskawczynski
Copy link
Member

charleskawczynski commented Apr 10, 2024

Sounds good. Can you define what you mean with fusing callbacks exactly?

Yes, all broadcast expressions need to be in the same function scope, and we can't have function calls between them. So, for example, one of these two forms would suffice:

    foreach(input_output_pairs) do out, args
        out .= compute_diagnostics.(args, more_args...)
    end

    foreach(keys(diagnostics)) do key
        out[key] .= compute_diagnostics.(diagnostics[key], args...)
    end

Basically, we need to be able to use the @fused feature.

With ClimaDiagnostics, the interface is one line integrator = ClimaDiagnostics.IntegratorWithDiagnostics(integrator, scheduled_diagnostics) + the definition of a ScheduledDiagnostic is. It might need to change, but one of the design goals of ClimaDiagnostics is to handle as much as possible internally, so I hope it won't.

Why would diagnostics have anything to do with the integrator? Re-creating the integrator violates the DRY principle-- we should probably make the integrator once, and I would prefer that happens in ClimaAtmos.

I think ClimaDiagnostics should basically provide an interface for specifying a diagnostics callback. Using a ScheduledDiagnostic type seems fine.

@Sbozzolo
Copy link
Member Author

Sounds good. Can you define what you mean with fusing callbacks exactly?

Yes, all broadcast expressions need to be in the same function scope, and we can't have function calls between them. So, for example, one of these two forms would suffice:

    foreach(input_output_pairs) do out, args
        out .= compute_diagnostics.(args, more_args...)
    end

    foreach(keys(diagnostics)) do key
        out[key] .= compute_diagnostics.(diagnostics[key], args...)
    end

Basically, we need to be able to use the @fused feature.

Let's chat offline on how to accomplish this.

With ClimaDiagnostics, the interface is one line integrator = ClimaDiagnostics.IntegratorWithDiagnostics(integrator, scheduled_diagnostics) + the definition of a ScheduledDiagnostic is. It might need to change, but one of the design goals of ClimaDiagnostics is to handle as much as possible internally, so I hope it won't.

Who reviewed this interface? Why would diagnostics have anything to do with the integrator? Re-creating the integrator violates the DRY principle-- we should make the integrator once.

I think ClimaDiagnostics should basically provide an interface for specifying a diagnostics callback. Using a ScheduledDiagnostic type seems fine.

ClimaDiagnostics provides that option too, you can get a DiagnosticsCallback to use in your integrator. In that case, you are responsible for ensuring that everything else is initialized before you construct the callback. If you tie the diagnostics to the integrator, you can ensure that the diagnostic are called as last callback and everything is properly initialized (eg, you want to compute the diagnostics after the radiation callback is called). Initialization means preparing the storage spaces for the diagnostics and the accumulators/counters. Initialization is also a way to catch malformed user-provided diagnostics and error out before the simulation starts. Initialization could be delayed: prepare the storage and allocators the first time you call the diagnostic. I didn't go that route because it introduces a bunch of type-gymnastics or type instability (eg, non concrete structs).

@charleskawczynski
Copy link
Member

Let's chat offline on how to accomplish this.

Sounds good.

ClimaDiagnostics provides that option too, you can get a DiagnosticsCallback to use in your integrator. In that case, you are responsible for ensuring that everything else is initialized before you construct the callback.

Sounds good, I think ensuring that everything else is initialized before constructing the callback is not a problem.

If you tie the diagnostics to the integrator, you can ensure that the diagnostic are called as last callback and everything is properly initialized (eg, you want to compute the diagnostics after the radiation callback is called).

I don't think that that logic should be in ClimaDiagnostics.

Initialization means preparing the storage spaces for the diagnostics and the accumulators/counters. Initialization is also a way to catch malformed user-provided diagnostics and error out before the simulation starts.

That's fine with me.

Initialization could be delayed: prepare the storage and allocators the first time you call the diagnostic. I didn't go that route because it introduces a bunch of type-gymnastics or type instability (eg, non concrete structs).

I personally like the idea of having some sort of initialization function, or at least some way to compile the diagnostics.

@Sbozzolo Sbozzolo force-pushed the gb/use_climadiagnostics branch 6 times, most recently from 73f64e8 to 8c9d4b6 Compare April 19, 2024 01:20
@Sbozzolo Sbozzolo force-pushed the gb/use_climadiagnostics branch 5 times, most recently from 7c58647 to 3f5daf6 Compare April 20, 2024 00:16
@Sbozzolo Sbozzolo marked this pull request as ready for review April 20, 2024 00:42
return x
end
end
error("Callback not found in $(affect!)")
return nothing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove this error? It might otherwise lead to confusing errors?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question.

The function forced all the callbacks to be either AtmosCallbacks or a specific type of DiffEqCallbacks callback (one with the SavedValues property), then the function atmos_callbacks filtered different types of callbacks dropping the DiffEqCallbacks callbacks. I flipped the conditional around so that atmos_callbacks filters for AtmosCallbacks (which is really the intent here: we want AtmosCallbacks so that downstream functions can compute lcm and cycle). Therefore, there is no longer a reason to reject callbacks that are not AtmosCallbacks or DiffEqCallbacks.

The diagnostic callback is a SciMLBase.DiscreteCallback, hence, it was hitting the error banch here. Note that there is no reason for the diagnostic callback to be an AtmosCallback as it doesn't have a strong notion of periodicity (e.g., calendar months have different number of timesteps).

callback_from_affect(x::AtmosCallback) = x
function callback_from_affect(affect!)
for p in propertynames(affect!)
x = getproperty(affect!, p)
if x isa AtmosCallback
return x
elseif x isa DECB.SavedValues
return x
end
end
error("Callback not found in $(affect!)")
end
function atmos_callbacks(cbs::SciMLBase.CallbackSet)
all_cbs = [cbs.continuous_callbacks..., cbs.discrete_callbacks...]
callback_objs = map(cb -> callback_from_affect(cb.affect!), all_cbs)
filter!(x -> !(x isa DECB.SavedValues), callback_objs)
return callback_objs
end

Copy link
Member

@charleskawczynski charleskawczynski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thank you!

@Sbozzolo Sbozzolo added this pull request to the merge queue Apr 22, 2024
Merged via the queue into main with commit c360b64 Apr 22, 2024
9 of 11 checks passed
@Sbozzolo Sbozzolo deleted the gb/use_climadiagnostics branch April 22, 2024 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move diagnostics to ClimaDiagnostics
2 participants