Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat req: Memory scalable static data in DART - both static across the ensemble and per-ensemble member static data #744

Open
hkershaw-brown opened this issue Sep 30, 2024 · 4 comments
Assignees

Comments

@hkershaw-brown
Copy link
Member

hkershaw-brown commented Sep 30, 2024

There are several model_mods and core DART modules that have a fixed size memory requirement on each processor. The memory usage is static_mem* num_procs (does not scale as you add processors), and is a hard limit for the model size in DART.

Goal:

memory usage per core = static_mem / num_procs
total memory usage = static_mem

Rather than the current:

memory usage per core = static_mem 
total memory usage = static_mem * num_procs

Note the code may need to be sensible about what static data is tiny (fine on every core) vs. large.

Static data in DART:

  • static data, same across the ensemble:

    • WRF phb (3d variable sized static data). A wrf model_mod version with distributed phb was written 2014/16 but never released.
    • Mesh structures (e.g. MPAS)
    • quad_interp utilities data structures (particularly MOM6 CESM3 workhorse 2/3-degree)
    • POP interpolation data structures
    • get_close data structures
  • Per ensemble member static data:
    This gets put into the state at the moment, so is inflated (maybe should not be). An example (I think) is the CLM fields that are 'no-update' see bug: inflation files when using 'no copy back' variables #276

In addition (going as a separate issue), is observation sequence files which are on every core (and particularly for external forward operators which are in the obs sequence).

@hkershaw-brown hkershaw-brown self-assigned this Sep 30, 2024
@hkershaw-brown
Copy link
Member Author

WRF PHB is read from a wrfinput template file, but is PHB in every wrf file?
If so it is "Per ensemble member static data" that is equal for every ensemble member

@kdraeder
Copy link
Contributor

kdraeder commented Nov 14, 2024

Here's question that might influence our choices:
is it reasonably easy to store some kinds of data distributed across a single node,
which is essentially the tasks we request from each node?
This would cut down on memory usage and not increase internode communication.

Here's a framework for thinking about names for the kinds of data filter needs to store
and some possibilities to consider.
Short and common usually wins over longer and more meaningful.
(except when trying to sound impressive: "intercomparison", "irregardless", ...)
I tried to think of short and meaningful descriptions.
Combinations of 2 simple words can be useful.

   First dimension: time varying;
      no = metadata about grids, including surface and boundaries.
         "static" in my/most vocabularies
      yes =  "evolving",  "time varying" (pairs with member-varying, below)
         due directly to assimilation:
            "assimilated",  "updated"
         due indirectly to assimilation through the model forecast: (some is currently called "no-copy-back")
             "not updated", "carried",  "passive", "baggage",

   Second dimension; within an ensemble:
      no varying among members:  (Helen has called: "static".  could be made specific by "ensemble static") 
         "no spread", "ensemble constant", "member independent"
      varying between members: 
         "updated"? (implies time varying too)
         "member varying"  "member dependent"

   Third dimension: size.  Mostly determines the importance of distributing it.
      1D, 2D, 3D

I prefer leaving "prognostic" and "diagnostic" for classifying variables in models.

@hkershaw-brown
Copy link
Member Author

is it reasonably easy to store some kinds of data distributed across a single node,
which is essentially the tasks we request from each node?
This would cut down on memory usage and not increase internode communication.

Yes for sure it is "easy" - it is just counting things.

@braczka
Copy link
Contributor

braczka commented Nov 15, 2024

WRF PHB is read from a wrfinput template file, but is PHB in every wrf file? If so it is "Per ensemble member static data" that is equal for every ensemble member

As far as I know PHB (base state geopotential) is in every wrfinput file. It is static both across ensemble member and in time. It needs to be summed with the PH (perturbation geopotential) to provide actual geopotential.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants