Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: assuming all state variables have unlimited dimension #614

Closed
hkershaw-brown opened this issue Jan 5, 2024 · 4 comments
Closed

bug: assuming all state variables have unlimited dimension #614

hkershaw-brown opened this issue Jan 5, 2024 · 4 comments
Assignees
Labels
Bug Something isn't working IO IO refactoring notes and issues MARBL Marine Biogeochemistry Library

Comments

@hkershaw-brown
Copy link
Member

hkershaw-brown commented Jan 5, 2024

Making a separate issue for this since Robin has hit this while working on MOM6-MARBLE.
Comment originally from #519

Assumptions in direct_netcdf_mod.f90:

  • If there is an unlimited dimension, every variable has the unlimited dimension.
  • The unlimited dimension is time, and we are interested in the latest time slice in a file.

In the cesm 2.3 clm, there is an unlimited dimension cohorts which is not used by any of the state variables (and, kind of strangely, not used at all)

So this line:

counts(1:num_dims-1) = get_dim_lengths(domain, i) ! the state

num_dims - 1 = 0

Gustavo's compiler is picking up this 1:0 and erroring out.

intel 19.1.1.21 (running on Cheyenne) just merrily continues, but the counts are incorrect. The count is set to 1 rather than the length of the dimension.

For fill_inflation_restart it does not matter if the state is read in incorrectly, since we just fill the output with (mean and sd (e.g. 1.0, 0.6). Side note here is we don't need to be calling read_state in fill_inflation_restart.

For filter, which does need to read the state, I think the read does not happen correctly (count is not the size of the variable) .

On write, only 'time' (lower case) can be the unlimited dimension. (This is another assumption in dart. It is a weakness in how we deal with unlimited dimensions (#359 (comment)))
For dart created files, dart won't add an unlimited dimension unless its name is 'time'. So creating and writing a file with fill inflation restart gets you a file with the correct info.

All the assumptions above are a bit of a hack.

I put a fix on https://github.com/NCAR/DART/tree/fix-unlimited_dim-read which checks the variable for unlimited dimension before adjusting the counts reading and writing. It is a narrow solution (it really just fixes the case where there is an unused unlimited dimension).

I'm not sure if the cohorts dimension will be used in the state, if it is then improving the state read/write to cope with various (and multiple) unlimited dimensions gets bumped up the priority list.

Let me know if this makes sense or not.

Originally posted by @hkershaw-brown in #519 (comment)

@hkershaw-brown hkershaw-brown added IO IO refactoring notes and issues Bug Something isn't working labels Jan 5, 2024
@nancycollins
Copy link
Collaborator

we could add a couple of namelist settings to be used in the state i/o code.

one could list dimension names to be ignored - e.g. if a variable is 4d in netcdf but really only 3d are used, the 4th dim should be ignored.

the other could list dimension names which are unlimited, so when creating a netcdf file and reading/writing it does the right thing.

the default for both could be "time" to be backwards compatible, but overwriteable and extensible for issues like this.

we could add another flag to say what to do with unlimited dims. the default is read the last entry, but we could have an option to stack the unlimited dim as an additional dimension in the state array, or indicate which index in the unlimited dim should be used.

(i'm sure i made a comment about this before, but i can't find it to reference. maybe it was pre-github.)

@nancycollins
Copy link
Collaborator

For fill_inflation_restart it does not matter if the state is read in incorrectly, since we just fill the output with (mean and sd (e.g. 1.0, 0.6). Side note here is we don't need to be calling read_state in fill_inflation_restart.

i think you're right. it does need the model time - could it just call that instead? i see the ensemble_handle is already allocated - will all the member numbers be set up correctly by other routines?

@hkershaw-brown hkershaw-brown self-assigned this Jan 16, 2024
@hkershaw-brown
Copy link
Member Author

Robin's MARBLE MOM6

netcdf marbl_params {
dimensions:
        Layer = 65 ;
        Time = 1 ;
variables:
        int64 Time(Time) ;
        double autotroph_settings\(1\)%kDOP(Layer) ;
        double autotroph_settings\(1\)%kNH4(Layer) ;
}

@hkershaw-brown hkershaw-brown added the MARBL Marine Biogeochemistry Library label Mar 11, 2024
@hkershaw-brown
Copy link
Member Author

fixed in #726

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working IO IO refactoring notes and issues MARBL Marine Biogeochemistry Library
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants