Feature/coarsen on pressure #87

oliverwm1 · 2019-12-18T01:00:54Z

Introduces functions to coarsen restart files on constant pressure surfaces, as well as compute layer thicknesses assuming hydrostatic balance and a consistent surface height.

Interpolating cell-centered quantities to the interfaces fails when using a 1-based indexing system for the tiles. We chose this to more closely align FV3s file naming conventions, but I think we should revert to 0-based indexing everywhere.

oliverwm1 · 2019-12-24T21:42:14Z

Not sure what's going on, but there are some strange artifacts in the coarsened data with the code as it stands :/ Will look into this in the new year. Happy holidays all!

spencerkclark · 2020-01-02T15:45:52Z

I did a little digging -- it's indeed a subtle issue. The root cause is the following pattern in remap_levels (da_no_coords is representative of p_out, which has no horizontal coordinates when it enters the function):

In [1]: import numpy as np; import xarray as xr

In [2]: da_no_coords = xr.DataArray(np.random.random((2, 2)), dims=['x', 'y'])

In [3]: da_no_coords
Out[3]:
<xarray.DataArray (x: 2, y: 2)>
array([[0.14846017, 0.39570592],
       [0.00748756, 0.36609841]])
Dimensions without coordinates: x, y

In [4]: da_no_coords_stacked_unstacked = da_no_coords.stack(column=('x', 'y')).unstack()

Note that stacking and unstacking leads to the addition of dummy coordinates to the previously unlabeled 'x' and 'y' dimensions; this was a design decision that dates back to the initial implementation of dimensions without coordinates in xarray (though there doesn't seem to be much discussion about it).

In [5]: da_no_coords_stacked_unstacked
Out[5]:
<xarray.DataArray (x: 2, y: 2)>
array([[0.14846017, 0.39570592],
       [0.00748756, 0.36609841]])
Coordinates:
  * x        (x) int64 0 1
  * y        (y) int64 0 1

If we have a reference dataset that has labeled 'x' and 'y' dimensions (like the restart files), and then try to insert this DataArray into it, xarray will try to align things using the dummy coordinates creating via stacking. This leads to NaNs on the outer edges when you do ds_remap[var] = remap_levels(...).

Details

In [6]: da_coords = xr.DataArray(np.random.random((2, 2)), coords=[range(1, 3), range(1, 3)], dims=
   ...: ['x', 'y'])

In [7]: da_coords
Out[7]:
<xarray.DataArray (x: 2, y: 2)>
array([[0.72792859, 0.34686449],
       [0.4581061 , 0.85930211]])
Coordinates:
  * x        (x) int64 1 2
  * y        (y) int64 1 2

In [8]: ds_coords = da_coords.rename('foo').to_dataset()

In [9]: reference = xr.zeros_like(ds_coords)

In [10]: reference['a'] = da_no_coords_stacked_unstacked

In [11]: reference
Out[11]:
<xarray.Dataset>
Dimensions:  (x: 2, y: 2)
Coordinates:
  * x        (x) int64 1 2
  * y        (y) int64 1 2
Data variables:
    foo      (x, y) float64 0.0 0.0 0.0 0.0
    a        (x, y) float64 0.3661 nan nan nan

The upshot is, we should probably restore the horizontal coordinates on delp_coarse_on_fine in _remap_given_delp before passing things down to remap_levels. I think you might be able to take advantage of something like this function I wrote for coarse-graining the surface data:
https://github.com/VulcanClimateModeling/fv3net/blob/0439cdb57c7b71731a3a11bd220aee337c7775d0/external/vcm/vcm/complex_sfc_data_coarsening.py#L420-L420
but I think you might need to modify it some to make it more robust to staggered vs. unstaggered dimensions and differing orders of downsampled vs. reference dimensions. A nice thing about that function is that it also restores chunk sizes in addition to coordinates.

oliverwm1 · 2020-01-02T16:53:17Z

Thanks for looking into this Spencer. Indeed a subtle issue, nice job tracking it down! This would have been introduced when I switched f_out to be initialized from p_out instead of f_in. I’ll take a look at your suggestion and work on implementing today.

external/vcm/vcm/cubedsphere/remapz.py

spencerkclark

Nice work @oliverwm1; I'm pleased with where this is now. Thanks for all the changes.

nbren12

This is looking pretty good, but it seems like there is a too much code for dealing with metadata and coordinate info. I think it would be cleaner to assume all the restart data is one Dataset with corrected metadata as returned by open_restarts. Then this PR would introduce a single function

def coarse_grain_on_pressure(restart_data: xr.Dataset) -> xr.Dataset:
   ...

I am working on another PR #97, which will make it easier to use the dimension renaming code on it's own. Finally, we would need to add a function to write the single combined restart Dataset to the separate files, but that shouldn't be too hard.

external/vcm/vcm/calc/thermo.py

external/vcm/vcm/cubedsphere/remapz.py

external/vcm/vcm/coarsen.py

external/vcm/vcm/calc/thermo.py

oliverwm1 · 2020-01-06T20:10:18Z

I think I've addressed all your comments, @nbren12. I went back to not outputting a vertical coordinate when calculating pressure or height on layer interfaces, which reduces some of the dimension/coordinate complications.

I agree the code would be simpler if we could assume all the restart data is in one xarray Dataset, but I think it is outside of the scope of this PR to build out that framework. For this PR, I took the approach of replicating the current model-level methods for coarsening, which operate separately-ish on each "restart category" (coarsening fv_tracer does depend on fv_core).

nbren12

Feel free to merge, but I think we should open an issue to make this code work on the combined restart data (as returned by open_restarts). I think this will make it easier to implement with dataflow.

cc @frodre

oliverwm1 · 2020-01-15T01:10:42Z

Feel free to merge, but I think we should open an issue to make this code work on the combined restart data (as returned by open_restarts). I think this will make it easier to implement with dataflow.

cc @frodre

Sounds good, I made an issue.

) In the original implementation (#87), the default method for coarse-graining 3D fields in the restart files was to first vertically remap them to a common set of pressure levels within a coarse grid cell, and then take a masked-mass-weighted average (masking is to take into account the fact that some fine-grid cells do not have air at pressures we are interpolating to, so we ignore them in averaging). Here the masked mass was computed as the product of the masked fine-grid cell area and the original fine-grid cell pressure thickness. In reality though, since we have interpolated the fine-grid fields to constant pressure surfaces, for the pressure thickness we should use the updated pressure thickness field (i.e. "delp_coarse_on_fine"), or better yet, no pressure thickness at all, since it will be constant within a coarse grid box and so will not contribute anything meaningful to the weighting. This PR addresses this by switching to using just the masked area as weights when pressure-level coarse-graining restart files.

Oliver Watt-Meyer and others added 30 commits December 17, 2019 16:06

Add calc/thermo.py

6a5ea2d

Add test_calc_thermo.py

80b5f54

Add remapz.py

2f4b812

Add coarse_grain_fv_core_on_pressure to coarsen.py

0c15978

Rename some functions in thermo.py

5a2494d

Rename functions in test_calc_thermo.py

779132b

Add fortran remapping code mappm.f90

53c8a0f

Refactor vertical remapping routines

f9b903a

Fix first argument passed to create_fv3_grid

fc54c94

Merge branch 'master' into feature/coarsen-on-pressure

100fc56

Rename hydrostatic_dz_with_logp to hydrostatic_dz

afbe2d0

Fix block_upsample import

7529eb2

Add fv_tracer remap on pressure

c00f677

Specify x and y dimension names in args

7889c3f

Add x and y dimension names to function calls

763553c

Add placeholder for coarse_grain_phis function

31159fe

Add impose_hydrostatic_balance and format

1437b59

Add docstrings

c5f7ca9

lint

a5eb26c

Fix create_fv3_grid import statement

b2d9ed4

Update tests for thermo.py

c60057d

Pass argument dimension in thermo.py

6aadbb0

Move mappm.f90 to external and add setup.py

a900292

Add mappm installation to circleci config

b8c888a

Revert mappm install on config.yml, make import optional

1d85dda

Fix assertion in remap_levels

f6d6264

Make delp a ds before passing to create_fv3_grid

0f989ba

Bugfix/xgcm (#91)

b9c0e04

Interpolating cell-centered quantities to the interfaces fails when using a 1-based indexing system for the tiles. We chose this to more closely align FV3s file naming conventions, but I think we should revert to 0-based indexing everywhere.

Various fixes to remapz.py

3b89cee

Rename sphum yaxis for hydrostatic DZ calc

777c2b0

Fix thermo tests

f83595e

Oliver Watt-Meyer added 4 commits January 2, 2020 11:02

Move _block_upsample_like to coarsen.py and make public

c380d39

Generalize block_upsample_like to handle staggered dims

d19b426

Use block_upsample_like in remapz.py

2011bc4

Remove unused import statement

f0b5494

spencerkclark reviewed Jan 2, 2020

View reviewed changes

external/vcm/vcm/cubedsphere/remapz.py Outdated Show resolved Hide resolved

Oliver Watt-Meyer added 9 commits January 2, 2020 21:27

Ensure chunks are set using right dimension order

717eb4a

Add coordinate for staggered delp

b395f84

Remove dtype specification for staggered delp coord

3cad648

Revert virtual temperature calc to approximate version

5c976de

Use lower coordinate in dlogp calculation output

c914fbd

Use exact virtual tempearture

43f61f8

Fix midpoint pressure calc and add docstrings

8d334c9

Make virtual t calc consistent with fv3gfs

51e5af7

Add tests for _add_coords

31af7a3

spencerkclark approved these changes Jan 3, 2020

View reviewed changes

nbren12 suggested changes Jan 3, 2020

View reviewed changes

nbren12 reviewed Jan 3, 2020

View reviewed changes

external/vcm/vcm/calc/thermo.py Outdated Show resolved Hide resolved

Address Noah comments

840596b

nbren12 approved these changes Jan 14, 2020

View reviewed changes

Merge branch 'master' into feature/coarsen-on-pressure

23d0813

oliverwm1 merged commit f91209b into master Jan 15, 2020

oliverwm1 deleted the feature/coarsen-on-pressure branch January 15, 2020 16:04

spencerkclark mentioned this pull request Jun 24, 2022

Use masked area weights when pressure-coarsening 3D restart files #1895

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/coarsen on pressure #87

Feature/coarsen on pressure #87

oliverwm1 commented Dec 18, 2019

oliverwm1 commented Dec 24, 2019

spencerkclark commented Jan 2, 2020

oliverwm1 commented Jan 2, 2020

spencerkclark left a comment

nbren12 left a comment

oliverwm1 commented Jan 6, 2020

nbren12 left a comment

oliverwm1 commented Jan 15, 2020

Feature/coarsen on pressure #87

Feature/coarsen on pressure #87

Conversation

oliverwm1 commented Dec 18, 2019

oliverwm1 commented Dec 24, 2019

spencerkclark commented Jan 2, 2020

oliverwm1 commented Jan 2, 2020

spencerkclark left a comment

Choose a reason for hiding this comment

nbren12 left a comment

Choose a reason for hiding this comment

oliverwm1 commented Jan 6, 2020

nbren12 left a comment

Choose a reason for hiding this comment

oliverwm1 commented Jan 15, 2020