Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate / document CDAT-xCDAT spatial differences (again) #500

Closed
pochedls opened this issue Jun 8, 2023 · 7 comments · Fixed by #495
Closed

Investigate / document CDAT-xCDAT spatial differences (again) #500

pochedls opened this issue Jun 8, 2023 · 7 comments · Fixed by #495

Comments

@pochedls
Copy link
Collaborator

pochedls commented Jun 8, 2023

A recent issue prompted me to look through (and update) the spatial averaging routine.

I validated this PR by looking at monthly CMIP5/6 historical tas files (1 per model for 117 climate models) and 21 averaging domains (I skipped NAFM due to this discussion).

In validating the proposed changes, I have (again) come across CDAT-xCDAT differences. In 28 cases of 2415 cases (1.1%) the spatial averages are different (as judged by np.allclose()). The monthly differences are mostly below 0.01 K, but exceed 0.1 (in rare cases). These results apply to both the current release v0.5.0 and the proposed PR.

I plan to summarize the reasons for the differences in this issue.

@pochedls
Copy link
Collaborator Author

pochedls commented Jun 8, 2023

For a subset of models and domains, the CDAT-xCDAT differences were traced to CDAT dropping the model's latitude bounds (and then replacing the bounds with its own calculated bounds). This resulted in different weights and small differences in the area averages (see table at bottom).

First – an example of incorrect bounds:

Excerpt of lat_bnds from ncdump -v lat_bnds /p/css03/cmip5_css02/data/cmip5/output1/BNU/BNU-ESM/historical/mon/atmos/Amon/r1i1p1/tas/1/tas_Amon_BNU-ESM_historical_r1i1p1_185001-200512.nc:

lat_bnds =
-87.8638000488281, -86.4801635742188,
-86.4801635742188, -83.704719543457,
-83.704719543457, -80.9192581176758,
-80.9192581176758, -78.1312561035156,
-78.1312561035156, -75.3422088623047,
-75.3422088623047, -72.5526351928711,

These appear to match the bounds in cdscan. But they are different from those loaded using cdms2:

fn = '/p/user_pub/xclim/CMIP5/CMIP/historical/atmos/mon/tas/CMIP5.CMIP.historical.BNU.BNU-ESM.r1i1p1.mon.tas.atmos.glb-z1-gu.1.0000000.0.xml'
f = cdms2.open(fn)
tas = f('tas')
print(tas.getLatitude().getBounds())

[[-89.24743652 -86.48016358]
[-86.48016358 -83.70471955]
[-83.70471955 -80.91925812]
[-80.91925812 -78.13125229]
[-78.13125229 -75.34220887]
[-75.34220887 -72.55263519]

If I drop the lat_bnds in xcdat and re-calculate them using .add_missing_bounds() the spatial average differences are order 10-10. This resolves 15 / 28 CDAT-xCDAT differences. I think CDAT probably drops the latitude bounds systematically, but it only matters in these cases. xCDAT does not drop the model latitude bounds (and I don't think it should – we should trust the model data and the user should decide to drop this data).

model domain max absolute difference
BNU-ESM global 0.0337137471188953
BNU-ESM NAM 0.03921949346317888
BNU-ESM SAM 0.057204666153268136
FIO-ESM global 0.033644891283586276
FIO-ESM NAM 0.03955337948826809
FIO-ESM SAM 0.0555397851591124
bcc-csm1-1 global 0.03277778145468346
bcc-csm1-1 NAM 0.04099915332000137
bcc-csm1-1 SAM 0.056736843962255534
bcc-csm1-1-m global 0.005671941710033934
bcc-csm1-1-m NAM 0.006484684673694119
bcc-csm1-1-m SAM 0.009858142884070276
fio-esm global 0.033644891283586276
fio-esm NAM 0.03955337948826809
fio-esm SAM 0.0555397851591124

@pochedls
Copy link
Collaborator Author

pochedls commented Jun 8, 2023

In another small set of cases (see table below), small differences seem to be introduced by the precision of the axis information.

For example: ncdump -h /p/css03/esgf_publish/CMIP6/CMIP/IPSL/IPSL-CM6A-LR/historical/r10i1p1f1/Amon/tas/gr/v20180803/tas_Amon_IPSL-CM6A-LR_historical_r10i1p1f1_gr_185001-201412.nc | grep lat

The lat axis is stored as a float (i.e., float32 not double). CDAT (type(tas.getLatitude().getBounds()[0, 0])) reads this as numpy.float64 whereas xcdat reads (type(ds.lat_bnds.values[0, 0])) this (properly) as numpy.float32. These datasets do not have bounds (so they are calculated by both CDAT and xCDAT). The difference in precision introduces numerical roundoff (10-11 to 10-8) differences in the bounds, then the weights (10-11 0 10-8), then the spatial averages.

If I cast the lat and lon axes to float64 in the xcdat dataset, recompute the bounds, and the spatial average, then xcdat and CDAT match. This resolves 6 of the 28 differences. This means 21 of 28 of the differences are resolved.

In the past (and repeated here), this issue had been resolved with my own averaging calculations. We could not figure out why my calculations differed from xarray, though we did realize that modifying the precision mattered. I think the custom spatial averager likely worked because it was probably casting the arrays to the same type which led to np.allclose resulting in True. I haven't tried to figure out exactly where the difference happens (it seems to stem from CDAT reading in the bounds with the wrong precision, which affects downstream calculations).

model domain max absolute difference
IPSL-CM6A-LR SAM 0.003096939779936747
IPSL-CM6A-LR AllM 0.007789411802434643
IPSL-CM6A-LR-INCA global 0.0037057772779576226
IPSL-CM6A-LR-INCA SAM 0.003101609951784212
IPSL-CM6A-LR-INCA AllM 0.007794648910362412
IPSL-CM6A-LR global 0.0037208329281384067

@pochedls
Copy link
Collaborator Author

pochedls commented Jun 8, 2023

One set of cases that I don't fully understand is with CESM2-WACCM (see differences). The bounds and weights appear to be identical in CDAT and xCDAT. The differences are "all close" if I either a) manually calculate the spatial average (10-5 differences), b) reverse the axis order when averaging in xcdat (10-3 differences), or drop and recompute the latitude bounds (and weights) (10-7 differences).

Unlike with IPSL data (above) the axes seem to be stored (and opened) with double precision. The bounds are stored as float in the netCDF file. Both xcdat and CDAT bounds/weights seem to open as float32. But if I drop the lat_bnds and re-create them in xcdat, they are float64 and I get very close to the CDAT spatial averages (10-7 differences). So I think this is another precision/roundoff type difference.

If we accept this as an explanation, this resolves 4/28 differences (3/28 left to figure out).

model domain max absolute difference
CESM2-WACCM tropics 0.004078462585937359
CESM2-WACCM global 0.006396168779986056
CESM2-WACCM SAM 0.0029629751547304295
CESM2-WACCM AllM 0.008556697618530507

@pochedls
Copy link
Collaborator Author

pochedls commented Jun 8, 2023

The final set of differences (see table below) appear to be related to incorrect longitude weights assigned by CDAT in rare instances.

For example, for the Sahel domain (/p/css03/esgf_publish/CMIP6/CMIP/NUIST/NESM3/historical/r1i1p1f1/Amon/tas/gn/v20190630/) which has the bounds 13 - 18oN and -10 to 10oE, CDAT gives the following relative weights:

lon_bnd relative weight
[-9.375 -7.5 ] 1.875
[-7.5 -5.625] 1.875
[-5.625 -3.75 ] 1.875
[-3.75 -1.875] 1.875
[-1.875 0. ] 1.875
[0. 1.875] 1.875
[1.875 3.75 ] 1.875
[3.75 5.625] 1.875
[5.625 7.5 ] 1.875
[7.5 9.375] 1.875
[ 9.375 10. ] 0.625

It correctly adjusts the bound of the lat grid cell (from 9.375 : 11.25 to 9.375 : 10.), but it drops weight from the grid cell spanning (-11.25 : -9.375) [it isn't listed]. xCDAT does weight this grid cell:

lon_bnd relative weight
[-11.25 -9.375] 0.625
[-9.375 -7.5 ] 1.875
[-7.5 -5.625] 1.875
[-5.625 -3.75 ] 1.875
[-3.75 -1.875] 1.875
[-1.875 0. ] 1.875
[0. 1.875] 1.875
[1.875 3.75 ] 1.875
[3.75 5.625] 1.875
[5.625 7.5 ] 1.875
[7.5 9.375] 1.875
[ 9.375 11.25 ] 0.625

Note the partial weights in the first and last grid cells. I don't know why CDAT doesn't provide the correct weights in these rare instances. This resolves the final 3 CDAT / xCDAT discrepancies.

model domain max absolute difference
NESM3 Sahel 0.15666759125781482
NESM3 GoG 0.036151854691411245
NESM3 SAmo 0.10544204167973703

@pochedls
Copy link
Collaborator Author

pochedls commented Jun 8, 2023

The issue that prompted the PR has to do with an E3SM dataset with unusual bounds.

The first set of bounds span 359.6484375 to 0.3515625. This appears to be inconsistent with CF-conventions (see, e.g., here) which states that bounds should be increasing. We don't have examples of bounds like this in the CMIP archive. If bounds wrap around the prime meridian they are stored in increasing order, e.g., [ -0.9375, 0.9375] for /p/css03/esgf_publish/CMIP6/CMIP/CSIRO/ACCESS-ESM1-5/historical/r10i1p1f1/Amon/tas/gn/v20200605/.

The weird E3SM bounds are not handled correctly by the spatial averager in v0.5.0. This issue is caught and fixed in the current PR.

@tomvothecoder
Copy link
Collaborator

Thanks for documenting the 28 cases where you found differences in the spatial averaging results between CDAT and xCDAT. I was able to follow your findings for these cases and am satisfied with the explanations.

One set of cases that I don't fully understand is with CESM2-WACCM (see differences). The bounds and weights appear
to be identical in CDAT and xCDAT. The differences are "all close" if I either a) manually calculate the spatial average (10-5 differences), b) reverse the axis order when averaging in xcdat (10-3 differences), or drop and recompute the latitude bounds (and weights) (10-7 differences).

Unlike with IPSL data (above) the axes seem to be stored (and opened) with double precision. The bounds are stored as float in the netCDF file. Both xcdat and CDAT bounds/weights seem to open as float32. But if I drop the lat_bnds and re-create them in xcdat, they are float64 and I get very close to the CDAT spatial averages (10-7 differences). So I think this is another precision/roundoff type difference.

If we accept this as an explanation, this resolves 4/28 differences (3/28 left to figure out).

As mentioned in this comment, we found that CDAT incorrectly type casts weights to np.float64 even though bounds in the input dataset are np.float32 (source notebook). There might be some subtle type casting going on resulting in precision differences. We probably shouldn't be worried here.

@pochedls
Copy link
Collaborator Author

I am also happy with where things are. I think this can be closed with #495.

pochedls added a commit that referenced this issue Jun 13, 2023
…on bounds span prime meridian (#495)

* Fix for #494 (and tested for robustness in #500)
* Updates spatial averaging logic to handle datasets with non-monotonic longitude bounds

Co-authored-by: Tom Vo <tomvothecoder@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants