-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]:.spatial.average(lat_bnds=(a,b)
does not appear to be working properly
#494
Comments
@AaronDonahue – I don't have permissions to see these files – what does |
@pochedls - They were files I had created, so I've just updated the permissions so you should be able to seee it. They're regridded files, so they are on a rectilinear grid. |
Okay – not a grid issue. I think I see the problem. @AaronDonahue is supplying weights and
If weights are not provided, xcdat basically makes weights that conform to your region (so zero weight outside your region of interest, partial weight for grid cells partially in the domain, and full weight inside the region of interest). I am guessing you are providing your own spatial weights (or some other kind of weight), but you would like Can you just let xcdat calculate the weights (or are they not spatial weights)? Another alternative approach would be to subset the domain(s) and then average (though in that case I think you would basically just be using xarray's weighted mean functionality and there would be no spatial treatment for grid cells partially in your averaging domain). @crterai – I'm still getting the following:
I think you need to use something like |
One other option would be to use area_weights = ds.spatial.get_weights(lat_bounds=(-10, 10))
my_weights = ...
weights = area_weights * my_weights
ds.spatial.average(varId, axis=["X", "Y"], weights=weights) |
Thanks for the suggestion @pochedls . Okay, so weights and lat_bounds don't work well together. I can try one of your suggested fixes. And hmm, that's odd. Maybe need executable access for the files as well..
and
|
@pochedls , thanks for the help. I tried letting xcdat calculate the weights but run into a separate issue:
provides the error
I get the same error if I use |
Update, I think I've got it. When loading the data I'm using
fails, but using a different data set,
|
I actually think the above may be a red-herring. If I remove |
I get I'm happy to look at this, but if you could create a complete, minimal example (import, open dataset, do something, error message) with data that I can access (e.g., "give" me a file on the LC or something) that would be helpful. |
@crterai , could you give @pochedls permission to the file above. @pochedls , I'll see what I can do about getting a simpler recreator. I'm at a loss because I don't understand what the difference between the Jan and Oct datasets is that cause the error. Except that the Oct dataset has fewer timesnaps per file (4) than Jan (96) |
Okay, it seems like there's somewhere in that tree of directories that doesn't allow access to @pochedls, so I've copied them over to a scratch directory. |
@crterai shared the data. The initial issue was that both The next issue was that the longitude bounds wrap around a prime meridian in this dataset. This creates problems, because weights are determined by the difference between the longitude bound values (and a prime meridian introduces a discontinuity, e.g., I also noticed that the I've addressed the multifile dataset issue and the prime meridian issues in PR #495, which will hopefully be included in our next release. |
Thanks for looking into this, @pochedls. Until the next release, what would you recommend we do for doing regional averages with the dataset that we have? Should we not use the |
I think you could subset your dataset and then average: import xarray as xr
fn = 'output.scream.SurfVars.INSTANT.nmins_x15.2020-01-31-00900.nc'
ds = xr.open_dataset(fn)
ds = ds.sel(lat=slice(-10, 10))
area = ds.area
tas = ds.T_2m
tasw = tas.weighted(area)
tasw = tas.mean(("lat", "lon")) This should be fine for large spatial averages, but if you look at small domains I don't know exactly how it handles the (partial) weights for grid cells straddling the domain of interest. |
Excellent, thanks for this Stephen. I was able to use your snippet above to write my own mini spatial averaging function which I can use for now. |
@AaronDonahue / @crterai / @chengzhuzhang / @tomvothecoder I spent some time re-evaluating spatial averaging in light of this issue (see #500). This was useful to understand why xCDAT sometimes gets different spatial average values compared to CDAT (in short, differences are small and I think xcdat is doing the right thing when differences do exist). The initial error that @AaronDonahue got ( This seems to be an uncommon issue:
In general, I think this tends to be rare because bounds are supposed to increase (see Fig. 7.1). My PR catches and addresses this situation when it occurs. I wanted to raise this and explain the bug that I found (and corresponding PR); if you think that other users may have hit this issue or have datasets that I could test with this PR, let me know. |
@pochedls thank you very much for your effort documenting this uncommon case and to have a fix. The provided file from SCREAM (/pscratch/sd/t/terai/EAMxx/shareWpochedls/output.scream.SurfVars.INSTANT.nmins_x15.2020-*nc) was processed with I suspect this could affect some use cases of regriding as well. .. |
I'm trying to figure out if/what is being asked of me. My understanding is that a |
Hi Charlie, Thanks for following up!
Yes to both
Based on the the cf-convention doc, Fig. 7.1: the bounds suppose to be increasing. Although this PR submitted resolves this particular issue. I think it is still worthwhile to correct to use the negative/positive longitude convention for lon_bounds created by |
Thanks Jill for the pointer. I did not realize there was a CF Convention exactly about this issue. Last question, where exactly is the dataset that |
@taylor13 – I wanted to see if you could review one aspect of this issue (@czender / @chengzhuzhang - FYI – I thought it would be useful for @taylor13 to weigh in here). If I have a longitude boundary that spans across 0o longitude, e.g., All of the CMIP data I looked at I have found (looping over historical tas data for all models) writes the bounds as the latter I think this ultimately doesn't matter for xcdat (we now handle this case), but should data generally be written as |
CF requires coordinates to be monotonically increasing or decreasing, and the bounds on a 1-D coordinate should be consistent with this (so if coordinate increases, the 2nd bound should be greater than first bound. This makes it simple to calculate the cell width. So in you example, In the special case of a single cell locate at 0 degE spanning |
@crterai / @AaronDonahue – we produced a bug fix for this issue that will be incorporated into the next version of xcdat (not yet released). Could you check to see if this issue is resolved in this candidate/pre-release (see this discussion). |
@pochedls - I can give it a try. |
Thanks @crterai! I tagged you and @AaronDonahue on the related test case in our v0.6.0rc1 testing plan, which Steve also mentioned in his comment above. |
What happened?
I am a new user to XCDAT so this could definitely be user error. I have a data set that I have am loading on
perlmutter
with this codeand then I attempt to take a spatial average with latitude bounds. As a sanity check I took the average for wildly different areas of the globe expecting them to be different, but they end up matching exactly.
I noticed this when plotting the average for more realistic cases and noticing the plots matched.
What did you expect to happen? Are there are possible answers you came across?
I expected the two different lat bounds to produce different answers.
Minimal Complete Verifiable Example (MVCE)
produces:
Relevant log output
No response
Anything else we need to know?
No response
Environment
TBH I don't know how to do this.
xcdat.show_versions()
throws an error. I am using the e3sm_unified environment on NERSC - perlmutter which I loaded withxarray.show_versions produces:
The text was updated successfully, but these errors were encountered: