-
Notifications
You must be signed in to change notification settings - Fork 1
A lightweight CF model #10
Comments
Thanks for the ping. 😄
Pyke is not core to Iris at all. It's just happens to be used to translate CF-netCDF files into Cubes, but it's the Cube which embodies CF in a Python object.
I'm keen to explore a core + optional extras model with Iris (e.g. https://github.com/SciTools/iris-extras/issues/7 and SciTools/iris#1789). The improving package/dependency management tools make it more feasible for us to pull capabilities out of the core Iris package and into extension packages. In the logical conclusion of that model the "CF-python object" is the Cube. I'm guessing you don't see things in quite the same way though, so I'm eager to understand the difference. Speaking of which...
How would a "CF xray.Dataset" differ from your "CF-python object"?
I think you once said something roughly equivalent to "I use xray by default and iris when I need CF compliance". I'd love to know more about what makes you reach for Iris. |
I had a long talk with @kwilcox about this yesterday. The CF model itself actual consists of functionality that can be separated: unit conversion, vertical coordinate calculation, standard_name manipulation, handling of different common data model featureTypes (Grid, Point, TimeSeries, TimeSeriesProfile, Profile, Trajectory, TrajectoryProfile). Grid handles only data which is colocated with coordinate values. To handle many of the newer oceanographic, atmospheric and hydrologic models, we also need support for grids where the data is not colocated with the coordinate data (staggered grid) and data which is on non-rectangular mesh (unstructured grid). This was the motivation behind the UGRID and SGRID conventions, and the "pyugrid" and "pysgrid" packages. We were thinking that if these packages could provide standard methods for these regular grid, ugrid or sgrid objects (e.g. subsetting and regridding methods that return specific featureTypes) then they could be passed into functions that would do things like return a vertical transect along a specified path, regardless of the type of object. And folks who come up with some other type of model feature type (possible spectral representation for FEM models like Imperial College ICOM model) could create their own package, as long as they provided the appropriate methods. Could Iris be the package that orchestrates this functionality? I don't see why not. The main things that keep me from using Iris more are: (1) awkward slicing on coordinate values (e.g. compared to Xray); (2) long time to open and inspect a dataset; (3) lack of a dataset concept; (4) monolithic structure. Yet (1) is probably easily overcome, (2) may be just a question of learning how to inspect a dataset with Iris (using raw over strict), (3) may not be a real problem as long as cube lists don't actually duplicate coordinate data and (4) is being worked on. |
I did not say "core of iris." But bare in mind that 99.99% of the time our data is in the netCDF format. That means pyke, for us, is the CF parser in iris.
What we imagine is an object one step behind the cube. Maybe just a new netCDF object with some CF modifications and checked for compliance, or a dict of dicts mapping nc.variables and nc.dimensions to CF definitions. I must sound like an 8 year old wishing for a dirty bike with a rocket 🚲 + 🚀
I guess that the grid support, like
The cube is more than the CF-object, and that is the main problem. My imaginary CF-object would be a lighter cube-like constructor behind the cube. Here are some examples of why we want something like this:
There is no "CF xray.Dataset" yet, but the CF-python object would help create it. One could add vertical coordinate to the Dataset using the information parsed by the CF-python object. If someone wants to do this in Maybe these two example will help:
If we could have an intermediate object maybe we could do this:
The
I am writing a blog post about this can you wait for it? 😜 |
@lesserwhirls and @dopplershift, I'm bringing you guys into this discussion too, because it would be great if we could all be working toward harmonization of access in python to the common data model |
Depends how long I need to wait... 😜 |
Ooops. My laptop died with that post and never configured the new one for the blog... Sorry. |
Are you planning to create a new post? Either way, I'd still love to know more about what helps/hinders your usage of Iris. |
Yes. As soon as I have some free time to restore my old HDD.
In a gist the post will be about how the CF model in iris helps our workflow. PS: The hinders are mostly the slicing (the reason why |
Super! Thank you! 😄
I'm trying to get a shared plan together for that: https://github.com/SciTools/iris/wiki/IEP-1 |
Awesome! I made a few comments here: https://via.hypothes.is/https://github.com/SciTools/iris/wiki/IEP-1 I guess that |
@rhattersley Here's an example that shows the kind of thing that hinders usage of Iris. In this notebook, the user just wants to do something very simple and common: extract time series data in a specified date range and plot them up: Not only is the xarray syntax a lot simpler, but it's a lot faster. The speeds are listed in the notebook, but I'm summarizing them here:
Xarray is 60 times faster! |
@ocefpaf - chrome was the only browser that showed the overlay widgets, but even with chrome I couldn't see any comments.
@rsignell-usgs - thanks! 👍 |
Weird I lost the comments too. I guess it is because the wiki was modified.
|
👍 We can move any further discussion to SciTools/iris#1988. |
Since the first time I saw iris I fell in love with its interpretation of the CF-conventions*. It is not a simple metadata bookkeeping like column/index labels in
pandas
orxray
, nor a "bag" dictionary holding all the metadata. It is a full-fledged CF-convention parser to create a Python object. Propagating units, checking for compliance, etc.I am completely unfamiliar with the tool iris uses to do this (pyke) and I never looked in to the details of the implementation. However, it would be extremely useful if we could take the approach used in cf_units and create a standalone module that generates a CF-python object. This object could be used by iris to create the
cube
. And it would also be possible to use it to create other objects, maybe even a CFxray.Dataset
.Note that there is a cf_python module out there, but I never looked if it fits our needs. (Well... We have to define our needs first don't we?)
I believe we do not have the manpower to do this right now, but I wanted to open the issue here to keep this idea alive and start a discussion.
Pinging @pelson and @rhattersley. Are you 👍 or 👎 ? Do you think this is possible? Do you
think this is useful? Or do you think this is a wild-goose chase?
* The truth is that this is a love-and-hate relationship. The CF interpretation is so good that it brings all of CF shortcomings to the cube 😜
The text was updated successfully, but these errors were encountered: