Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CMUG Sea Surface Salinity dataset and diagnostic #1832

Merged
merged 24 commits into from
Dec 7, 2021
Merged

CMUG Sea Surface Salinity dataset and diagnostic #1832

merged 24 commits into from
Dec 7, 2021

Conversation

jvegreg
Copy link
Contributor

@jvegreg jvegreg commented Sep 30, 2020

Add ESACCI-SEA-SURFACE-SALINITY dataset (v1 and v2) and a diagnostic showing how to

Requires ESMValGroup/ESMValCore#764 and ESMValGroup/ESMValCore#798

Tasks

  • Create an issue to discuss what you are going to do, if you haven't done so already (and add the link at the bottom)
  • Give this pull request a descriptive title that can be used as a one line summary in a changelog
  • Make sure your code is composed of functions of no more than 50 lines and uses meaningful names for variables
  • Circle/CI tests pass. Status can be seen below your pull request. If the tests are failing, click the link to find out why.
  • Preferably Codacy code quality checks pass, however a few remaining hard to solve Codacy issues are still acceptable. Status can be seen below your pull request. If there is an error, click the link to find out why. If you suspect Codacy may be wrong, please ask by commenting.
  • Please use yamllint to check that your YAML files do not contain mistakes
  • (Only if really necessary) Add any additional dependencies needed for the diagnostic script to setup.py, esmvaltool/install/R/r_requirements.txt or esmvaltool/install/Julia/Project.toml (depending on the language of your script) and also to package/meta.yaml for conda dependencies (includes Python and others, but not R/Julia)
  • If new dependencies are introduced, check that the license is compatible with Apache2.0

New recipe/diagnostic

  • Add documentation for the recipe to the doc/sphinx/source/recipes folder and add a new entry to index.rst
  • Add provenance information

New data reformatting script

  • Test the CMORized data using recipes/example/recipe_check_obs.yml, to make sure the CMOR checks pass without errors
  • Add the new dataset to the table in the documentation
  • Tag @remi-kazeroni in this pull request, so that the new dataset can be added to the OBS data pool at DKRZ and synchronized with CEDA-Jasmin

Copy link
Contributor

@axel-lauer axel-lauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this new diagnostic. It worked and the output looks OK but I would recommend some more documentation (see comments) and I did not like the absolute path used for the shapefile in the recipe.


Recipes are stored in recipes/

* recipe_sea_surface_salinity
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* recipe_sea_surface_salinity
* recipe_sea_surface_salinity.yml

Variables
---------

* sos (ocean, monthly mean, time depth_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* sos (ocean, monthly mean, time depth_id)
* sos (ocean, monthly, longitude, latitude, time)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should refer to the original data or the dimensions expected by the diagnostic after the preprocessor?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far, this ususally referred to the dimensions expected by the recipe (recipe as provided in the official release). I guess we should think about how to handle this in the future and make it more clear to what this actually refers. Or maybe simply get rid of this section? What do people think?

esmvaltool/recipes/recipe_sea_surface_salinity.yml Outdated Show resolved Hide resolved
@@ -0,0 +1,86 @@
.. _recipes_sea_surface_salinity:

Sea Surface Salinity Evaluation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to have a section in the documentation on:

(1) how to obtain the shapefile needed
(2) in which directory to put the shapefile
(3) available regions (or at least examples) for analysis
(4) required / recommended preprocessor settings (extract_shape)

@jvegreg
Copy link
Contributor Author

jvegreg commented Oct 20, 2020

I think I addresed all doc related comments. Anyway, this can not be merged because there is a PR in the core that is still pending

@axel-lauer
Copy link
Contributor

Both, ESMValGroup/ESMValCore#764 and ESMValGroup/ESMValCore#798 have been merged. Is there anything else missing to move forward with this PR?

@axel-lauer
Copy link
Contributor

I just tried to test this recipe again with the latest ESMValCore (master) but had no success. The first problem was with the preprocessor area_statistics:

Traceback (most recent call last):
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/_main.py", line 433, in run
    fire.Fire(ESMValTool())
  File "/mnt/lustre02/work/bd0854/b380103/miniconda3/envs/esm22/lib/python3.9/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/mnt/lustre02/work/bd0854/b380103/miniconda3/envs/esm22/lib/python3.9/site-packages/fire/core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/mnt/lustre02/work/bd0854/b380103/miniconda3/envs/esm22/lib/python3.9/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/_main.py", line 410, in run
    process_recipe(recipe_file=recipe, config_user=cfg)
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/_main.py", line 100, in process_recipe
    recipe = read_recipe_file(recipe_file, config_user)
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/_recipe.py", line 55, in read_recipe_file
    return Recipe(raw_recipe,
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/_recipe.py", line 963, in __init__
    self.tasks = self.initialize_tasks() if initialize_tasks else None
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/_recipe.py", line 1338, in initialize_tasks
    task = _get_preprocessor_task(
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/_recipe.py", line 927, in _get_preprocessor_task
    task = _get_single_preprocessor_task(
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/_recipe.py", line 784, in _get_single_preprocessor_task
    products = _get_preprocessor_products(
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/_recipe.py", line 718, in _get_preprocessor_products
    _update_preproc_functions(settings, config_user, variable, variables,
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/_recipe.py", line 754, in _update_preproc_functions
    _update_fx_settings(settings=settings,
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/_recipe.py", line 489, in _update_fx_settings
    _update_fx_files(step_name, settings, variable, config_user,
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/_recipe.py", line 436, in _update_fx_files
    fx_info.update({'mip': None})
AttributeError: 'str' object has no attribute 'update'

After changing

    area_statistics:
      operator: mean
      fx_variables:
        areacello: areacello

to

    area_statistics:
      operator: mean
      fx_variables: [areacello]

the recipe starts but then crashes again with

2021-05-19 09:25:05,037 UTC [3246] ERROR   Program terminated abnormally, see stack trace below for more information:
Traceback (most recent call last):
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/_main.py", line 433, in run
    fire.Fire(ESMValTool())
  File "/mnt/lustre02/work/bd0854/b380103/miniconda3/envs/esm22/lib/python3.9/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/mnt/lustre02/work/bd0854/b380103/miniconda3/envs/esm22/lib/python3.9/site-packages/fire/core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/mnt/lustre02/work/bd0854/b380103/miniconda3/envs/esm22/lib/python3.9/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/_main.py", line 410, in run
    process_recipe(recipe_file=recipe, config_user=cfg)
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/_main.py", line 104, in process_recipe
    recipe.run()
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/_recipe.py", line 1409, in run
    self.tasks.run(max_parallel_tasks=self._cfg['max_parallel_tasks'])
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/_task.py", line 674, in run
    self._run_sequential()
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/_task.py", line 685, in _run_sequential
    task.run()
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/_task.py", line 248, in run
    input_files.extend(task.run())
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/_task.py", line 252, in run
    self.output_files = self._run(input_files)
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/preprocessor/__init__.py", line 481, in _run
    product.apply(step, self.debug)
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/preprocessor/__init__.py", line 350, in apply
    self.cubes = preprocess(self.cubes, step, **self.settings[step])
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/preprocessor/__init__.py", line 297, in preprocess
    result.append(_run_preproc_function(function, item, settings))
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/preprocessor/__init__.py", line 280, in _run_preproc_function
    return function(items, **kwargs)
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/preprocessor/_area.py", line 537, in extract_shape
    return _mask_cube(cube, selections)
  File "/mnt/lustre02/work/bd0854/b380103/ESMValCore/esmvalcore/preprocessor/_area.py", line 549, in _mask_cube
    return fix_coordinate_ordering(cubelist.merge_cube())
  File "/mnt/lustre02/work/bd0854/b380103/miniconda3/envs/esm22/lib/python3.9/site-packages/iris/cube.py", line 405, in merge_cube
    (merged_cube,) = proto_cube.merge()
  File "/mnt/lustre02/work/bd0854/b380103/miniconda3/envs/esm22/lib/python3.9/site-packages/iris/_merge.py", line 1325, in merge
    merged_cube = self._get_cube(merged_data)
  File "/mnt/lustre02/work/bd0854/b380103/miniconda3/envs/esm22/lib/python3.9/site-packages/iris/_merge.py", line 1605, in _get_cube
    cube = iris.cube.Cube(
  File "/mnt/lustre02/work/bd0854/b380103/miniconda3/envs/esm22/lib/python3.9/site-packages/iris/cube.py", line 902, in __init__
    self.add_cell_measure(cell_measure, dims)
  File "/mnt/lustre02/work/bd0854/b380103/miniconda3/envs/esm22/lib/python3.9/site-packages/iris/cube.py", line 1132, in add_cell_measure
    data_dims = self._check_multi_dim_metadata(cell_measure, data_dims)
  File "/mnt/lustre02/work/bd0854/b380103/miniconda3/envs/esm22/lib/python3.9/site-packages/iris/cube.py", line 1065, in _check_multi_dim_metadata
    raise ValueError(
ValueError: Unequal lengths. Cube dimension 0 => 9; metadata 'cell_area' dimension 0 => 780.

@jvegasbsc Could you please take a look? Are there other things that still need to be done / adressed before merging? It would be good to get things moving soon so we can meet our CMUG deliverables.

@axel-lauer
Copy link
Contributor

The preprocessor defined in recipe_sea_surface_salinity.yml creates time series by extracting some given shapes and calculating area means. Each of these two operations works individually but not when combined. As the diagnostic was working before, the error is probably related to changes in the preprocessor that have been introduced during the last year. I could not figure out what might go wrong. @valeriupredoi could you maybe take a look? We need to get this merged soon to meet our CMUG project deliverables and I would greatly appreciate some help as @jvegasbsc does not seem to be around any more...

@valeriupredoi
Copy link
Contributor

hi @axel-lauer could you please merge the latest main (and sort out the possible conflicts)? I'll have a look afterwards; if it's a specific run issue, then please post the log file here 🍺

@axel-lauer
Copy link
Contributor

Thanks @valeriupredoi ! I just merged the latest main into this branch. Here is the log file from running the recipe: main_log_debug.txt

@valeriupredoi
Copy link
Contributor

cheers @axel-lauer - am having a looksee now 👍

@valeriupredoi
Copy link
Contributor

valeriupredoi commented Nov 16, 2021

@axel-lauer here's what I found:

  • the test is failing because the private function _get_overlap() is not in _multimodel.py (it's not been there since a long while now, I believe v2.2) - the functionality of that missing func needs to be implemented in the diag
  • the issue with the shapefile uncovers a bug in the extract_shape function: the mask builder is trying to merge a cubeList, that is made up in your case of the actual sea surface sality time-variant variable, and a number of cell measures, see below:
0: sea_surface_salinity / (0.001)      (time: 780; cell index along second dimension: 404; cell index along first dimension: 802)
1: sea_surface_salinity / (0.001)      (time: 780; cell index along second dimension: 404; cell index along first dimension: 802)
2: sea_surface_salinity / (0.001)      (time: 780; cell index along second dimension: 404; cell index along first dimension: 802)
3: sea_surface_salinity / (0.001)      (time: 780; cell index along second dimension: 404; cell index along first dimension: 802)
4: sea_surface_salinity / (0.001)      (time: 780; cell index along second dimension: 404; cell index along first dimension: 802)
5: sea_surface_salinity / (0.001)      (time: 780; cell index along second dimension: 404; cell index along first dimension: 802)
6: sea_surface_salinity / (0.001)      (time: 780; cell index along second dimension: 404; cell index along first dimension: 802)
7: sea_surface_salinity / (0.001)      (time: 780; cell index along second dimension: 404; cell index along first dimension: 802)
8: sea_surface_salinity / (0.001)      (time: 780; cell index along second dimension: 404; cell index along first dimension: 802)
9: sea_surface_salinity / (0.001)      (time: 780; cell index along second dimension: 404; cell index along first dimension: 802)

-> the reason why there are 9 is that there are 9 regions to be selected and merged into one big map, but the problem is that they are not time-dependent, and that suggests to me that those are the cell measures cubes rather than the actual data cube masked. Here is the issue ESMValGroup/ESMValCore#1394

@axel-lauer
Copy link
Contributor

Big thanks to @sloosvel for opening PR ESMValGroup/ESMValCore#1403 This seems to fix the problem with extract_shape, i.e. the preprocessor finishes now successfully! That's really cool!
Now the actual diagnostic sea_surface_salinity/compare_salinity.py crashes as import of _get_overlap and _slice_cube from esmvalcore.preprocessor._multimodel fails. What happend to those functions and what can be used to replace them?
Once this is solved, I can compare the original output and the output with the fix from @sloosvel. If that still matches, green light for this PR and for ESMValGroup/ESMValCore#1403!

@axel-lauer axel-lauer self-requested a review December 6, 2021 13:30
Copy link
Contributor

@axel-lauer axel-lauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script and the cmorizer run fine now and the results look as expected. This is now ready to be merged.

@valeriupredoi
Copy link
Contributor

cheers @axel-lauer 🍺 Lemme have a looksee in a very short bit (10min) and will approve and merge 👍

Copy link
Contributor

@valeriupredoi valeriupredoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! A few comments from me, to the very least fix the typos please 🍺

doc/sphinx/source/recipes/recipe_sea_surface_salinity.rst Outdated Show resolved Hide resolved
doc/sphinx/source/recipes/recipe_sea_surface_salinity.rst Outdated Show resolved Hide resolved
doc/sphinx/source/recipes/recipe_sea_surface_salinity.rst Outdated Show resolved Hide resolved
doc/sphinx/source/recipes/recipe_sea_surface_salinity.rst Outdated Show resolved Hide resolved
doc/sphinx/source/recipes/recipe_sea_surface_salinity.rst Outdated Show resolved Hide resolved
reference_dataset = variables.pop(ref_alias)[0]
reference = iris.load_cube(reference_dataset[n.FILENAME])
reference_ancestor = reference_dataset[n.FILENAME]
logger.debug(reference)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a verbose message to debugger, not just a cube

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added more verbose message...

time_coord.units.name,
calendar='gregorian',
)
unify_time_units((reference, dataset))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this always succeed? Probably best to catch an error with a try/except statement

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why @jvegasbsc implemented this. I would leave it as is since it seems to be working fine...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, not a biggie

calendar='gregorian',
)
unify_time_units((reference, dataset))
logger.debug(dataset)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here too please add a user-readable message

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added more verbose message...

'caption': caption,
'domains': ['global', ],
'autors': ['vegas-regidor_javier'],
'references': ['acknow_author'],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs a value

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there is none. @jvegasbsc is that right or would you have a suggestion?



if __name__ == "__main__":
main()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - I am bit confused why you've created a couple very hefty objects instead of just using functions - it's not like the diagnostic is a module that gets used many times

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, but I would guess @jvegasbsc had a good reason...

Copy link
Contributor

@valeriupredoi valeriupredoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cheers @axel-lauer 🍺 It still needs a bit of polishing but I believe you guys need this in asap, so I'm not gonna be the proverbial cork in the bottle especially the bits that are still needed are minor 😁

@valeriupredoi
Copy link
Contributor

looks good, merge at will @axel-lauer 👍

@axel-lauer
Copy link
Contributor

Big thank you @valeriupredoi for your quick help!! I owe you a 🍺 next time we can meet in person!

@axel-lauer axel-lauer merged commit 05f2241 into main Dec 7, 2021
@axel-lauer axel-lauer deleted the cmug_sos branch December 7, 2021 08:32
@valeriupredoi
Copy link
Contributor

@axel-lauer glad I could help! I won't say no to the 🍺 tho 😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants