Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Longer run time of recipes with 3D regridding in ESMValTool v2.10.0 compared to v2.5.0 #3590

Closed
k-a-webb opened this issue May 10, 2024 · 3 comments · Fixed by ESMValGroup/ESMValCore#2418

Comments

@k-a-webb
Copy link

For a single CanESM5 dataset of 30 years, it takes ~4min wall time to run a simple recipe in ESMValTool v2.5.0, but >1h in v2.10.0 and v2.11 (main branch). There is a significant increase in run time in the regridding step.

The test recipes involves the following preprocessors (and no diagnostic scripts):

preprocessors:
  time_ocean_zonal_mean:
    custom_order: true
    climate_statistics:
      operator: mean
      period: full
    extract_levels:
      levels: [ 0,  10, 20, 50, 100, 200, 300, 500, 750, 1000, 1250, 1500, 1750, 2000, 2250, 2500, 2750, 3000, 3250, 3500, 3750, 4000, 4250, 4500, 4750, 5000, 5250, 5500, 5750]
      scheme: linear_extrapolate
      coordinate: depth
    regrid:
      target_grid: 1x1
      scheme: nearest
    zonal_statistics:
      operator: mean

In this case, the regridding is done via esmpy_regrid (alias of _regrid_esmpy.regrid) as it passes _attempt_irregular_regridding check)

It takes longer to both build the target grid (regridder = build_regridder(src_rep, dst_rep, method)), and regrid the data (res = map_slices(src, regridder, src_rep, dst_rep)).

main_log_debug.py files for the various runs of the same recipe in different environments, each with a different installation of ESMValTool:
main_log_debug-esmvaltool.txt
main_log_debug--ESMValToolv2.5.0.txt
main_log_debug-EVTmaindev.txt (same as main_log_debug-EVTmain.txt, as expected)

(Note: ESMValTool install main branch fails with missing author message -- despite inclusion of authors.)


Installation details:

ESMValTool v2.10.0 was installed via

mamba create --name esmvaltool -c conda-forge esmvaltool
conda activate esmvaltool
ESMValCore: 2.10.0
ESMValTool: 2.10.0

ESMValTool main branch installed via

mamba create --name a4d_env_EVTmain

conda activate a4d_env_EVTmain

cd ~/code/esmvaltool/
git clone https://github.com/ESMValGroup/ESMValTool.git -b main  ESMValTool_main
cd ESMValTool_main
mamba env update --file environment.yml -n a4d_env_EVTmain
> esmvaltool version
Running esmvaltool executable from ESMValCore. No other command line utilities are available until ESMValTool is installed.
ESMValCore: 2.10.0

as well as the development version,


mamba create --name a4d_env_EVTmaindev

conda activate a4d_env_EVTmaindev

cd ~/code/esmvaltool/
cd ESMValTool_main
mamba env update --file environment.yml -n a4d_env_EVTmaindev
pip install --editable '.[develop]'

cd ~/code/esmvalcore/
git clone https://github.com/ESMValGroup/ESMValCore.git -b main  ESMValCore_main
cd ESMValCore_main
mamba env update --file environment.yml -n a4d_env_EVTmain
pip install --editable '.[develop]'
> esmvaltool version
/space/hall5/sitestore/eccc/crd/ccrn/users/rkw001/miniconda3/envs/a4d_env_EVTmaindev/lib/python3.11/site-packages/pyproj/__init__.py:89: UserWarning: pyproj unable to set database path.
  _pyproj_global_context_initialize()
ESMValCore: 2.11.0.dev100+ga782af8e3.d20240510
ESMValTool: 2.11.0.dev72+gcb582bd01.d20240510

Note: To install ESMValTool v2.5.0 the following modifications to the install instructions was required:

  • specify python version compatible with esmpy/ESMF v8.2.0 needed for ESMValTool v2.5.0, I used python 3.9.7
  • In the environment.yml, the versions of esmpy was specified to 8.2.0, and esmvalcore to v2.5.0 (no specific versions were originally specified)
mamba create --name a4d_env_EVTv2.5r python==3.9.7
conda activate a4d_env_EVTv2.5r

cd ~/code/esmvaltool/
git clone https://github.com/ESMValGroup/ESMValTool.git -b v2.5.0  ESMValTool_v2.5.0
cd ESMValTool_v2.5.0
nano environment.yml # esmpy==8.2.0, esmvalcore==2.5.0
mamba env update --file environment.yml -n a4d_env_EVTv2.5r
ESMValCore: 2.5.0
ESMValTool: 2.5.0

Following the basic instructions for installing ESMValTool without the above modifications lead to package version issues with both shapely and esmpy/ESMF


Environment files (conda list > environment_<env>.yml) files are also attached.
environment__EVTmain.txt
environment__esmvaltool.txt
environment__EVTv2.5r.txt
environment__EVTmaindev.txt

@k-a-webb
Copy link
Author

@malininae

@bouweandela
Copy link
Member

bouweandela commented May 14, 2024

Thanks for reporting the issue! This happens because the climate_statistics and extract_levels preprocessor functions are now lazy, but the ESMPy based regridding preprocessor is not. Therefore it was loading the data from disk and recomputing the input to the regridding multiple times. It should be fixed by ESMValGroup/ESMValCore#2418.

(Note: ESMValTool install main branch fails with missing author message -- despite inclusion of authors.)

It looks like you did not install ESMValTool, but only created the conda environment with its dependencies. If you run pip install -e . in the directory where you checked out ESMValTool it should work as expected.

@k-a-webb
Copy link
Author

Excellent! Thanks for sorting this out :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants