Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up functions that use time dimension #1713

Merged
merged 7 commits into from
Oct 5, 2022
Merged

Conversation

bouweandela
Copy link
Member

@bouweandela bouweandela commented Sep 3, 2022

Description

This speeds up various time related functions. For recipes with many datasets and/or many timepoints the speedup can be considerable.


Before you get started

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.


To help with the number pull requests:

@bouweandela
Copy link
Member Author

@valeriupredoi Could you help me out with the failing test tests/integration/preprocessor/_io/test_concatenate.py::TestConcatenate::test_concatenate_with_iris_exception? I do not really understand why concatenation should raise an exception in that case and why it does not do so anymore with these changes.

@valeriupredoi
Copy link
Contributor

sure thing @bouweandela - I'll have a looksee later today - cheers for starting this! 🍺

@valeriupredoi
Copy link
Contributor

valeriupredoi commented Sep 5, 2022

OK @bouweandela - I have a diagnostic for that issue - boy, that wasn't straightforward not even a bit 😁 - the problem comes from our _io module: in your implementation here, for _io.py/line508 the concatenation succeeds because:

Attempting concatenatenation of %s with %s Cube1:  sample / (unknown)                  (time: 1)
    Dimension coordinates:
        time                             x 
 Cube2:  sample / (unknown)                  (time: 3)
    Dimension coordinates:
        time                             x

so it's two normal vector cubes that get concatenated correctly;

The same bit of code fails on main because the first cube is scalar:

Attempting concatenatenation of %s with %s Cube1:  sample / (unknown)                  (scalar cube)
    Scalar coordinates:
        time                        1950-01-02 00:00:00 
 Cube2:  sample / (unknown)                  (time: 3)
    Dimension coordinates:
        time

and the comparison in iris' _concatenate._CubeSignature.match() method fails:

THIS metadata []
OTHER metadata [_CoordMetaData(defn=DimCoordMetadata(standard_name='time', long_name=None, var_name='time', units=Unit('days since 1950-01-01', calendar='gregorian'), attributes={}, coord_system=None, climatological=False, circular=False), dims=(0,), points_dtype=dtype('float64'), bounds_dtype=None, kwargs={'scalar': False, 'circular': False, 'order': 1})]

where THIS metadata is self.dim_metadata of the scalar cube that is empty because iris folk don't assign dim metadata to scalar cubes. I think this is a combination of bugs - to start with, in our concatenation mechanism Cube1 is a single-point cube, and this was written way before scalar cubes became a thing, so that automatically gets translated to a scalar cube if it's got a single point - then there is the bug in the actual iris concatenation, or, better off, lack of a feature - they should check for one of the cubes being scalar and raise the alarm way before any concatenation may be tried. Alas, I don't understand why your branch here preserves that single point cube as a vector cube? Any ideas? I'll try patch our _io with a check if that cube is scalar and convert it to vector cube, but someone should really tell iris not to mix scalar and vector cubes in their concatenation process

@bouweandela
Copy link
Member Author

bouweandela commented Sep 5, 2022

I don't understand why your branch here preserves that single point cube as a vector cube?

I suspect that is the difference between selecting a range vs selecting an element from the cube i.e. cube[10:11] vs cube[10].

I see that the concatenate function uses extract_time. In the, very similar, function clip_timerange, there was this bugfix recently: #1497, which seems to add the time dimension back on.

However, coming back to this specific test, why should concatenation fail here? Wouldn't it be fine to just concatenate those two example cubes?

@valeriupredoi
Copy link
Contributor

the basic iris concatenate_cube() fails in that test (with two normal vector cubes, as defined in the test case), but our super duper _io.concatenate should not fail - I too am wondering why the test was set to fail (given that it was written before the scalar cubes debacle - yes, I really don't like that concept from the iris folk!) so that try call at line 508 should have gone through no problem. Cheers for the pointer to that fix, I'll fix it along those lines? BTW do you know how to convert a scalar cube to a vector cube without adding data to it?

@codecov
Copy link

codecov bot commented Sep 6, 2022

Codecov Report

Merging #1713 (2933673) into main (62b5c30) will increase coverage by 0.06%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #1713      +/-   ##
==========================================
+ Coverage   91.47%   91.53%   +0.06%     
==========================================
  Files         206      206              
  Lines       11204    11205       +1     
==========================================
+ Hits        10249    10257       +8     
+ Misses        955      948       -7     
Impacted Files Coverage Δ
esmvalcore/cmor/check.py 97.76% <100.00%> (+0.01%) ⬆️
esmvalcore/preprocessor/_time.py 97.68% <100.00%> (+2.02%) ⬆️
esmvalcore/preprocessor/_io.py 84.67% <0.00%> (-0.41%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@bouweandela bouweandela marked this pull request as ready for review September 6, 2022 13:58
@bouweandela
Copy link
Member Author

I also fixed some code quality issues reported by prospector in the files I was editing.

@bouweandela bouweandela added the enhancement New feature or request label Sep 6, 2022
Copy link
Contributor

@valeriupredoi valeriupredoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very nice and straightforward PR bud! One question though - being recently bitten by iris' scalar cubes and their lack of properties and methods - do we risk anywhere here to return scalar cubes? I wonder if cube.subset() does indeed always return normal (vector) cubes

Copy link
Contributor

@schlunma schlunma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Code looks good. I just tested this with the clip_timerange and extract_time preprocessors (I don't have a recipe to quickly test the others) and both worked just like expected. I just have one comment on the error message.

It also looks like subset never returns a scalar cube (tested this with a timerange of one point) 👍

esmvalcore/preprocessor/_time.py Outdated Show resolved Hide resolved
@bouweandela
Copy link
Member Author

bouweandela commented Sep 6, 2022

Here are some timings:

In [1]: import iris

In [2]: import esmvalcore.preprocessor

In [3]: from esmvalcore.cmor.check import CheckLevels

In [4]: cube = iris.load_cube("/home/bandela/climate_data/cmip5/output1/NCAR/CCSM4/rcp85/mon/atmos/Amon/r6i1p1/v20160829/pr_Amon_CCSM4_rcp85_r6i1p1_200601-210012.nc")

In [5]: cube.coord('time').shape
Out[5]: (1140,)

In [6]: %timeit esmvalcore.preprocessor.cmor_check_metadata(cube, cmor_table='CMIP5', mip='Amon', short_name='pr', frequency='mon', check_level=CheckLevels.IGNORE)
2.67 ms ± 322 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [7]: %timeit esmvalcore.preprocessor.fix_metadata([cube], project='CMIP5', mip='Amon', short_name='pr', dataset='CCSM4', frequency='mon', check_levels=CheckLevels.IGNORE)
5.77 ms ± 205 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [8]: %timeit esmvalcore.preprocessor.clip_timerange(cube, '2040/2060')
5.01 ms ± 494 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [9]: %timeit esmvalcore.preprocessor.extract_time(cube, start_year=2040, end_year=2060, start_month=1, end_month=1, start_day=1, end_day=1)
5.27 ms ± 761 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [10]: %timeit esmvalcore.preprocessor.resample_time(cube, month=1)
4.33 ms ± 374 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Timings with main branch:

In [1]: import iris

In [2]: import esmvalcore.preprocessor

In [3]: from esmvalcore.cmor.check import CheckLevels

In [4]: cube = iris.load_cube("/home/bandela/climate_data/cmip5/output1/NCAR/CCSM4/rcp85/mon/atmos/Amon/r6i1p1/v20160829/pr_Amon_CCSM4_rcp85_r6i1p1_200601-210012.nc")

In [5]: %timeit esmvalcore.preprocessor.cmor_check_metadata(cube, cmor_table='CMIP5', mip='Amon', short_name='pr', frequency='mon', check_level=CheckLevels.IGNORE)
662 ms ± 42.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [6]: %timeit esmvalcore.preprocessor.fix_metadata([cube], project='CMIP5', mip='Amon', short_name='pr', dataset='CCSM4', frequency='mon', check_level=CheckLevels.IGNORE)
676 ms ± 19.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [7]: %timeit esmvalcore.preprocessor.clip_timerange(cube, '2040/2060')
339 ms ± 22.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [8]: %timeit esmvalcore.preprocessor.extract_time(cube, start_year=2040, end_year=2060, start_month=1, end_month=1, start_day=1, end_day=1)
357 ms ± 21.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [9]: %timeit esmvalcore.preprocessor.resample_time(cube, month=1)
328 ms ± 7.03 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Because the first 3 functions are always run for every dataset, this will save 1.7 second per dataset load, so if you're loading a 100 datasets this will save close to 3 minutes runtime. That is, assuming the dataset is similar to the example (about 1000 time points) and your computer is similar to my 5-year-old laptop.

@bouweandela
Copy link
Member Author

Note that I had to replace Cube.subset by a custom slicing function, because it was unfortunately as slow as the previous implementation.

Copy link
Contributor

@schlunma schlunma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Impressive speedup! The error message looks fine now. I have two small comments on scalar time coordinates.

I also wonder why this speedup is so large (vs. extract and also vs. subset). Would it make sense to move this kind of change to iris? Your function looks like it would do exactly what subset does.

select = np.zeros(len(dates), dtype=bool)
for hour in hours:
select |= dates == hour
cube = _select_timeslice(cube, select)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think select_timeslice fails if time is a scalar dimension due to this line:

time_dim = cube.coord_dims(coord)[0]

Do we need to consider this here? In _extract_datetime you considered this case explicitly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can make it a bit nicer still, but using this function on a scalar cube really doesn't make sense: at best it will return the original cube.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not saying we need to support this, but it would probably be nice to check if the coordinate is scalar and return a nice error message. Otherwise the line time_dim = cube.coord_dims(coord)[0] will probably produce an ugly IndexError.

dates = time.units.num2date(time.points)
requested = PartialDateTime(month=month, day=day, hour=hour)
select = dates == requested
cube = _select_timeslice(cube, select)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about the scalar time coordinate.

@valeriupredoi
Copy link
Contributor

totally with @schlunma on this one - a factor 100 speed improvement is no joke, and it's consistent - that shows there is a serious slowdown in what was before, and if it is indeed cube_select than that badly needs a fix, that you may have just found, and should probably be reported to the iris folk. Is memory consumption affected by your speedup in any way?

@valeriupredoi
Copy link
Contributor

another interesting experiment would be to repeat the same reasoning you introduce here and construct a slicer that doesn't only do it along the time axis, but along a generic dim, see how that compares to cube_select - maybe they've optimized that function to perform just OK along any dim, but it's not exactly a Ferrari along any of the dims, just a VW Polo, but does the job for everything

@bouweandela
Copy link
Member Author

I also wonder why this speedup is so large (vs. extract and also vs. subset). Would it make sense to move this kind of change to iris? Your function looks like it would do exactly what subset does.

Good suggestion! I looked into the iris code a bit to find out and opened an issue to ask if this is something that could be changed: SciTools/iris#4957.

@schlunma
Copy link
Contributor

Awesome, thanks @bouweandela!

From your issue it also looks like only time coordinates are affected (through a suboptimal use of num2date), so I don't think we need to implement changes for any other extraction preprocessor.

@sloosvel
Copy link
Contributor

Code looks good. I just tested this with the clip_timerange and extract_time preprocessors (I don't have a recipe to quickly test the others) and both worked just like expected.

I can work on testing the other preprocessors then!

Copy link
Contributor

@sloosvel sloosvel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested the generation of bounds in _check and preprocessors resample_time and resample_hours and got the same results as when using main.

Regarding @schlunma comments about possible issues in these two preprocessors, I think it's quite unlikely that anybody would use these preprocessors in cubes with scalar time coordinates, but if you have time to just return the cube with a message that the time coordinate was not resampled (instead of an error?) then go for it.

I would say the code looks good as well and improvements in execution time are always nice to have.

esmvalcore/preprocessor/_time.py Show resolved Hide resolved
tests/unit/preprocessor/_time/test_time.py Show resolved Hide resolved
@bouweandela bouweandela mentioned this pull request Sep 23, 2022
15 tasks
@valeriupredoi
Copy link
Contributor

this is a nice inclusion for 2.7 - since the freeze deadline is today, I'd very much argue to have it merged, if there are no major objections then I will merge it by end of the day today, and propose it for the Highlights section too. Please yell here otherwise, and make it fast, as captain Piccard says 👍

@valeriupredoi valeriupredoi added this to the v2.7.0 milestone Oct 3, 2022
@valeriupredoi
Copy link
Contributor

@schlunma you OK to have this merged, bud? 🍺

@schlunma
Copy link
Contributor

schlunma commented Oct 4, 2022

This should definitely go into v2.7, but I still think that my comment above (nicer error message in case a scalar time coordinate is present) should be addressed.

@valeriupredoi
Copy link
Contributor

cheers, Manu! The floor is yours, @bouweandela 😁

@bouweandela
Copy link
Member Author

bouweandela commented Oct 4, 2022

Let me run a few tests with the updates I proposed to iris (SciTools/iris#4955, SciTools/iris#4969) to see if this is still relevant. If yes, I will address the review comments and we can get this included.

@bouweandela
Copy link
Member Author

Here are the new timings (i.e. with the next release of iris, this can be expected):

In [1]: import iris
   ...: import esmvalcore.preprocessor
   ...: from esmvalcore.cmor.check import CheckLevels
   ...: cube = iris.load_cube("/home/bandela/climate_data/cmip5/output1/NCAR/CCSM4/rcp85/mon/atmos/Amon/r6i1p1/v20160829/pr_Amon_CCSM4_rcp85_r6i1p1_200601-210012.nc")

In [2]: %timeit esmvalcore.preprocessor.cmor_check_metadata(cube, cmor_table='CMIP5', mip='Amon', short_name='pr', frequency='mon', check_level=CheckLevels.IGNORE)
2.32 ms ± 67.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [3]: %timeit esmvalcore.preprocessor.fix_metadata([cube], project='CMIP5', mip='Amon', short_name='pr', dataset='CCSM4', frequency='mon', check_level=CheckLevels.IGNORE)
4.88 ms ± 77.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [4]: %timeit esmvalcore.preprocessor.clip_timerange(cube, '2040/2060')
4.24 ms ± 63.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [5]: %timeit esmvalcore.preprocessor.extract_time(cube, start_year=2040, end_year=2060, start_month=1, end_month=1, start_day=1, end_day=1)
4.2 ms ± 43.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [6]: %timeit esmvalcore.preprocessor.resample_time(cube, month=1)
3.8 ms ± 164 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

With the main branch:

In [1]: import iris
   ...: import esmvalcore.preprocessor
   ...: from esmvalcore.cmor.check import CheckLevels
   ...: cube = iris.load_cube("/home/bandela/climate_data/cmip5/output1/NCAR/CCSM4/rcp85/mon/atmos/Amon/r6i1p1/v20160829/pr_Amon_CCSM4_rcp85_r6i1p1_200601-210012.nc")

In [2]: %timeit esmvalcore.preprocessor.cmor_check_metadata(cube, cmor_table='CMIP5', mip='Amon', short_name='pr', frequency='mon', check_level=CheckLevels.IGNORE)
622 ms ± 51 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [3]:  %timeit esmvalcore.preprocessor.fix_metadata([cube], project='CMIP5', mip='Amon', short_name='pr', dataset='CCSM4', frequency='mon', check_level=CheckLevels.IGNORE)
603 ms ± 39 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [4]: %timeit esmvalcore.preprocessor.clip_timerange(cube, '2040/2060')
8.88 ms ± 147 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [5]: %timeit esmvalcore.preprocessor.extract_time(cube, start_year=2040, end_year=2060, start_month=1, end_month=1, start_day=1, end_day=1)
9.23 ms ± 434 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [6]: %timeit esmvalcore.preprocessor.resample_time(cube, month=1)
7.83 ms ± 177 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

so it looks like the changes here still offer a considerable speedup, though it's mostly the CMOR checker/fixer that benefits. For the other functions there is a factor of 2 speedup, but they didn't take that long to begin with, so overall program runtime is unlikely to benefit.

@valeriupredoi
Copy link
Contributor

holy caboose! That's mega @bouweandela - one thing we need to work on is how we showcase these results - big bold numbers with 👀 all over 😁 Very cool work! OK so let's get this one in, and then we should talk to the iris folk see if they want to do something about it in their code. Could I be a bother and ask you to format a few of those results in a nice table so I can link it from the Highlights bit of the release notes? Something that jumps rather than laden with a lot of technical parameters 🍺 What say you abou Manu's asking for a nicer still error message?

@bouweandela
Copy link
Member Author

What say you abou Manu's asking for a nicer still error message?

I think it's a very good idea, but I'm afraid I'm out of time for today and tomorrow is my day off. I would also be fine with postponing this until the next release.

@valeriupredoi
Copy link
Contributor

au contraire, mon chum - I say we merge this now even if @schlunma will not be completely happy with that error message, but I swiftly open an issue documenting the need for an improved error message. This or we have the conda-lock bot PRs as highlight 😁

@valeriupredoi
Copy link
Contributor

das freundlich ping @schlunma - what say you about this proposition #1713 (comment) 🍺

Copy link
Contributor

@schlunma schlunma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sounds good to me! I hope this will not become one of the many issues that will never be solved😅

@valeriupredoi
Copy link
Contributor

wonderful, many thanks, Manu! 🍺

Copy link
Contributor

@valeriupredoi valeriupredoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

da RM approves too 👑

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants