Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance #24

Merged
merged 3 commits into from
Sep 12, 2023
Merged

Performance #24

merged 3 commits into from
Sep 12, 2023

Conversation

taldcroft
Copy link
Member

@taldcroft taldcroft commented Jan 2, 2023

Description

This is part of a set of performance changes to kadi, chandra_aca, ska_sun, Quaternion, and Chandra.Maneuver to make kadi command states faster. The changes are independent.

This PR includes changes to:

  • Use numba - including a numba-ized version of the radec2eci from Ska.quatutil
  • Add support for sun_ra, sun_dec kwargs to the pitch and off_nominal_roll functions to allow upstream applications (with values for those) to run faster.
  • Use numba caching

Interface impacts

The numba cache uses the NUMBA_CACHE_DIR already configured in ska after sot/ska_helpers#29 .
The pitch and off_nominal_roll functions now support sun_ra and sun_dec kwargs.

Testing

Unit tests

Independent check of unit tests by Jean

  • Linux

Functional tests

Also checking numba caching behavior in ska3-flight 2023.3. This demonstrates the creation of numba cache files in the expected user directory.

(ska3) ➜  ska_sun git:(performance) rm -rf ~/.ska3             
(ska3) ➜  ska_sun git:(performance) pytest ska_sun
============================= test session starts ==============================
platform darwin -- Python 3.10.8, pytest-7.2.1, pluggy-1.0.0
rootdir: /Users/aldcroft/git, configfile: pytest.ini
plugins: timeout-2.1.0, anyio-3.6.2
collected 12 items                                                             

ska_sun/tests/test_sun.py ............                                   [100%]

============================== 12 passed in 6.95s ==============================
(ska3) ➜  ska_sun git:(performance) find ~/.ska3     
/Users/aldcroft/.ska3
/Users/aldcroft/.ska3/cache
/Users/aldcroft/.ska3/cache/numba
/Users/aldcroft/.ska3/cache/numba/ska_sun_28343af752654d47d93a0d0bcdf944cc27778384
/Users/aldcroft/.ska3/cache/numba/ska_sun_28343af752654d47d93a0d0bcdf944cc27778384/sun.position_at_jd-135.py310.1.nbc
/Users/aldcroft/.ska3/cache/numba/ska_sun_28343af752654d47d93a0d0bcdf944cc27778384/sun._nominal_roll-307.py310.nbi
/Users/aldcroft/.ska3/cache/numba/ska_sun_28343af752654d47d93a0d0bcdf944cc27778384/sun.sph_dist-213.py310.nbi
/Users/aldcroft/.ska3/cache/numba/ska_sun_28343af752654d47d93a0d0bcdf944cc27778384/sun._radec2eci-266.py310.nbi
/Users/aldcroft/.ska3/cache/numba/ska_sun_28343af752654d47d93a0d0bcdf944cc27778384/sun._nominal_roll-307.py310.1.nbc
/Users/aldcroft/.ska3/cache/numba/ska_sun_28343af752654d47d93a0d0bcdf944cc27778384/sun.position_at_jd-135.py310.nbi
/Users/aldcroft/.ska3/cache/numba/ska_sun_28343af752654d47d93a0d0bcdf944cc27778384/sun._radec2eci-266.py310.1.nbc
/Users/aldcroft/.ska3/cache/numba/ska_sun_28343af752654d47d93a0d0bcdf944cc27778384/sun.sph_dist-213.py310.1.nbc

With regard to performance, on linux (HEAD fido) a comparison with ska_sun 3.10.1 shows the expected result that the numba caching can be slower on the first run of a function (compared to flight / 3.10.1) and then has advantages.

flight:

In [1]: import ska_sun

In [2]: time = '2023:001'

In [3]: %time ska_sun.nominal_roll(10, 20, time=time)
CPU times: user 4.83 ms, sys: 0 ns, total: 4.83 ms
Wall time: 6.41 ms
Out[3]: 292.1368992773556

# This is the version without performance improvements but it still shows some speedup when called again (NFS caching?)
In [4]: %time ska_sun.nominal_roll(10, 20, time=time)
CPU times: user 883 µs, sys: 0 ns, total: 883 µs. 
Wall time: 812 µs                      
Out[4]: 292.1368992773556

In [6]: %timeit ska_sun.nominal_roll(10, 20, time=time)
417 µs ± 118 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [7]: ska_sun.__version__
Out[7]: '3.10.1'

This PR

In [1]: import ska_sun

In [2]: time = '2023:001'

In [3]: %time ska_sun.nominal_roll(10, 20, time=time)
CPU times: user 1.92 s, sys: 43.8 ms, total: 1.96 s
Wall time: 2.31 s
Out[3]: -67.86310072264433

In [4]: %time ska_sun.nominal_roll(10, 20, time=time)
CPU times: user 123 µs, sys: 10 µs, total: 133 µs
Wall time: 136 µs
Out[4]: -67.86310072264433

In [5]: %timeit ska_sun.nominal_roll(10, 20, time=time)
35.3 µs ± 1.86 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

And when set with debug printing on the numba caching, the caching saves seem reasonable

ska3-jeanconn-fido> export NUMBA_DEBUG_CACHE=1
ska3-jeanconn-fido> ipython
Python 3.10.8 | packaged by conda-forge | (main, Nov 22 2022, 08:26:04) [GCC 10.4.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.8.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import ska_sun

In [2]: time = '2023:001'

In [3]: %time ska_sun.nominal_roll(10, 20, time=time)
[cache] index saved to '/home/jeanconn/.ska3/cache/numba/ska_sun_cdeca1941b46e08ad13896c2df292287d50c752d/sun.position_at_jd-135.py310.nbi'
[cache] data saved to '/home/jeanconn/.ska3/cache/numba/ska_sun_cdeca1941b46e08ad13896c2df292287d50c752d/sun.position_at_jd-135.py310.1.nbc'
[cache] index saved to '/home/jeanconn/.ska3/cache/numba/ska_sun_cdeca1941b46e08ad13896c2df292287d50c752d/sun._radec2eci-266.py310.nbi'
[cache] data saved to '/home/jeanconn/.ska3/cache/numba/ska_sun_cdeca1941b46e08ad13896c2df292287d50c752d/sun._radec2eci-266.py310.1.nbc'
[cache] index loaded from '/home/jeanconn/.ska3/cache/numba/ska_sun_cdeca1941b46e08ad13896c2df292287d50c752d/sun._radec2eci-266.py310.nbi'
[cache] index loaded from '/home/jeanconn/.ska3/cache/numba/ska_sun_cdeca1941b46e08ad13896c2df292287d50c752d/sun._radec2eci-266.py310.nbi'
[cache] index saved to '/home/jeanconn/.ska3/cache/numba/ska_sun_cdeca1941b46e08ad13896c2df292287d50c752d/sun._radec2eci-266.py310.nbi'
[cache] data saved to '/home/jeanconn/.ska3/cache/numba/ska_sun_cdeca1941b46e08ad13896c2df292287d50c752d/sun._radec2eci-266.py310.2.nbc'
[cache] index saved to '/home/jeanconn/.ska3/cache/numba/ska_sun_cdeca1941b46e08ad13896c2df292287d50c752d/sun._nominal_roll-307.py310.nbi'
[cache] data saved to '/home/jeanconn/.ska3/cache/numba/ska_sun_cdeca1941b46e08ad13896c2df292287d50c752d/sun._nominal_roll-307.py310.1.nbc'
CPU times: user 1.91 s, sys: 40.1 ms, total: 1.95 s
Wall time: 2.3 s
Out[3]: -67.86310072264433

In [4]: %time ska_sun.nominal_roll(10, 20, time=time)
CPU times: user 169 µs, sys: 16 µs, total: 185 µs
Wall time: 189 µs
Out[4]: -67.86310072264433

In [5]: %time ska_sun.nominal_roll(10, 20, time=time)
CPU times: user 159 µs, sys: 16 µs, total: 175 µs
Wall time: 178 µs
Out[5]: -67.86310072264433

In [6]: %time ska_sun.pitch(10, 20, time=time)
[cache] index saved to '/home/jeanconn/.ska3/cache/numba/ska_sun_cdeca1941b46e08ad13896c2df292287d50c752d/sun.sph_dist-213.py310.nbi'
[cache] data saved to '/home/jeanconn/.ska3/cache/numba/ska_sun_cdeca1941b46e08ad13896c2df292287d50c752d/sun.sph_dist-213.py310.1.nbc'
CPU times: user 84.7 ms, sys: 136 µs, total: 84.8 ms
Wall time: 93.8 ms
Out[6]: 96.6598242368135

In [7]: %time ska_sun.pitch(10, 20, time=time)
CPU times: user 119 µs, sys: 11 µs, total: 130 µs
Wall time: 133 µs
Out[7]: 96.6598242368135

And as mentioned, this PR provides the option to supply sun_ra and sun_dec from the calling code which can provide significant savings.

In [8]: %timeit ska_sun.nominal_roll(10, 20, sun_ra=sun_ra, sun_dec=sun_dec)
1.21 µs ± 0.62 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

@taldcroft taldcroft changed the title WIP Performance Performance Sep 11, 2023
@jeanconn
Copy link
Contributor

I think this would need rebase for tests to pass at this point. @taldcroft Do you want me to go ahead and do that to test this independently from #25?

@jeanconn
Copy link
Contributor

Rebased and force-pushed.

@taldcroft
Copy link
Member Author

@jeanconn - I re-ran unit tests and added some functional test information in the description.

@taldcroft taldcroft merged commit d7f965e into master Sep 12, 2023
@taldcroft taldcroft deleted the performance branch September 12, 2023 10:21
@javierggt javierggt mentioned this pull request Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants