Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Much faster parsing in CxoTime and add maude format #15

Merged
merged 18 commits into from
Sep 30, 2020
Merged

Conversation

taldcroft
Copy link
Member

@taldcroft taldcroft commented Sep 24, 2020

Description

This effectively does a backport (or sideways port) of astropy/astropy#10360 to speed up parsing of date and greta format strings by a factor of about 20 and 50, respectively.

For large arrays, the CxoTime object creation time on my Mac is around 300 ns per time.

This also adds a new maude format that is basically greta as an int64 without the .. It was easiest/fastest to toss this into the parsing PR.

Some timings:

In [15]: dates = ['2001:001:01:23:45.666'] * 1_000_000                                                                   

In [16]: from Chandra.Time import date2secs                                                                              

In [17]: time x = date2secs(dates)                                                                                       
CPU times: user 3.49 s, sys: 22.6 ms, total: 3.52 s
Wall time: 3.52 s

In [18]: time x = DateTime(vals, format='date').secs                                                                     
CPU times: user 8.51 s, sys: 20.5 ms, total: 8.53 s
Wall time: 8.53 s

In [19]: time x = CxoTime(vals, format='date').secs                                                                      
CPU times: user 481 ms, sys: 18.4 ms, total: 500 ms
Wall time: 499 ms

In [20]: gretas = ['2001001.012345666'] * 1_000_000                                                                      

In [21]: time x = DateTime(gretas, format='greta').secs                                                                  
CPU times: user 12.8 s, sys: 15.8 ms, total: 12.8 s
Wall time: 12.8 s

In [22]: time x = CxoTime(gretas, format='greta').secs                                                                   
CPU times: user 468 ms, sys: 27 ms, total: 495 ms
Wall time: 495 ms

In [23]: time x = CxoTime(gretas, format='greta').date                                                                   
CPU times: user 4.97 s, sys: 33.6 ms, total: 5 s
Wall time: 5 s

In [24]: time x = DateTime(gretas, format='greta').date                                                                  
CPU times: user 12.9 s, sys: 27.7 ms, total: 12.9 s
Wall time: 12.9 s

Testing

  • Passes unit tests on MacOS, Windows
  • Functional testing (new tests)

@jeanconn
Copy link
Contributor

jeanconn commented Sep 24, 2020

Cool! Though the status of the astropy code isn't clear to me. Are you sideways-porting to get this in faster here because it will take longer in astropy? Or because the two types need more special code for handling here? I'm just trying to keep an eye on the things that we might want to pull back out of cxotime as you make advances in astropy.time .

@taldcroft
Copy link
Member Author

The astropy 4.2 release may well get dragged out beyond the New Year, and I'd like to get these speed enhancements into CxoTime for shiny. These changes override the base methods in astropy.time so I don't expect any problems, but for sure we'll want to mostly revert this PR when we get 4.2 into Ska3. This PR has been structured with the mixin class to make that process pretty straightforward when we do it.

@taldcroft taldcroft changed the title Implement much faster parsing of date and greta formats in CxoTime Implement much faster parsing in CxoTime and add maude format Sep 25, 2020
@taldcroft taldcroft changed the title Implement much faster parsing in CxoTime and add maude format Much faster parsing in CxoTime and add maude format Sep 25, 2020
@arthegost
Copy link

Thanks for adding the maude format. It will be helpful in the future when many others are using it in there daily tasks. I wanted to move my comments from the Slack conversations to here.

The maude is returned as a numerical value from the server itself, so it looks like all timestamps will need to be converted before ingested into cxotime.

Also, as mentioned, the timestamps will always come back with the exact same amount of digits, which is 16 (2^4).

@arthegost
Copy link

arthegost commented Sep 25, 2020

Thank you for the explanation, Tom. It seems that strings and numbers are acceptable for the maude format, as long as format='maude' is used.

@taldcroft taldcroft merged commit 5fd7153 into master Sep 30, 2020
@taldcroft taldcroft deleted the faster-parsing branch September 30, 2020 16:13
@javierggt javierggt mentioned this pull request Dec 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants