Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parser for TESS DVT files #164

Merged
merged 13 commits into from
Jan 14, 2025
Merged

Conversation

rosteen
Copy link
Contributor

@rosteen rosteen commented Dec 31, 2024

By default, this will load each TCE extension into the data collection as a separate object and display them all. Specifying extname will skip adding the deselected extensions into viewers. Opening as draft since I need to fix the auto-generated data label. I probably also need to think a little harder about what to do with the ephemeris in the case of multiple TCE extensions that are all added to the viewer.

Screenshot 2024-12-31 at 12 49 10 PM

@rosteen rosteen requested a review from kecnry December 31, 2024 17:50
Copy link

codecov bot commented Dec 31, 2024

Codecov Report

Attention: Patch coverage is 32.50000% with 27 lines in your changes missing coverage. Please review.

Project coverage is 90.12%. Comparing base (7b78df5) to head (febc67b).
Report is 51 commits behind head on main.

Files with missing lines Patch % Lines
lcviz/parsers.py 17.85% 23 Missing ⚠️
lcviz/components/components.py 60.00% 2 Missing ⚠️
lcviz/helper.py 71.42% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #164      +/-   ##
==========================================
- Coverage   93.92%   90.12%   -3.81%     
==========================================
  Files          37       41       +4     
  Lines        1598     2187     +589     
==========================================
+ Hits         1501     1971     +470     
- Misses         97      216     +119     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rosteen rosteen added this to the 1.1.0 milestone Dec 31, 2024
@rosteen rosteen added io enhancement New feature or request labels Dec 31, 2024
@rosteen rosteen marked this pull request as ready for review December 31, 2024 19:18
@rosteen
Copy link
Contributor Author

rosteen commented Dec 31, 2024

I think this is ready for testing/opinions on how it's implemented, I imagine I'll need to iterate a little.

flux_err=data['LC_INIT_ERR'])
lc.meta = hdulist[0].header
lc.meta['MISSION'] = 'TESS DVT'
lc.meta['FLUX_ORIGIN'] = "LC_INIT"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the (near?) future, we should either add flux column validation or work to ensure that non-standard flux columns are selectable.

By default, all of the table columns will be options in the Flux Column plugin. So far, we haven't written any intentional support for plotting models, but there are pre-computed transit light curve models in the DVT files, stored under the "MODEL_INIT" and "MODEL_WHITE" columns. This causes a hiccup if you try to set "MODEL_WHITE" as the flux column, since we look for a corresponding _err column, and none exists for the model:

self.app._jdaviz_helper._set_data_component(dc_item, 'flux', dc_item[self.selected])
self.app._jdaviz_helper._set_data_component(dc_item, 'flux_err', dc_item[self.selected+"_err"]) # noqa

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using .get instead might fix L225, but we should check that's the only issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 Down the road, it'll be great to change the plot options defaults for "MODEL_*" columns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I fixed this in components.py but it needs more testing to make sure something downstream isn't broken by an all-nan uncertainty array.

"""
# Determine if we're loading a DVT file, which has a separate parser
if isinstance(data, str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be just as easy to support HDUList too, which would be helpful for writing tests and for files we make/edit on the fly. Can we add that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but handling HDULists in general is probably a separate ticket/PR.

lcviz/parsers.py Outdated
Comment on lines 35 to 37
lc = lightkurve.LightCurve(data=data,
flux=data['LC_INIT'],
flux_err=data['LC_INIT_ERR'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have some work to do on the units here. We're currently expecting the time unit to be in JD (I think?), but these files have the units:

TTYPE1  = 'TIME    '           / column title: data time stamps                 
TFORM1  = 'D       '           / column format: 64-bit floating point   
TUNIT1  = 'BJD - 2457000, days' / column units: Barycenter corrected TESS Julian

but the offset of 2457000 isn't being reapplied in this parser yet, so the time axis is prehistoric:

Screenshot 2025-01-02 at 10 05 51

We might also need to intentionally not load the DVT's phase column, or mark its caveats. The phase units are not the (0, 1) orbital phase that we usually use:

TTYPE4  = 'PHASE   '           / column title: Phase using period and epoch     
TFORM4  = 'E       '           / column format: 32-bit floating point           
TUNIT4  = 'days    '           / column units: [-0.25*period, 0.75*period]     

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, we probably shouldn't load the phase column but could load the ephemeris and let lcviz automatically phase-fold (if that's easy - otherwise can be a follow-up ticket)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, we probably shouldn't load the phase column but could load the ephemeris and let lcviz automatically phase-fold (if that's easy - otherwise can be a follow-up ticket)

I already made a pass at doing that at L53-54.

@rosteen
Copy link
Contributor Author

rosteen commented Jan 2, 2025

@bmorris3 I fixed the time handling, but I'm surprised that using the epoch and period from the DVT header doesn't seem to result in a nice light curve in the time vs phase viewer:

Screenshot 2025-01-02 at 1 50 58 PM

@rosteen
Copy link
Contributor Author

rosteen commented Jan 2, 2025

Ah ha, I switched to the model and realized that the period is good, it's just not centered at phase 0.5. Here's the folded plot using the header values, zoomed around the transit :

Screenshot 2025-01-02 at 2 31 03 PM

lcviz/parsers.py Outdated
# add ephemeris information from the DVT extension
if ephem_plugin is not None and show_ext_in_viewer:
ephem_plugin.period = header['TPERIOD']
ephem_plugin.t0 = header['TEPOCH'] + time_offset
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The midtransit time doesn't align with this epoch because of the time conventions – when we get it right, the midtransit time should be phase zero.

You're correctly applying the BJD offset to TEPOCH, but then you need to subtract the LC's coordinate reference time like this:

dvt_file = "https://archive.stsci.edu/missions/tess/tid/s0002/0000/0001/0010/0827/tess2018235142541-s0002-s0002-0000000100100827-00109_dvt.fits"

with fits.open(dvt_file, mode='readonly') as hdulist:
    print(
        "t0 for ephemeris plugin:", 
        (hdulist[1].header['TEPOCH'] + 2457000) - 
        lcviz.app.data_collection[0].coords.reference_time.jd
    )

That puts the transit at phase zero.

lcviz/parsers.py Outdated Show resolved Hide resolved
Comment on lines 230 to 232
nan_errs = np.empty(dc_item['flux'].shape)
nan_errs[:] = np.nan
self.app._jdaviz_helper._set_data_component(dc_item, 'flux_err', nan_errs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we explicitly add support for light curves without flux uncertainties, rather than adding nan uncertainties? If/when we add fitting features, we'll have to undo the nans.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What currently requires uncertainties? I can't think of anywhere that we currently use them (although that really should be bumped up near the top of our priority list) 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we can't change the flux_err component to be None, because you can't change the array shape of a component. We could remove the flux_err component entirely in this case and add it back when needed, does that sound better than setting the array to NaNs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, so the problem is if some flux entries in the same data-entry have uncertainties but others do not (if none have uncertainties, then we could just not call _set_data_component ever and that would be ok, right?)?

I think maybe just a comment here to explain why we have to do this (or removing/re-adding) so that when we get to using the uncertainties we can adjust as needed... its hard for me to say which is the better/worse choice at this point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, there's LC_INIT with LC_INIT_ERR, but for example LC_WHITE doesn't have an LC_WHITE_ERR column. We'll need a check either way when fitting is added, either for whether the error exists or if it's all NaN, just a matter of which to check. Sounds like @kecnry and I don't have a strong preference so I can defer to @bmorris3 and remove the component instead of doing all NaN.

Copy link
Member

@kecnry kecnry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one small comment, otherwise looks good to me. Are there any necessary follow-ups or does this at least achieve everything requested by MAST?

lcviz/parsers.py Outdated

# Loop through the TCEs in the file. If we only want one (specified by
# `extname` keyword) then only load that one into the viewers and ephemeris.
for i in range(1, len(hdulist)-1):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just iterate over hdulist directly (for hdulitem in hdulist[1:]) or is the iterator i needed anywhere else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think at some point I was going to accept an integer rather than string name for extension, I can change this to what you suggested.

@rosteen
Copy link
Contributor Author

rosteen commented Jan 14, 2025

Just one small comment, otherwise looks good to me. Are there any necessary follow-ups or does this at least achieve everything requested by MAST?

I think that this covers everything MAST needs, if they decide they want to provide an integer instead of extname for TCE event it would be a small change. I think we can get it into their hands like this for testing on their end.

@kecnry
Copy link
Member

kecnry commented Jan 14, 2025

dev failures seem unrelated - will follow-up on that separately. Thanks!

@kecnry kecnry merged commit bade521 into spacetelescope:main Jan 14, 2025
10 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request io
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants