Add parser for TESS DVT files #164

rosteen · 2024-12-31T17:50:22Z

By default, this will load each TCE extension into the data collection as a separate object and display them all. Specifying extname will skip adding the deselected extensions into viewers. Opening as draft since I need to fix the auto-generated data label. I probably also need to think a little harder about what to do with the ephemeris in the case of multiple TCE extensions that are all added to the viewer.

codecov · 2024-12-31T17:53:23Z

Codecov Report

Attention: Patch coverage is 32.50000% with 27 lines in your changes missing coverage. Please review.

Project coverage is 90.12%. Comparing base (7b78df5) to head (febc67b).
Report is 51 commits behind head on main.

Files with missing lines	Patch %	Lines
lcviz/parsers.py	17.85%	23 Missing ⚠️
lcviz/components/components.py	60.00%	2 Missing ⚠️
lcviz/helper.py	71.42%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #164      +/-   ##
==========================================
- Coverage   93.92%   90.12%   -3.81%     
==========================================
  Files          37       41       +4     
  Lines        1598     2187     +589     
==========================================
+ Hits         1501     1971     +470     
- Misses         97      216     +119

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

rosteen · 2024-12-31T19:19:33Z

I think this is ready for testing/opinions on how it's implemented, I imagine I'll need to iterate a little.

bmorris3 · 2025-01-02T14:51:46Z

lcviz/parsers.py

+                                   flux_err=data['LC_INIT_ERR'])
+        lc.meta = hdulist[0].header
+        lc.meta['MISSION'] = 'TESS DVT'
+        lc.meta['FLUX_ORIGIN'] = "LC_INIT"


In the (near?) future, we should either add flux column validation or work to ensure that non-standard flux columns are selectable.

By default, all of the table columns will be options in the Flux Column plugin. So far, we haven't written any intentional support for plotting models, but there are pre-computed transit light curve models in the DVT files, stored under the "MODEL_INIT" and "MODEL_WHITE" columns. This causes a hiccup if you try to set "MODEL_WHITE" as the flux column, since we look for a corresponding _err column, and none exists for the model:

lcviz/lcviz/components/components.py

Lines 224 to 225 in 0f18bec

self.app._jdaviz_helper._set_data_component(dc_item, 'flux', dc_item[self.selected])

self.app._jdaviz_helper._set_data_component(dc_item, 'flux_err', dc_item[self.selected+"_err"]) # noqa

I think using .get instead might fix L225, but we should check that's the only issue.

🤔 Down the road, it'll be great to change the plot options defaults for "MODEL_*" columns.

I think I fixed this in components.py but it needs more testing to make sure something downstream isn't broken by an all-nan uncertainty array.

bmorris3 · 2025-01-02T14:54:35Z

lcviz/helper.py

        """
+        # Determine if we're loading a DVT file, which has a separate parser
+        if isinstance(data, str):


It could be just as easy to support HDUList too, which would be helpful for writing tests and for files we make/edit on the fly. Can we add that?

I agree, but handling HDULists in general is probably a separate ticket/PR.

bmorris3 · 2025-01-02T15:07:22Z

lcviz/parsers.py

+        lc = lightkurve.LightCurve(data=data,
+                                   flux=data['LC_INIT'],
+                                   flux_err=data['LC_INIT_ERR'])


We have some work to do on the units here. We're currently expecting the time unit to be in JD (I think?), but these files have the units:

TTYPE1 = 'TIME ' / column title: data time stamps TFORM1 = 'D ' / column format: 64-bit floating point TUNIT1 = 'BJD - 2457000, days' / column units: Barycenter corrected TESS Julian

but the offset of 2457000 isn't being reapplied in this parser yet, so the time axis is prehistoric:

We might also need to intentionally not load the DVT's phase column, or mark its caveats. The phase units are not the (0, 1) orbital phase that we usually use:

TTYPE4 = 'PHASE ' / column title: Phase using period and epoch TFORM4 = 'E ' / column format: 32-bit floating point TUNIT4 = 'days ' / column units: [-0.25*period, 0.75*period]

agreed, we probably shouldn't load the phase column but could load the ephemeris and let lcviz automatically phase-fold (if that's easy - otherwise can be a follow-up ticket)

agreed, we probably shouldn't load the phase column but could load the ephemeris and let lcviz automatically phase-fold (if that's easy - otherwise can be a follow-up ticket)

I already made a pass at doing that at L53-54.

… column

rosteen · 2025-01-02T18:52:47Z

@bmorris3 I fixed the time handling, but I'm surprised that using the epoch and period from the DVT header doesn't seem to result in a nice light curve in the time vs phase viewer:

rosteen · 2025-01-02T19:32:07Z

Ah ha, I switched to the model and realized that the period is good, it's just not centered at phase 0.5. Here's the folded plot using the header values, zoomed around the transit :

bmorris3 · 2025-01-03T14:40:54Z

lcviz/parsers.py

+        # add ephemeris information from the DVT extension
+        if ephem_plugin is not None and show_ext_in_viewer:
+            ephem_plugin.period = header['TPERIOD']
+            ephem_plugin.t0 = header['TEPOCH'] + time_offset


The midtransit time doesn't align with this epoch because of the time conventions – when we get it right, the midtransit time should be phase zero.

You're correctly applying the BJD offset to TEPOCH, but then you need to subtract the LC's coordinate reference time like this:

dvt_file = "https://archive.stsci.edu/missions/tess/tid/s0002/0000/0001/0010/0827/tess2018235142541-s0002-s0002-0000000100100827-00109_dvt.fits" with fits.open(dvt_file, mode='readonly') as hdulist: print( "t0 for ephemeris plugin:", (hdulist[1].header['TEPOCH'] + 2457000) - lcviz.app.data_collection[0].coords.reference_time.jd )

That puts the transit at phase zero.

lcviz/parsers.py

bmorris3 · 2025-01-03T14:45:18Z

lcviz/components/components.py

+            nan_errs = np.empty(dc_item['flux'].shape)
+            nan_errs[:] = np.nan
+            self.app._jdaviz_helper._set_data_component(dc_item, 'flux_err', nan_errs)


Can we explicitly add support for light curves without flux uncertainties, rather than adding nan uncertainties? If/when we add fitting features, we'll have to undo the nans.

What currently requires uncertainties? I can't think of anywhere that we currently use them (although that really should be bumped up near the top of our priority list) 🤔

So, we can't change the flux_err component to be None, because you can't change the array shape of a component. We could remove the flux_err component entirely in this case and add it back when needed, does that sound better than setting the array to NaNs?

ah, so the problem is if some flux entries in the same data-entry have uncertainties but others do not (if none have uncertainties, then we could just not call _set_data_component ever and that would be ok, right?)?

I think maybe just a comment here to explain why we have to do this (or removing/re-adding) so that when we get to using the uncertainties we can adjust as needed... its hard for me to say which is the better/worse choice at this point.

Right, there's LC_INIT with LC_INIT_ERR, but for example LC_WHITE doesn't have an LC_WHITE_ERR column. We'll need a check either way when fitting is added, either for whether the error exists or if it's all NaN, just a matter of which to check. Sounds like @kecnry and I don't have a strong preference so I can defer to @bmorris3 and remove the component instead of doing all NaN.

Co-authored-by: Brett M. Morris <morrisbrettm@gmail.com>

kecnry

Just one small comment, otherwise looks good to me. Are there any necessary follow-ups or does this at least achieve everything requested by MAST?

kecnry · 2025-01-06T19:14:10Z

lcviz/parsers.py

+
+    # Loop through the TCEs in the file. If we only want one (specified by
+    # `extname` keyword) then only load that one into the viewers and ephemeris.
+    for i in range(1, len(hdulist)-1):


can we just iterate over hdulist directly (for hdulitem in hdulist[1:]) or is the iterator i needed anywhere else?

I think at some point I was going to accept an integer rather than string name for extension, I can change this to what you suggested.

rosteen · 2025-01-14T15:42:15Z

Just one small comment, otherwise looks good to me. Are there any necessary follow-ups or does this at least achieve everything requested by MAST?

I think that this covers everything MAST needs, if they decide they want to provide an integer instead of extname for TCE event it would be a small change. I think we can get it into their hands like this for testing on their end.

kecnry · 2025-01-14T15:45:23Z

dev failures seem unrelated - will follow-up on that separately. Thanks!

rosteen added 3 commits December 31, 2024 11:16

Initial pass at DVT file parsing

4356aa5

Check for DVT file from header, add separate parser

ae43506

Correct comment

d03863c

rosteen requested a review from kecnry December 31, 2024 17:50

rosteen added 2 commits December 31, 2024 14:06

Debugging

81a1ebf

Changelog

acc8bdf

rosteen added this to the 1.1.0 milestone Dec 31, 2024

rosteen added io enhancement New feature or request labels Dec 31, 2024

rosteen marked this pull request as ready for review December 31, 2024 19:18

bmorris3 reviewed Jan 2, 2025

View reviewed changes

rosteen added 2 commits January 2, 2025 13:06

Handle switching to flux column that doesn't have an associated error…

577b84e

… column

Handle time offset in DVT data

9ebdd19

Check against component_ids rather than doing our own list comprehension

0c0b470

bmorris3 reviewed Jan 3, 2025

View reviewed changes

rosteen and others added 4 commits January 3, 2025 10:25

Update lcviz/parsers.py

2dad851

Co-authored-by: Brett M. Morris <morrisbrettm@gmail.com>

Subtract reference time to correct t0

6522156

Remove flux_err component for components with no error

10999ce

Codestyle

ef0835a

bmorris3 approved these changes Jan 6, 2025

View reviewed changes

kecnry reviewed Jan 6, 2025

View reviewed changes

Address review comment

febc67b

kecnry merged commit bade521 into spacetelescope:main Jan 14, 2025
10 of 13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add parser for TESS DVT files #164

Add parser for TESS DVT files #164

rosteen commented Dec 31, 2024

codecov bot commented Dec 31, 2024 •

edited

Loading

rosteen commented Dec 31, 2024

bmorris3 Jan 2, 2025

bmorris3 Jan 2, 2025

bmorris3 Jan 2, 2025

rosteen Jan 2, 2025

bmorris3 Jan 2, 2025

rosteen Jan 2, 2025

bmorris3 Jan 2, 2025

kecnry Jan 2, 2025

rosteen Jan 2, 2025

rosteen commented Jan 2, 2025

rosteen commented Jan 2, 2025 •

edited

Loading

bmorris3 Jan 3, 2025

bmorris3 Jan 3, 2025

kecnry Jan 3, 2025

rosteen Jan 3, 2025

kecnry Jan 3, 2025

rosteen Jan 3, 2025

kecnry left a comment

kecnry Jan 6, 2025

rosteen Jan 14, 2025

rosteen commented Jan 14, 2025

kecnry commented Jan 14, 2025

	self.app._jdaviz_helper._set_data_component(dc_item, 'flux', dc_item[self.selected])
	self.app._jdaviz_helper._set_data_component(dc_item, 'flux_err', dc_item[self.selected+"_err"]) # noqa

Add parser for TESS DVT files #164

Add parser for TESS DVT files #164

Conversation

rosteen commented Dec 31, 2024

codecov bot commented Dec 31, 2024 • edited Loading

Codecov Report

rosteen commented Dec 31, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rosteen commented Jan 2, 2025

rosteen commented Jan 2, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kecnry left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rosteen commented Jan 14, 2025

kecnry commented Jan 14, 2025

codecov bot commented Dec 31, 2024 •

edited

Loading

rosteen commented Jan 2, 2025 •

edited

Loading