-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto-configuration identification #2124
Conversation
Have you ever questioned the nature of your reality? (A special day message.) |
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #2124 +/- ##
==========================================
- Coverage 91.96% 91.93% -0.04%
==========================================
Files 146 146
Lines 15911 15972 +61
==========================================
+ Hits 14633 14684 +51
- Misses 1278 1288 +10
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Planning ahead for eventual support of viz = jdaviz.open(filename/data)
... should this handle accepting the data directly as well (a specutils object, for example)? That's fine if its out-of-scope for now, I'm just wondering if that would work into this basic design or if it would require some refactoring/renaming?
Do we prefer the capitalized version of the config (Imviz vs imviz) to match the class names?
It very well could! There's only very minimal logic to extract the meaningful properties of the file in this snippet here. We could expand this to extract
Or even better, do we prefer to return cls ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this with a couple non-JWST files I have lying around from old testing, and it successfully identified Specviz for a 1D spectrum, but failed to identify any config for the MANGA cube that we used to use for testing Cubeviz. Might be worth taking a look at that and seeing if there's a simple way to get it to work that might cover a decent number of other mission formats. I can send you that file if you don't have it.
try: | ||
with asdf_in_fits.open(filename) as af: | ||
wcs = af.tree['meta']['wcs'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes me vaguely unhappy, but I don't know of another way to check for this other than trying to open it, so...not a change request, more of a change wish 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel this. I thought about something like
with asdf_in_fits.open(filename) as af:
meta = getattr(af.tree, 'meta', fits.getheader(filename))
but thought that was too obfuscated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually like that idea and since its a key and not an attribute, it might be slightly less ugly (although on second thought you will need two layers of dictionaries which makes it a bit ugly again 😬 ). This entire _get_wcs
function could be removed and
header = fits.getheader(filename, ext=ext)
data = fits.getdata(filename, ext=ext)
wcs = _get_wcs(filename, header)
replaced with:
data = fits.getdata(filename, ext=ext)
with asdf_in_fits.open(filename) as af:
meta = af.tree.get('meta', {}).get('wcs', WCS(fits.getheader(filename, ext=ext)))
which also avoids having to parse the header just to have on hand for the fallback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree @kecnry that this is a better way to do it, but even I have trouble reading it. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i couldn't find anything to nitpick so i will just leave a 'looks good to me!'
Let's consider re returning the classes themselves, that might be overhead for the mast use-case, and |
'wave', | ||
'flux' | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I worry about hard-coding these kinds of column names, but can't think of a more general way to do this. I suggested we could try to auto-identify wavelength or flux columns based on any attached units on the column. Maybe that's out of scope for this PR though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out of scope for this PR though.
I think so. Let's get v0.1 done and iterate.
recognized_spectrum_format.find('s3d') > -1): | ||
return 'cubeviz' | ||
elif (n_axes == 2 and | ||
recognized_spectrum_format.find('x1d') > -1): | ||
return 'specviz' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it's worth generalizing these now to something non-JWST specific. For example, this function already fails with MaNGA cubes files, which is maybe a tad bittersweet since Jdaviz was originally developed against those files. :( And for MAST to switch to this function, we'd need this generalized before we can start any work on our end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that there's a better way to do this! Stepping away for a few days and seeing this comment makes it clearer.
This PR relies on astropy.io.registry
to auto-identify file types, and then when that fails, uses heuristics to make a guess. The astropy registry knows some JWST files without a problem, so I didn't need to use something else.
But there is already a registry which identifies JWST, MaNGA, and many others: the specutils registry. The easy solution to all this is to also try the specutils registry. I'll do this today! 🤦🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Support for MaNGA introduced in 7b8d713.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update. The MaNGA cube and rss files do seem to now correctly identify as cubeviz and specviz, which is great. Since JWST formats are covered by the identify_data
function, it's not clear to me why we're still explicitly checking on the filename suffix at lines 241-246?
@kecnry – any thoughts on this option?
|
Unless it would introduce unnecessary overhead, I'd vote to stick with the string for now and then have |
7b8d713
to
2027a5a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve this. I think it's in good shape now but there are some refinements that can be made to make this more robust and general.
recognized_spectrum_format.find('s3d') > -1): | ||
return 'cubeviz' | ||
elif (n_axes == 2 and | ||
recognized_spectrum_format.find('x1d') > -1): | ||
return 'specviz' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update. The MaNGA cube and rss files do seem to now correctly identify as cubeviz and specviz, which is great. Since JWST formats are covered by the identify_data
function, it's not clear to me why we're still explicitly checking on the filename suffix at lines 241-246?
Thanks all! I'll write up Jira tickets for the remaining improvements that we labeled out-of-scope for this PR. |
Description
Standalone and MAST interfaces would benefit from automatically launching the "best" jdaviz helper/configuration for a given data file. This PR is one step down that path, providing a new function for suggesting a helper based on the contents of the file.
The function does the following:
filename
ends inasdf
, give "imviz" (the only ASDF-supporting helper so far)astropy.io.registry
. If there's only one unique match, return the corresponding helper.fits.BinTableHDU
orfits.fitsrec.FITS_rec
, look for standard spectral columns and return "specviz" if you find themastropy.io.registry
, use the number of world dimensions in the WCS/gwcs to break the ties.This workflow isn't intended to work for any file. This PR supports FITS files from JWST and HST (with FITS WCS), ASDF-in-FITS from JWST, and files with paths ending in
.asdf
. Since there aren't perfectly-followed standards for all data products, support for any given data product from one of these missions is not guaranteed. That said, the tests added in this PR cover many of the files we are likely to handle, and all helpers except Mosviz.Change log entry
CHANGES.rst
? If you want to avoid merge conflicts,list the proposed change log here for review and add to
CHANGES.rst
before merge. If no, maintainershould add a
no-changelog-entry-needed
label.Checklist for package maintainer(s)
This checklist is meant to remind the package maintainer(s) who will review this pull request of some common things to look for. This list is not exhaustive.
trivial
label.