-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
expand fields to allow multiple values #4
Comments
Some examples from MAST. jw01282-c1010_t005_miri_f1550c-mask1550
hlsp_tasoc_tess_ffi_tic00166699853-s0004-cam2-ccd2-c1800_tess_v05
hlsp_ullyses_hst-fuse_fuse-stis_sk-71d8_uv
|
What does it mean if a plane has multiple bandpassName(s)? I recognize the form of F1550C as a filter around 1550nm... Does it mean multiple filters in the light path, so the Plane.energy.bounds would be the intersection of the two? In a DerivedObservation, one could stack inputs in different filters to make a wider Plane.energy.bounds, possibly with meaningful sub-samples. I know there are also some "white light" images like that... These two possibilities kind of reduce to "light in filter A and/or B" respectively (logical and/or)... would the model only need one interpretation or also a way to specify |
For multiple proposals, I'm not really sure what the usage means. I can imagine a queue-scheduled observatory optimising by taking one observation that two different proposals requested and wanting to "assign" it to both of them... (see below) If a proposal has an observing plan that includes making stacks, then DerivedObservation(s) with proposal information makes sense. That's why Proposal is common to all Observation, not just SimpleObservation. However, if a 3rd party were to decide to stack SimpleObservation(s) from different proposals, I would not want to assign any proposal information to the resulting DerivedObservation: the proposal indicates "this thing exists because of proposal xyz" and that's just not true of all DerivedObservation(s). They are more likely to fit "this thing exists because of project abc" (which would be in Plane.provenance.project). |
below: There was a radio use case where they collect data (SimpleObservation) of a large region of the sky and different parts of the data are destined for different proposals (and the implied different permissions). The solution there was that we renamed CompositeObservation to DerivedObservation and the plan would be to have the SimpleObservation (owned by the observatory) and create a DerivedObservation for each proposal. The extra catch there was that in radio one would actually want to extract different subsets of the data for each of those so it did not overlap with the "two observations include the same artifact issue". So, that was one way that one data acquisition was "assigned" to two proposals, but it is a special case where there really are new data created by some processing. |
on multiple telescopes: I assume the MAST use case where is "DerivedObservation made by combining data from multiple telescopes". In radio (interferometry) there is also VLBI observations that involve mutliple telescopes and there is the "multiple dishes in a single facility" and there I don't know where the line between telescope -- collector -- detector -- correlator lies. I'm toying with the Telescope and Instrument classes and if that looks promising I might pull this out as a separate issue. As it stands, we could not change cardinality of Telescope.name by itself because multiple telescopes are not at the same location. If we go ahead with removing WCS from CAOM for 2.5, the need for Telescope.geoLocation* would kind of go away (I think it's only useful for some spectral reference frame transforms). |
I've actually seen both situations (and vs or). I think the most common are DerivedObservations that are stitched spectra across multiple bandpasses (eg, HASP, HLSP). So it's the wavelength coverage of each stacked together. Our implementation in Plane.energy.bounds is the UNION of all wavelength ranges covered in those cases. There are also a few examples where there is more than one optical element in place, for example a filter plus a grating. I believe in that situation we've ignored or glossed over any impact of the grating and just captured the wavelength range of the filter. In those cases, I think we've used something like I can imagine situations were observers may want to have two filters so have a much narrower bandpass (eg, suppressing red light from an inefficient blue filter). The bounds I would want to record for those are the INTERSECTION rather than the UNION of the filters. That isn't always possible though, and I don't have a good feel for how frequent it happens among the MAST datasets. I think what I would like to have is the multiple storing of bandpasses to support the first use case: each element in the bandpass list is something that will be UNION-ed together to form the final bounds range. |
For the TESS mission, it's as you initially describe: it's a survey-like situation where users can propose to extract select stars at a higher cadence than the full-frame images. The mission office selects these targets and while they all have the same PI (George Ricker, the PI of the mission itself), the targets get associated with every proposal that submitted them. Another example is the Hubble Advanced Products (HAP), specifically the multi-visit mosaics. These are DerivedObservations that are mosaics made from separate programs. The artifacts of these observations are new images of some area of the sky that are drizzled, rotated, and stacked together. The mission specifically asked us to capture every program that has been used in them so that users can search for multi-visit mosaics produced from specific programs. I believe for these we have Plane.proposal.project be HAP-MVM. That is, these observations exist because of the HAP-MVM project, but the original programs are indicated in Observation.prpID. This same situation will happen with the spectroscopic analogue, HASP, once they start stacking spectra across visits. |
This has been used for High-Level Science Products (HLSPs), where things like spectra from more than one telescope/instrument have been stitched together to have broader spectral coverage. I think HASP (Hubble Advanced Spectroscopic Products) currently only uses HST data, but they do combine multiple instruments and create new products from that. |
bandpassName I think I agree that multiple bandpassName(s) should be the wider union usage you describe. User queries could still use The narrower multiple filter intersection thing belongs in a possible enhancement of the instrument model (eg to describe the path of the signal through components before reaching the detector. In that case, you could still construct a single bandpassName with two filter names in it using a different separator (eg In either case, Plane.energy.bounds would give the correct/representative numeric band limits. |
telescopes/instruments I have increased the scope of #11 to cover this topic. |
proposalID I feel like This really is part of what provenance should cover and in principle one could write a query to extract the proposalID(s) of all the member observations (admittedly, it would be complicated because the join would bloat the query result and sincve 2.4 we allow members to be Observation so there's an arbitrary sequence of joins to collect all the initial proposalID values). Does that give essentially the same result? I'm thinking more along the lines an optional |
proposalID At MAST, our UI has a link to HST/JWST proposals for additional information. That's hard-coded into the UI (something along the lines of if HST, take observation.prpID and make it a link), I believe it might be broken for multi-proposal cases but that's a separate matter. The situation about extracting from member observations is that some use-cases don't have members. For example, the TESS lightcurves can have multiple proposal IDs but they have no members. As for an optional My biggest worry is that it may make things a little more complicated- some observations would have proposal information in one place, others in another. We may be forced at MAST to consolidate them so even if we add a place for |
I was thinking that It also does not solve the other issue where a SimpleObservation is part of (assigned to) multiple proposals; in this sense I'm thinking of proposalID as an indicator of access rights; that could be a PI querying to "find all my data"... those proposals that share an observation do have different PI, title, keywords and need to be distinct Proposal objects. So it's really the composition |
proposalID To change the cardinality here, we would have to change the model so that an Observation simply had Collection-Observation -> Observation w/ collection field Normalising Proposal would allow an observation to belong to 2+ proposals, but it is definitely a slippery slope because those proposals don't necessarily have the same target (objects), specific target positions, and very likely not the same requirements (eg an observation might meet requirements for one proposal but fail wrt. the other). So this would be quite a mess. There is a draft IVOA ProposalDM so any effort to normalise the Proposal class/model would likely have to take that into account in addition to the current MAST-ESAC proposal metadata details. The current denormalisation is clearly a "copy a few useful bits" and doesn't really conflict with whatever happens in that other work. For the use case of commensural observing (assign one observation to 2 proposals for implied or explicit access rights) the currently recommended approach of creating 1 SimpleObservation (no proposal) and 2 DerivedObservation with one Proposal each works. The algorithm.name would indicate that the derived are copies or it could indicate they are subsets of the single member (that's one of the radio use cases). That does mean there are 3 observations for potentially the same science data (same Artifact.uri). It would be plausible to not create planes and/or artifacts for the base SimpleObservation and thus more or less "hide it" from normal queries. Unless there is some subset operation, the planes and artifacts of the two derived observations would be identical in one extreme (down to same Artifact.uri), but could in principle differ in processing and thus have different Plane metadata and different files (Artifact.uri). So, this approach allows for some additional redundancy and would likely create two paths to the same file (see #2) that would need to be allowed in the degenerate case. Although re-use of Artifact.uri allows one to share a file between two observations, it would probably be confusing to just make two SimpleObservation (different Observation.observationID, same Artifact.uri) because it would not be easy to distinguish that from a mistake. For the use case of tracking all uses of data from proposal X or all the proposals that contributed data to this (derived) observation, that is a provenance issue (navigating forward or backwards respectively). The model as it stands can support such queries but they are probably best tackled with navigation (drill down to details). There is an IVOA Provenance DM and the CAOM Provenance is a very simple one step provenance in that context. At this point, I don't think changing cardinality of proposalID is feasible. Of course, the field is an opaque string (with collection-specific meaning) so there are no rules about values in place, but I think it would be confusing, dangerous, and maybe short sighted to abuse it with multiple values. I think it is fine for DerivedObservation(s) created outside the scope of a Proposal to not have any proposal info at all: essentially no proposal info could mean unknown or multiple. |
final result:
|
this comes from an archive partners slack discussion started by David Rodrigues
proposal.id
telescope.name
instrument.name
plane.energy.bandpassName
... maybe more
The text was updated successfully, but these errors were encountered: