-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aperture photometry batch mode API #2401
Conversation
Codecov ReportPatch coverage is
📢 Thoughts on this report? Let us know!. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we aim to stay close to photutils API?
https://photutils.readthedocs.io/en/stable/aperture.html#background-subtraction
phot_table = aperture_photometry(data, apertures)
So in Imviz, it can be like:
phot_table = aper_phot_plugin.aperture_photometry(list_of_data_labels, list_of_subset_apertures, local_bkg=only_one_subset_background)
cc @eteq and @larrybradley
|
||
unpack_batch_options(dataset=['image1', 'image2'], | ||
subset=['Subset 1', 'Subset 2'], | ||
bg_subset=['Subset 3'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, if bg_subset
has multiple elements, then it is unpacked too? Is that what people want though? Usually a given background is very specific to the chosen aperture?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this bypasses the UI checks, right? For example, now someone can define "Subset 1" to be both the aperture and background and we cannot stop them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, if
bg_subset
has multiple elements, then it is unpacked too? Is that what people want though? Usually a given background is very specific to the chosen aperture?
Right. If you want specific combinations, then you call the other method (or call this first, and pass just the combinations you want).
Also, this bypasses the UI checks, right?
If there are checks that are only in the UI, then yes, those are bypassed. We should move (or at least copy) those to the python end. This PR doesn't yet validate the input values, so technically a batch could get halfway through before erroring out, in which case you'll get the results that completed before the error. We'll need to think of what we want to happen in a case like this. Full validation in advance would likely require a significant refactor of the plugin code and since computing isn't too expensive, I think this is probably okay for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I'll also mention that the current plan for the UI is to implement multiselect for the dataset and aperture (subset
) only (I just figured it was nice to keep the API that the UI will use as general as possible). Using batch mode on any other input parameter would require calling the API and I'm guessing most people would then choose to manually create their list of all inputs instead of generating one gigantic matrix with a bunch of nonsensical combinations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
p.s. I thought maybe we can pass multiple apertures straight into photutils instead of looping through them, but looks like that is only possible if the aperture shape/size is identical and only differ in center positions, so I guess that is not very useful for us... See astropy/photutils#1615
so technically a batch could get halfway through before erroring out, in which case you'll get the results that completed before the error
That would be annoying. I prefer returning all successful results and a list of combos that failed, after the loop completes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would work for me and is easier to implement than full input-validation before running anything. I'm always in favor or raising exceptions, but I can at least wait until all that can pass do so, and then raise the exception at the end so the user is aware that not all were successful.
EDIT: I added a commit which implements what I described above so that the loop does not quit at the first failure.
In my opinion, accepting multiple lists requires more validation on our-end (needing to require each list to be the same length, etc) and is more cumbersome to setup from the user-perspective. I think switching to that syntax would also make it less convenient to call the unpacking method and then just filtering out which you do/don't want to pass along. But it wouldn't be too difficult to change to that syntax if that is the consensus. We also accept a lot more inputs than just those few, so positional arguments is not ideal (but I think individual lists as kwargs would be reasonable still). |
dbac15b
to
456a310
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some high-level questions which might become concerns depending on the answers 😉 :
- I think I understood from the description that this is not meant to be public API yet, but would eventually become public once there's been time to implement it in the UI, is that right?
- This is pretty deep in the weeds such that I'm honestly not sure which use case this is supposed to address. Is it just the use case of "do aperture photometry on multiple images with multiple different subsets for each image"? And is that intended to be the only case this supports, or is this meant to be the starting point of a larger way of expressing batch operations?
Now a concern:
- I am afraid that this is becoming a sort of "mini-language" that expresses batch operations. I thing we don't want to go down that road because we should instead be telling users "if you want anything more complex than the simplest of operations, use the tools from Python, because that's going to be an easier and more flexible approach than any mini-language we come up with". Now if we want to say that this API is for internal use - e.g., this is a way for jdaviz (and potentially, future viz tools that are using jdaviz but are not in jdaviz itself) to structure how they do batch operations, I can get behind that. But I think we want to be very careful to make it clear that this is meant mainly as a format to communicate from one UI to the python libraries, not an API that's intended for users to actual do their science on. The most extreme version of that would be to never expose this as a public API - not sure I want to go that far, but it is an option.
A couple of comments on @eteq's comments after a discussion at tag up:
Here is a workflow that made sense to us during tag up:
Another workflow:
Does this make sense @eteq? Am I missing something @kecnry, @orifox, @pllim? |
Maybe related or not, but my remaining thoughts:
|
Sounds good to me. We can also continue/revisit the discussion about the list vs dictionary inputs anytime before we officially expose the plugin user API. I made a note that when we do implement the UI for the "matrix" mode, that we will want to be able to retrieve the inputs from those UI-selections, not only by passing lists manually 🐱. |
Thanks for the clarification @camipacifici , that makes a lot more sense. But that does lead to a few additional thoughts:
|
I think this falls under the follow-up UI work, I had added a note to the ticket there to make sure we do this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few comments after reviewing this:
-
I think that the 'unpack_batch_options' should check if the dataset/subset names are valid and loaded, this would make it a more helpful helper function. Otherwise, at least in my opinion, making these combinations would be something I would be inclined to just do myself in a few lines.
-
@pllim already commented on this but another concern I have with the 'unpack_batch_options' is that it assumes every aperture should be combined with every background subset, when I think most people would assume it should be 1:1. One solution I can think of for this is if background is a list, then it should be the same length as the photometric apertures and raise an error if not, and if its a single subset then that should be combined with each. That way, if people have 2 background apertures they want to apply to each data and each subset, they can just run this twice and combine the results.
-
Should this allow you to pass in background values rather than subsets to calculate the background in, like the GUI does?
-
I am trying to understand the intended workflow here, versus what I expected when I see 'batch more photometry'. It seems like the intent of this is to be able to reproduce/automate the process of loading data, making some clicks to create some subsets on sources and some subsets for background, and running the photometry tool on these one-by-one. I don't see how this implementation makes this process reproducible after doing it once and restarting your notebook- the batch option function just passes in references by name so if you restart the notebook, wouldn't you have to click to create these subsets again and the only utility these changes would have is you could avoid having to click the photometry tool for each one-by-one? Is there a way to preserve subsets so they can be re-generated in a new notebook instance?
I see now that the intent of this PR is not to replicate something that should be done in photutils like I orignially envisioned (e.g providing positions, aperture sizes, backgrounds, and getting back a table) but rather to aid in reproducing a workflow someone did by hand in the GUI. However, I think it makes sense to be able to go back and forth more seamlessly - for example if you do some preliminary analysis in photutils to select the brightest sources, maybe you should be able to load those locations in to create subsets in a list/table without clicking (Maybe this functionality is already available?).
To add to the fun, even without this PR, "straight to photutils batch call" is kinda possible already but it is not that convenient. For instance, you can already do the following workflow at a high level:
|
This can probably be added if we want - I originally didn't because (1) I wanted to keep this part of the logic general to potentially be shared with other plugins where we're planning to implement batch mode, (2) the error will be raised later anyways, (3) when using this from the UI all selections are guaranteed to be valid, and (4) perhaps a bit of laziness 😉 . Honestly, I don't expect anyone to call this method by hand, but it is needed for the current plan for the UI.... but maybe the cleanest solution is to hide this as a private method and instead only expose the matrix from what has been set by the traitlets/UI which already include these checks natively?
Agreed, when a user is doing batch mode, I expect them to use This expectation that an API call shouldn't create every combination is maybe another sign that this method should be hidden and only used by UI interactions.
Yes, this should work, as long as
Right, I think there are lots of different opinions on what "batch mode" should and should not entail. The idea here (which can be debated if that is what/all we want) is to allow you to run the existing plugin over several combinations in one click/call rather than having to tediously click a bunch or write your own for-loop. This PR in particularly aims to provide the API framework for a UI that will allow that multiselect capability for the apertures and backgrounds, while also providing more flexibility for custom combinations that a simple "matrix" operation between those multiselects would allow in the UI. I think anything beyond that (better reproducibility of entire workflows, different interaction with photutils) would motivate more planning about a redesign of the existing plugin and/or require saving state.
Maybe (as a separate effort) we should try to create convenience functions to export the info needed to pass along to photutils directly if this is a use-case people want? |
I fixed the dev job, so you should rebase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cami seems happy enough with this, so I am approving by proxy, though it is hard for me to see how this will play out, but only one way to find out. Thanks!
* loops over options, sets traitlets, calls existing method * see comments for refactoring that would be needed to NOT touch the traitlets
* in "other changes" for now since this technically isn't public facing until the UI is implemented or the user-API for aperture photometry is officially exposed
* any entry that is failed will be reported, pass full_exceptions to also report the individual exceptions
70fd682
to
68c975a
Compare
68c975a
to
2780e97
Compare
Description
This pull request implements the (not-yet public*) plugin API for batch processing of aperture photometry. The main method (
plugin.batch_aper_phot
) takes a list of dictionaries with traitlets names and values as inputs, loops over the list, sets the traitlets, and calls the existing internal method to compute aperture photometry. The results are then available in the results table (from the plugin UI or API, once its public) and the state of the plugin, including the plot, is left based on the last iteration from that list. Any value that is not provided in a dictionary will be adopted from the "current" state at the time (either the original plugin state or as it was left from the previous iteration). This is something we may want to reconsider - perhaps restoring to the original state after each iteration, so that an override in the first entry isn't then adopted in the second if not explicitly provided again.* none of these methods are made public via this API since there is no exposed user API for aperture photometry yet and the UI implementation is scheduled for follow-up work. When we do expose the user API, we will need to finalize some traitlet/attribute names to avoid confusion and add these two methods to the list of exposed methods.
For example:
This also implements a helper method (
plugin.unpack_batch_options
) which takes any number of keyword arguments, where each item can contain either a single value or a list. This then unpacks to create the "matrix" of all combinations that can then be used as input to the method described above. This will then become the hook-in point for the multi-select dropdowns to be implemented in the UI (which likely will only support this matrix mode for the input dataset and aperture subsets).For example:
returns:
which could then be passed directly (or modified) as input to
batch_aper_phot
.Change log entry
CHANGES.rst
? If you want to avoid merge conflicts,list the proposed change log here for review and add to
CHANGES.rst
before merge. If no, maintainershould add a
no-changelog-entry-needed
label.Checklist for package maintainer(s)
This checklist is meant to remind the package maintainer(s) who will review this pull request of some common things to look for. This list is not exhaustive.
trivial
label.