Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RCAL-943: Add a step for creating multiband source catalogs #1485

Merged
merged 25 commits into from
Nov 14, 2024

Conversation

larrybradley
Copy link
Member

@larrybradley larrybradley commented Nov 1, 2024

Resolves RCAL-943

Closes #1486

This PR adds a new pipline step, MultibandCatalogStep, for creating multiband catalogs from a detection image, representing the combination of all bands. It also adds Kron photometry to SourceCatalogStep (incl. the multiband catalog).

I think this work falls under RCAL-873.

CC: @schlafly

Tasks

  • request a review from someone specific, to avoid making the maintainers review every PR
  • add a build milestone, i.e. 24Q4_B15 (use the latest build if not sure)
  • Does this PR change user-facing code / API? (if not, label with no-changelog-entry-needed)
    • write news fragment(s) in changes/: echo "changed something" > changes/<PR#>.<changetype>.rst (see below for change types)
    • update or add relevant tests
    • update relevant docstrings and / or docs/ page
    • start a regression test and include a link to the running job (click here for instructions)
      • Do truth files need to be updated ("okified")?
        • after the reviewer has approved these changes, run okify_regtests to update the truth files
  • if a JIRA ticket exists, make sure it is resolved properly
news fragment change types...
  • changes/<PR#>.general.rst: infrastructure or miscellaneous change
  • changes/<PR#>.docs.rst
  • changes/<PR#>.stpipe.rst
  • changes/<PR#>.associations.rst
  • changes/<PR#>.scripts.rst
  • changes/<PR#>.mosaic_pipeline.rst
  • changes/<PR#>.patch_match.rst

steps

  • changes/<PR#>.dq_init.rst
  • changes/<PR#>.saturation.rst
  • changes/<PR#>.refpix.rst
  • changes/<PR#>.linearity.rst
  • changes/<PR#>.dark_current.rst
  • changes/<PR#>.jump_detection.rst
  • changes/<PR#>.ramp_fitting.rst
  • changes/<PR#>.assign_wcs.rst
  • changes/<PR#>.flatfield.rst
  • changes/<PR#>.photom.rst
  • changes/<PR#>.flux.rst
  • changes/<PR#>.source_detection.rst
  • changes/<PR#>.tweakreg.rst
  • changes/<PR#>.skymatch.rst
  • changes/<PR#>.outlier_detection.rst
  • changes/<PR#>.resample.rst
  • changes/<PR#>.source_catalog.rst

@larrybradley larrybradley added this to the 25Q1_B16 milestone Nov 1, 2024
@larrybradley larrybradley requested a review from a team as a code owner November 1, 2024 16:17
@github-actions github-actions bot added the dependencies Pull requests that update a dependency file label Nov 1, 2024
@larrybradley larrybradley force-pushed the multiband-catalog branch 2 times, most recently from 8341427 to 58926be Compare November 1, 2024 16:51
Copy link
Collaborator

@schlafly schlafly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Let's touch base later this afternoon.

# background subtracted, have the same shape, and are pixel
# aligned.
# TODO: Do we need a separate background subtraction step
# prior to this one?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The incoming L3s have been sky matched but not background subtracted; we probably do want to follow source_catalog and subtract a background.

)
detection_var += convolve_fft(
wht**2 * model.var_rnoise, kernel, mask=coverage_mask
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For discussion later, I've been staring at code like this in a different package recently as well, and maybe I am missing something important? But in my head you have something like:

signal(i) = sum_j PSF(i, j) * image(j) * weight(j)
variance(i) = sum_j PSF(i, j)^2 * sigma^2(j) * weight^2(j)

this code has the sigma^2 and weight^2 terms but not the PSF^2 term.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw in your doc file the kernel**2 term, but I have questions about it. Since kernel is normalized to sum to 1, convolving the variance with kernel**2 does not conserve the variance. Convolution is a linear operation, so why would you want to square the kernel?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I claim that the detection significance image should be the local significance of the kernel at each location. This is a linear fit with a profile P. For a normal linear least squares problem, if the uncertainties were given by a covariance matrix C, the best fit fluxes would be:
x = (P^T C^-1 P)^-1 P^T C^-1 y
with variance
(dx)^2 = (P^T C^-1 P)^-1

The ratio of x / dx is the significance is x / (dx) = P^T C^-1 y / sqrt(P^T C^-1 P) . The term in the denominator with P^T P is the sum of the square of the PSF.

Similarly, I claim if you compute signal(i) = sum_j PSF(i, j) * image(j) * weight(j) and ask what the variance is, you likewise get a PSF^2 term that would correspond to convolving with the square of the kernel.


kernel_fwhm : float
The full-width at half-maximum (FWHM) in pixels of the 2D
Gaussian kernel used to smooth the detection image.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And maybe not for right now, but remembering my thinking... for a PSF, we want to use the appropriate PSF for each band, so a different FWHM for each band. For an extended object, it's less important, but technically we would want a true source profile convolved with the appropriate PSF.

# TODO: SED weights to be defined in the asn file for each
# input filter image
try:
sed_weight = library.asn["products"][0]["members"][i]["sed_weight"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does a single SED for right now, which makes sense, though we probably want to extend that.


# NOTE: I'm assuming here that all of the input images have been
# background subtracted, have the same shape, and are pixel
# aligned.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same shape and pixel aligned seems good to me for the DR catalogs. We do want to do background subtraction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a separate background subtraction step that saves the background-subtracted images may make sense. These background subtraction files are used in a couple places and I'd rather not keep them all around in memory (the are all needed initially to create the detection image) or recompute them.

# source_catalog will ultimately get these filter-dependent
# values from a reference file based on EE values;
# do we want filter-dependent aperture parameters for the
# multiband catalog?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss more. We did recently get aperture reference file schemas into roman_datamodels, but we don't have those files in CRDS yet. And given the modest range of aperture sizes we may prefer a handful of fixed sizes anyway.

}

# TODO: do we want to save the det_img and det_err?
det_img, det_err = make_detection_image(library, self.kernel_fwhms)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have talked about adding these to the current segmentation image product.


# this is needed for the DAOFind sharpness and roundness
# properties; are these needed for the Roman source catalog?
star_kernel_fwhm = np.min(self.kernel_fwhms) # ??
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have said we will compute these.


# save the segmentation image and multiband catalog
# TODO: I noticed that the catalog is saved twice;
# once here and once when the step returns
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good catch. I think we hit this with the original source catalog and worked around it somehow, e.g., by only saving the segementation image in sae_base_results.

star_kernel_fwhm,
self.fit_psf,
detection_cat=det_catobj,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just recording for my thinking---this does totally separate fits on each filter. e.g., for the PSFs, the centers will jump around a little from filter to filter, or for the krons, they will have separate shapes.

Copy link
Member Author

@larrybradley larrybradley Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the Kron shapes are fixed by the Kron radius (with some scaling parameterization) and shape parameters, which are calculated only from the detection image. The initial PSF centers will also be from the detection image centroids (as will the circular aperture centers).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also do forced PSF photometry with fixed positions (based on the detection image centroids).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, good, I had misunderstood this. This is good behavior for now.

Copy link

codecov bot commented Nov 1, 2024

Codecov Report

Attention: Patch coverage is 95.60976% with 9 lines in your changes missing coverage. Please review.

Project coverage is 76.68%. Comparing base (1fe4cbc) to head (cc0202e).
Report is 26 commits behind head on main.

Files with missing lines Patch % Lines
...mancal/multiband_catalog/multiband_catalog_step.py 96.15% 3 Missing ⚠️
romancal/multiband_catalog/background.py 90.00% 2 Missing ⚠️
romancal/multiband_catalog/utils.py 91.30% 2 Missing ⚠️
romancal/multiband_catalog/detection_image.py 97.95% 1 Missing ⚠️
romancal/source_catalog/source_catalog.py 96.77% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1485      +/-   ##
==========================================
+ Coverage   76.21%   76.68%   +0.47%     
==========================================
  Files         115      120       +5     
  Lines        7639     7832     +193     
==========================================
+ Hits         5822     6006     +184     
- Misses       1817     1826       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@larrybradley larrybradley changed the title Add a step for creating multiband source catalogs RCAL-943: Add a step for creating multiband source catalogs Nov 4, 2024
@schlafly
Copy link
Collaborator

I added a test_multiband_catalog regression test. We should update that with the new files after we adjust the step defaults for the background subtraction size. But here is the resulting log messages:

2024-11-12 13:16:24,435 - stpipe.MultibandCatalogStep - INFO - Step MultibandCatalogStep running with args ('L3_skycell_mbcat_asn.json',).
2024-11-12 13:16:24,438 - stpipe.MultibandCatalogStep - INFO - Step MultibandCatalogStep parameters are:
  pre_hooks: []
  post_hooks: []
  output_file: None
  output_dir: None
  output_ext: .asdf
  output_use_model: False
  output_use_index: True
  save_results: True
  skip: False
  suffix: cat
  search_output_file: True
  input_dir: ''
  bkg_boxsize: 1000
  kernel_fwhms: None
  snr_threshold: 3.0
  npixels: 25
  deblend: True
  aperture_ee1: 30
  aperture_ee2: 50
  aperture_ee3: 70
  ci1_star_threshold: 2.0
  ci2_star_threshold: 1.8
  fit_psf: True
2024-11-12 13:16:28,277 - stpipe.MultibandCatalogStep - INFO - Making detection image
2024-11-12 13:16:28,278 - stpipe.MultibandCatalogStep - INFO - Making detection image with kernel FWHM=2.0
2024-11-12 13:16:28,279 - stpipe.MultibandCatalogStep - INFO - Processing model r0099101001001001001_r274dp63x31y81_prompt_F158_i2d.asdf: filter=F158, sed_weight=1.0
2024-11-12 13:16:40,349 - stpipe.MultibandCatalogStep - INFO - Making detection image with kernel FWHM=20.0
2024-11-12 13:16:40,349 - stpipe.MultibandCatalogStep - INFO - Processing model r0099101001001001001_r274dp63x31y81_prompt_F158_i2d.asdf: filter=F158, sed_weight=1.0
Deblending: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1084/1084 [00:03<00:00, 347.51it/s]
2024-11-12 13:17:01,134 - stpipe.MultibandCatalogStep - INFO - Detected 4921 sources
2024-11-12 13:17:10,314 - stpipe.MultibandCatalogStep - INFO - Constructing a gridded PSF model.
<cutting lots of PSF stuff>
2024-11-12 13:19:12,205 - stpipe.MultibandCatalogStep - INFO - Fitting a PSF model to sources for improved astrometric precision.
2024-11-12 13:20:11,845 - stpipe.MultibandCatalogStep - INFO - Saved model in r0099101001001001001_r274dp63x31y81_prompt_segm.asdf
2024-11-12 13:20:11,913 - stpipe.MultibandCatalogStep - INFO - Saved model in r0099101001001001001_r274dp63x31y81_prompt_cat.asdf
2024-11-12 13:20:12,150 - stpipe.MultibandCatalogStep - INFO - Saved model in L3_skycell_mbcat_asn_cat.asdf
2024-11-12 13:20:12,150 - stpipe.MultibandCatalogStep - INFO - Step MultibandCatalogStep done
2024-11-12 13:20:12,263 - stpipe.MultibandCatalogStep - INFO - MultibandCatalogStep instance created.
2024-11-12 13:20:12,263 - stpipe.MultibandCatalogStep - INFO - DMS391: successfully used multiple kernels to detect sources.
2024-11-12 13:20:12,263 - stpipe.MultibandCatalogStep - INFO - DMS393: successfully used deblending to separate blended sources.
2024-11-12 13:20:12,263 - stpipe.MultibandCatalogStep - INFO - DMS399: successfully tested that catalogs contain aperture fluxes and uncertainties.

The deblending and multiple kernel messages we will use for demonstrating the requirements.

@larrybradley larrybradley force-pushed the multiband-catalog branch 2 times, most recently from ff78cd7 to e86639e Compare November 12, 2024 18:54
Copy link
Collaborator

@schlafly schlafly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's wait for regtests to finish to merge, but this looks good.

for kernel_fwhm in kernel_fwhms:
img, err = make_det_image(library, kernel_fwhm)
det_img = np.maximum(det_img, img)
det_err = np.maximum(det_err, err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't actually use det_err right now, so let's not change anything, but FWIW: the way I was thinking about this is that we want the SNR image and take the maximum of that, so one maximum. Conceptually I think we want the highest significance points, so probably if we wanted two images it would be something like

m = img / err > det_img / det_err
det_img[m] = img[m]
det_err[m] = err[m]

but let's not do anything now.

@larrybradley
Copy link
Member Author

I'm going to wait to rebase this after #1457 is merge, but I suspect there could be conflicts.

@larrybradley larrybradley merged commit 14d5f82 into spacetelescope:main Nov 14, 2024
31 checks passed
@larrybradley larrybradley deleted the multiband-catalog branch November 14, 2024 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Step for multiband source catalogs
2 participants