RCAL-943: Add a step for creating multiband source catalogs #1485

larrybradley · 2024-11-01T16:17:35Z

This PR adds a new pipline step, MultibandCatalogStep, for creating multiband catalogs from a detection image, representing the combination of all bands. It also adds Kron photometry to SourceCatalogStep (incl. the multiband catalog).

I think this work falls under RCAL-873.

CC: @schlafly

Tasks

news fragment change types...

changes/<PR#>.general.rst: infrastructure or miscellaneous change
changes/<PR#>.docs.rst
changes/<PR#>.stpipe.rst
changes/<PR#>.associations.rst
changes/<PR#>.scripts.rst
changes/<PR#>.mosaic_pipeline.rst
changes/<PR#>.patch_match.rst

steps

changes/<PR#>.dq_init.rst
changes/<PR#>.saturation.rst
changes/<PR#>.refpix.rst
changes/<PR#>.linearity.rst
changes/<PR#>.dark_current.rst
changes/<PR#>.jump_detection.rst
changes/<PR#>.ramp_fitting.rst
changes/<PR#>.assign_wcs.rst
changes/<PR#>.flatfield.rst
changes/<PR#>.photom.rst
changes/<PR#>.flux.rst
changes/<PR#>.source_detection.rst
changes/<PR#>.tweakreg.rst
changes/<PR#>.skymatch.rst
changes/<PR#>.outlier_detection.rst
changes/<PR#>.resample.rst
changes/<PR#>.source_catalog.rst

schlafly

This looks good. Let's touch base later this afternoon.

schlafly · 2024-11-01T16:32:28Z

romancal/multiband_catalog/multiband_catalog_step.py

+        # background subtracted, have the same shape, and are pixel
+        # aligned.
+        # TODO: Do we need a separate background subtraction step
+        # prior to this one?


The incoming L3s have been sky matched but not background subtracted; we probably do want to follow source_catalog and subtract a background.

schlafly · 2024-11-01T16:36:18Z

romancal/multiband_catalog/detection_image.py

+                )
+                detection_var += convolve_fft(
+                    wht**2 * model.var_rnoise, kernel, mask=coverage_mask
+                )


For discussion later, I've been staring at code like this in a different package recently as well, and maybe I am missing something important? But in my head you have something like:

signal(i) = sum_j PSF(i, j) * image(j) * weight(j) variance(i) = sum_j PSF(i, j)^2 * sigma^2(j) * weight^2(j)

this code has the sigma^2 and weight^2 terms but not the PSF^2 term.

I saw in your doc file the kernel**2 term, but I have questions about it. Since kernel is normalized to sum to 1, convolving the variance with kernel**2 does not conserve the variance. Convolution is a linear operation, so why would you want to square the kernel?

I claim that the detection significance image should be the local significance of the kernel at each location. This is a linear fit with a profile P. For a normal linear least squares problem, if the uncertainties were given by a covariance matrix C, the best fit fluxes would be:
x = (P^T C^-1 P)^-1 P^T C^-1 y
with variance
(dx)^2 = (P^T C^-1 P)^-1

The ratio of x / dx is the significance is x / (dx) = P^T C^-1 y / sqrt(P^T C^-1 P) . The term in the denominator with P^T P is the sum of the square of the PSF.

Similarly, I claim if you compute signal(i) = sum_j PSF(i, j) * image(j) * weight(j) and ask what the variance is, you likewise get a PSF^2 term that would correspond to convolving with the square of the kernel.

schlafly · 2024-11-01T16:38:10Z

romancal/multiband_catalog/detection_image.py

+
+    kernel_fwhm : float
+        The full-width at half-maximum (FWHM) in pixels of the 2D
+        Gaussian kernel used to smooth the detection image.


And maybe not for right now, but remembering my thinking... for a PSF, we want to use the appropriate PSF for each band, so a different FWHM for each band. For an extended object, it's less important, but technically we would want a true source profile convolved with the appropriate PSF.

schlafly · 2024-11-01T16:38:51Z

romancal/multiband_catalog/detection_image.py

+            # TODO: SED weights to be defined in the asn file for each
+            # input filter image
+            try:
+                sed_weight = library.asn["products"][0]["members"][i]["sed_weight"]


This does a single SED for right now, which makes sense, though we probably want to extend that.

schlafly · 2024-11-01T16:39:40Z

romancal/multiband_catalog/multiband_catalog_step.py

+
+        # NOTE: I'm assuming here that all of the input images have been
+        # background subtracted, have the same shape, and are pixel
+        # aligned.


Same shape and pixel aligned seems good to me for the DR catalogs. We do want to do background subtraction.

I think a separate background subtraction step that saves the background-subtracted images may make sense. These background subtraction files are used in a couple places and I'd rather not keep them all around in memory (the are all needed initially to create the detection image) or recompute them.

schlafly · 2024-11-01T16:41:02Z

romancal/multiband_catalog/multiband_catalog_step.py

+        # source_catalog will ultimately get these filter-dependent
+        # values from a reference file based on EE values;
+        # do we want filter-dependent aperture parameters for the
+        # multiband catalog?


Let's discuss more. We did recently get aperture reference file schemas into roman_datamodels, but we don't have those files in CRDS yet. And given the modest range of aperture sizes we may prefer a handful of fixed sizes anyway.

schlafly · 2024-11-01T16:41:32Z

romancal/multiband_catalog/multiband_catalog_step.py

+        }
+
+        # TODO: do we want to save the det_img and det_err?
+        det_img, det_err = make_detection_image(library, self.kernel_fwhms)


We have talked about adding these to the current segmentation image product.

schlafly · 2024-11-01T16:42:35Z

romancal/multiband_catalog/multiband_catalog_step.py

+
+            # this is needed for the DAOFind sharpness and roundness
+            # properties; are these needed for the Roman source catalog?
+            star_kernel_fwhm = np.min(self.kernel_fwhms)  # ??


We have said we will compute these.

schlafly · 2024-11-01T16:46:40Z

romancal/multiband_catalog/multiband_catalog_step.py

+
+        # save the segmentation image and multiband catalog
+        # TODO: I noticed that the catalog is saved twice;
+        # once here and once when the step returns


Yes, good catch. I think we hit this with the original source catalog and worked around it somehow, e.g., by only saving the segementation image in sae_base_results.

schlafly · 2024-11-01T16:48:19Z

romancal/multiband_catalog/multiband_catalog_step.py

+                        star_kernel_fwhm,
+                        self.fit_psf,
+                        detection_cat=det_catobj,
+                    )


Just recording for my thinking---this does totally separate fits on each filter. e.g., for the PSFs, the centers will jump around a little from filter to filter, or for the krons, they will have separate shapes.

No, the Kron shapes are fixed by the Kron radius (with some scaling parameterization) and shape parameters, which are calculated only from the detection image. The initial PSF centers will also be from the detection image centroids (as will the circular aperture centers).

We can also do forced PSF photometry with fixed positions (based on the detection image centroids).

Thanks, good, I had misunderstood this. This is good behavior for now.

codecov · 2024-11-01T17:20:42Z

Codecov Report

Attention: Patch coverage is 95.60976% with 9 lines in your changes missing coverage. Please review.

Project coverage is 76.68%. Comparing base (1fe4cbc) to head (cc0202e).
Report is 26 commits behind head on main.

Files with missing lines	Patch %	Lines
...mancal/multiband_catalog/multiband_catalog_step.py	96.15%	3 Missing ⚠️
romancal/multiband_catalog/background.py	90.00%	2 Missing ⚠️
romancal/multiband_catalog/utils.py	91.30%	2 Missing ⚠️
romancal/multiband_catalog/detection_image.py	97.95%	1 Missing ⚠️
romancal/source_catalog/source_catalog.py	96.77%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1485      +/-   ##
==========================================
+ Coverage   76.21%   76.68%   +0.47%     
==========================================
  Files         115      120       +5     
  Lines        7639     7832     +193     
==========================================
+ Hits         5822     6006     +184     
- Misses       1817     1826       +9

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

schlafly · 2024-11-12T18:27:21Z

I added a test_multiband_catalog regression test. We should update that with the new files after we adjust the step defaults for the background subtraction size. But here is the resulting log messages:

2024-11-12 13:16:24,435 - stpipe.MultibandCatalogStep - INFO - Step MultibandCatalogStep running with args ('L3_skycell_mbcat_asn.json',).
2024-11-12 13:16:24,438 - stpipe.MultibandCatalogStep - INFO - Step MultibandCatalogStep parameters are:
  pre_hooks: []
  post_hooks: []
  output_file: None
  output_dir: None
  output_ext: .asdf
  output_use_model: False
  output_use_index: True
  save_results: True
  skip: False
  suffix: cat
  search_output_file: True
  input_dir: ''
  bkg_boxsize: 1000
  kernel_fwhms: None
  snr_threshold: 3.0
  npixels: 25
  deblend: True
  aperture_ee1: 30
  aperture_ee2: 50
  aperture_ee3: 70
  ci1_star_threshold: 2.0
  ci2_star_threshold: 1.8
  fit_psf: True
2024-11-12 13:16:28,277 - stpipe.MultibandCatalogStep - INFO - Making detection image
2024-11-12 13:16:28,278 - stpipe.MultibandCatalogStep - INFO - Making detection image with kernel FWHM=2.0
2024-11-12 13:16:28,279 - stpipe.MultibandCatalogStep - INFO - Processing model r0099101001001001001_r274dp63x31y81_prompt_F158_i2d.asdf: filter=F158, sed_weight=1.0
2024-11-12 13:16:40,349 - stpipe.MultibandCatalogStep - INFO - Making detection image with kernel FWHM=20.0
2024-11-12 13:16:40,349 - stpipe.MultibandCatalogStep - INFO - Processing model r0099101001001001001_r274dp63x31y81_prompt_F158_i2d.asdf: filter=F158, sed_weight=1.0
Deblending: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1084/1084 [00:03<00:00, 347.51it/s]
2024-11-12 13:17:01,134 - stpipe.MultibandCatalogStep - INFO - Detected 4921 sources
2024-11-12 13:17:10,314 - stpipe.MultibandCatalogStep - INFO - Constructing a gridded PSF model.
<cutting lots of PSF stuff>
2024-11-12 13:19:12,205 - stpipe.MultibandCatalogStep - INFO - Fitting a PSF model to sources for improved astrometric precision.
2024-11-12 13:20:11,845 - stpipe.MultibandCatalogStep - INFO - Saved model in r0099101001001001001_r274dp63x31y81_prompt_segm.asdf
2024-11-12 13:20:11,913 - stpipe.MultibandCatalogStep - INFO - Saved model in r0099101001001001001_r274dp63x31y81_prompt_cat.asdf
2024-11-12 13:20:12,150 - stpipe.MultibandCatalogStep - INFO - Saved model in L3_skycell_mbcat_asn_cat.asdf
2024-11-12 13:20:12,150 - stpipe.MultibandCatalogStep - INFO - Step MultibandCatalogStep done
2024-11-12 13:20:12,263 - stpipe.MultibandCatalogStep - INFO - MultibandCatalogStep instance created.
2024-11-12 13:20:12,263 - stpipe.MultibandCatalogStep - INFO - DMS391: successfully used multiple kernels to detect sources.
2024-11-12 13:20:12,263 - stpipe.MultibandCatalogStep - INFO - DMS393: successfully used deblending to separate blended sources.
2024-11-12 13:20:12,263 - stpipe.MultibandCatalogStep - INFO - DMS399: successfully tested that catalogs contain aperture fluxes and uncertainties.

The deblending and multiple kernel messages we will use for demonstrating the requirements.

schlafly

Let's wait for regtests to finish to merge, but this looks good.

schlafly · 2024-11-14T14:51:53Z

romancal/multiband_catalog/detection_image.py

+    for kernel_fwhm in kernel_fwhms:
+        img, err = make_det_image(library, kernel_fwhm)
+        det_img = np.maximum(det_img, img)
+        det_err = np.maximum(det_err, err)


We don't actually use det_err right now, so let's not change anything, but FWIW: the way I was thinking about this is that we want the SNR image and take the maximum of that, so one maximum. Conceptually I think we want the highest significance points, so probably if we wanted two images it would be something like

m = img / err > det_img / det_err det_img[m] = img[m] det_err[m] = err[m]

but let's not do anything now.

larrybradley · 2024-11-14T17:56:09Z

I'm going to wait to rebase this after #1457 is merge, but I suspect there could be conflicts.

for more information, see https://pre-commit.ci

larrybradley added the multiband_catalog label Nov 1, 2024

larrybradley added this to the 25Q1_B16 milestone Nov 1, 2024

larrybradley requested a review from a team as a code owner November 1, 2024 16:17

github-actions bot added the testing label Nov 1, 2024

larrybradley force-pushed the multiband-catalog branch from b7a2d26 to 28b33e5 Compare November 1, 2024 16:23

github-actions bot added the dependencies Pull requests that update a dependency file label Nov 1, 2024

larrybradley force-pushed the multiband-catalog branch 2 times, most recently from 8341427 to 58926be Compare November 1, 2024 16:51

schlafly reviewed Nov 1, 2024

View reviewed changes

larrybradley force-pushed the multiband-catalog branch from 58926be to f357b05 Compare November 1, 2024 16:53

larrybradley changed the title ~~Add a step for creating multiband source catalogs~~ RCAL-943: Add a step for creating multiband source catalogs Nov 4, 2024

github-actions bot added regression_testing stpipe labels Nov 12, 2024

larrybradley force-pushed the multiband-catalog branch 2 times, most recently from ff78cd7 to e86639e Compare November 12, 2024 18:54

schlafly approved these changes Nov 14, 2024

View reviewed changes

larrybradley added 5 commits November 14, 2024 13:31

Add detection_image module

1b9eb09

Add catalog utils

581dfd2

Clean up source_catalog_step

07f7eb3

Update RomanSourceCatalog for multiband catalog

4ec675b

Add multiband catalog step

2a1d6f4

larrybradley force-pushed the multiband-catalog branch from c23a1bd to 8b50067 Compare November 14, 2024 18:40

larrybradley and others added 5 commits November 14, 2024 13:45

Add multiband_catalog step to towncrier

3a40531

Add multibandcatalogstep suffix

3eb61f8

updates for B16 tests.

f91209c

Add background subtraction

f94148d

Remove columns with det_ prefix

d898f49

schlafly and others added 13 commits November 14, 2024 13:45

Actually add new regtests.

5e07605

[pre-commit.ci] auto fixes from pre-commit.com hooks

c6dd7c0

for more information, see https://pre-commit.ci

Increase number of ids.

ef70c97

[pre-commit.ci] auto fixes from pre-commit.com hooks

aa9e228

for more information, see https://pre-commit.ci

Update log messages and add multiband catalog to integration.py

4e3bcac

[pre-commit.ci] auto fixes from pre-commit.com hooks

ebcc88e

for more information, see https://pre-commit.ci

Improve column renaming/removal

4066ad2

Include err==0 in the background coverage mask

0e709a9

Explicitly save only the segmentation image

6511eff

Temporarily change bkg_boxsize default

e48c16a

Convolve variance with kernel**2

e128953

Add MultibandCatalogStep tests

52df742

Add changelog entry

e5ff765

larrybradley force-pushed the multiband-catalog branch from 8b50067 to e5ff765 Compare November 14, 2024 18:45

larrybradley and others added 2 commits November 14, 2024 14:22

Check if convolved_data is None

ed04e7c

Get output filename correct.

cc0202e

larrybradley merged commit 14d5f82 into spacetelescope:main Nov 14, 2024
31 checks passed

larrybradley deleted the multiband-catalog branch November 14, 2024 21:55

stscijgbot-rstdms mentioned this pull request Nov 18, 2024

Step for multiband source catalogs #1486

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RCAL-943: Add a step for creating multiband source catalogs #1485

RCAL-943: Add a step for creating multiband source catalogs #1485

larrybradley commented Nov 1, 2024 •

edited

Loading

schlafly left a comment

schlafly Nov 1, 2024

schlafly Nov 1, 2024

larrybradley Nov 1, 2024

schlafly Nov 4, 2024

schlafly Nov 1, 2024

schlafly Nov 1, 2024

schlafly Nov 1, 2024

larrybradley Nov 1, 2024

schlafly Nov 1, 2024

schlafly Nov 1, 2024

schlafly Nov 1, 2024

schlafly Nov 1, 2024

schlafly Nov 1, 2024

larrybradley Nov 1, 2024 •

edited

Loading

larrybradley Nov 1, 2024

schlafly Nov 4, 2024

codecov bot commented Nov 1, 2024 •

edited

Loading

schlafly commented Nov 12, 2024

schlafly left a comment

schlafly Nov 14, 2024

larrybradley commented Nov 14, 2024

RCAL-943: Add a step for creating multiband source catalogs #1485

RCAL-943: Add a step for creating multiband source catalogs #1485

Conversation

larrybradley commented Nov 1, 2024 • edited Loading

Tasks

steps

schlafly left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

larrybradley Nov 1, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Nov 1, 2024 • edited Loading

Codecov Report

schlafly commented Nov 12, 2024

schlafly left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

larrybradley commented Nov 14, 2024

larrybradley commented Nov 1, 2024 •

edited

Loading

larrybradley Nov 1, 2024 •

edited

Loading

codecov bot commented Nov 1, 2024 •

edited

Loading