Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy input datamodel in source_catalog step #1457

Merged
merged 5 commits into from
Nov 14, 2024

Conversation

larrybradley
Copy link
Member

@larrybradley larrybradley commented Oct 15, 2024

With this PR the input datamodel to the source_catalog step is copied before being operated on so that it is left completely unaltered.

Tasks

  • request a review from someone specific, to avoid making the maintainers review every PR
  • add a build milestone, i.e. 24Q4_B15 (use the latest build if not sure)
  • Does this PR change user-facing code / API? (if not, label with no-changelog-entry-needed)
    • write news fragment(s) in changes/: echo "changed something" > changes/<PR#>.<changetype>.rst (see below for change types)
    • update or add relevant tests
    • update relevant docstrings and / or docs/ page
    • start a regression test and include a link to the running job (click here for instructions)
      • Do truth files need to be updated ("okified")?
        • after the reviewer has approved these changes, run okify_regtests to update the truth files
  • if a JIRA ticket exists, make sure it is resolved properly
news fragment change types...
  • changes/<PR#>.general.rst: infrastructure or miscellaneous change
  • changes/<PR#>.docs.rst
  • changes/<PR#>.stpipe.rst
  • changes/<PR#>.associations.rst
  • changes/<PR#>.scripts.rst
  • changes/<PR#>.mosaic_pipeline.rst
  • changes/<PR#>.patch_match.rst

steps

  • changes/<PR#>.dq_init.rst
  • changes/<PR#>.saturation.rst
  • changes/<PR#>.refpix.rst
  • changes/<PR#>.linearity.rst
  • changes/<PR#>.dark_current.rst
  • changes/<PR#>.jump_detection.rst
  • changes/<PR#>.ramp_fitting.rst
  • changes/<PR#>.assign_wcs.rst
  • changes/<PR#>.flatfield.rst
  • changes/<PR#>.photom.rst
  • changes/<PR#>.flux.rst
  • changes/<PR#>.source_detection.rst
  • changes/<PR#>.tweakreg.rst
  • changes/<PR#>.skymatch.rst
  • changes/<PR#>.outlier_detection.rst
  • changes/<PR#>.resample.rst
  • changes/<PR#>.source_catalog.rst

@larrybradley larrybradley added this to the 25Q1_B16 milestone Oct 15, 2024
@larrybradley larrybradley requested a review from schlafly October 15, 2024 16:32
@larrybradley larrybradley requested a review from a team as a code owner October 15, 2024 16:32
@schlafly
Copy link
Collaborator

When I was thinking about this I imagined only copying model.data (to avoid copying all of the other image planes that aren't really relevant). But I don't know to what extent photutils might touch input data, or if only the model.data -= bkg.background is expected to be relevant.

@larrybradley
Copy link
Member Author

The model.err array is also modified in-place in RomanSourceCatalog to convert from MJy/sr to uJy. But I can can update this to only copy those 2 arrays.

@schlafly
Copy link
Collaborator

If the source_catalog code is responsible for all of the modifications and so we can safely save a couple of copies in it and restore them at the end, let's do that. If photutils is "allowed" to modify the arrays we're passing it in the model then let's go ahead and do the full model copy.

Maybe let's also document in the docstring to some function that source_catalog is allowed to change fields data & err by a scale factor (numerical precision differences only). Thanks!

@larrybradley
Copy link
Member Author

The modifications are happening before any inputs to photutils. photutils isn't changing any input values.

Copy link

codecov bot commented Oct 15, 2024

Codecov Report

Attention: Patch coverage is 92.85714% with 1 line in your changes missing coverage. Please review.

Project coverage is 76.21%. Comparing base (74f6ab8) to head (196696b).
Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
romancal/source_catalog/source_catalog_step.py 92.85% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1457      +/-   ##
==========================================
- Coverage   76.24%   76.21%   -0.04%     
==========================================
  Files         115      115              
  Lines        7650     7639      -11     
==========================================
- Hits         5833     5822      -11     
  Misses       1817     1817              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@schlafly
Copy link
Collaborator

Great, then let's try to only save the planes we are explicitly modifying.

@larrybradley
Copy link
Member Author

@schlafly I've updated this PR to copy only the data and err arrays (that are modified in place in the step).

Copy link
Collaborator

@schlafly schlafly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, pending successful completion of the regression tests.

assert_allclose(original_data, mosaic_model.data, atol=5.0e-5)
assert_allclose(original_err, mosaic_model.err, atol=5.0e-5)
assert_equal(original_data, mosaic_model.data)
assert_equal(original_err, mosaic_model.err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this test!

Copy link
Collaborator

@mairanteodoro mairanteodoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
I only have one concern about using assert_equal() to compare float32 arrays.

assert_allclose(original_data, image_model.data, atol=5.0e-5)
assert_allclose(original_err, image_model.err, atol=5.0e-5)
assert_equal(original_data, image_model.data)
assert_equal(original_err, image_model.err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be safer to use assert_allclose() because both arrays are of type float32.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this PR the arrays really are identical, so assert_equal should always work.

@schlafly
Copy link
Collaborator

This "fails" regtests but I think that's just saying that this is working and it's a numerical noise thing that is now fixed that we should okify. The only failures are pipeline tests that run source cataloging with small numerical differences.
https://github.com/spacetelescope/RegressionTests/actions/runs/11839507319/job/32991232674

@larrybradley
Copy link
Member Author

This was out-of-sync with main, so I just rebased. I'll merge when CI passes again.

@larrybradley larrybradley merged commit 1fe4cbc into spacetelescope:main Nov 14, 2024
31 checks passed
@larrybradley larrybradley deleted the bkgsub branch November 14, 2024 18:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants