make missing data work with -999 #540

ChristineStawitz-NOAA · 2024-01-19T23:42:35Z

What is the feature?

Allows FIMS to handle missing data years if the data value is coded = -999

How have you implemented the solution?

Does the PR impact any other area of the project?

How to test this change

Developer pre-PR checklist

I relied on GitHub actions to 🧪 things for me while I sat on the 🛋️.

github-actions · 2024-01-19T23:42:48Z

Instructions for code reviewer

Hello reviewer, thanks for taking the time to review this PR!

Please use this checklist during your review, checking off items that you have verified are complete!
For PRs that don't make changes to code (e.g., changes to README.md or Github actions workflows), feel free to skip over items on the checklist that are not relevant. Remember it is still important to do a thorough review.
Then, comment on the pull request with your review indicating where you have questions or changes need to be made before merging.
Remember to review every line of code you’ve been asked to review, look at the context, make sure you’re improving code health, and compliment developers on good things that they do.
PR reviews are a great way to learn, so feel free to share your tips and tricks. However, for optional changes (i.e., not required for merging), please include nit: (for nitpicking) before making the suggestion. For example, nit: I prefer using a data.frame() instead of a matrix because...
Engage with the developer when they respond to comments and check off additional boxes as they become complete so the PR can be merged in when all the tasks are fulfilled. Make it clear when this has been reached by commenting on the PR with something like This PR is now ready to be merged, no changes needed.

Checklist

The PR is requested to be merged into the appropriate branch (typically main)
The code is well-designed.
The functionality is good for the users of the code.
Any User Interface changes are sensible and look good.
The code isn’t more complex than it needs to be.
Code coverage remains high, indicating the new code is tested.
The developer used clear names for everything.
Comments are clear and useful, and mostly explain why instead of what.
Code is appropriately documented (doxygen and roxygen).

codecov · 2024-01-19T23:47:14Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (70c6cc9) 75.28% compared to head (346b023) 75.35%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #540      +/-   ##
==========================================
+ Coverage   75.28%   75.35%   +0.07%     
==========================================
  Files          39       39              
  Lines        2080     2086       +6     
  Branches      140      140              
==========================================
+ Hits         1566     1572       +6     
  Misses        471      471              
  Partials       43       43

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

k-doering-NOAA

@ChristineStawitz-NOAA I took a look and don't see any issues with this. I had a few questions, mostly for my understanding. The do not need to be resolved to merge, but answering them would help me :).

One thing that I'm not sure about is how much documentation this should have - perhaps at least saying how users would use this -999 on the description of this PR might be helpful? That way, it will give us something to start from once writing more formal documentation!

k-doering-NOAA · 2024-02-06T23:15:30Z

inst/include/population_dynamics/fleet/fleet.hpp

-      dnorm.mean = fims_math::log(this->expected_index[i]);
-      dnorm.sd = fims_math::exp(this->log_obs_error[i]);
-      nll -= dnorm.evaluate(true);
+      if(this->observed_index_data->at(i) != this->observed_index_data->na_value){


It seems like if we add more data types besides an index and age comps, more if() statements would need to be added to each of the data types' negative log likelihood assignment statements to allow for NAs to work, correct?

correct! When coding we also discussed if we need to handle a case where there are NA values in some ages but not others despite the existence of some comp samples, but for now, you can only specify a full missing year. Are you thinking we should document this or add to an issue?

k-doering-NOAA · 2024-02-06T23:20:50Z

tests/testthat/test-fims-estimation.R

+    testindex <- 2
+    na_value <- -999
+    if(i==4){
+      fishing_fleet_index$index_data[testindex] <- na_value


this is how R users would use it currently, correct? Assigning -999 to anything that should be an NA?

Maybe this should be added to the PR description at least for some minimal documentation on how this works.

Good point - I added something to the fims-demo vignette about how to specify missing data!

k-doering-NOAA · 2024-02-06T23:28:56Z

@ChristineStawitz-NOAA before you merge - I just realized that the missing data commit is in the proportion female (#543) PR as well. Perhaps then it would be better to either 1) Not merge this in and only merge in the #543 ; Or 2) Merge this in, and do some rebasing magic on the #543 before merging this in

Cole-Monnahan-NOAA · 2024-02-09T00:38:43Z

One observation from exploring this branch and putting missing data in. When I put -999 in for a year, I also put -1 in for input multinomial sample size, thinking that it would ignore this value since it's skipped in the calculations. But it caused some really unexpected behavior. The model would run but had the wrong results. I could try to reproduce this if necessary at some point. But I think we need more error checking for inputs to catch some of this early on.

Otherwise I approve the review and closing it. Sorry I was late to this.

ChristineStawitz-NOAA · 2024-02-09T16:31:58Z

Did you mean to close Allow ability for proportion female to vary by age by ChristineStawitz-NOAA · Pull Request #543 · NOAA-FIMS/FIMS (github.com) <#543> ? That's the PR I need to be approved to merge into main I'm confused where you put the -1 in for multinomial sample size. The age_comp_data input is a vector of integers such that sample size is multiplied by the proportion in each age group. So did you have a vector of negative numbers in that year?

…

On Thu, Feb 8, 2024 at 4:38 PM Cole Monnahan ***@***.***> wrote: One observation from exploring this branch and putting missing data in. When I put -999 in for a year, I also put -1 in for input multinomial sample size, thinking that it would ignore this value since it's skipped in the calculations. But it caused some really unexpected behavior. The model would run but had the wrong results. I could try to reproduce this if necessary at some point. But I think we need more error checking for inputs to catch some of this early on. Otherwise I approve the review and closing it. Sorry I was late to this. — Reply to this email directly, view it on GitHub <#540 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALNPO3MC32PKHUXEHJCZR2TYSVVZ5AVCNFSM6AAAAABCCV4BQGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZVGE2TAMJYG4> . You are receiving this because you modified the open/close state.Message ID: ***@***.***>

-- Christine C. Stawitz, PhD. (pronouns: she/her) FIMS Project Lead NOAA Fisheries Office of Science and Technology | U.S. Department of Commerce Schedule a meeting with me! <https://calendly.com/christine-stawitz/30min> Mobile: 206-617-2060 www.fisheries.noaa.gov

kellijohnson-NOAA · 2024-02-09T16:57:23Z

It seems like the original plan of the data team to handle missing values behind the scenes would make things more consistent than having people put -999 in their data input.

ChristineStawitz-NOAA · 2024-02-09T17:13:53Z

Agree - that is the long-term goal @kellijohnson-NOAA. This was an interim fix to get the case studies running with missing years of data prior to M2 development starting, with the eventual goal of populating the na_values internally within the R interface so the user doesn't have to do it.

kellijohnson-NOAA · 2024-02-09T17:15:30Z

Thanks @ChristineStawitz-NOAA for the clarification.

make missing data work with -999

346b023

ChristineStawitz-NOAA requested a review from Cole-Monnahan-NOAA January 19, 2024 23:42

ChristineStawitz-NOAA self-assigned this Jan 19, 2024

k-doering-NOAA self-requested a review February 6, 2024 23:14

k-doering-NOAA approved these changes Feb 6, 2024

View reviewed changes

Bai-Li-NOAA mentioned this pull request Feb 7, 2024

Allow ability for proportion female to vary by age #543

Merged

1 task

ChristineStawitz-NOAA closed this Feb 8, 2024

ChristineStawitz-NOAA deleted the isNA branch February 14, 2024 22:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make missing data work with -999 #540

make missing data work with -999 #540

ChristineStawitz-NOAA commented Jan 19, 2024

github-actions bot commented Jan 19, 2024

codecov bot commented Jan 19, 2024 •

edited

Loading

k-doering-NOAA left a comment •

edited

Loading

k-doering-NOAA Feb 6, 2024

ChristineStawitz-NOAA Feb 8, 2024

k-doering-NOAA Feb 6, 2024

ChristineStawitz-NOAA Feb 8, 2024

k-doering-NOAA commented Feb 6, 2024

Cole-Monnahan-NOAA commented Feb 9, 2024

ChristineStawitz-NOAA commented Feb 9, 2024 via email

kellijohnson-NOAA commented Feb 9, 2024

ChristineStawitz-NOAA commented Feb 9, 2024 via email •

edited

Loading

kellijohnson-NOAA commented Feb 9, 2024

make missing data work with -999 #540

make missing data work with -999 #540

Conversation

ChristineStawitz-NOAA commented Jan 19, 2024

What is the feature?

How have you implemented the solution?

Does the PR impact any other area of the project?

How to test this change

Developer pre-PR checklist

github-actions bot commented Jan 19, 2024

Instructions for code reviewer

Checklist

codecov bot commented Jan 19, 2024 • edited Loading

Codecov Report

k-doering-NOAA left a comment • edited Loading

Choose a reason for hiding this comment

k-doering-NOAA Feb 6, 2024

Choose a reason for hiding this comment

ChristineStawitz-NOAA Feb 8, 2024

Choose a reason for hiding this comment

k-doering-NOAA Feb 6, 2024

Choose a reason for hiding this comment

ChristineStawitz-NOAA Feb 8, 2024

Choose a reason for hiding this comment

k-doering-NOAA commented Feb 6, 2024

Cole-Monnahan-NOAA commented Feb 9, 2024

ChristineStawitz-NOAA commented Feb 9, 2024 via email

kellijohnson-NOAA commented Feb 9, 2024

ChristineStawitz-NOAA commented Feb 9, 2024 via email • edited Loading

kellijohnson-NOAA commented Feb 9, 2024

codecov bot commented Jan 19, 2024 •

edited

Loading

k-doering-NOAA left a comment •

edited

Loading

ChristineStawitz-NOAA commented Feb 9, 2024 via email •

edited

Loading