Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make missing data work with -999 #540

Closed
wants to merge 1 commit into from
Closed

make missing data work with -999 #540

wants to merge 1 commit into from

Conversation

ChristineStawitz-NOAA
Copy link
Contributor

What is the feature?

  • Allows FIMS to handle missing data years if the data value is coded = -999

How have you implemented the solution?

Does the PR impact any other area of the project?

How to test this change

Developer pre-PR checklist

  • I relied on GitHub actions to 🧪 things for me while I sat on the 🛋️.

Copy link
Contributor

Instructions for code reviewer

Hello reviewer, thanks for taking the time to review this PR!

  • Please use this checklist during your review, checking off items that you have verified are complete!
  • For PRs that don't make changes to code (e.g., changes to README.md or Github actions workflows), feel free to skip over items on the checklist that are not relevant. Remember it is still important to do a thorough review.
  • Then, comment on the pull request with your review indicating where you have questions or changes need to be made before merging.
  • Remember to review every line of code you’ve been asked to review, look at the context, make sure you’re improving code health, and compliment developers on good things that they do.
  • PR reviews are a great way to learn, so feel free to share your tips and tricks. However, for optional changes (i.e., not required for merging), please include nit: (for nitpicking) before making the suggestion. For example, nit: I prefer using a data.frame() instead of a matrix because...
  • Engage with the developer when they respond to comments and check off additional boxes as they become complete so the PR can be merged in when all the tasks are fulfilled. Make it clear when this has been reached by commenting on the PR with something like This PR is now ready to be merged, no changes needed.

Checklist

  • The PR is requested to be merged into the appropriate branch (typically main)
  • The code is well-designed.
  • The functionality is good for the users of the code.
  • Any User Interface changes are sensible and look good.
  • The code isn’t more complex than it needs to be.
  • Code coverage remains high, indicating the new code is tested.
  • The developer used clear names for everything.
  • Comments are clear and useful, and mostly explain why instead of what.
  • Code is appropriately documented (doxygen and roxygen).

Copy link

codecov bot commented Jan 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (70c6cc9) 75.28% compared to head (346b023) 75.35%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #540      +/-   ##
==========================================
+ Coverage   75.28%   75.35%   +0.07%     
==========================================
  Files          39       39              
  Lines        2080     2086       +6     
  Branches      140      140              
==========================================
+ Hits         1566     1572       +6     
  Misses        471      471              
  Partials       43       43              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@k-doering-NOAA k-doering-NOAA self-requested a review February 6, 2024 23:14
Copy link
Member

@k-doering-NOAA k-doering-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChristineStawitz-NOAA I took a look and don't see any issues with this. I had a few questions, mostly for my understanding. The do not need to be resolved to merge, but answering them would help me :).

One thing that I'm not sure about is how much documentation this should have - perhaps at least saying how users would use this -999 on the description of this PR might be helpful? That way, it will give us something to start from once writing more formal documentation!

dnorm.mean = fims_math::log(this->expected_index[i]);
dnorm.sd = fims_math::exp(this->log_obs_error[i]);
nll -= dnorm.evaluate(true);
if(this->observed_index_data->at(i) != this->observed_index_data->na_value){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like if we add more data types besides an index and age comps, more if() statements would need to be added to each of the data types' negative log likelihood assignment statements to allow for NAs to work, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct! When coding we also discussed if we need to handle a case where there are NA values in some ages but not others despite the existence of some comp samples, but for now, you can only specify a full missing year. Are you thinking we should document this or add to an issue?

testindex <- 2
na_value <- -999
if(i==4){
fishing_fleet_index$index_data[testindex] <- na_value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is how R users would use it currently, correct? Assigning -999 to anything that should be an NA?

Maybe this should be added to the PR description at least for some minimal documentation on how this works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - I added something to the fims-demo vignette about how to specify missing data!

@k-doering-NOAA
Copy link
Member

@ChristineStawitz-NOAA before you merge - I just realized that the missing data commit is in the proportion female (#543) PR as well. Perhaps then it would be better to either 1) Not merge this in and only merge in the #543 ; Or 2) Merge this in, and do some rebasing magic on the #543 before merging this in

@Cole-Monnahan-NOAA
Copy link
Contributor

One observation from exploring this branch and putting missing data in. When I put -999 in for a year, I also put -1 in for input multinomial sample size, thinking that it would ignore this value since it's skipped in the calculations. But it caused some really unexpected behavior. The model would run but had the wrong results. I could try to reproduce this if necessary at some point. But I think we need more error checking for inputs to catch some of this early on.

Otherwise I approve the review and closing it. Sorry I was late to this.

@ChristineStawitz-NOAA
Copy link
Contributor Author

ChristineStawitz-NOAA commented Feb 9, 2024 via email

@kellijohnson-NOAA
Copy link
Contributor

It seems like the original plan of the data team to handle missing values behind the scenes would make things more consistent than having people put -999 in their data input.

@ChristineStawitz-NOAA
Copy link
Contributor Author

ChristineStawitz-NOAA commented Feb 9, 2024 via email

@kellijohnson-NOAA
Copy link
Contributor

Thanks @ChristineStawitz-NOAA for the clarification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants