Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added validation function for serodata and fit_seromodel. Fixes #148 #154

Merged
merged 9 commits into from
Feb 13, 2024

Conversation

jpavlich
Copy link
Member

Please check that the validations I do for each column in serodata are correct.

@jpavlich jpavlich linked an issue Jan 26, 2024 that may be closed by this pull request
R/modelling.R Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Feb 2, 2024

Codecov Report

Attention: 12 lines in your changes are missing coverage. Please review.

Comparison is base (df2322a) 68.26% compared to head (e0d7d1b) 68.63%.

❗ Current head e0d7d1b differs from pull request most recent head 2a5820a. Consider uploading reports for the commit 2a5820a to get more accurate results

Files Patch % Lines
R/modelling.R 84.61% 12 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #154      +/-   ##
==========================================
+ Coverage   68.26%   68.63%   +0.37%     
==========================================
  Files          10       10              
  Lines        1736     1779      +43     
==========================================
+ Hits         1185     1221      +36     
- Misses        551      558       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jpavlich
Copy link
Member Author

jpavlich commented Feb 3, 2024

@ntorresd I just added type checking for serodata. Do not merge just yet. As soon as you confirm me the missing checks and the type checks are correct, I will write tests and method documentations, then we can merge this PR.

@ntorresd
Copy link
Member

ntorresd commented Feb 6, 2024

@ntorresd I just added type checking for serodata. Do not merge just yet. As soon as you confirm me the missing checks and the type checks are correct, I will write tests and method documentations, then we can merge this PR.

It seems like all the necessary checks are in place, thanks @jpavlich!

@jpavlich jpavlich force-pushed the 148-input-validation-for-fit_seromodel branch from 4977060 to ef879d7 Compare February 7, 2024 21:27
`validate_prepared_serodata` also calls `validate_serodata`
@jpavlich
Copy link
Member Author

jpavlich commented Feb 7, 2024

@ntorresd Please let me know if validate_prepared_serodata is ok. After merging into main I will open another branch to add documentation and tests

R/modelling.R Outdated
missing <- optional_cols[which(!(optional_cols %in% colnames(serodata)))]
warning(
"The following optional columns in `serodata` are missing.",
"Consider including them to get more information from this analysis",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently printing the following when country, test and antibody are missing:
The following optional columns in serodata are missing.Consider including them to get more information from this analysiscountry, test, antibody
Please add a line break at the end of line 56.

R/modelling.R Outdated
) {
missing <- must_have_cols[which(!(must_have_cols %in% colnames(serodata)))]
stop(
"The following mandatory columns in `serodata` are missing.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a line break at the end of this message too.

R/modelling.R Outdated

# Check that the serodata has the necessary columns to fully
# identify the age groups
stopifnot(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about the implementation of the age-varying models and the additional time-varying models I realized that this kind of validation may not be necessary given that we will need age_min and age_max anyway to correctly specify the chunks on which the FoI values are estimated, meaning that both of these should be validated in validate_serodata as mandatory columns, whereas age_mean_f should be validated on validate_prepared_serodata. This way the data validation will be simpler.



validate_serodata <- function(serodata) {
col_types <- list(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add both age_min and age_max to this list

}

validate_prepared_serodata <- function(serodata) {
col_types <- list(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

total, counts and tsur are already validated by validate_serodata(serodata) in line 113. Here we should just make sure that both age_mean_f and birth_year had been added to the data.

For the time being we can also add prev_obs, prev_obs_lower and prev_obs_upper (which should be numeric) for consistency with the current version of prepare_serodata. Although they're not needed for modelling, they're currently used for plotting purposes, so to simplify data validation for those functions I think it's worth adding them here. In the future we may refactor prepare_serodata for it just to prepare the data for modelling and compute the prevalence with its binomial confidence interval internally in the plotting functions (if we decide to keep the visualization module in the package), but we can decide this later.

@ntorresd ntorresd merged commit a790a1f into main Feb 13, 2024
7 checks passed
@ntorresd ntorresd deleted the 148-input-validation-for-fit_seromodel branch February 13, 2024 00:06
@ntorresd ntorresd mentioned this pull request Mar 8, 2024
@jpavlich
Copy link
Member Author

Fixes #152 since group_sim_data calls prepare_serodata. The latter performs the required validations that are implemented in this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Input validation for fit_seromodel
3 participants