-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added validation function for serodata
and fit_seromodel
. Fixes #148
#154
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #154 +/- ##
==========================================
+ Coverage 68.26% 68.63% +0.37%
==========================================
Files 10 10
Lines 1736 1779 +43
==========================================
+ Hits 1185 1221 +36
- Misses 551 558 +7 ☔ View full report in Codecov by Sentry. |
@ntorresd I just added type checking for |
…anged `validate_serodata` to `validate_prepared_serodata` within fit_seromodel
… `prepare_serodata`
4977060
to
ef879d7
Compare
`validate_prepared_serodata` also calls `validate_serodata`
@ntorresd Please let me know if |
…alidate_prepared_serodata`
…alidate_prepared_serodata`
R/modelling.R
Outdated
missing <- optional_cols[which(!(optional_cols %in% colnames(serodata)))] | ||
warning( | ||
"The following optional columns in `serodata` are missing.", | ||
"Consider including them to get more information from this analysis", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is currently printing the following when country
, test
and antibody
are missing:
The following optional columns in
serodata are missing.Consider including them to get more information from this analysiscountry, test, antibody
Please add a line break at the end of line 56.
R/modelling.R
Outdated
) { | ||
missing <- must_have_cols[which(!(must_have_cols %in% colnames(serodata)))] | ||
stop( | ||
"The following mandatory columns in `serodata` are missing.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a line break at the end of this message too.
R/modelling.R
Outdated
|
||
# Check that the serodata has the necessary columns to fully | ||
# identify the age groups | ||
stopifnot( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about the implementation of the age-varying models and the additional time-varying models I realized that this kind of validation may not be necessary given that we will need age_min
and age_max
anyway to correctly specify the chunks on which the FoI values are estimated, meaning that both of these should be validated in validate_serodata
as mandatory columns, whereas age_mean_f
should be validated on validate_prepared_serodata
. This way the data validation will be simpler.
|
||
|
||
validate_serodata <- function(serodata) { | ||
col_types <- list( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add both age_min
and age_max
to this list
} | ||
|
||
validate_prepared_serodata <- function(serodata) { | ||
col_types <- list( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
total
, counts
and tsur
are already validated by validate_serodata(serodata)
in line 113. Here we should just make sure that both age_mean_f
and birth_year
had been added to the data.
For the time being we can also add prev_obs
, prev_obs_lower
and prev_obs_upper
(which should be numeric) for consistency with the current version of prepare_serodata
. Although they're not needed for modelling, they're currently used for plotting purposes, so to simplify data validation for those functions I think it's worth adding them here. In the future we may refactor prepare_serodata
for it just to prepare the data for modelling and compute the prevalence with its binomial confidence interval internally in the plotting functions (if we decide to keep the visualization module in the package), but we can decide this later.
Fixes #152 since |
Please check that the validations I do for each column in
serodata
are correct.