Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full refactorization of modelling and visualisation functions #200

Merged
merged 78 commits into from
Sep 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
78 commits
Select commit Hold shift + click to select a range
d97c921
clean R and tests folders
ntorresd May 24, 2024
4dfbf46
feat: add constant models with and without seroreversion
ntorresd May 25, 2024
1fe63f6
delete old modules
ntorresd May 25, 2024
998a182
add config file with default priors and distribution indexes
ntorresd May 25, 2024
c86101a
add input validation utilities
ntorresd May 25, 2024
ba67365
feat: add working version of `fit_seromodel` using functional specifi…
ntorresd May 25, 2024
9ca8276
feat: add clean seroprevalence visualisation functions
ntorresd May 25, 2024
58a838d
feat: add age_seroreversion model
ntorresd May 25, 2024
760096d
feat: add `foi_index` (old `chunks`) default behavior
ntorresd May 25, 2024
15aad8d
feat: add `plot_foi_estimates` with option to add foi trend
ntorresd May 25, 2024
a371ac3
change model name for models without seroreversion to *_no_seroreversion
ntorresd May 25, 2024
6389d3c
feat: add age-varying model without seroreversion
ntorresd May 25, 2024
3a61712
refac: change stan functions names to avoid ambiguity
ntorresd May 26, 2024
fdcfe9b
feat: add time varying model with and without seroreversion
ntorresd May 26, 2024
36cf0b2
remove unnecessary line in `probability_exact_time_varying`
ntorresd May 26, 2024
ea07ee8
doc: add documentation for new functions
ntorresd May 27, 2024
ea6093f
refac: move data simulation validation functions to validation module
ntorresd Jul 31, 2024
962df9e
fix: change to `extract_central_estimates` to deal with 1-time estimates
ntorresd May 28, 2024
aedfc4f
feat: add 'plot_rhats' function
ntorresd Jul 31, 2024
daa6873
remove unnecessary stan file
ntorresd May 28, 2024
fd75b70
fix: add error message for constant model exceptions in plotting func…
ntorresd May 28, 2024
8ecde6b
feat: add `summarise_model` and `plot_summary` functions
ntorresd May 28, 2024
a93075b
feat: add new `plot_seromodel` function
ntorresd May 28, 2024
260e74e
fix: change reference to config file
ntorresd May 28, 2024
afe8512
fix: correct stan models reference
ntorresd May 28, 2024
d009c50
doc: add export tag to plot_seromodel
ntorresd Jul 31, 2024
de1cca9
fix: remove uneccessary argument call in `plot_seromodel`
ntorresd May 29, 2024
80b5ac1
feat: introduce 'summarise_loo_estimate' to simplify 'summarise_serom…
ntorresd Aug 1, 2024
5473e03
refac: move 'summarise_seromodel' to a separate file
ntorresd Aug 1, 2024
9684420
feat: add convergence field to summary in 'summarise_seromodel'
ntorresd Aug 1, 2024
7961fc9
refac: improve parameter specification in plotting functions
ntorresd Aug 1, 2024
aa544ce
add comments to config.yml
ntorresd May 30, 2024
a2feb51
fix: add manually null seroprev for age=0 in expanded prevalence
ntorresd Aug 1, 2024
f22090f
doc: lintr functions' documentation
ntorresd Aug 6, 2024
768c20f
refac: add `#nolint` to long variable and function names
ntorresd Aug 6, 2024
f8cd7bb
refac(lintr): add `.data$` where needed in pipes
ntorresd Aug 6, 2024
691b6c0
refac(lintr): other syntax and lintr corrections
ntorresd Aug 6, 2024
4fbc02b
fix: remove deprecated `@docType` tag
ntorresd Aug 6, 2024
67da8f4
doc(lintr): syntax and lintr corrections to data simulation vignette
ntorresd Aug 6, 2024
de8fa30
doc: update documentation
ntorresd Aug 6, 2024
db19c9a
update DESCRIPTION
ntorresd Aug 6, 2024
5bd6560
update WORDLIST
ntorresd Aug 6, 2024
76cd72e
fix: change `append()` by `c()` to append when possible
ntorresd Aug 15, 2024
a6cf80c
fix: improve error messages readability
ntorresd Aug 15, 2024
dc1166c
fix: replace validation function by `checkmate::assert_names()`
ntorresd Aug 15, 2024
59a3abc
feat: add stan time-varying models in log-scale
ntorresd Aug 15, 2024
b7b66a3
feat: modify R modules to allow log-scale model implementation
ntorresd Aug 15, 2024
3461930
feat: update stanmodels.R
ntorresd Aug 15, 2024
9d64eae
fix: correct foi_sigma_rw parameters set up
ntorresd Aug 15, 2024
a569288
feat: set default behavior for sampling initialization
ntorresd Aug 15, 2024
367724c
fix: simplify loaded datasets structure and update column NAMESPACE
ntorresd Aug 15, 2024
1f806bf
fix: change `tsur` to `survey_year` in R modules
ntorresd Aug 15, 2024
a6e34a0
update NAMESPACE
ntorresd Aug 15, 2024
fdb3d3a
update DESCRIPTION
ntorresd Aug 20, 2024
209dc30
feat: enable sigma prior specification for all foi_models
ntorresd Aug 20, 2024
3170390
fix: change default seroreversion prior to `sf_normal`
ntorresd Aug 20, 2024
60866f7
fix: add error whenever a not supported `model_type` is specified
ntorresd Aug 20, 2024
671a416
doc: improve `fit_seromodel` prior parameters documentation
ntorresd Aug 20, 2024
c5e50bc
fix: remove unecessary samples from output stanfit object
ntorresd Aug 20, 2024
c675100
feat: add `get_age_intervals` function to plot_seromodel module
ntorresd Aug 20, 2024
33f8968
feat: enable data binning for plotting in `plot_serosurvey`
ntorresd Aug 20, 2024
e3cfd30
fix: remove unecessary validation of `survey_year` in `validate_seros…
ntorresd Aug 20, 2024
ac7d073
fix: minor syntax corrections
ntorresd Aug 20, 2024
83c2c6a
doc: update documentation
ntorresd Aug 20, 2024
0e0b13a
doc: upgrade vignettes accordingly to refactorization
ntorresd Aug 20, 2024
97f5f3f
update WORDLIST
ntorresd Aug 20, 2024
7ce2600
doc: update serofoi.Rmd
ntorresd Aug 20, 2024
9571e25
doc: remove unecessary dependency from simulating_serosurveys vignette
ntorresd Aug 20, 2024
4dfbd35
fix: correct serofoi-package.Rd encoding
ntorresd Aug 20, 2024
4f236be
doc: move vignettes to articles
ntorresd Aug 20, 2024
04b0dc3
change version tag to 1.0.0
ntorresd Aug 21, 2024
5a4a554
fix: remove `survey_year` from `plot_serosurvey()`
ntorresd Aug 21, 2024
8d3ed76
doc: add some examples in functions' documentation
ntorresd Aug 22, 2024
c7d8ac8
doc: add complementary information to functions' documentation
ntorresd Aug 22, 2024
b56a8a9
doc: update documentation
ntorresd Aug 22, 2024
a671e4d
refac: change `sample_size` for `n_sample` across the package
ntorresd Sep 9, 2024
6195b5d
doc: update documentation
ntorresd Sep 9, 2024
e3092d3
change version to 1.0.1
ntorresd Sep 9, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,4 @@
^Meta$
^.lintr$
^.*/tests/docker/.*$
^vignettes/articles$
10 changes: 6 additions & 4 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: serofoi
Type: Package
Title: Estimates the Force-of-Infection of a given pathogen from population based seroprevalence studies
Version: 0.1.0
Title: Estimates the Force-of-Infection of a given pathogen from population-based seroprevalence studies
Version: 1.0.1
Authors@R:
c(
person(
Expand Down Expand Up @@ -36,7 +36,7 @@ License: MIT + file LICENSE
Encoding: UTF-8
Language: en-GB
LazyData: true
RoxygenNote: 7.3.1
RoxygenNote: 7.3.2
NeedsCompilation: yes
Depends:
R (>= 3.5.0)
Expand All @@ -55,7 +55,9 @@ Imports:
purrr,
tidyr,
tibble,
Matrix
Matrix,
glue,
config
LinkingTo:
BH (>= 1.66.0),
Rcpp (>= 0.12.0),
Expand Down
28 changes: 15 additions & 13 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,27 +1,29 @@
# Generated by roxygen2: do not edit by hand

export(extract_seromodel_summary)
export(build_stan_data)
export(extract_central_estimates)
export(fit_seromodel)
export(get_chunk_structure)
export(get_cohort_ages)
export(get_foi_central_estimates)
export(get_prev_expanded)
export(get_table_rhats)
export(plot_foi)
export(plot_info_table)
export(plot_rhats)
export(get_foi_index)
export(plot_foi_estimates)
export(plot_seromodel)
export(plot_seroprev)
export(plot_seroprev_fitted)
export(prepare_serodata)
export(plot_seroprevalence_estimates)
export(plot_serosurvey)
export(plot_summary)
export(prepare_serosurvey_for_plotting)
export(probability_seropositive_by_age)
export(probability_seropositive_general_model_by_age)
export(run_seromodel)
export(set_foi_init)
export(sf_cauchy)
export(sf_normal)
export(sf_uniform)
export(simulate_serosurvey)
export(simulate_serosurvey_age_and_time_model)
export(simulate_serosurvey_age_model)
export(simulate_serosurvey_general_model)
export(simulate_serosurvey_time_model)
export(summarise_central_estimate)
export(summarise_loo_estimate)
export(summarise_seromodel)
import(Rcpp)
import(dplyr)
import(methods)
Expand Down
245 changes: 245 additions & 0 deletions R/build_stan_data.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,245 @@
#' Sets normal distribution parameters for sampling
#'
#' @param mean Mean of the normal distribution
#' @param sd Standard deviation of the normal distribution
#' @return List with specified statistics and name of the model
#' @export
sf_normal <- function(mean = 0, sd = 1) {
# Restricting normal inputs to be non-negative
if (mean < 0 || sd <= 0) {
stop(
"Normal distribution only accepts",
" `mean>=0` and `sd>0` for mean and standard deviation")
}

return(list(mean = mean, sd = sd, name = "normal"))
}

#' Sets uniform distribution parameters for sampling
#'
#' @param min Minimum value of the random variable of the uniform distribution
#' @param max Maximum value of the random variable of the uniform distribution
#' @return List with specified statistics and name of the model
#' @export
sf_uniform <- function(min = 0, max = 10) {
# Restricting uniform inputs to be non-negative
if (min < 0 || (min >= max)) {
stop(
"Uniform distribution only accepts",
" 0<=min<max"
)
}
if (min >= max) {
message("Uniform distribution only accepts min < max")
}

return(list(min = min, max = max, name = "uniform"))
}

#' Sets Cauchy distribution parameters for sampling
#'
#' @param scale Scale
#' of the normal distribution
#' @param sd Standard deviation of the normal distribution
#' @return List with specified statistics and name of the model
#' @export
sf_cauchy <- function(location = 0, scale = 1) {
# Restricting normal inputs to be non-negative
if (location < 0 || scale < 0) {
stop(
"Normal distribution only accepts",
" `location>=0` and `scale=>0` for mean and standard deviation")
}

return(list(location = location, scale = scale, name = "cauchy"))
}

#' Sets empty distribution
sf_none <- function() {
return(list(name = "none"))
}

#' Generates force-of-infection indexes for heterogeneous age groups
#'
#' Generates a list of integers indexing together the time/age intervals
#' for which FOI values will be estimated in [fit_seromodel].
#' The max value in `foi_index` correspond to the number of FOI values to
#' be estimated when sampling.
#' @inheritParams fit_seromodel
#' @param group_size Age groups size
#' @return Integer vector with the indexes numerating each year/age
#' (depending on the model).
#' @examples
#' data(chagas2012)
#' foi_index <- get_foi_index(chagas2012, group_size = 25)
#' @export
get_foi_index <- function(
serosurvey,
group_size
) {
checkmate::assert_int(
group_size,
lower = 1,
upper = max(serosurvey$age_max)
)

foi_index <- unlist(
purrr::map(
seq(
1,
max(serosurvey$age_max) / group_size,
1),
rep,
times = group_size
)
)

foi_index <- c(
foi_index,
rep(
max(foi_index),
max(serosurvey$age_max) - length(foi_index)
)
)

return(foi_index)
}

#' Set stan data defaults for sampling
#'
#' @param stan_data List to be passed to [rstan][rstan::sampling]
#' @inheritParams fit_seromodel
#' @return List with default values of stan data for sampling
set_stan_data_defaults <- function(
stan_data,
is_log_foi = FALSE,
is_seroreversion = FALSE
) {
config_file <- system.file("extdata", "config.yml", package = "serofoi")
prior_default <- config::get(file = config_file, "priors")$defaults

foi_defaults <- list(
foi_prior_index = prior_default$index,
foi_min = prior_default$min,
foi_max = prior_default$max,
foi_mean = prior_default$mean,
foi_sd = prior_default$sd
)
# Add sigma defaults depending on scale
if (is_log_foi) {
# Normal distribution defaults
foi_defaults <- c(
foi_defaults,
list(
foi_sigma_rw_loc = prior_default$mean,
foi_sigma_rw_sc = prior_default$sd
)
)
} else {
# Cauchy distribution defaults
foi_defaults <- c(
foi_defaults,
list(
foi_sigma_rw_loc = prior_default$location,
foi_sigma_rw_sc = prior_default$scale
)
)
}
stan_data <- c(
stan_data,
foi_defaults
)

if (is_seroreversion) {
seroreversion_defaults <- list(
seroreversion_prior_index = prior_default$index,
seroreversion_min = prior_default$min,
seroreversion_max = prior_default$max,
seroreversion_mean = prior_default$mean,
seroreversion_sd = prior_default$sd
)
stan_data <- c(
stan_data,
seroreversion_defaults
)
}

return(stan_data)
}

#' Builds stan data for sampling depending on the selected model
#'
#' @inheritParams fit_seromodel
#' @return List with necessary data for sampling the specified model
#' @export
build_stan_data <- function(
serosurvey,
model_type = "constant",
foi_prior = sf_uniform(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the config.yml file and it did seem to have a max default of 10 (which I think is fine); I just wanted to double-check that I read it correctly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right. The default uniform distribution is sf_uniform(0, 10) and for the normal distribution it is sf_normal(0, 1).

foi_index = NULL,
is_log_foi = FALSE,
foi_sigma_rw = sf_none(),
is_seroreversion = FALSE,
seroreversion_prior = sf_none()
) {

stan_data <- list(
n_observations = nrow(serosurvey),
age_max = max(serosurvey$age_max),
ages = seq(1, max(serosurvey$age_max), 1),
n_seropositive = serosurvey$n_seropositive,
n_sample = serosurvey$n_sample,
age_groups = serosurvey$age_group
) %>%
set_stan_data_defaults(
is_log_foi = is_log_foi,
is_seroreversion = is_seroreversion
)

if (is.null(foi_index)) {
foi_index_default <- get_foi_index(serosurvey = serosurvey, group_size = 1)
stan_data <- c(
stan_data,
list(foi_index = foi_index_default)
)
} else {
# TODO: check that foi_index is the right size
stan_data <- c(
stan_data,
list(foi_index = foi_index)
)
}
config_file <- system.file("extdata", "config.yml", package = "serofoi")
prior_index <- config::get(file = config_file, "priors")$indexes

if (foi_prior$name == "uniform") {
stan_data$foi_prior_index <- prior_index[["uniform"]]
stan_data$foi_min <- foi_prior$min
stan_data$foi_max <- foi_prior$max
} else if (foi_prior$name == "normal") {
stan_data$foi_prior_index <- prior_index[["normal"]]
stan_data$foi_mean <- foi_prior$mean
stan_data$foi_sd <- foi_prior$sd
}

if (foi_sigma_rw$name == "cauchy") {
stan_data$foi_sigma_rw_loc <- foi_sigma_rw$location
stan_data$foi_sigma_rw_sc <- foi_sigma_rw$scale
}

if (is_seroreversion) {
if (seroreversion_prior$name == "none") {
stop("seroreversion_prior not specified")
} else if (seroreversion_prior$name == "uniform") {
stan_data$seroreversion_prior_index <- prior_index[["uniform"]]
stan_data$seroreversion_min <- seroreversion_prior$min
stan_data$seroreversion_max <- seroreversion_prior$max
} else if (seroreversion_prior$name == "normal") {
stan_data$seroreversion_prior_index <- prior_index[["normal"]]
stan_data$seroreversion_mean <- seroreversion_prior$mean
stan_data$seroreversion_sd <- seroreversion_prior$sd
}
}

return(stan_data)
}
17 changes: 0 additions & 17 deletions R/chagas2012.R

This file was deleted.

15 changes: 0 additions & 15 deletions R/chik2015.R

This file was deleted.

Loading
Loading