-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve simulate_*()
functions: Rename columns + improve object names + comments
#242
Conversation
This PR also creates an opportunity to plot the network contained the returned reprex::reprex({
library(epicontacts)
# Install this PR and load the package
pak::pkg_install("epiverse-trace/epichains#242")
library(epichains)
# Simulate an outbreak
set.seed(32)
outbreak <- simulate_chains(
index_cases = 5,
statistic = "size",
offspring_dist = rpois,
generation_time = function(n) rlnorm(n, meanlog = 0.58, sdlog = 1.58),
lambda = 1.5,
stat_max = 30
)
# Create an epicontacts object
plot_df <- make_epicontacts(
linelist = outbreak,
contacts = outbreak,
id = "infectee",
from = "infector",
to = "infectee",
directed = TRUE
)
# Plot the epicontacts object
plot(plot_df)
})
The code used above can be converted to a simple S3 |
Before I go into detailed review can I just ask if one change I spotted was intentional, as it affects interpretation of the results and what is the "correct" way of accounting. Before (e.g. in bpmodels) each index case (/simulation) generated their own infectee ids, i.e. we could have
etc. Now the infectee_id is shared across all index cases (/simulations)
I think the way we want this relates to how we interpret the simulations (this also affects the
It only occurred to me when reviewing this PR that we're conflating these two concepts (also noting that the first argument is called |
Yes, this change was in response to your suggestions in the linked issues and my summary of how I understood it here #238 (comment). For now, I'll revert to option 1, i.e., using the |
e43e2ff
to
3d8950f
Compare
@sbfnk Here is what a reprex of what your comments here + in #238 (i.e., rename library(epichains)
set.seed(123)
epc_out <- simulate_chains(
index_cases = 5,
statistic = "length",
offspring_dist = rpois,
stat_max = 100,
lambda = 0.5
)
epc_out
#> index_case infector infectee generation
#> 1 1 NA 1 1
#> 2 2 NA 1 1
#> 3 3 NA 1 1
#> 4 4 NA 1 1
#> 5 5 NA 1 1
#> 6 2 1 2 2
#> 7 4 1 2 2
#> 8 5 1 2 2
#> 9 5 1 3 2
#> 10 5 2 4 3
Created on 2024-05-09 with [reprex v2.1.0](https://reprex.tidyverse.org/) The Does this reflect what you were suggesting above? |
Yes except that I'd revert |
This PR closes #238 and #175.
NEWS.md
Function enhancements + Documentation improvements
Currently, the
<epichains>
object returns columns with namesinfectee_id
,sim_id
,infector_id
, andgeneration
, and optionally,susc_pop
andtime
(ifpop
andgeneration_time
are specified respectively). However, these columns are confusing, swapped in interpretation, and not straightforward to explain as noted in the linked issues. Thesim_id
column is also not unique across the dataset, making it hard to interpret.User-facing changes
The
<epichains>
object now returns columns with nameschain
,infector
,infectee
,generation
, and optionally,time
, ifgeneration_time
is specified. Thesusc_pop
column has been removed as it was not deemed necessary to return.The help file of
simulate_chains()
andsimulate_chain_stats()
also gain a new section providing a clear definition of what a "chain" is as used in the function.The
infectee
column now contains a unique id for each infectee, which can link them to their infector and seeding index case.The
index_cases
argument ofsimulate_chains()
andsimulate_chain_stats()
has been renamed ton_chains
to reflect the fact that the supplied number will simulaten
independent chains, each starting with 1 individual.Non-user-facing changes
Additionally, some of the objects in the code have been renamed and comments have been improved to make the code (hopefully) easier to read.
NA (package unpublished).