Skip to content
This repository has been archived by the owner on Jul 24, 2024. It is now read-only.

Commit

Permalink
Report devel2 into dev (#352)
Browse files Browse the repository at this point in the history
* updates to state template

* fix load_cum_inf_geounit_dates to use hosp only

* add hosp method chunks from report_devel

* adding generic mapping function

* removing grouping by time for appropriate cumsum in load_cum_inf

* fixing error in load_cum_inf

* add ventilator to scenario tbl

* add warning about loading infections from hosp data

* deprecate old functions, integration testing temp

* recreating clean NAMESPACE to remove export of setup_testing_environment preventing pkg install

* adding sim_num before post_process in load_hosp_sims_filtered for output that does not contain sim_num but requires it for post-processing

* adding warning about variable name to load_hosp_geounit_threshold

* moving make_excess_heatmap to deprecated functions

* prep report_devel2 for dev merge (#351)

* Version with pyarrow included

* Dependencies for arrow in R as well

* Fixed check_model script

* Updated for feather integration

* Updated test cases since `n` is reserved in yml

* adding make_excess_heatmap function for hosp outcomes

* Fixing parallelization mistake

* Minor fixes

- Use the "optimize" covidImportation version
- Always upgrade local packages if upgrade available (vs silently ignore)
- check_model_reports should ensure axis are dates

* new figure relative to threshold heatmap

* Update importation.R to match covidImportation package updates

* Updated model code to use the new covidImportation package, and also seed to E instead of I (and keep population fixed

* Fixed typo

* Final fix to avoid numba

* Fixed path to install_local script

* Added package

* Fixed seeding creation

* rm NAs and fix create_seeding.R

* add new cum hosp/deaths check to check_models scr

* update indexes in check model script

* long form mobility

* Update reference to geoid-params.csv inside of hosp_run.R

* 10x seeding file

* Write the npi when writing parquet output

* template

* report after simulation

* Removed geodata read from hosp_run.R since it's not being used

* Updated things that feed into mobility

* Updated build_US_setup.R to account for the move

* These files got removed in a previous commit

* Removing unused (as far as I can tell anyway) data

* Fix bug when the places are also a number

* Changing back test cases to use size/prob instead of n/p

* Updated name to pass checks on case sensitive OS

* Updated to use file_extension argument`

* Fix broken tests, though I recommend we eliminate the mean and var checks since they'll be flaky

* Updated build_US_setup.R to work with the current setup

* Renamed parameters to avoid confusion; print out simid as 9 digits

SEIR and hospitalization phases have more standardized file format

* read parquet file times correctly

* Revert "read parquet file times correctly"

This reverts commit 521dd25.

* parquet date fixes (#207)

Co-authored-by: hrmeredith12 <hrmeredith12@gmail.com>

* Report devel (#208)

* fix unit test code

* fix unit test for real

* fix unit tests

* adding ability to filter geoids in relative heatmap function

* adding template for county-specific report for a given state

* lower tolerance for distribution tests

* planning_models chunk

* planning scenario chunk

* add names to dev team

Co-authored-by: eclee25 <eclee25@gmail.com>
Co-authored-by: Kyra Grantz <kyragrantz@gmail.com>
Co-authored-by: hrmeredith12 <hrmeredith12@gmail.com>

* Adding Javier (#210)

Co-authored-by: hrmeredith12 <hrmeredith12@gmail.com>
Co-authored-by: Elizabeth Lee <eclee25@gmail.com>

* Delete build-model-input.R (#217)

* Dataseed merge (#215)

* Adding Javier

* Adding commute data back in

* rm fixed param and comment out bad plot

* commit namesapce report gen

* fix NVentCurr name

* formatting changes to county report template, removing defaults that should be modified for each report

* adding references for county report template

* change importation seeding

* table formatting

* limitations chunk considering age specific hosp calculations

* removing build_hospdeath_geoid_par - old version not used in hosprun.R

* removing legacy hospitalization scripts. everything runs through hosp_run.R now

* using current default durations to minimize confusion

Co-authored-by: hrmeredith12 <hrmeredith12@gmail.com>
Co-authored-by: Elizabeth Lee <eclee25@gmail.com>
Co-authored-by: Kyra Grantz <kyragrantz@gmail.com>

* Removing config.yml and changing the variable name in create_seeding to be truthful. (#219)

* Fixed the low in followup issue (#224)

* Fixed the low in followup issue

* Adding initial ^

* adding county report template yaml (#221)

Co-authored-by: jkamins7 <jkaminsky@jhu.edu>

* Fix load-bearing typo (#225)

* Fix load-bearing typo

* pretty sure it's supposed to be this

Co-authored-by: Josh Wills <jwills@apache.org>
Co-authored-by: kkintaro <katkintaro@gmail.com>

* Add an environment variable that can be used for writing uniquely named output files across blocks of jobs from AWS batch

* fix for 1 scenario (#230)

Co-authored-by: Elizabeth Lee <eclee25@gmail.com>

* RStudio in the Docker container

RStudio is now available in the Docker container, which allows development and EDA with the same set of packages as is run in production.

* Update covidImportation package to v1.6 (#10)

* Update covidImportation package to v1.6 (#250)

* Adding final form of previous packrat + docker setup after merging weirdness

* Switching .so to git lfs

* Updated indexing in simulations and hospitalization

* Added better indexing for hospitalization

* Add ability to reduce alpha, sigma, and gamma (#241)

* Add the ability to reduce multiple parameters

* Add Reduce scenario template to test_simple and documentation

* minor bug test fix

* Minor bugs

Co-authored-by: Joseph Lemaitre <joseph.lemaitre@epfl.ch>

* Move the spatial setup outside of the scenarios loop since it's expensive to load and doesn't change per scenario.

* Removing source for packages installable from cran

* Updated the python rules for reticulate (tests still pass)

* Removing source for packages installable from cran

* Updated the python rules for reticulate (tests still pass)

* Updated based on review

* Fixed filter issues with makefile setup in case dynfilter isn't provided in config

* Updated makefile

* Reduce hospitalization memory pressure

Switch a critical split-apply-combine away from `do.call()`, which results in a 45% reduction in memory usage and a 35% speedup in execution time in my testing.

* Packrat (#253)

* Adding final form of previous packrat + docker setup after merging weirdness

* Switching .so to git lfs

* Removing source for packages installable from cran

* Updated the python rules for reticulate (tests still pass)

* Updated based on review

* Updated to use dev's docker instead of dataseed's

* Added reticulate zoo and xts

* Updated docker with git-lfs

* Packrat (#267)

* Adding final form of previous packrat + docker setup after merging weirdness

* Switching .so to git lfs

* Removing source for packages installable from cran

* Updated the python rules for reticulate (tests still pass)

* Updated based on review

* Updated to use dev's docker instead of dataseed's

* Added reticulate zoo and xts

* Updated docker with git-lfs

* Updating docker to install current versions of local packages

* Update .Rprofile

* Update dockerhub.yaml

* Update aws.yaml

* Yet another packrat attempt

* Update ci.yml

* Generic version of the batch job launcher/runner (#257)

* Generic version of batch from the union of jwills_dfU_run and dataseed_batch2

* Fixes from running stuff on some test jobs

* Add a vcpu CLI option and update sims_per_job to refer to slots per job

Co-authored-by: jkamins7 <jkaminsky@jhu.edu>

* changing covidImportation tag to 1.6.1

* Reduce SEIR startup costs (#273)

* 60% speedup in one run SEIR performance

The biggest cost in a single sim SEIR run was importation of Numba and the JIT compilation. Change this to compile ahead of time, which results in a nice 60% lift in one run SEIR performance by saving these startup costs---which will be valuable for our large inference runs.

Minor performance benefit when running many simulations as JIT costs are amortized away.

```
Benchmark #1: single sim JIT compilation (current)
  Time (mean ± σ):     13.429 s ±  0.537 s
  Range (min … max):   12.973 s … 14.867 s    100 runs

Benchmark #2: single sim AOT compilation (new)
  Time (mean ± σ):      5.129 s ±  0.125 s
  Range (min … max):    4.901 s …  5.364 s    100 runs
```

* Add Python build directory to .gitignore

* Integrate build_US_setup into pipeline and... (#271)

* Add hard-coded territory data to build_US_setup

* Create csv of island area census data since it cannot be accessed by API

* Change the report targets to follow the conventions of make_makefile

* Integrate build_US_setup into pipeline

* Some bug fixes

* git lfs pull of commute_data.csv and switch docker image

* Update ci.yml

* Update ci.yml

* Remove generated files

* Update make_makefile.R

* Update run_tests.py

* pull census year from config

* Use census year from config to build_US_setup

* Update build_US_setup.R

Co-authored-by: eclee25 <eclee25@gmail.com>

* Add check to hospitalization that geodata geoids are in geoid-params.csv (#283)

* added state level script for creating csv reporting out quantiles

* Fixed a slight bug with static dates and added full geographic extent version of the quantile generation script

* Added countylevel script

* Varios fixes and updates to post run summarization scripts.

* Integrate QuantileSummarizeGeoExtent.R into pipeline (untested)

* Integrate QuantileSummarizeGeoExtent.R into pipeline

* Create QuantileSummarizeGeoidLevel.py

* Working on the python script

* Integrate quantile scripts into Makefile

* Delete QuantileSummarizeGeoidLevel.py

* perf fix for quantile_report_script

* QuantileSummarizeGeoidLevel on Apache Spark

This commit includes a Python implementation of `QuantileSummarizeGeoidLevel.R` running on Apache Spark. The job essentially computes quantiles grouped by geoid and time whereby Spark provides the shuffle and quantile estimation mechanism to perform this aggregation efficiently. The job can be run locally within the container (fine for USA run but takes ~45mins on a r5.24xlarge) or distributed on Amazon EMR. This commit adds Spark and consequently Java inside the container.

* add `--name_filter` to quantile_summarize_geoid_level as per feedback

* Adjust quantile scripts so they all have the same interface

- Fixed bug in both R scripts where `num_files` was set incorrectly
- Adjust quantile_summarize_geoid_level.py to take scenarios (+ config file) versus path names as input to mimic the interface of the other scripts

* Revert make_makefile.R to dev branch version

* setup file for international countries

* Fatiguing NPI

* tested MVP

* other implementation, maybe cleaner

* update to hosp_run to take specified geoid-params

* Added mild infections as output of hospitalization

* minor

* Hospitalization package update

* dev setup

* fixed rate

* adding apl deployment to ecr

* international seeding and setup files created

* Update to report template docs for country reports

* update to non-US scripts

* update to international branch country setup

* non-US setup Rmd and other scripts finished.

* update

* minor print edit

* updates to script to make international functional with master

* minor update to report and setup scripts

* setup fix

* non-us update

* dev setup relative min

* relative min ready

* 1. Added integration tests for US and non-US create_seeding.R and build_US_setup.R/build_nonUS_setup.R

2. create_seeding.R now has the option to choose "CSSE" or "USAFacts" for a US run.

* Delete jhucsse_case_data_crude.csv

accidental data commit

* vignette fix

* Removed man folders from packages

* fixes in the international branch before the merge

* Do not update packages

* Update covidImportation to v1.6.1

* minor fix

* fix non-US setup

* Update local_install.R

* Fix merge error

* Reload covidImportation v1.6.1 to fix tidyverse dependency

* seeding update with inputted incidence multiplier

* minor names fix

* Minor fixes to build_US and build_nonUS integration tests

* deleted a comma

* minor bug fix

* Fix reversed international tag

* fixed error message

* fixed python error

* minor

* Adding updated severity parameters

* fixing US seeding

* adding print message

* Update covidImportation with bug fix

* minor update

* Fix filter issue

* integration testing fixes

* Non-US makefile added. This should actually work fine for US as well. It also adds the ability to use the setup_name from the config to add a file prefix to model outputs, and then only clean those model outputs when running "make clean".

* make_makefile.R now includes both US and non-US  functionality

* make_makefile white space fix

* Add tictoc package to dev docker

* Updated to fix a docker bug

Co-authored-by: Josh Wills <jwills@apache.org>
Co-authored-by: jkamins7 <jkaminsky@jhu.edu>
Co-authored-by: kkintaro <katkintaro@gmail.com>
Co-authored-by: Kyra Grantz <kyragrantz@gmail.com>
Co-authored-by: Sam Shah <sam@skipflag.com>
Co-authored-by: shauntruelove <satruelove@gmail.com>
Co-authored-by: chadi <joseph.lemaitre@epfl.ch>
Co-authored-by: hrmeredith12 <hrmeredith12@gmail.com>
Co-authored-by: Josh Wills <josh.wills@gmail.com>
Co-authored-by: Sam Shah <shahsam@umich.edu>
Co-authored-by: Dave <David.Witman@jhuapl.edu>
Co-authored-by: Shaun Truelove <shauntruelove@users.noreply.github.com>

* rename report.generation folder

* update report.generation path in workflow test

Co-authored-by: Kyra Grantz <kyragrantz@gmail.com>
Co-authored-by: juanderone <57634493+juanderone@users.noreply.github.com>
Co-authored-by: Josh Wills <jwills@apache.org>
Co-authored-by: jkamins7 <jkaminsky@jhu.edu>
Co-authored-by: kkintaro <katkintaro@gmail.com>
Co-authored-by: Sam Shah <sam@skipflag.com>
Co-authored-by: shauntruelove <satruelove@gmail.com>
Co-authored-by: chadi <joseph.lemaitre@epfl.ch>
Co-authored-by: hrmeredith12 <hrmeredith12@gmail.com>
Co-authored-by: Josh Wills <josh.wills@gmail.com>
Co-authored-by: Sam Shah <shahsam@umich.edu>
Co-authored-by: Dave <David.Witman@jhuapl.edu>
Co-authored-by: Shaun Truelove <shauntruelove@users.noreply.github.com>
  • Loading branch information
14 people authored Jul 15, 2020
1 parent 6bc18b4 commit 4b336a5
Show file tree
Hide file tree
Showing 38 changed files with 1,761 additions and 1,411 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,9 +41,9 @@ jobs:
setwd("R/pkgs/hospitalization")
devtools::test(stop_on_failure=TRUE)
shell: Rscript {0}
- name: Run report_generation tests
- name: Run report.generation tests
run: |
setwd("R/pkgs/report_generation")
setwd("R/pkgs/report.generation")
devtools::test(stop_on_failure=TRUE)
shell: Rscript {0}
- name: Run integration tests
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ export(make_scn_time_summary_table)
export(make_scn_time_summary_table_withVent)
export(plot_event_time_by_geoid)
export(plot_geounit_attack_rate_map)
export(plot_geounit_map)
export(plot_hist_incidHosp_state)
export(plot_line_hospPeak_time_county)
export(plot_model_vs_obs)
Expand All @@ -35,4 +36,3 @@ export(plot_ts_incid_inf_state_sample)
export(print_pretty_date)
export(print_pretty_date_short)
export(reference_chunk)
export(setup_testing_environment)
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ load_scenario_sims_filtered <- function(scenario_dir,
##' with pre and post filters
##'
##' @param scenario_dir the subdirectory containing this scenario
##' @param name_filter function that
##' @param name_filter string that indicates which pdeath level to import (from the hosp filename)
##' @param post_process function that does processing after
##' @param geoid_len in defined, this we want to make geoids all the same length
##' @param padding_char character to add to the front of geoids if fixed length
Expand Down Expand Up @@ -172,8 +172,8 @@ load_hosp_sims_filtered <- function(scenario_dir,

read_file(files[i]) %>%
padfn %>%
post_process(...) %>%
mutate(sim_num = i)
mutate(sim_num = i) %>%
post_process(...)
}

rc<- dplyr::bind_rows(rc)
Expand Down
720 changes: 720 additions & 0 deletions R/pkgs/report.generation/R/ReportBuildUtils-deprecated.R

Large diffs are not rendered by default.

Large diffs are not rendered by default.

338 changes: 338 additions & 0 deletions R/pkgs/report.generation/R/ReportLoadData-deprecated.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,338 @@
##' Deprecated convenience function to load cumulative geounit hosp outcomes at a specific date for the given scenario
##'
##' @param scn_dirs paste(config$name, config$interventions$scenarios, sep = "_") character vector of scenario directory names
##' @param scenariolabels config$report$formatting$scenario_labels character vector of scenario labels
##' @param name_filter character string that filenames should match
##' @param display_date character date string for which cumulative infections should be extracted
##' @param incl_geoids character vector of geoids that are included in the report
##' @param geoid_len in defined, this we want to make geoids all the same length
##' @param padding_char character to add to the front of geoids if fixed length
##'
##' @return a data frame with columns
##' - time
##' - comp
##' - geoid
##' - N
##' - sim_num
##' - scenario_num
##' - scenario_name
##'
##' @export
load_cum_hosp_geounit_date <- function(scn_dirs,
num_files = NA,
scenariolabels = NULL,
name_filter,
display_date=config$end_date,
incl_geoids=NULL,
geoid_len = 0,
padding_char = "0",
file_extension = 'auto'){

if(is.null(scenariolabels)){
warning("You have not specified scenario labels for this function. You may encounter future errors.")
}

display_date <- as.Date(display_date)
##filter to munge the data at the scenario level
if (!is.null(incl_geoids)) {
hosp_post_process <- function(x) {
x %>%
dplyr::filter(!is.na(time) & geoid %in% incl_geoids, time <= display_date) %>%
group_by(geoid, sim_num) %>%
dplyr::summarize(NincidDeath = sum(incidD),
NincidInf = sum(incidI),
NincidICU=sum(incidICU),
NincidHosp=sum(incidH),
NincidVent = sum(incidVent)) %>%
ungroup()
}
} else {
hosp_post_process <- function(x) {
x %>%
dplyr::filter(!is.na(time) & time <= display_date) %>%
group_by(geoid, sim_num) %>%
dplyr::summarize(NincidDeath = sum(incidD),
NincidInf = sum(incidI),
NincidICU=sum(incidICU),
NincidHosp=sum(incidH),
NincidVent = sum(incidVent)) %>%
ungroup()
}
}


rc <- list(length=length(scn_dirs))
for (i in 1:length(scn_dirs)) {
rc[[i]] <- load_hosp_sims_filtered(scn_dirs[i],
num_files = num_files,
name_filter = name_filter,
post_process = hosp_post_process,
geoid_len = geoid_len,
padding_char = padding_char,
file_extension = file_extension)
rc[[i]]$scenario_num <- i
rc[[i]]$scenario_name <- scenariolabels[[i]]
}

return(dplyr::bind_rows(rc))
}



##' Deprecated convenience function to load timeseries current hospital outcomes
##'
##' @param scn_dirs paste(config$name, config$interventions$scenarios, sep = "_") character vector of scenario directory names
##' @param scenariolabels config$report$formatting$scenario_labels character vector of scenario labels
##' @param name_filter character string that filenames should match
##' @param end_date last date to include in timeseries
##' @param incl_geoids character vector of geoids that are included in the report
##' @param geoid_len in defined, this we want to make geoids all the same length
##' @param padding_char character to add to the front of geoids if fixed length
##'
##' @return a data frame with columns
##' - time
##' - geoid
##' - NHospCurr, NICUCurr, NVentCurr
##' - sim_num
##' - scenario_num
##' - scenario_name
##'
##' @export
load_ts_current_hosp_geounit <- function(scn_dirs,
num_files = NA,
scenariolabels = NULL,
name_filter,
end_date,
incl_geoids=NULL,
geoid_len = 0,
padding_char = "0",
qlo = 0.025,
qhi= 0.975,
file_extension = 'auto') {

if(is.null(scenariolabels)){
warning("You have not specified scenario labels for this function. You may encounter future errors.")
}


## currently too slow including the quantiles... ##
end_date <- as.Date(end_date)
##filter to munge the data at the scenario level
if (!is.null(incl_geoids)) {
hosp_post_process <- function(x) {
x %>%
dplyr::filter(!is.na(time) & geoid %in% incl_geoids, time <= end_date) %>%
dplyr::select(time,
geoid,
sim_num,
NHospCurr = hosp_curr,
NICUCurr = icu_curr,
NVentCurr = vent_curr) %>%
group_by(sim_num, time, geoid) %>%
mutate(#NHospCurrlo = quantile(NHospCurr, qlo),
#NHospCurrhi = quantile(NHospCurr, qhi),
NHospCurr = mean(NHospCurr),
#NICUCurrlo = quantile(NICUCurr, qlo),
#NICUCurrhi = quantile(NICUCurr, qhi),
NICUCurr = mean(NICUCurr),
#NVentCurrlo = quantile(NVentCurr, qlo),
#NVentCurrhi = quantile(NVentCurr, qhi),
NVentCurr = mean(NVentCurr)) %>%
ungroup()
}
} else {
hosp_post_process <- function(x) {
x %>%
dplyr::filter(!is.na(time) & time <= end_date) %>%
dplyr::select(time,
geoid,
sim_num,
NHospCurr = hosp_curr,
NICUCurr = icu_curr,
NVentCurr = vent_curr) %>%
group_by(sim_num, time, geoid) %>%
mutate(#NHospCurrlo = quantile(NHospCurr, qlo),
#NHospCurrhi = quantile(NHospCurr, qhi),
NHospCurr = mean(NHospCurr),
#NICUCurrlo = quantile(NICUCurr, qlo),
#NICUCurrhi = quantile(NICUCurr, qhi),
NICUCurr = mean(NICUCurr),
#NVentCurrlo = quantile(NVentCurr, qlo),
#NVentCurrhi = quantile(NVentCurr, qhi),
NVentCurr = mean(NVentCurr)) %>%
ungroup()
}
}


rc <- list(length=length(scn_dirs))
for (i in 1:length(scn_dirs)) {
rc[[i]] <- load_hosp_sims_filtered(scn_dirs[i],
num_files = num_files,
name_filter = name_filter,
post_process = hosp_post_process,
geoid_len = geoid_len,
padding_char = padding_char,
file_extension = file_extension)
rc[[i]]$scenario_num <- i
rc[[i]]$scenario_name <- scenariolabels[[i]]
}

return(dplyr::bind_rows(rc))
}


##' Deprecated convenience function to load peak geounit infections before a given date for the given scenarios
##'
##' @param scn_dirs paste(config$name, config$interventions$scenarios, sep = "_") character vector of scenario directory names
##' @param display_date character string for date before which infection peak should be identified
##' @param scenariolabels config$report$formatting$scenario_labels character vector of scenario labels
##' @param incl_geoids optional character vector of geoids that are included in the report, if not included, all geoids will be used
##' @param geoid_len required length of geoid
##' @param padding_char padding
##'
##' @return a data frame with columns
##' - time
##' - comp
##' - geoid
##' - N
##' - sim_num
##' - scenario_num
##' - scenario_name
##'
##' @export
load_inf_geounit_peaks_date <- function(scn_dirs,
display_date=config$end_date,
num_files = NA,
scenariolabels=NULL,
incl_geoids=NULL,
geoid_len = 0,
padding_char = "0",
file_extension = 'auto'){

if(is.null(scenariolabels)){
warning("You have not specified scenario labels for this function. You may encounter future errors.")
}

display_date <- as.Date(display_date)
inf_pre_process <- function(x) {
x %>%
dplyr::filter(comp == "diffI" & time <= display_date)
}

if (!is.null(incl_geoids)) {
inf_post_process <- function(x) {
x %>%
ungroup %>%
dplyr::filter(!is.na(time), geoid %in% incl_geoids) %>%
group_by(geoid) %>%
dplyr::slice(which.max(N)) %>%
ungroup()
}
} else{
inf_post_process <- function(x) {
x %>%
ungroup %>%
dplyr::filter(!is.na(time)) %>%
group_by(geoid) %>%
dplyr::slice(which.max(N)) %>%
ungroup()
}

}

rc <- list()
for (i in 1:length(scn_dirs)) {
rc[[i]] <- load_scenario_sims_filtered(scn_dirs[i],
num_files = num_files,
pre_process = inf_pre_process,
post_process = inf_post_process,
geoid_len = geoid_len,
padding_char = padding_char,
file_extension = file_extension)
rc[[i]]$scenario_num <- i
rc[[i]]$scenario_name <- scenariolabels[[i]]

}

return(dplyr::bind_rows(rc))

}


##' Deprecated convenience function to load peak geounit hospitalizations (or any other variable) by a specific date for the given scenarios
##'
##' @param scn_dirs paste(config$name, config$interventions$scenarios, sep = "_") character vector of scenario directory names
##' @param max_var character string of variable that will be maximized per geoid
##' @param display_date date before which we should search for peaks
##' @param name_filter character string that filenames should match
##' @param incl_geoids optional character vector of geoids that are included in the report, if not included, all geoids will be used
##' @param scenariolabels config$report$formatting$scenario_labels character vector of scenario labels
##' @param incl_geoids optional character vector of geoids that are included in the report, if not included, all geoids will be used
##' @param geoid_len required length of geoid
##' @param padding_char padding
##'
##' @return a data frame with columns
##' - sim_num
##' - Pk_[variableName] variable that was maximized by geoid
##' - NhospCurr number of people in hospital on a day
##' - NICUCurr number of people in ICU on a day
##' - NincidDeath number of incidence deaths on a day
##' - NincidInf number of incident infections on a day
##' - NincidICH number of incident ICUs on a day
##' @export
### all of the peak times for each sim and each county so we can make a figure for when things peak
load_hosp_geounit_peak_date <- function(scn_dirs,
max_var,
display_date = config$end_date,
num_files = NA,
name_filter,
incl_geoids = NULL,
scenariolabels = NULL,
geoid_len = 0,
padding_char = "0",
file_extension = 'auto'){

if(is.null(scenariolabels)){
warning("You have not specified scenario labels for this function. You may encounter future errors.")
}

display_date <- as.Date(display_date)
if (!is.null(incl_geoids)) {
hosp_post_process <- function(x) {
x %>%
dplyr::rename(mx_var = !!max_var) %>%
dplyr::filter(!is.na(time), geoid %in% incl_geoids, time <= display_date) %>%
group_by(geoid) %>%
dplyr::slice(which.max(mx_var)) %>%
ungroup()
}
} else {
hosp_post_process <- function(x) {
x %>%
dplyr::rename(mx_var = !!max_var) %>%
dplyr::filter(!is.na(time), time <= display_date) %>%
group_by(geoid) %>%
dplyr::slice(which.max(mx_var)) %>%
ungroup()
}
}
rc <- list(length=length(scn_dirs))
for (i in 1:length(scn_dirs)) {
rc[[i]] <- load_hosp_sims_filtered(scn_dirs[i],
num_files = num_files,
name_filter = name_filter,
post_process = hosp_post_process,
geoid_len = geoid_len,
padding_char = padding_char,
file_extension = file_extension) %>%
dplyr::select(time, geoid, sim_num, mx_var)
rc[[i]]$scenario_num <- i
rc[[i]]$scenario_name <- scenariolabels[[i]]
}

rc %>%
dplyr::bind_rows() %>%
dplyr::rename(!!paste0("Pk_", max_var) := mx_var) %>% ## notate the column that was maximized with "Pk_"
return()
}
Loading

0 comments on commit 4b336a5

Please sign in to comment.