-
Notifications
You must be signed in to change notification settings - Fork 3
Best Practices
Elizabeth Lee edited this page May 22, 2023
·
16 revisions
Checklist to run through when running a map on cholera-mapping-pipeline
using the old pipeline.
- You and at least one other member of the team should review the config file that will be run.
- Commit the reviewed config file to
cholera-configs
. - Check which branch you are on in
cholera-configs
,cholera-mapping-pipeline
, andcholera-covariates
in the directories from which the model will be run. - Git pull in
cholera-configs
,cholera-mapping-pipeline
, andcholera-covariates
in the directories from which the model will be run. - Revert uncommitted changes to files
git reset --hard
- Remove uncommitted files
git clean -fxd
- Re-create
database_api_key.R
- Review the
cholera-configs
Github Kanban board and issues for notes on the run you plan to launch. - Reinstall taxdat.
- Check for and remove old data files in the data folder that might be related to your model run.
- Run
taxdat::add_explicit_file_names_to_config
on your config. - Review the shell (.sh) script that will be used to launch your model run.
- Record the branch name(s) for all repos, commit hash for
cholera-mapping-pipeline
, the config settings, and other model launch notes in the Github issue of thecholera-configs
Kanban board. - Update the Kanban board status
- Do not either reinstall taxdat or make changes to the git repo.
- For many production runs, the data pull and diagnostic reports may be run on idmodeling but the Stan model is run on ARCH Rockfish cluster. In these situations, model files may be transferred between the two servers using the
scp
orrsync
commands.
- Review the model logs to see if the run finished successfully.
- Always generate the country data report. Generate the data comparison report and generated quantities report as appropriate to the purpose of the model run. (Run all three for final production runs.)
- Update the Kanban board status as appropriate
As of 25 Feb 2022, model diagnostic reports include: data comparison report and country data report RMD files
- Commit logs and diagnostic reports to the appropriate
cholera-mapping-reports
folder. - Commit intermediate model output files to the appropriate
cholera-mapping-output
folder (Only perform this step for report/manuscript final runs). There may not be intermediate model output files for the model diagnostic reports. However, there were intermediate model output files generated in the creation of the Dec 2021 Gavi report and these should be committed tocholera-mapping-output
. - Update the Kanban board status
- Post the diagnostic reports in
cholera-taxonomy
Slack channel. - Team members should then post comments on the Kanban board issue after reviewing diagnostic reports. Additional information on approving runs may be found on this wiki page.
- Commit all **approved **model input and output model files to the appropriate
cholera-mapping-output
folder. - If you are in the process of running models for a production run but the model is not yet approved, you do not need to commit model files to the
cholera-mapping-output
repository. Instead, you may usescp
orrsync
for large file transfer. - Do NOT delete any model files that may eventually become an approved production run. If, for example, you are running variations of a production run to see if we can improve model fit, do not delete the original Stan model output. To avoid accidental overwriting, we recommend transferring files to empty folders.
- Use the parameter-specified config in the pipeline code (eg, Use "config$<param_name>" directly in a pipeline script such as "prepare_stan_input.R")
- Add the parameter explicitly to the config writer script
Analysis/R/write_batch_mapping_config_general.R
and related taxdat function (automate_mapping_config
) inpackages/taxdat/R/config_helpers.R
- Add a config check function for each parameter and encode the default value into the check function in
packages/taxdat/R/setup_helpers.R
(eg, check_<param_name>) - Review the validity of
check_update_config
to use the check function appropriately inpackages/taxdat/R/setup_helpers.R
- Review
automate_mapping_config
inpackages/taxdat/R/config_helpers.R
- Call the check function in the config writer script and in the
set_parameters
file. That way, when no parameter is specified, the default parameter encoded in the check function will be used. - Add a unit test using the testthat package for the newly-added check function in
packages/taxdat/tests/testthat/test_setup_helpers.R
- Update the config file parameters wiki page on Github -- add the parameter to the example config and update the argument dictionary with the parameter name and description.
Checklist for Github workflow and merging branches. For additional detail, we are roughly following the Integration-Manager workflow described here except individuals work on branches and not forks.
For the following checklist, assume that dev
is the production branch and you're making updates on dev_a
. You will start by making a new branch called dev_a
from dev
. All of your code changes will be made in dev_a
. Test that dev_a
works as expected, ideally by writing unit tests or running a map or both.
- Submit a PR for
dev
intodev_a
and review changes. After reviewing and resolving conflicts, mergedev
intodev_a
. - Submit a PR for
dev_a
intodev
and review changes. - Test that
dev_a
works as expected, ideally by running unit tests, integration tests, a map, or one or more of the above. - If tests produce the expected results, merge the pull request.
- If you no longer intend to make changes to
dev_a
, delete the branch. If there are more changes to make, continue working indev_a
and follow these steps again from the top when ready to merge.