This repository holds the data and source code for the following manuscript:
Here, you can:
- Run the source code to reproduce the figures from the input datasets. Just say
Rscript
src/gensup_analysis.R
, noting the dependencies at the top of the script. It completes in about 8 minutes on a 2021 MacBook Pro. The script reproduces Figures 1-3, S2-S5, Tables S1-S30, and stats_for_text.txt, all of which you can find in display_items. To run the script in "one target only mode", where drugs with >1 human target are removed, sayRscript src/gensup_analysis.R --oto
and you'll find the output in oto; Figures 1-3 from that version of the analysis become figures S6 - S8 in the manuscript. - If you're curious, you can also browse the source code for other scripts that prepared this releasable analytical dataset, in src. These scripts require inputs that are either too large for GitHub, and/or not approved for public release, thus, you will not be able to successfully run them after cloning the repository; they are provided simply for reference in case you want to see what we did.
- Browse the input datasets in data. We have permission from Citeline Pharmaprojects to publicly release the subset of their data that appear here. This includes data/pp.tsv, which contains the highest phase reached for all target-indication (T-I) pairs added to Pharmaprojects since 2000.
Note about dependencies. This code was written for R 4.2.0 and the following package versions: tidyverse_1.3.1, janitor_2.1.0, binom_1.1-1.1, glue_1.6.2, lawstat_3.4, weights_1.0.4, epitools_0.5-10.1, DescTools_0.99.45, openxlsx_4.2.5, optparse_1.7.1, MASS_7.3-56.
This work is licensed under a Creative Commons Attribution 4.0 International License.