-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
95 lines (69 loc) · 4.06 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "80%"
)
devtools::load_all(".")
```
# Joint variable importance plot <img src="man/figures/jointVIP_logo.png" align="right" width="150"/>
<!-- badges: start -->
[![CRAN_Status_Badge](https://img.shields.io/cran/v/jointVIP?color=952100)](https://cran.r-project.org/package=jointVIP) [![CRAN_Downloads_Badge](https://cranlogs.r-pkg.org/badges/jointVIP?color=952100)](https://cran.r-project.org/package=jointVIP)
[![R-CMD-check](https://github.com/ldliao/jointVIP/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/ldliao/jointVIP/actions/workflows/R-CMD-check.yaml)
<!-- badges: end -->
Joint variable importance plot (jointVIP) visualizes each variable's outcome importance via Pearson's correlation and treatment importance via cross-sample standardized mean differences. Bias curves enable comparisons to support variable prioritization among potential confounders.
## Installation
You can install the `jointVIP` package on CRAN using:
``` r
# for version on CRAN
install.packages("jointVIP")
# for development version on github
devtools::install_github("ldliao/jointVIP")
```
## BRFSS Example
To demonstrate, we use the 2015 Behavioral Risk Factor Surveillance System (BRFSS) example to answer the causal question: Does smoking increase the risk of chronic obstructive pulmonary disease (COPD)? The data and background is inspired by [Clay Ford's work from University of Virginia Library](https://data.library.virginia.edu/getting-started-with-matching-methods/). First, the data is cleaned to only have numeric variables, i.e., all factored variables are transformed via one-hot-encoding. Treatment variable `smoke` only contains 0 (control) and 1 (treatment).
With the cleaned data, you can specify details in the function `create_jointVIP()` like so:
```{r example}
library(jointVIP)
## basic example code
library(dplyr)
# load data
data('brfss', package='jointVIP')
treatment = 'smoke'
outcome = 'COPD'
covariates = names(brfss)[!names(brfss) %in% c(treatment, outcome)]
## select the pilot sample from random portion
## pilot data here are considered as 'external controls'
## can be a separate dataset; should be chosen with caution
set.seed(1234895)
pilot_prop = 0.2
pilot_sample_num = sample(which(brfss %>% pull(treatment) == 0),
length(which(brfss %>% pull(treatment) == 0)) *
pilot_prop)
## set up pilot and analysis data
## we want to make sure these two data are non-overlapping
pilot_df = brfss[pilot_sample_num, ]
analysis_df = brfss[-pilot_sample_num, ]
## minimal example
brfss_jointVIP = create_jointVIP(treatment = treatment,
outcome = outcome,
covariates = covariates,
pilot_df = pilot_df,
analysis_df = analysis_df)
```
Generic functions can be used for the `jointVIP` object to extract information as a glance with `summary()` and `print()`.
```{r generic}
summary(brfss_jointVIP)
print(brfss_jointVIP)
```
```{r plot, dpi=300, fig.asp = 0.75, fig.width = 6, fig.align = "center", message=FALSE}
plot(brfss_jointVIP)
```
In this example, `age_over65` and `average_drinks` are two most important variables to adjust. At a bias tolerance of 0.01, 3 variables: `age_over65`, `average_drinks`, and `age_25to34` are above the tolerance threshold. Moreover, `age_over65` and `average_drinks` are of higher importance for adjustment than `age_25to34`. Although `race_black` and `age_over65` have similar absolute standardized mean differences (0.322 and 0.333, respectively), `age_over65` is more important to adjust for since its highly correlated with the outcome.
## Acknowledgement
Ford, C. 2018. “Getting Started with Matching Methods.” UVA Library StatLab. https://library.virginia.edu/data/articles/getting-started-with-matching-methods/ (accessed Jan 29, 2024).