For some reproducible examples please visit rtichoke blog!
You can install rtichoke from GitHub with:
# install.packages("devtools")
devtools::install_github("uriahf/rtichoke")
rtichoke
is designed to help analysts with exploration of performance metrics with a binary outcome. In order to do so it uses interactive visualization.
In order to use rtichoke
you need to have
probs
: Estimated Probabilities as predictions.reals
: Binary Outcomes.
There are 3 different cases and for each one of them rtichoke requires a different kind of input:
The user is required to provide a list with one vector for the predictions and a list with one vector for the outcomes.
create_roc_curve(
probs = list(example_dat$bad_model),
reals = list(example_dat$outcome)
)
Why? In order to compare performance for several different models for the same population.
How? The user is required to provide a list with one vector of predictions for each model and a list with one vector for the outcome of the population.
create_roc_curve(
probs = list(
"Good Model" = example_dat$estimated_probabilities,
"Bad Model" = example_dat$bad_model,
"Random Guess" = example_dat$random_guess
),
reals = list(rtichoke::example_dat$outcome)
)
Why? In order to compare performance for different populations, like in Train / Test split or in order to check the fairness of the algorithms.
How? The user is required to provide a list with one vector of predictions for each population and a list with one vector of outcomes for each population.
create_roc_curve(
probs = list(
"Train" = example_dat %>%
dplyr::filter(type_of_set == "train") %>%
dplyr::pull(estimated_probabilities),
"Test" = example_dat %>% dplyr::filter(type_of_set == "test") %>%
dplyr::pull(estimated_probabilities)
),
reals = list(
"Train" = example_dat %>% dplyr::filter(type_of_set == "train") %>%
dplyr::pull(outcome),
"Test" = example_dat %>% dplyr::filter(type_of_set == "test") %>%
dplyr::pull(outcome)
)
)
For some outputs in rtichoke you can alternatively prepare a performance
data and use it as an input: instead of create_*_curve
use
plot_*_curve
and instead of create_performance_table
use
render_performance_table
:
one_pop_one_model_as_a_vector %>%
plot_roc_curve()
In order to get all the supported outputs of rtichoke in one html file
the user can call create_summary_report()
.
If you encounter a bug please fill an issue with a minimal reproducible example, it will be easier for me to help you and it might help others in the future. Alternatively you are welcome to contact me personally: ufinkel@gmail.com