create_summary_report fails on multiple populations if one of the populations always has specific outcome #69

filbert42 · 2022-07-19T09:10:57Z

Hi Uriah, thank you for this great package! :)

I'm trying to use Rtichoke to detect fairness problems in my models, so I'm using create_summary_report function with several populations that are actually subpopulations of my test data set.
However, apparently, if for some subpopulation my target outcome have a single type of 'real' value (for example, for some reason all men in the given dataset always have outcome == 1), create_summary_report throws this error:

<simpleError in roc.default(response, predictor, auc = TRUE, ...): 'response' must have two levels>`

Is it possible to render report regardless? This doesn't happen very often and it obviously a problem with my data, but it would be more convenient, I believe.

The text was updated successfully, but these errors were encountered:

uriahf · 2022-07-20T14:17:37Z

Thank you @filbert42 !

In order to solve this problem I will render what is possible.

If there are no real positives in the data (TP + FN) I can't calculate sensitivity TP / (TP + FN) and therefore I can't create ROC nor Precision Recall curves. The same is true for Lift Curve (the prevalence will be equal to 0).

If there are no real negatives in the data (TN + FP) I can't calculate specificity TN /(TP + FN) and therefore I can't create ROC nor Gains curves.

All the other cases might be weird but possible if I'm not mistaken.

Sounds good?

uriahf · 2023-01-05T06:20:52Z

The following code should run properly now


rtichoke:::create_summary_report(
  probs = list(example_dat$estimated_probabilities),
  reals = list(rep(0, 150))
)



rtichoke:::create_summary_report(
  probs = list(example_dat$estimated_probabilities),
  reals = list(rep(1, 150))
)


rtichoke:::create_summary_report(
  probs = list(
    "Second Model" = example_dat$bad_model,
    "First Model" = example_dat$estimated_probabilities
  ),
  reals = list(rep(0, 150))
) 


rtichoke:::create_summary_report(
  probs = list(
    "Second Model" =example_dat$bad_model,
    "First Model" = example_dat$estimated_probabilities
  ),
  reals = list(rep(1, 150))
) 


rtichoke::create_summary_report(
  probs = list(
    "train" = example_dat %>% dplyr::filter(type_of_set == "train") %>%
      dplyr::pull(estimated_probabilities),
    "test" = example_dat %>% dplyr::filter(type_of_set == "test") %>%
      dplyr::pull(estimated_probabilities)
  ),
  reals = list(
    "train" = rep(0, 96),
    "test" = example_dat %>% dplyr::filter(type_of_set == "test") %>%
      dplyr::pull(outcome)
  )
) 

rtichoke::create_summary_report(
  probs = list(
    "train" = example_dat %>% dplyr::filter(type_of_set == "train") %>%
      dplyr::pull(estimated_probabilities),
    "test" = example_dat %>% dplyr::filter(type_of_set == "test") %>%
      dplyr::pull(estimated_probabilities)
  ),
  reals = list(
    "train" = rep(1, 96),
    "test" = example_dat %>% dplyr::filter(type_of_set == "test") %>%
      dplyr::pull(outcome)
  )
)

filbert42 changed the title ~~create_summary_report fails on multiple populations if one of the populations always have specific outcome~~ create_summary_report fails on multiple populations if one of the populations always has specific outcome Jul 19, 2022

uriahf added Difficulty: intermediate Priority: medium labels Jul 20, 2022

uriahf added this to rtichoke 0.0.5 Sep 22, 2022

uriahf added Priority: high and removed Priority: medium labels Sep 22, 2022

uriahf mentioned this issue Sep 22, 2022

Support zero variance when there are zero-variance predictions #83

Closed

uriahf moved this to Todo in rtichoke 0.0.5 Sep 22, 2022

uriahf closed this as completed in edc592f Apr 12, 2023

github-project-automation bot moved this from Todo to Done in rtichoke 0.0.5 Apr 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

create_summary_report fails on multiple populations if one of the populations always has specific outcome #69

create_summary_report fails on multiple populations if one of the populations always has specific outcome #69

filbert42 commented Jul 19, 2022

uriahf commented Jul 20, 2022 •

edited

Loading

uriahf commented Jan 5, 2023

create_summary_report fails on multiple populations if one of the populations always has specific outcome #69

create_summary_report fails on multiple populations if one of the populations always has specific outcome #69

Comments

filbert42 commented Jul 19, 2022

uriahf commented Jul 20, 2022 • edited Loading

uriahf commented Jan 5, 2023

uriahf commented Jul 20, 2022 •

edited

Loading