Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create_summary_report fails on multiple populations if one of the populations always has specific outcome #69

Closed
filbert42 opened this issue Jul 19, 2022 · 2 comments

Comments

@filbert42
Copy link

Hi Uriah, thank you for this great package! :)

I'm trying to use Rtichoke to detect fairness problems in my models, so I'm using create_summary_report function with several populations that are actually subpopulations of my test data set.
However, apparently, if for some subpopulation my target outcome have a single type of 'real' value (for example, for some reason all men in the given dataset always have outcome == 1), create_summary_report throws this error:

<simpleError in roc.default(response, predictor, auc = TRUE, ...): 'response' must have two levels>`

Is it possible to render report regardless? This doesn't happen very often and it obviously a problem with my data, but it would be more convenient, I believe.

@filbert42 filbert42 changed the title create_summary_report fails on multiple populations if one of the populations always have specific outcome create_summary_report fails on multiple populations if one of the populations always has specific outcome Jul 19, 2022
@uriahf
Copy link
Owner

uriahf commented Jul 20, 2022

Thank you @filbert42 !

In order to solve this problem I will render what is possible.

If there are no real positives in the data (TP + FN) I can't calculate sensitivity TP / (TP + FN) and therefore I can't create ROC nor Precision Recall curves. The same is true for Lift Curve (the prevalence will be equal to 0).

If there are no real negatives in the data (TN + FP) I can't calculate specificity TN /(TP + FN) and therefore I can't create ROC nor Gains curves.

All the other cases might be weird but possible if I'm not mistaken.

Sounds good?

@uriahf
Copy link
Owner

uriahf commented Jan 5, 2023

The following code should run properly now


rtichoke:::create_summary_report(
  probs = list(example_dat$estimated_probabilities),
  reals = list(rep(0, 150))
)



rtichoke:::create_summary_report(
  probs = list(example_dat$estimated_probabilities),
  reals = list(rep(1, 150))
)


rtichoke:::create_summary_report(
  probs = list(
    "Second Model" = example_dat$bad_model,
    "First Model" = example_dat$estimated_probabilities
  ),
  reals = list(rep(0, 150))
) 


rtichoke:::create_summary_report(
  probs = list(
    "Second Model" =example_dat$bad_model,
    "First Model" = example_dat$estimated_probabilities
  ),
  reals = list(rep(1, 150))
) 


rtichoke::create_summary_report(
  probs = list(
    "train" = example_dat %>% dplyr::filter(type_of_set == "train") %>%
      dplyr::pull(estimated_probabilities),
    "test" = example_dat %>% dplyr::filter(type_of_set == "test") %>%
      dplyr::pull(estimated_probabilities)
  ),
  reals = list(
    "train" = rep(0, 96),
    "test" = example_dat %>% dplyr::filter(type_of_set == "test") %>%
      dplyr::pull(outcome)
  )
) 

rtichoke::create_summary_report(
  probs = list(
    "train" = example_dat %>% dplyr::filter(type_of_set == "train") %>%
      dplyr::pull(estimated_probabilities),
    "test" = example_dat %>% dplyr::filter(type_of_set == "test") %>%
      dplyr::pull(estimated_probabilities)
  ),
  reals = list(
    "train" = rep(1, 96),
    "test" = example_dat %>% dplyr::filter(type_of_set == "test") %>%
      dplyr::pull(outcome)
  )
) 

@uriahf uriahf closed this as completed in edc592f Apr 12, 2023
@github-project-automation github-project-automation bot moved this from Todo to Done in rtichoke 0.0.5 Apr 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Done
Development

No branches or pull requests

2 participants