Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Counterplots #402

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Implement Counterplots #402

wants to merge 4 commits into from

Conversation

rmazzine
Copy link
Contributor

Code for counterplots using DiCE TF example:

Here is the repo for CounterPlots (https://github.com/ADMAntwerp/CounterPlots) and how they look like, I believe they will greatly improve the usability of counterfactual explanations with comprehensive visuals and charts.

import dice_ml
from dice_ml.utils import helpers # helper functions
from sklearn.model_selection import train_test_split

dataset = helpers.load_adult_income_dataset()
target = dataset["income"] # outcome variable
train_dataset, test_dataset, _, _ = train_test_split(dataset,
                                                     target,
                                                     test_size=0.2,
                                                     random_state=0,
                                                     stratify=target)
# Dataset for training an ML model
d = dice_ml.Data(dataframe=train_dataset,
                 continuous_features=['age', 'hours_per_week'],
                 outcome_name='income')

# Pre-trained ML model
m = dice_ml.Model(model_path=dice_ml.utils.helpers.get_adult_income_modelpath(),
                  backend='TF2', func="ohe-min-max")
# DiCE explanation instance
exp = dice_ml.Dice(d,m)

# Generate counterfactual examples
query_instance = test_dataset.drop(columns="income")[0:1]
dice_exp = exp.generate_counterfactuals(query_instance, total_CFs=4, desired_class="opposite")
# Visualize counterfactual explanation
dice_exp.visualize_as_dataframe()

# Create counterplots
ctp = dice_exp.plot_counterplots(m)

# Counterplots outputs
ctp[0].constellation()
ctp[0].greedy()
ctp[0].countershapley()
ctp[0].countershapley_values()

Example using Scikit-Learn:

import dice_ml
from dice_ml import Dice

from sklearn.datasets import load_iris, fetch_california_housing
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.model_selection import train_test_split
from sklearn.compose import ColumnTransformer
from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor

import pandas as pd

df_iris = load_iris(as_frame=True).frame
df_iris.head()

df_iris.info()

outcome_name = "target"
continuous_features_iris = df_iris.drop(outcome_name, axis=1).columns.tolist()
target = df_iris[outcome_name]

# Split data into train and test
datasetX = df_iris.drop(outcome_name, axis=1)
x_train, x_test, y_train, y_test = train_test_split(datasetX,
                                                    target,
                                                    test_size=0.2,
                                                    random_state=0,
                                                    stratify=target)

categorical_features = x_train.columns.difference(continuous_features_iris)

# We create the preprocessing pipelines for both numeric and categorical data.
numeric_transformer = Pipeline(steps=[
    ('scaler', StandardScaler())])

categorical_transformer = Pipeline(steps=[
    ('onehot', OneHotEncoder(handle_unknown='ignore'))])

transformations = ColumnTransformer(
    transformers=[
        ('num', numeric_transformer, continuous_features_iris),
        ('cat', categorical_transformer, categorical_features)])

# Append classifier to preprocessing pipeline.
# Now we have a full prediction pipeline.
clf_iris = Pipeline(steps=[('preprocessor', transformations),
                           ('classifier', RandomForestClassifier())])
model_iris = clf_iris.fit(x_train, y_train)

d_iris = dice_ml.Data(dataframe=df_iris,
                      continuous_features=continuous_features_iris,
                      outcome_name=outcome_name)

# We provide the type of model as a parameter (model_type)
m_iris = dice_ml.Model(model=model_iris, backend="sklearn", model_type='classifier')

exp_genetic_iris = Dice(d_iris, m_iris, method="genetic")

# Single input
query_instances_iris = x_test[2:3]
genetic_iris = exp_genetic_iris.generate_counterfactuals(query_instances_iris, total_CFs=7, desired_class=2)
genetic_iris.visualize_as_dataframe()

# Create counterplots
ctps = genetic_iris.plot_counterplots(m_iris)

# Counterplots output
ctps[0].greedy()
ctps[0].countershapley()
ctps[0].countershapley_values()
ctps[0].constellation()

Signed-off-by: rmazzine <mazzine.r@gmail.com>
Signed-off-by: rmazzine <mazzine.r@gmail.com>
Signed-off-by: rmazzine <mazzine.r@gmail.com>
Signed-off-by: rmazzine <mazzine.r@gmail.com>
@rmazzine
Copy link
Contributor Author

@amit-sharma the issues doesn't seem relatable to my changes, or am I wrong? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant