Skip to content

Commit

Permalink
Merge pull request #124 from DataResponsibly/development
Browse files Browse the repository at this point in the history
Release 0.5.0
  • Loading branch information
denysgerasymuk799 committed Jun 2, 2024
2 parents 14c275a + e7a304a commit 76e4bb3
Show file tree
Hide file tree
Showing 158 changed files with 115,694 additions and 182,002 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ jobs:
fail-fast: false
matrix:
python: [3.8, 3.9]
os: [ubuntu-latest, macos-latest]
os: [ubuntu-latest, macos-13]

uses: ./.github/workflows/build-virny.yml
with:
Expand All @@ -28,7 +28,7 @@ jobs:
fail-fast: false
matrix:
python: [3.8, 3.9]
os: [ubuntu-latest, macos-latest]
os: [ubuntu-latest, macos-13]

uses: ./.github/workflows/unit-tests.yml
with:
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ notebooks
.DS_Store
.ipynb_checkpoints
docs/examples/test.py
tests/results

# Remove big files from GitHub repo
virny/datasets/2018
Expand Down
6 changes: 3 additions & 3 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
include virny/datasets/*.csv
include virny/datasets/*.gz
include virny/datasets/*.zip
include virny/datasets/data/*.csv
include virny/datasets/data/*.gz
include virny/datasets/data/*.zip
50 changes: 35 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,8 @@

**Virny** is a Python library for in-depth profiling of model performance across overall and disparity dimensions.
In addition to its metric computation capabilities, the library provides an interactive tool called _VirnyView_
to streamline responsible model selection and generate nutritional labels for ML models.
to streamline responsible model selection and generate nutritional labels for ML models.

The Virny library was developed based on three fundamental principles:

1) easy extensibility of model analysis capabilities;
Expand Down Expand Up @@ -65,33 +66,52 @@ pip install virny
* [Interactive Demo](https://huggingface.co/spaces/denys-herasymuk/virny-demo)


## 💡 Features
## 😎 Why Virny

In contrast to existing fairness software libraries and model card generating frameworks, our system stands out in four key aspects:

1. Virny facilitates the measurement of **all normatively important performance dimensions** (including _fairness_, _stability_, and _uncertainty_) for a set of initialized models, both overall and broken down by user-defined subgroups of interest.

2. Virny enables data scientists to analyze performance using **multiple sensitive attributes** (including _non-binary_) and their _intersections_.

3. Virny offers **diverse APIs for metric computation**, designed to analyze multiple models in a single execution, assessing stability and uncertainty on correct and incorrect predictions broken down by protected groups, and testing models on multiple test sets, including in-domain and out-of-domain.

4. Virny implements streamlined flow design tailored for **responsible model selection**, reducing the complexity associated with numerous model types, performance dimensions, and data-centric and model-centric interventions.


## 💡 List of Features

* Entire pipeline for profiling model accuracy, stability, uncertainty, and fairness
* Profiling of all normatively important performance dimensions: accuracy, stability, uncertainty, and fairness
* Ability to analyze non-binary sensitive attributes and their intersections
* Compatibility with [pre-, in-, and post-processors](https://aif360.readthedocs.io/en/latest/modules/algorithms.html#) for fairness enhancement from AIF360
* Convenient metric computation interfaces: an interface for multiple models, an interface for multiple test sets, and an interface for saving results into a user-defined database
* Interactive _VirnyView_ visualizer that profiles dataset properties related to protected groups, computes comprehensive [nutritional labels](http://sites.computer.org/debull/A19sept/p13.pdf) for individual models, compares multiple models according to multiple metrics, and guides users through model selection
* Compatibility with [pre-, in-, and post-processors](https://aif360.readthedocs.io/en/latest/modules/algorithms.html#) for fairness enhancement from AIF360
* An `error_analysis` computation mode to analyze model stability and confidence for correct and incorrect prodictions broken down by groups
* Metric static and interactive visualizations
* Data loaders with subsampling for popular fair-ML benchmark datasets
* User-friendly parameters input via config yaml files
* Check out [our documentation](https://dataresponsibly.github.io/Virny/) for a comprehensive overview
* User-friendly parameters input via config yaml files

Check out [our documentation](https://dataresponsibly.github.io/Virny/) for a comprehensive overview.

## 📖 Library Overview

![Virny_Architecture](https://github.com/DataResponsibly/Virny/assets/42843889/91620e0f-11ff-4093-8fb6-c88c90bff711)
## 🤗 Affiliations

![NYU-UCU-Logos](https://user-images.githubusercontent.com/42843889/216840888-071bf184-f0e3-4a3e-94dc-c0d1c7784143.png)

The software framework decouples the process of model profiling into several stages, including **subgroup metric computation**,
**disparity metric composition**, and **metric visualization**. This separation empowers data scientists with greater control and
flexibility in employing the library, both during model development and for post-deployment monitoring. The above figure demonstrates
how the library constructs a pipeline for model analysis. Inputs to a user interface are shown in green, pipeline stages are shown in blue,
and the output of each stage is shown in purple.

## 💬 Citation

## 🤗 Affiliations
If Virny has been useful to you, and you would like to cite it in a scientific publication, please refer to the [paper](https://dl.acm.org/doi/abs/10.1145/3626246.3654738) published at SIGMOD:

![NYU-UCU-Logos](https://user-images.githubusercontent.com/42843889/216840888-071bf184-f0e3-4a3e-94dc-c0d1c7784143.png)
```bibtex
@inproceedings{herasymuk2024responsible,
title={Responsible Model Selection with Virny and VirnyView},
author={Herasymuk, Denys and Arif Khan, Falaah and Stoyanovich, Julia},
booktitle={Companion of the 2024 International Conference on Management of Data},
pages={488--491},
year={2024}
}
```


## 📝 License
Expand Down
3 changes: 2 additions & 1 deletion docs/.pages
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
nav:
- introduction
- api
- examples
- glossary
- api
- release_notes
4 changes: 4 additions & 0 deletions docs/api/analyzers/AbstractOverallVarianceAnalyzer.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,10 @@ Abstract class for an analyzer that computes overall variance metrics for subgro

Number of estimators in ensemble to measure base_model stability

- **random_state** (*int*) – defaults to `None`

[Optional] Controls the randomness of the bootstrap approach for model arbitrariness evaluation

- **with_predict_proba** (*bool*) – defaults to `True`

[Optional] A flag if model can return probabilities for its predictions. If no, only metrics based on labels (not labels and probabilities) will be computed.
Expand Down
4 changes: 4 additions & 0 deletions docs/api/analyzers/BatchOverallVarianceAnalyzer.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,10 @@ Analyzer to compute subgroup variance metrics for batch learning models.

Number of estimators in ensemble to measure base_model stability

- **random_state** (*int*) – defaults to `None`

[Optional] Controls the randomness of the bootstrap approach for model arbitrariness evaluation

- **with_predict_proba** (*bool*) – defaults to `True`

[Optional] A flag if model can return probabilities for its predictions. If no, only metrics based on labels (not labels and probabilities) will be computed.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,10 @@ Analyzer to compute subgroup variance metrics using the defined post-processor.

Number of estimators in ensemble to measure base_model stability

- **random_state** (*int*) – defaults to `None`

[Optional] Controls the randomness of the bootstrap approach for model arbitrariness evaluation

- **with_predict_proba** (*bool*) – defaults to `True`

[Optional] A flag if model can return probabilities for its predictions. If no, only metrics based on labels (not labels and probabilities) will be computed.
Expand Down
8 changes: 8 additions & 0 deletions docs/api/analyzers/SubgroupVarianceAnalyzer.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,10 +50,18 @@ Analyzer to compute variance metrics for subgroups.

A sensitive attribute to use for post-processing

- **random_state** (*int*) – defaults to `None`

[Optional] Controls the randomness of the bootstrap approach for model arbitrariness evaluation

- **computation_mode** (*str*) – defaults to `None`

[Optional] A non-default mode for metrics computation. Should be included in the ComputationMode enum.

- **with_predict_proba** (*bool*) – defaults to `True`

[Optional] True, if models in models_config have a predict_proba method and can return probabilities for predictions, False, otherwise. Note that if it is set to False, only metrics based on labels (not labels and probabilities) will be computed. Ignored when a postprocessor is not None, and set to False in this case.

- **notebook_logs_stdout** (*bool*) – defaults to `False`

[Optional] True, if this interface was execute in a Jupyter notebook, False, otherwise.
Expand Down
4 changes: 2 additions & 2 deletions docs/api/custom-classes/BaseFlowDataset.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@ Dataset class with custom train and test splits that is used as input for metric

## Parameters

- **init_features_df** (*pandas.core.frame.DataFrame*)
- **init_sensitive_attrs_df** (*pandas.core.frame.DataFrame*)

Full train + test non-preprocessed dataset of features without the target column. It is used for creating test groups.
Full train + test non-preprocessed dataset of sensitive attributes with initial indexes. It is used for creating test groups.

- **X_train_val** (*pandas.core.frame.DataFrame*)

Expand Down
19 changes: 19 additions & 0 deletions docs/api/datasets/BankMarketingDataset.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# BankMarketingDataset

Dataset class for the Bank Marketing dataset that contains sensitive attributes among feature columns. Source: https://github.com/tailequy/fairness_dataset/blob/main/experiments/data/bank-full.csv General description and analysis: https://arxiv.org/pdf/2110.00530.pdf (Section 3.1.5) Broad description: https://archive.ics.uci.edu/dataset/222/bank+marketing



## Parameters

- **subsample_size** (*int*) – defaults to `None`

Subsample size to create based on the input dataset

- **subsample_seed** (*int*) – defaults to `None`

Seed for sampling using the sample() method from pandas




19 changes: 19 additions & 0 deletions docs/api/datasets/CardiovascularDiseaseDataset.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# CardiovascularDiseaseDataset

Dataset class for the Cardiovascular Disease dataset that contains sensitive attributes among feature columns. Source and broad description: https://www.kaggle.com/datasets/sulianova/cardiovascular-disease-dataset



## Parameters

- **subsample_size** (*int*) – defaults to `None`

Subsample size to create based on the input dataset

- **subsample_seed** (*int*) – defaults to `None`

Seed for sampling using the sample() method from pandas




15 changes: 0 additions & 15 deletions docs/api/datasets/CreditCardDefaultDataset.md

This file was deleted.

23 changes: 0 additions & 23 deletions docs/api/datasets/DiabetesDataset.md

This file was deleted.

23 changes: 23 additions & 0 deletions docs/api/datasets/DiabetesDataset2019.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# DiabetesDataset2019

Dataset class for the Diabetes 2019 dataset that contains sensitive attributes among feature columns. Source and broad description: https://www.kaggle.com/datasets/tigganeha4/diabetes-dataset-2019/data



## Parameters

- **subsample_size** (*int*) – defaults to `None`

Subsample size to create based on the input dataset

- **subsample_seed** (*int*) – defaults to `None`

Seed for sampling using the sample() method from pandas

- **with_nulls** (*bool*) – defaults to `True`

Whether to keep nulls in the dataset or drop rows with any nulls. Default: True.




19 changes: 19 additions & 0 deletions docs/api/datasets/GermanCreditDataset.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# GermanCreditDataset

Dataset class for the German Credit dataset that contains sensitive attributes among feature columns. Source: https://github.com/tailequy/fairness_dataset/blob/main/experiments/data/german_data_credit.csv General description and analysis: https://arxiv.org/pdf/2110.00530.pdf (Section 3.1.3) Broad description: https://archive.ics.uci.edu/dataset/144/statlog+german+credit+data



## Parameters

- **subsample_size** (*int*) – defaults to `None`

Subsample size to create based on the input dataset

- **subsample_seed** (*int*) – defaults to `None`

Seed for sampling using the sample() method from pandas




6 changes: 4 additions & 2 deletions docs/api/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,10 +49,12 @@ The purpose is to provide sample datasets for functionality testing and show exa
- [ACSMobilityDataset](../datasets/ACSMobilityDataset)
- [ACSPublicCoverageDataset](../datasets/ACSPublicCoverageDataset)
- [ACSTravelTimeDataset](../datasets/ACSTravelTimeDataset)
- [BankMarketingDataset](../datasets/BankMarketingDataset)
- [CardiovascularDiseaseDataset](../datasets/CardiovascularDiseaseDataset)
- [CompasDataset](../datasets/CompasDataset)
- [CompasWithoutSensitiveAttrsDataset](../datasets/CompasWithoutSensitiveAttrsDataset)
- [CreditCardDefaultDataset](../datasets/CreditCardDefaultDataset)
- [DiabetesDataset](../datasets/DiabetesDataset)
- [DiabetesDataset2019](../datasets/DiabetesDataset2019)
- [GermanCreditDataset](../datasets/GermanCreditDataset)
- [LawSchoolDataset](../datasets/LawSchoolDataset)
- [RicciDataset](../datasets/RicciDataset)
- [StudentPerformancePortugueseDataset](../datasets/StudentPerformancePortugueseDataset)
Expand Down
4 changes: 4 additions & 0 deletions docs/api/preprocessing/preprocess-dataset.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,10 @@ Preprocess an input dataset using sklearn ColumnTransformer. Split the dataset o

Instance of sklearn ColumnTransformer to preprocess categorical and numerical columns.

- **sensitive_attributes_dct** (*dict*)

Dictionary of sensitive attribute names and their disadvantaged values.

- **test_set_fraction** (*float*)

Fraction from 0 to 1. Used to split the input dataset on the train and test sets.
Expand Down
4 changes: 4 additions & 0 deletions docs/api/user-interfaces/compute-metrics-with-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,10 @@ Return a dictionary where keys are model names, and values are metrics for sensi

[Optional] Postprocessor object to apply to model predictions before metrics computation

- **with_predict_proba** (*bool*) – defaults to `True`

[Optional] True, if models in models_config have a predict_proba method and can return probabilities for predictions, False, otherwise. Note that if it is set to False, only metrics based on labels (not labels and probabilities) will be computed. Ignored when a postprocessor is not None, and set to False in this case.

- **notebook_logs_stdout** (*bool*) – defaults to `False`

[Optional] True, if this interface was execute in a Jupyter notebook, False, otherwise.
Expand Down
4 changes: 4 additions & 0 deletions docs/api/user-interfaces/compute-metrics-with-db-writer.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,10 @@ Return a dictionary where keys are model names, and values are metrics for sensi

[Optional] Postprocessor object to apply to model predictions before metrics computation

- **with_predict_proba** (*bool*) – defaults to `True`

[Optional] True, if models in models_config have a predict_proba method and can return probabilities for predictions, False, otherwise. Note that if it is set to False, only metrics based on labels (not labels and probabilities) will be computed. Ignored when a postprocessor is not None, and set to False in this case.

- **notebook_logs_stdout** (*bool*) – defaults to `False`

[Optional] True, if this interface was execute in a Jupyter notebook, False, otherwise.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,10 @@ Compute stability and accuracy metrics for each model in models_config based on

Python function object has one argument (run_models_metrics_df) and save this metrics df to a target database

- **with_predict_proba** (*bool*) – defaults to `True`

[Optional] True, if models in models_config have a predict_proba method and can return probabilities for predictions, False, otherwise. Note that if it is set to False, only metrics based on labels (not labels and probabilities) will be computed. Ignored when a postprocessor is not None, and set to False in this case.

- **notebook_logs_stdout** (*bool*) – defaults to `False`

[Optional] True, if this interface was execute in a Jupyter notebook, False, otherwise.
Expand Down
4 changes: 2 additions & 2 deletions docs/api/utils/create-test-protected-groups.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@ Return a dictionary where keys are subgroup names, and values are X_test row ind

Test feature set

- **init_features_df** (*pandas.core.frame.DataFrame*)
- **init_sensitive_attrs_df** (*pandas.core.frame.DataFrame*)

Initial full dataset without preprocessing
Initial full dataset of sensitive attributes without preprocessing

- **sensitive_attributes_dct** (*dict*)

Expand Down
1 change: 0 additions & 1 deletion docs/examples/.pages
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
title: Examples 🍱
nav:
- Multiple_Models_Interface_Use_Case.md
- Interactive_Web_App_Demo.md
- Multiple_Models_Interface_With_DB_Writer.md
- Multiple_Models_Interface_With_Error_Analysis.md
- Multiple_Models_Interface_With_Multiple_Test_Sets.md
Expand Down
Loading

0 comments on commit 76e4bb3

Please sign in to comment.