Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Development #106

Merged
merged 184 commits into from
Jan 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
184 commits
Select commit Hold shift + click to select a range
0190e22
Added extra tests
denysgerasymuk799 Aug 14, 2023
e65dcc1
Added plot 1 to a gradio app
Sep 30, 2023
c84456e
Created subgroup and group heatmaps
Oct 1, 2023
21f783f
Added bar charts to a web app
Oct 1, 2023
e0113e1
Added bar charts to a web app
denysgerasymuk799 Oct 1, 2023
8580bbc
Added init for view 1
denysgerasymuk799 Oct 1, 2023
de6b7d7
Improved a metrics bar chart
denysgerasymuk799 Oct 2, 2023
3afea56
Tested a gradio app on a big metrics df
denysgerasymuk799 Oct 2, 2023
6da26a2
Added a gradio app for Law_School
denysgerasymuk799 Oct 4, 2023
cb207bb
Added an overall subgrop to heatmaps
denysgerasymuk799 Oct 4, 2023
498b0ef
Added a gradio app for Ricci
denysgerasymuk799 Oct 5, 2023
e49c7c8
Added minor fixes to a model selection ap
denysgerasymuk799 Oct 6, 2023
69306c5
Reveresed a color bar for heatmaps
denysgerasymuk799 Oct 7, 2023
6262752
Added a table with model names that satisfy all 4 constraints
denysgerasymuk799 Oct 7, 2023
6146aae
Added tolerance to heatmaps
denysgerasymuk799 Oct 8, 2023
b8ea341
Added tolerance to heatmaps
denysgerasymuk799 Oct 8, 2023
b981218
Added a test sample for data stats panel
denysgerasymuk799 Oct 9, 2023
97ccb6f
Added subgroup proportions and base rates
denysgerasymuk799 Oct 9, 2023
ab34104
Changed a default range for Label_Stability_Ratio
denysgerasymuk799 Oct 10, 2023
84cf426
Restructured section 3 in the gradio app
denysgerasymuk799 Oct 11, 2023
df44d29
Added dynamic variables for the stats bar chart
denysgerasymuk799 Oct 12, 2023
03f4c19
Added new uncertainty disparity metrics
denysgerasymuk799 Oct 12, 2023
a7bdd87
Merge pull request #60 from DataResponsibly/feature/add_error_analysi…
denysgerasymuk799 Oct 12, 2023
d26f528
Added new dependencies
denysgerasymuk799 Oct 12, 2023
477331a
Merge pull request #61 from DataResponsibly/feature/add_error_analysi…
denysgerasymuk799 Oct 12, 2023
866f30f
Added minor fixes to a visualization component
denysgerasymuk799 Oct 13, 2023
7e5714e
Added gc collect and fixed metrics computation issue for a correct/in…
denysgerasymuk799 Oct 14, 2023
6056c69
Merge pull request #62 from DataResponsibly/feature/add_error_analysi…
denysgerasymuk799 Oct 14, 2023
0ee929a
Added flushing for warning prints
denysgerasymuk799 Oct 14, 2023
436fa4c
Merge pull request #63 from DataResponsibly/feature/add_error_analysi…
denysgerasymuk799 Oct 14, 2023
5130cce
Fixed a bug with group partitioning for extra test sets
denysgerasymuk799 Oct 18, 2023
6a1ef30
Merge pull request #64 from DataResponsibly/feature/add_error_analysi…
denysgerasymuk799 Oct 18, 2023
a32cba9
Created functions for each separate metric
denysgerasymuk799 Oct 20, 2023
985a483
Improved MetricsComposer
denysgerasymuk799 Oct 20, 2023
8bb2b29
Fixed tests for metrics
denysgerasymuk799 Oct 20, 2023
f01ac76
Fixed tests for metrics
denysgerasymuk799 Oct 20, 2023
7a531ed
Fixed tests for metrics
denysgerasymuk799 Oct 20, 2023
6d44b1e
Aligned API based on with_predict_proba
denysgerasymuk799 Oct 21, 2023
a61cfa0
Removed test files
denysgerasymuk799 Oct 21, 2023
79512a6
Checked error_analysis mode
denysgerasymuk799 Oct 21, 2023
3427782
Added functions for postprocessing
proc1v Oct 21, 2023
e591687
Merge branch 'feature/add_postprocessing_mode' of https://github.com/…
proc1v Oct 21, 2023
06edd35
Added tests for MetricsComposer
denysgerasymuk799 Oct 21, 2023
072085b
Added tests for all metrics
denysgerasymuk799 Oct 22, 2023
aa47d41
Added parameters to computation interfaces
proc1v Oct 22, 2023
c298054
Updated user_iterfaces for postprocessing
proc1v Oct 23, 2023
3672df5
Added CreditCardDefault data loader
proc1v Nov 6, 2023
09155c0
Updated requirements
proc1v Nov 6, 2023
c6006b2
Fixed with_predict_proba parameter
proc1v Nov 6, 2023
eea74b4
Fixed numpy version
proc1v Nov 8, 2023
4383688
Added garbage collector
proc1v Nov 12, 2023
1ffc39d
Added Label_Stability_Difference
denysgerasymuk799 Nov 20, 2023
085b1c9
Merge pull request #66 from DataResponsibly/feature/add_label_stabili…
denysgerasymuk799 Nov 20, 2023
3b2a222
Added Label_Stability_Difference
denysgerasymuk799 Nov 20, 2023
c1b9f6a
Merge pull request #67 from DataResponsibly/feature/add_postprocessin…
denysgerasymuk799 Nov 21, 2023
7db1b9b
Resolved merge conflict
denysgerasymuk799 Nov 27, 2023
620f36f
Added model performance summary
denysgerasymuk799 Nov 28, 2023
7d2e557
Added Positive-Rate to a model performance summary plot
denysgerasymuk799 Nov 29, 2023
a773815
Improved dataset stats plot
denysgerasymuk799 Nov 29, 2023
1d5ad3a
Added overall and disparity constraints to a model selection bar chart
denysgerasymuk799 Nov 29, 2023
2a6e71e
Added uncertainty disparity bar charts
denysgerasymuk799 Nov 29, 2023
ae533db
Set red-green color palette
denysgerasymuk799 Nov 30, 2023
c185456
Add labels
dmytro-omelian Dec 3, 2023
f008441
Merge pull request #70 from Dichik/add_labels_to_conf_mat
denysgerasymuk799 Dec 3, 2023
c4c0c8c
Added saving of eqodss fitted params
proc1v Dec 6, 2023
466d81a
dubug
proc1v Dec 6, 2023
c1feaf9
Added test metrics for ACS Public Coverage
denysgerasymuk799 Dec 7, 2023
612b025
Save current version of tolerance
denysgerasymuk799 Dec 10, 2023
8259f79
Added dynamic tolerance
denysgerasymuk799 Dec 10, 2023
87318fd
Added tests for tolerance
denysgerasymuk799 Dec 10, 2023
52ea843
Added tests for tolerance
denysgerasymuk799 Dec 10, 2023
1d2cc3c
wip
denysgerasymuk799 Dec 17, 2023
06c60fc
Added error handling for a dataset stats screen
denysgerasymuk799 Dec 17, 2023
f06cc9f
Added all error handling
denysgerasymuk799 Dec 18, 2023
b3c88f2
wip
denysgerasymuk799 Dec 18, 2023
9ec7ef1
Merge branch 'development' into feature/add_postprocessing_mode
denysgerasymuk799 Dec 18, 2023
f409f9b
Merge pull request #71 from DataResponsibly/feature/add_postprocessin…
denysgerasymuk799 Dec 18, 2023
44cb335
Merge pull request #72 from DataResponsibly/feature/add_visualization…
denysgerasymuk799 Dec 18, 2023
901e1a6
Fixed tests
denysgerasymuk799 Dec 18, 2023
f60bfde
Merge pull request #73 from DataResponsibly/feature/add_postprocessin…
denysgerasymuk799 Dec 18, 2023
6bd58d5
wip
denysgerasymuk799 Dec 18, 2023
d5893af
Cleaned unnecessary files
denysgerasymuk799 Dec 19, 2023
c9d1d20
Removed datapane
denysgerasymuk799 Dec 19, 2023
a48e489
Merge pull request #74 from DataResponsibly/feature/add_visualization…
denysgerasymuk799 Dec 19, 2023
fe7fbf4
Merge pull request #75 from DataResponsibly/feature/add_visualization…
denysgerasymuk799 Dec 19, 2023
8fcd8f6
Added gradio to dependencies
denysgerasymuk799 Dec 19, 2023
42bc639
Merge pull request #76 from DataResponsibly/feature/add_visualization…
denysgerasymuk799 Dec 19, 2023
ed7ab57
Merge branch 'virny_demo' into feature/add_visualization_component
denysgerasymuk799 Dec 19, 2023
933100c
wip1
denysgerasymuk799 Dec 19, 2023
986438d
Resolved merge conflicts
denysgerasymuk799 Dec 19, 2023
d8f64d0
Resolved merge conflicts
denysgerasymuk799 Dec 19, 2023
6879532
Resolved merge conflicts
denysgerasymuk799 Dec 19, 2023
1b67372
Merge pull request #78 from DataResponsibly/feature/add_visualization…
denysgerasymuk799 Dec 19, 2023
f4d0f31
Added auto-creation of model_composed_metrics_df
denysgerasymuk799 Dec 19, 2023
dd72b3d
Simplified input for MetricsInteractiveVisualizer
denysgerasymuk799 Dec 19, 2023
75db0ea
Merge pull request #80 from DataResponsibly/feature/add_visualization…
denysgerasymuk799 Dec 19, 2023
4bfc40f
Removed unnecessary dependencies
denysgerasymuk799 Dec 19, 2023
4a5a0a5
Changed structured user interfaces
denysgerasymuk799 Dec 19, 2023
d784771
wip
denysgerasymuk799 Dec 20, 2023
2a7c1df
Added postprocessing in compute_metrics_with_config()
denysgerasymuk799 Dec 20, 2023
ac25882
Added postprocessing in compute_metrics_with_config()
denysgerasymuk799 Dec 20, 2023
b925dab
Added a notebook_logs_stdout argument for all interfaces
denysgerasymuk799 Dec 20, 2023
11dcd2b
wip
denysgerasymuk799 Dec 20, 2023
0ede2ab
Added a notebook_logs_stdout argument for all interfaces
denysgerasymuk799 Dec 20, 2023
330a412
Added a notebook_logs_stdout argument for all interfaces
denysgerasymuk799 Dec 20, 2023
ed55c1f
Completed a use case for a postprocessor
denysgerasymuk799 Dec 21, 2023
e46ccf1
Fixed visualizations
denysgerasymuk799 Dec 21, 2023
a4e904a
Fixed visualizations
denysgerasymuk799 Dec 21, 2023
953fbaf
Updated all visualizations in use case notebooks
denysgerasymuk799 Dec 21, 2023
2ef8802
Merge pull request #82 from DataResponsibly/feature/add_visualization…
denysgerasymuk799 Dec 21, 2023
68e2304
Improved models tuning
denysgerasymuk799 Dec 21, 2023
f128858
Merge pull request #84 from DataResponsibly/feature/add_visualization…
denysgerasymuk799 Dec 21, 2023
a473faa
Improved documentation
denysgerasymuk799 Dec 21, 2023
73bc45c
Added started_app argument in the interactive visualizer
denysgerasymuk799 Dec 21, 2023
a96873d
Merge pull request #87 from DataResponsibly/feature/add_visualization…
denysgerasymuk799 Dec 21, 2023
c37559f
Added figsize_scale
denysgerasymuk799 Dec 21, 2023
2fa1aea
Added figsize_scale
denysgerasymuk799 Dec 21, 2023
4867249
Added figsize_scale
denysgerasymuk799 Dec 21, 2023
4e867e2
Added font_increase
denysgerasymuk799 Dec 21, 2023
e204f79
Added figsize_scale
denysgerasymuk799 Dec 21, 2023
9e0b2d5
Added figsize_scale
denysgerasymuk799 Dec 21, 2023
421a449
Merge pull request #89 from DataResponsibly/feature/add_visualization…
denysgerasymuk799 Dec 21, 2023
3f6c4a2
Improved visualizations
denysgerasymuk799 Dec 21, 2023
1492ee0
Merge pull request #90 from DataResponsibly/feature/add_visualization…
denysgerasymuk799 Dec 21, 2023
f38e37a
Aligned namings for dimensions in model performance summary
denysgerasymuk799 Dec 22, 2023
e06a769
Merge pull request #92 from DataResponsibly/feature/add_visualization…
denysgerasymuk799 Dec 22, 2023
698c62c
Aligned namings for dimensions in model performance summary
denysgerasymuk799 Dec 22, 2023
10ceb14
Merge pull request #94 from DataResponsibly/feature/add_visualization…
denysgerasymuk799 Dec 22, 2023
d20a68b
Added interactive web app demonstration
denysgerasymuk799 Dec 25, 2023
732f4b6
Updated README
denysgerasymuk799 Dec 25, 2023
1a5fff1
Updated README
denysgerasymuk799 Dec 25, 2023
fb4e2fd
Updated welcome page
denysgerasymuk799 Dec 25, 2023
2c0340f
Fixed tests
denysgerasymuk799 Dec 25, 2023
8bc4c8f
Added documentation for analyzers
denysgerasymuk799 Dec 25, 2023
1e92435
Added documentation for custom classes
denysgerasymuk799 Dec 26, 2023
f4ea2d1
Added documentation for datasets
denysgerasymuk799 Dec 26, 2023
1f908de
Added docstrings for all packages
denysgerasymuk799 Dec 26, 2023
6f00936
Added docstrings for all packages
denysgerasymuk799 Dec 26, 2023
21ceb1e
Added docstrings for all packages
denysgerasymuk799 Dec 26, 2023
b2f03f4
Added docstrings for all packages
denysgerasymuk799 Dec 26, 2023
0d20563
Added docstrings for all packages
denysgerasymuk799 Dec 26, 2023
0e2e8b0
Added docstrings for all packages
denysgerasymuk799 Dec 26, 2023
1651556
Added release notes
denysgerasymuk799 Dec 26, 2023
ef74b15
Updated documentation in example notebooks
denysgerasymuk799 Dec 26, 2023
76abe2d
Updated a use case notebook for postprocessor
denysgerasymuk799 Dec 26, 2023
e5c6abe
Updated all documentation
denysgerasymuk799 Dec 26, 2023
2bcdab8
Added StudentPerformancePortugueseDataset
denysgerasymuk799 Dec 26, 2023
c7f7d95
Added StudentPerformancePortugueseDataset
denysgerasymuk799 Dec 26, 2023
0f6b975
Added StudentPerformancePortugueseDataset
denysgerasymuk799 Dec 26, 2023
c026196
Added tests for Epistemic Uncertainty
denysgerasymuk799 Dec 30, 2023
f3e8616
Merge pull request #96 from DataResponsibly/feature/add_visualization…
denysgerasymuk799 Jan 3, 2024
b7052cd
Simplified postprocessor usage
denysgerasymuk799 Jan 3, 2024
7dc6199
Added a debug print
denysgerasymuk799 Jan 4, 2024
f64e102
Removed setting seed to None by default
denysgerasymuk799 Jan 4, 2024
4aef681
Added a fair inprocessing wrapper
denysgerasymuk799 Jan 6, 2024
929182f
Updated seaborn version to 0.13.1
denysgerasymuk799 Jan 7, 2024
705bcae
Implemented copy methods in FairInprocessingWrapper
denysgerasymuk799 Jan 7, 2024
af25296
Implemented copy methods in FairInprocessingWrapper
denysgerasymuk799 Jan 7, 2024
0df9627
Removed fair inprocessing wrapper
denysgerasymuk799 Jan 8, 2024
da7bf5a
Merge pull request #97 from DataResponsibly/feature/prepare_experimen…
denysgerasymuk799 Jan 17, 2024
afd1cc9
Reverted seaborn version
denysgerasymuk799 Jan 17, 2024
ad67d56
Set selection rate as a default metric for Representation dimension
denysgerasymuk799 Jan 17, 2024
1003d11
Merge pull request #99 from DataResponsibly/feature/add_visualization…
denysgerasymuk799 Jan 17, 2024
42d9c7a
Added a tutorial for inprocessor usage
denysgerasymuk799 Jan 17, 2024
a4af765
Added base inprocessing wrapper
denysgerasymuk799 Jan 18, 2024
3e97130
Merge branch 'development' into feature/prepare_for_uncertainty_exper…
denysgerasymuk799 Jan 21, 2024
6c11a5c
Merge pull request #101 from DataResponsibly/feature/prepare_for_unce…
denysgerasymuk799 Jan 21, 2024
ffa584d
Checked dependencies
denysgerasymuk799 Jan 21, 2024
ddf756d
Updated all parity metrics to difference metrics
denysgerasymuk799 Jan 29, 2024
bff6ad5
Updated tests for difference metrics
denysgerasymuk799 Jan 29, 2024
5b78cea
Updated documentation for difference metrics
denysgerasymuk799 Jan 29, 2024
0e3eda4
Updated use case notebooks for difference metrics
denysgerasymuk799 Jan 29, 2024
100e607
Checked 3 out of 6 interactive notebooks
denysgerasymuk799 Jan 29, 2024
40d73c1
Checked all interactive tests
denysgerasymuk799 Jan 29, 2024
573af67
Updated README and release notes
denysgerasymuk799 Jan 29, 2024
374aa4c
Updated README and release notes
denysgerasymuk799 Jan 29, 2024
1d90ade
Merge branch 'development' into feature/prepare_for_uncertainty_exper…
denysgerasymuk799 Jan 29, 2024
7b2806b
Merge pull request #102 from DataResponsibly/feature/prepare_for_unce…
denysgerasymuk799 Jan 29, 2024
4b9fd31
Finalized virny demo
denysgerasymuk799 Jan 29, 2024
b7188da
Merge pull request #104 from DataResponsibly/feature/prepare_for_unce…
denysgerasymuk799 Jan 29, 2024
83c2216
Updated README
denysgerasymuk799 Jan 29, 2024
4d5bf86
Updated README
denysgerasymuk799 Jan 29, 2024
d605e6b
Updated documentation
denysgerasymuk799 Jan 29, 2024
5190574
Release v0.4.0
denysgerasymuk799 Jan 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/unit-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,5 +23,5 @@ jobs:
- name: pytest [Branch]
run: |
source ~/.venv/bin/activate
pip install requests-toolbelt==1.0.0
pip install xgboost~=1.7.2
pytest --durations=10 -n logical # Run pytest on all logical CPU cores
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
*_venv
virny_env
notebooks
*.env
.DS_Store
Expand Down
45 changes: 24 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,28 +28,29 @@
</p>



## 📜 Description

**Virny** is a Python library for auditing model stability and fairness. The Virny library was
developed based on three fundamental principles:
**Virny** is a Python library for in-depth profiling of model performance across overall and disparity dimensions.
In addition to its metric computation capabilities, the library provides an interactive tool called _VirnyView_
to streamline responsible model selection and generate nutritional labels for ML models.
The Virny library was developed based on three fundamental principles:

1) easy extensibility of model analysis capabilities;

2) compatibility to user-defined/custom datasets and model types;

3) simple composition of parity metrics based on context of use.
3) simple composition of disparity metrics based on the context of use.

Virny decouples model auditing into several stages, including: **subgroup metrics computation**, **group metrics composition**,
and **metrics visualization and reporting**. This gives data scientists and practitioners more control and flexibility
to use the library for model development and monitoring post-deployment.
Virny decouples model auditing into several stages, including: **subgroup metric computation**, **disparity metric composition**,
and **metric visualization**. This gives data scientists more control and flexibility to use the library
for model development and monitoring post-deployment.

For quickstart, look at our [Use Case Examples](https://dataresponsibly.github.io/Virny/examples/Multiple_Models_Interface_Use_Case/).
For quickstart, look at [use case examples](https://dataresponsibly.github.io/Virny/examples/Multiple_Models_Interface_Use_Case/), [an interactive demo](https://huggingface.co/spaces/denys-herasymuk/virny-demo), and [a demonstrative Jupyter notebook](https://huggingface.co/spaces/denys-herasymuk/virny-demo/blob/main/notebooks/ACS_Income_Demo.ipynb).


## 🛠 Installation

Virny supports **Python 3.8 (recommended), 3.9** and can be installed with `pip`:
Virny supports **Python 3.8 and 3.9** and can be installed with `pip`:

```bash
pip install virny
Expand All @@ -61,29 +62,31 @@ pip install virny
* [Introduction](https://dataresponsibly.github.io/Virny/)
* [API Reference](https://dataresponsibly.github.io/Virny/api/overview/)
* [Use Case Examples](https://dataresponsibly.github.io/Virny/examples/Multiple_Models_Interface_Use_Case/)
* [Interactive Demo](https://huggingface.co/spaces/denys-herasymuk/virny-demo)


## 💡 Features

* Entire pipeline for auditing model stability and fairness
* Metrics reports and visualizations
* Ability to analyze intersections of sensitive attributes
* Entire pipeline for profiling model accuracy, stability, uncertainty, and fairness
* Ability to analyze non-binary sensitive attributes and their intersections
* Compatibility with [pre-, in-, and post-processors](https://aif360.readthedocs.io/en/latest/modules/algorithms.html#) for fairness enhancement from AIF360
* Convenient metric computation interfaces: an interface for multiple models, an interface for multiple test sets, and an interface for saving results into a user-defined database
* An `error_analysis` computation mode to analyze model stability and confidence for correct and incorrect prodictions splitted by groups
* Data loaders with subsampling for fairness datasets
* An `error_analysis` computation mode to analyze model stability and confidence for correct and incorrect prodictions broken down by groups
* Metric static and interactive visualizations
* Data loaders with subsampling for popular fair-ML benchmark datasets
* User-friendly parameters input via config yaml files
* Check out [our documentation](https://dataresponsibly.github.io/Virny/) for a comprehensive overview


## 📖 Library Terminology
## 📖 Library Overview

This section briefly explains the main terminology used in our library.
![Virny_Architecture](https://github.com/DataResponsibly/Virny/assets/42843889/91620e0f-11ff-4093-8fb6-c88c90bff711)

* A **sensitive attribute** is an attribute that partitions the population into groups with unequal benefits received.
* A **protected group** (or simply _group_) is created by partitioning the population by one or many sensitive attributes.
* A **privileged value** of a sensitive attribute is a value that gives more benefit to a protected group, which includes it, than to protected groups, which do not include it.
* A **subgroup** is created by splitting a protected group by privileges and disprivileged values.
* A **group metric** is a metric that shows the relation between privileged and disprivileged subgroups created based on one or many sensitive attributes.
The software framework decouples the process of model profiling into several stages, including **subgroup metric computation**,
**disparity metric composition**, and **metric visualization**. This separation empowers data scientists with greater control and
flexibility in employing the library, both during model development and for post-deployment monitoring. The above figure demonstrates
how the library constructs a pipeline for model analysis. Inputs to a user interface are shown in green, pipeline stages are shown in blue,
and the output of each stage is shown in purple.


## 🤗 Affiliations
Expand Down
15 changes: 9 additions & 6 deletions docs/api/analyzers/AbstractOverallVarianceAnalyzer.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,14 @@ Abstract class for an analyzer that computes overall variance metrics for subgro

Number of estimators in ensemble to measure base_model stability

- **with_predict_proba** (*bool*) – defaults to `True`

[Optional] A flag if model can return probabilities for its predictions. If no, only metrics based on labels (not labels and probabilities) will be computed.

- **notebook_logs_stdout** (*bool*) – defaults to `False`

[Optional] True, if this interface was execute in a Jupyter notebook, False, otherwise.

- **verbose** (*int*) – defaults to `0`

[Optional] Level of logs printing. The greater level provides more logs. As for now, 0, 1, 2 levels are supported.
Expand All @@ -65,17 +73,12 @@ Abstract class for an analyzer that computes overall variance metrics for subgro

???- note "compute_metrics"

Measure metrics for the base model. Display plots for analysis if needed. Save results to a .pkl file
Measure metrics for the base model. Save results to a .csv file.

**Parameters**

- **make_plots** (*bool*) – defaults to `False`
- **save_results** (*bool*) – defaults to `True`
- **with_fit** (*bool*) – defaults to `True`

???- note "get_metrics_dict"

???- note "print_metrics"

???- note "save_metrics_to_file"

1 change: 1 addition & 0 deletions docs/api/analyzers/AbstractSubgroupAnalyzer.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ Abstract class for a subgroup analyzer to compute metrics for subgroups.
**Parameters**

- **y_preds**
- **models_predictions** (*dict*)
- **save_results** (*bool*)
- **result_filename** (*str*) – defaults to `None`
- **save_dir_path** (*str*) – defaults to `None`
Expand Down
17 changes: 10 additions & 7 deletions docs/api/analyzers/BatchOverallVarianceAnalyzer.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Analyzer to compute subgroup variance metrics for batch learning models.

- **base_model_name** (*str*)

Model name like 'HoeffdingTreeClassifier' or 'LogisticRegression'
Model name like 'DecisionTreeClassifier' or 'LogisticRegression'

- **bootstrap_fraction** (*float*)

Expand Down Expand Up @@ -46,6 +46,14 @@ Analyzer to compute subgroup variance metrics for batch learning models.

Number of estimators in ensemble to measure base_model stability

- **with_predict_proba** (*bool*) – defaults to `True`

[Optional] A flag if model can return probabilities for its predictions. If no, only metrics based on labels (not labels and probabilities) will be computed.

- **notebook_logs_stdout** (*bool*) – defaults to `False`

[Optional] True, if this interface was execute in a Jupyter notebook, False, otherwise.

- **verbose** (*int*) – defaults to `0`

[Optional] Level of logs printing. The greater level provides more logs. As for now, 0, 1, 2 levels are supported.
Expand All @@ -69,17 +77,12 @@ Analyzer to compute subgroup variance metrics for batch learning models.

???- note "compute_metrics"

Measure metrics for the base model. Display plots for analysis if needed. Save results to a .pkl file
Measure metrics for the base model. Save results to a .csv file.

**Parameters**

- **make_plots** (*bool*) – defaults to `False`
- **save_results** (*bool*) – defaults to `True`
- **with_fit** (*bool*) – defaults to `True`

???- note "get_metrics_dict"

???- note "print_metrics"

???- note "save_metrics_to_file"

96 changes: 96 additions & 0 deletions docs/api/analyzers/BatchOverallVarianceAnalyzerPostProcessing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# BatchOverallVarianceAnalyzerPostProcessing

Analyzer to compute subgroup variance metrics using the defined post-processor.



## Parameters

- **postprocessor**

One of postprocessors from aif360 (https://aif360.readthedocs.io/en/stable/modules/algorithms.html#module-aif360.algorithms.postprocessing)

- **sensitive_attribute** (*str*)

A sensitive attribute to use for post-processing

- **base_model**

Base model for stability measuring

- **base_model_name** (*str*)

Model name like 'DecisionTreeClassifier' or 'LogisticRegression'

- **bootstrap_fraction** (*float*)

[0-1], fraction from train_pd_dataset for fitting an ensemble of base models

- **X_train** (*pandas.core.frame.DataFrame*)

Processed features train set

- **y_train** (*pandas.core.frame.DataFrame*)

Targets train set

- **X_test** (*pandas.core.frame.DataFrame*)

Processed features test set

- **y_test** (*pandas.core.frame.DataFrame*)

Targets test set

- **target_column** (*str*)

Name of the target column

- **dataset_name** (*str*)

Name of dataset, used for correct results naming

- **n_estimators** (*int*)

Number of estimators in ensemble to measure base_model stability

- **with_predict_proba** (*bool*) – defaults to `True`

[Optional] A flag if model can return probabilities for its predictions. If no, only metrics based on labels (not labels and probabilities) will be computed.

- **notebook_logs_stdout** (*bool*) – defaults to `False`

[Optional] True, if this interface was execute in a Jupyter notebook, False, otherwise.

- **verbose** (*int*) – defaults to `0`

[Optional] Level of logs printing. The greater level provides more logs. As for now, 0, 1, 2 levels are supported.




## Methods

???- note "UQ_by_boostrap"

Quantifying uncertainty of the base model by constructing an ensemble from bootstrapped samples and applying postprocessing intervention.

Return a dictionary where keys are models indexes, and values are lists of correspondent model predictions for X_test set.

**Parameters**

- **boostrap_size** (*int*)
- **with_replacement** (*bool*)
- **with_fit** (*bool*) – defaults to `True`

???- note "compute_metrics"

Measure metrics for the base model. Save results to a .csv file.

**Parameters**

- **save_results** (*bool*) – defaults to `True`
- **with_fit** (*bool*) – defaults to `True`

???- note "save_metrics_to_file"

1 change: 1 addition & 0 deletions docs/api/analyzers/SubgroupErrorAnalyzer.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ Analyzer to compute error metrics for subgroups.
**Parameters**

- **y_preds**
- **models_predictions** (*dict*)
- **save_results** (*bool*)
- **result_filename** (*str*) – defaults to `None`
- **save_dir_path** (*str*) – defaults to `None`
Expand Down
15 changes: 13 additions & 2 deletions docs/api/analyzers/SubgroupVarianceAnalyzer.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Analyzer to compute variance metrics for subgroups.

## Parameters

- **model_setting** (*virny.configs.constants.ModelSetting*)
- **model_setting** (*[metrics.ModelSetting](../../metrics/ModelSetting)*)

Model learning type; a constant from virny.configs.constants.ModelSetting

Expand Down Expand Up @@ -42,10 +42,22 @@ Analyzer to compute variance metrics for subgroups.

A dictionary of protected groups where keys are subgroup names, and values are X_test row indexes correspondent to this subgroup.

- **postprocessor** – defaults to `None`

One of postprocessors from aif360 (https://aif360.readthedocs.io/en/stable/modules/algorithms.html#module-aif360.algorithms.postprocessing)

- **postprocessing_sensitive_attribute** (*str*) – defaults to `None`

A sensitive attribute to use for post-processing

- **computation_mode** (*str*) – defaults to `None`

[Optional] A non-default mode for metrics computation. Should be included in the ComputationMode enum.

- **notebook_logs_stdout** (*bool*) – defaults to `False`

[Optional] True, if this interface was execute in a Jupyter notebook, False, otherwise.

- **verbose** (*int*) – defaults to `0`

[Optional] Level of logs printing. The greater level provides more logs. As for now, 0, 1, 2 levels are supported.
Expand All @@ -66,7 +78,6 @@ Analyzer to compute variance metrics for subgroups.
- **save_results** (*bool*)
- **result_filename** (*str*) – defaults to `None`
- **save_dir_path** (*str*) – defaults to `None`
- **make_plots** (*bool*) – defaults to `True`
- **with_fit** (*bool*) – defaults to `True`

???- note "set_test_protected_groups"
Expand Down
5 changes: 5 additions & 0 deletions docs/api/analyzers/SubgroupVarianceCalculator.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,10 @@ Calculator that calculates variance metrics for subgroups.

[Optional] A non-default mode for metrics computation. Should be included in the ComputationMode enum.

- **with_predict_proba** (*bool*) – defaults to `True`

[Optional] A flag if model can return probabilities for its predictions. If no, only metrics based on labels (not labels and probabilities) will be computed.




Expand All @@ -39,6 +43,7 @@ Calculator that calculates variance metrics for subgroups.

**Parameters**

- **y_preds**
- **models_predictions** (*dict*)
- **save_results** (*bool*)
- **result_filename** (*str*) – defaults to `None`
Expand Down
4 changes: 2 additions & 2 deletions docs/api/custom-classes/MetricsComposer.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# MetricsComposer

Composer class that combines different subgroup metrics to create group metrics such as 'Disparate_Impact' or 'Accuracy_Parity'

Metric Composer class that combines different subgroup metrics to create disparity metrics such as 'Disparate_Impact' or 'Accuracy_Difference'.

Definitions of the disparity metrics could be observed in the __init__ method of the Metric Composer: https://github.com/DataResponsibly/Virny/blob/main/virny/custom_classes/metrics_composer.py

## Parameters

Expand Down
37 changes: 37 additions & 0 deletions docs/api/custom-classes/MetricsInteractiveVisualizer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# MetricsInteractiveVisualizer

Class to create an interactive web app based on models metrics.



## Parameters

- **X_data** (*pandas.core.frame.DataFrame*)

An original features dataframe

- **y_data** (*pandas.core.frame.DataFrame*)

An original target column pandas series

- **model_metrics**

A dictionary or a dataframe where keys are model names and values are dataframes of subgroup metrics for each model

- **sensitive_attributes_dct** (*dict*)

A dictionary where keys are sensitive attributes names (including attributes intersections), and values are privilege values for these attributes




## Methods

???- note "create_web_app"

Build an interactive web application.

**Parameters**

- **start_app** – defaults to `True`

Loading
Loading