Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/smartphone example #167

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.1.0
1.2.0
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions docs/source/api_reference/api_reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,5 +57,8 @@ raw specifications might not be sufficient to provide comprehensive instructions
xaiographs.datasets.load_education_performance
xaiographs.datasets.load_education_performance_discretized
xaiographs.datasets.load_education_performance_why
xaiographs.datasets.load_phone_brand_preferences
xaiographs.datasets.load_phone_brand_preferences_discretized
xaiographs.datasets.load_phone_brand_preferences_why
```

6 changes: 6 additions & 0 deletions docs/source/api_reference/datasets.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,9 @@
xaiographs.datasets.load_education_performance
xaiographs.datasets.load_education_performance_discretized
xaiographs.datasets.load_education_performance_why
xaiographs.datasets.load_phone_brand_preferences
xaiographs.datasets.load_phone_brand_preferences_discretized
xaiographs.datasets.load_phone_brand_preferences_why
```

```{eval-rst}
Expand All @@ -39,6 +42,9 @@
.. autofunction:: xaiographs.datasets.load_education_performance
.. autofunction:: xaiographs.datasets.load_education_performance_discretized
.. autofunction:: xaiographs.datasets.load_education_performance_why
.. autofunction:: xaiographs.datasets.load_phone_brand_preferences
.. autofunction:: xaiographs.datasets.load_phone_brand_preferences_discretized
.. autofunction:: xaiographs.datasets.load_phone_brand_preferences_why
```


Expand Down
2 changes: 2 additions & 0 deletions docs/source/contributors/contributors.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,11 @@ XAIoGraphs has been developed by ***Applied AI & Privacy*** team (Telefónica In
* [Enrique Fernandez](https://github.com/QuiqueFdez)
* [Alejandro Manuel Arranz](https://github.com/cx02747)
* [Manuel Martín](https://github.com/mmarmar)
* [Morganne De Witte](https://www.linkedin.com/in/morgannedw/)
* [Mario Villaizan](https://github.com/mvvmvv)
* [Cesar García](https://github.com/cesarggtid)
* [David Cadenas](https://github.com/davidcadi)
* [Alejandra Maria Alonso](https://www.linkedin.com/in/alejandraalonsodiaz/)
* [Miguel Angel Martín](https://github.com/mamj-telefonica)
* [Oriol Arnau](https://github.com/oarnau)
* [Morganne De Witte](https://github.com/MorganneDeWitte)
16 changes: 9 additions & 7 deletions docs/source/examples/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,14 @@ XAIoGraphs contains a set of examples that can be executed as `entry points`:



| Example | Entry Point | Rows | Num. Feats | Task |
|:--------------------------------------------------|:------------------------------|:-----:|:----------:|:----------------:|
| [Titanic](titanic.md) | titanic_example | 1309 | 8 | Binary |
| [COMPAS](compas.md) | compas_example | 4230 | 7 | Multi-Class (3) |
| [COMPAS Reality](compas_reality.md) | compas_reality_example | 4230 | 7 | Binary |
| [Body Performace](body_performance.md) | body_performance_example | 13393 | 11 | Multi-Class (3) |
| [Education Performance](education_performance.md) | education_performance_example | 145 | 29 | Multi-Class (5) |
| Example | Entry Point | Rows | Num. Feats | Task |
|:----------------------------------------------------------------|:------------------------------|:-----:|:----------:|:----------------:|
| [Titanic](titanic.md) | titanic_example | 1309 | 8 | Binary |
| [COMPAS](compas.md) | compas_example | 4230 | 7 | Multi-Class (3) |
| [COMPAS Reality](compas_reality.md) | compas_reality_example | 4230 | 7 | Binary |
| [Body Performace](body_performance.md) | body_performance_example | 13393 | 11 | Multi-Class (3) |
| [Education Performance](education_performance.md) | education_performance_example | 145 | 29 | Multi-Class (5) |
| [Smartphone Brand Preferences](smartphone_brand_preferences.md) | smartphone_example | 259 | 11 | Multi-Class (5) |


Use the entry points to see an example run with the XAIoGraphs library installed in a Python virtual environment
Expand All @@ -36,3 +37,4 @@ You can see more information about each of these examples at the links below:
* [Compas Reality Example](compas_reality.md)
* [Body Performace Example](body_performance.md)
* [Education Performance Example](education_performance.md)
* [Smartphone Brand Preferences](smartphone_brand_preferences.md)
134 changes: 134 additions & 0 deletions docs/source/examples/smartphone_brand_preferences.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
[< ✏️ Examples](examples/examples)

# Smartphone Brand Preferences Example


This example highlights the most important features smartphones from certain brands have, to predict the most likely smartphone-brand purchase. The [`Smartphone Brand Preferences Dataset`](../user_guide/datasets.md#smartphone-brand-preferences) is used to capture the characteristics a smartphone has and
why people choose to buy a determined brand

This dataset can be obtained using the [`load_phone_brand_preferences()`](../api_reference/datasets.md#xaiographs.datasets.load_phone_brand_preferences)
function:

```python
>>> from xaiographs.datasets import load_phone_brand_preferences
>>> df_dataset = load_phone_brand_preferences()
>>> df_dataset.head(5)
brand internal_memory performance main_camera selfie_camera battery_size screen size weight price age gender occupation
0 Samsung 128 8.81 50 10 3700 6.1 167 528 38 Female Data analyst
1 Apple 256 7.94 12 12 3065 6.1 204 999 38 Female Data analyst
2 Google 128 6.76 50 8 4614 6.4 207 499 31 Female sales
3 Samsung 128 7.22 50 10 4500 6.6 195 899 31 Female sales
4 Google 128 6.88 12 8 4410 6.1 178 449 27 Female Team leader

```

To determine the explainability of this dataset, XAIoGraphs provides a dataset that has already been discretized and
columns with targets probabilities using
[`load_phone_brand_preferences_discretized()`](../api_reference/datasets.md#xaiographs.datasets.load_education_performance_discretized) function:


```python
>>> from xaiographs.datasets import load_phone_brand_preferences_discretized
>>> df_dataset, features_cols, target_cols, y_true, y_predict = load_phone_brand_preferences_discretized()
>>> df_dataset.head(5)
id internal_memory performance main_camera selfie_camera battery_size screen_size weight price age gender occupation y_true y_predict Apple Google Motorola Samsung Xiaomi
0 0 128_GB Ultra top 15_50_MP <10_MP <4000_mAh <6.4_inches <190_g 450_700_dollars 35_45_years Female Technology Samsung Apple 1 0 0 0 0
1 1 >=256_GB Top <15_MP 10_30_MP <4000_mAh <6.4_inches 190_205_g >700_dollars 35_45_years Female Technology Apple Apple 1 0 0 0 0
2 2 128_GB Mid 15_50_MP <10_MP 4000_4700_mAh <6.4_inches >205_g 450_700_dollars 25_35_years Female Business Google Google 0 1 0 0 0
3 3 128_GB Mid 15_50_MP <10_MP 4000_4700_mAh 6.4_6.6_inches 190_205_g >700_dollars 25_35_years Female Business Samsung Samsung 0 0 0 1 0
4 4 128_GB Mid <15_MP <10_MP 4000_4700_mAh <6.4_inches <190_g 200_450_dollars 25_35_years Female Administration Google Google 0 1 0 0 0

```


&nbsp;
## Code Example

The following entry point (with Python virtual environment enabled) is used to demonstrate this example.

```python
>> smartphone_brand_preferences
```

Alternatively, you may run the code below to view a full implementation of all XAIoGraphs functionalities with this Dataset:

```python
from xaiographs import Explainer
from xaiographs import Why
from xaiographs import Fairness
from xaiographs.datasets import load_phone_brand_preferences_discretized, load_phone_brand_preferences_why

LANG = 'en'

# LOAD DATASETS & SEMANTICS
df_phone_brand_pref, feature_cols, target_cols, y_true, y_predict = load_phone_brand_preferences_discretized()
df_values_semantics, df_target_values_semantics = load_phone_brand_preferences_why(language=LANG)

# EXPLAINER
explainer = Explainer(importance_engine='LIDE', verbose=1)
explainer.fit(df=df_phone_brand_pref, feature_cols=feature_cols, target_cols=target_cols)

# WHY
why = Why(language=LANG,
explainer=explainer,
why_values_semantics=df_values_semantics,
why_target_values_semantics=df_target_values_semantics,
verbose=1)
why.fit()

# FAIRNESS
f = Fairness(verbose=1)
f.fit(df=df_phone_brand_pref[feature_cols + [y_true] + [y_predict]],
sensitive_cols=['gender', 'age'],
target_col=y_true,
predict_col=y_predict)
```

&nbsp;
## XAIoWeb Smartphone Brand Preferences

After running the `.fit()` methods of each of the classes (one, two, or all three), a sequence of JSON files are
generated in the `xaioweb_files` folder to visualized in XAIoWeb interface.


To launch the web (with the virtual environment enabled), run the following entry point:

```python
>> xaioweb -d xaioweb_files -o -f
```

And the results seen in XAIoWeb are the following:

&nbsp;
#### Global Explainability
&nbsp;
```{image} ../../imgs/smartphone_brand_preferences_example/XaioWeb_Global_Explainability.png
:alt: Global Explainability
:class: bg-primary
:width: 600px
:align: center
```

&nbsp;
#### Local Explainability
&nbsp;
```{image} ../../imgs/smartphone_brand_preferences_example/XaioWeb_Local_Explainability.png
:alt: Local Explainability
:class: bg-primary
:width: 600px
:align: center
```

&nbsp;
#### Fairness
&nbsp;
```{image} ../../imgs/smartphone_brand_preferences_example/XaioWeb_Fairness.png
:alt: Fairness
:class: bg-primary
:width: 600px
:align: center
```
&nbsp;


[< ✏️ Examples](examples/examples)
29 changes: 22 additions & 7 deletions docs/source/user_guide/datasets.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,14 @@ To test the capabilities of XAIoGraphs, it provides a series of datasets via

The following datasets are included:

| Datset | Rows | Num. Feats | Task |
|:------------------------------------------------|:-----:|:----------:|:---------------:|
| [Titanic](#titanic) | 1309 | 8 | Binary |
| [Compas](#compas) | 4230 | 7 | Multi-Class (3) |
| [Compas Reality](#compas) | 4230 | 7 | Binary |
| [Body Performace](#body-performance) | 13393 | 11 | Multi-Class (3) |
| [Education Performance](#education-performance) | 145 | 29 | Multi-Class (5) |
| Dataset | Rows | Num. Feats | Task |
|:----------------------------------------------------------------|:-----:|:----------:|:----------------:|
| [Titanic](titanic.md) | 1309 | 8 | Binary |
| [COMPAS](compas.md) | 4230 | 7 | Multi-Class (3) |
| [COMPAS Reality](compas_reality.md) | 4230 | 7 | Binary |
| [Body Performace](body_performance.md) | 13393 | 11 | Multi-Class (3) |
| [Education Performance](education_performance.md) | 145 | 29 | Multi-Class (5) |
| [Smartphone Brand Preferences](smartphone_brand_preferences.md) | 981 | 17 | Multi-Class (5) |

These datasets are accessible in both raw and discretized form, ready for usage by the
[`Explainability`](../api_reference/explainability.md) and [`Fairness`](../api_reference/fairness.md) classes.
Expand Down Expand Up @@ -99,5 +100,19 @@ purpose is to predict students' end-of-term performances using ML techniques.
| **function to obtain dataset** | [`xaiographs.datasets.load_education_performance()`](../api_reference/datasets.md#xaiographs.datasets.load_education_performance) |
| **function to obtain discretized dataset** | [`xaiographs.datasets.load_education_performance_discretized()`](../api_reference/datasets.md#xaiographs.datasets.load_education_performance_discretized) |

&nbsp;
## Smartphone Brand Preferences

The data was collected through a combination of three datasets containing the most noteworthy features on the preferred smartphones in the US in 2022, user's data and smartphone ratings. This information was obtained via a Mechanical Turk survey where participants assessed 10 randomly presented phones by likelihood of purchase and provided personal information. This example highlights the most important features smartphones from certain brands have, to predict the most likely smartphone-brand purchase.

| | |
|----------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **Source** | [https://www.kaggle.com/datasets/meirnizri/cellphones-recommendations/data?select=cellphones+ratings.csv](https://www.kaggle.com/datasets/meirnizri/cellphones-recommendations/data?select=cellphones+ratings.csv) |
| **Num Rows:** | 259 |
| **Num Features** | 11 |
| **Num Targets:** | 5 |
| **function to obtain dataset** | [`xaiographs.datasets.load_phone_brand_preferences()`](../api_reference/datasets.md#xaiographs.datasets.load_education_performance) |
| **function to obtain discretized dataset** | [`xaiographs.datasets.load_phone_brand_preferences_discretized()`](../api_reference/datasets.md#xaiographs.datasets.load_education_performance_discretized) |


[< 📚 User Guide](user_guide/user_guide)
1 change: 1 addition & 0 deletions docs/source/user_guide/user_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,5 +30,6 @@
* [COMPAS](datasets.md#compas)
* [Body Performace](datasets.md#body-performace)
* [Education Performance](datasets.md#education-performance)
* [Smartphone Brand Preferences](datasets.md#smartphone-brand-preferences)

[//]: # (* [Compas]&#40;datasets.md#compas&#41;)
46 changes: 46 additions & 0 deletions examples/smartphone_brand_preferences_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# -*- coding: utf-8 -*-

u"""
© 2023 Telefónica Digital España S.L.
This file is part of XAIoGraphs.

XAIoGraphs is free software: you can redistribute it and/or modify it under the terms of the Affero GNU General Public
License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any
later version.

XAIoGraphs is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the Affero GNU General Public License
for more details.

You should have received a copy of the Affero GNU General Public License along with XAIoGraphs. If not,
see https://www.gnu.org/licenses/."""

from xaiographs import Explainer
from xaiographs import Why
from xaiographs import Fairness
from xaiographs.datasets import load_phone_brand_preferences_discretized, load_phone_brand_preferences_why

LANG = 'en'

# LOAD DATASETS & SEMANTICS
df_phone_brand_pref, feature_cols, target_cols, y_true, y_predict = load_phone_brand_preferences_discretized()
df_values_semantics, df_target_values_semantics = load_phone_brand_preferences_why(language=LANG)

# EXPLAINER
explainer = Explainer(importance_engine='LIDE', verbose=1)
explainer.fit(df=df_phone_brand_pref, feature_cols=feature_cols, target_cols=target_cols)

# WHY
why = Why(language=LANG,
explainer=explainer,
why_values_semantics=df_values_semantics,
why_target_values_semantics=df_target_values_semantics,
verbose=1)
why.fit()

# FAIRNESS
f = Fairness(verbose=1)
f.fit(df=df_phone_brand_pref[feature_cols + [y_true] + [y_predict]],
sensitive_cols=['gender', 'age'],
target_col=y_true,
predict_col=y_predict)
8 changes: 4 additions & 4 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
numpy==1.21.6
pandas==1.3.5
scikit-learn==0.24.2
scipy==1.7.3
numpy>=1.19.5,<=1.23.1
pandas>=1.3.5,<2.0.0
scikit-learn<=0.24.2,<=1.3.2
scipy>=1.7.3
tqdm==4.64.1
3 changes: 2 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@
author='Telefonica I+D',
author_email='ricardo.moyagarcia@telefonica.com',
license='AGPL-3.0 license',
python_requires='>=3.7',
python_requires='>=3.7,<3.10',
packages=['xaiographs'],
include_package_data=True,
install_requires=required,
Expand All @@ -50,6 +50,7 @@
"education_performance_example = xaiographs.examples.education_performance_example:main",
"compas_example = xaiographs.examples.compas_example:main",
"compas_reality_example = xaiographs.examples.compas_reality_example:main",
"smartphone_example = xaiographs.examples.smartphone_brand_preferences_example:main",
]
}
)
20 changes: 20 additions & 0 deletions xaiographs/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,27 @@
You should have received a copy of the Affero GNU General Public License along with XAIoGraphs. If not,
see https://www.gnu.org/licenses/."""

import sys
import numpy as np

from .exgraph.explainer import Explainer
from .fairness import Fairness
from .why.why import Why

# Check the Python version
if sys.version_info < (3, 7) or sys.version_info >= (3, 10):
raise RuntimeError("Your Python {}.{}.{} version is incompatible with XAIoGraphs.\n"
"XAIoGraphs requires Python 3.7 or higher, but less than 3.10."
.format(sys.version_info.major, sys.version_info.minor, sys.version_info.micro))

# Check the numpy version
numpy_version = np.__version__

# Split the version into a tuple of integers (e.g., '1.20.3' -> (1, 20, 3))
numpy_version_tuple = tuple(map(int, numpy_version.split('.')))

# Check if the version is within the allowed range (>=1.19.5, <1.23.5)
if not ((1, 19, 5) <= numpy_version_tuple <= (1, 23, 1)):
raise ImportError("Your numpy version {} is incompatible with XAIoGraphs.\n"
"XAIoGraphs requires numpy version 1.19.5 or higher, but less than 1.23.1"
.format(numpy_version))
Loading
Loading