Artificial intelligence systems, such as Sentiment Analysis (SA) systems, typically learn from large amounts of data that may reflect human bias. Consequently, such systems may exhibit unintended demographic bias against specific characteristics (e.g., gender, occupation, country-of-origin, etc.). Such bias manifests in an SA system when it predicts different sentiments for similar texts that differ only in the characteristic of individuals described. To automatically uncover bias in SA systems, this paper presents BiasFinder, an approach that can discover biased predictions in SA systems via metamorphic testing. A key feature of BiasFinder is the automatic curation of suitable templates from any given text inputs, using various Natural Language Processing (NLP) techniques to identify words that describe demographic characteristics. Next, BiasFinder generates new texts from these templates by mutating words associated with a class of a characteristic (e.g., gender-specific words such as female names, 'she', 'her'). These texts are then used to tease out bias in an SA system. BiasFinder identifies a bias-uncovering test case (BTC) when an SA system predicts different sentiments for texts that differ only in words associated with a different class (e.g., male vs. female) of a target characteristic (e.g., gender). We evaluate BiasFinder on 10 SA systems and 2 large scale datasets, and the results show that BiasFinder can create more BTCs than two popular baselines. We also conduct an annotation study and find that human annotators consistently think that test cases generated by BiasFinder are more fluent than the two baselines.
For fine-tuning SA, we use HuggingFace library that provide many pre-trained language models, including BERT, RoBERTa, and XLNET.
For nlp task, please install thess libraries:
- spacy (need
en_core_web_lg
) - pandas
- numpy
- scikit-learn
- nltk
- neuralcoref
- fastNLP
For occupation bias, you need StanfordCoreNLP and several libraries:
- inflect
- pycorenlp -> Stackoverflow Guide to serve StanfordCoreNLP as an API
For preparing data from genderComputer, please install thess libraries:
- python-nameparser
- unidecode
Tips: you may use docker for faster implemention on your coding environment. https://hub.docker.com/r/pytorch/pytorch/tags provide several version of PyTorch containers. Please pull the appropiate pytorch container with the tag 1.9 version, using this command.
docker pull pytorch/pytorch:1.9.0-cuda10.2-cudnn7-devel
dataset | description |
---|---|
asset/imdb/ |
We use IMDB movie review dataset downloaded from Google Drive proposed by Zhang et al. (2015). |
asset/gender_associated_word/ |
It contains pre-determined values for Gender Associated Words |
asset/gender_computer/ |
It contains a notebook asset/gender_computer/genderComputer/prepare_male_female_names.ipynb to prepare the names for BiasFinder experiment. |
asset/predefined_occupation_list/neutral-occupation.csv/ |
It contains pre-determined words for neutral occupations |
Run this command inside the codes/fine-tuning/
folder to fine-tune SA models.
bash fine-tune-imdb.sh
bash fine-tune-twitter-s140.sh
Then check the test accuracy of the fine-tuned models
bash test-imdb.sh
bash test-twitter-s140.sh
Check the accuracy in codes/evaluation/Model-Performance.ipynb
Our framework, BiasFinder, can be instantiated to identify different kinds of bias. In this work, we show how BiasFinder can be instantiated to uncover bias in three different demographic characteristics: gender, occupation, and country-oforigin.
BiasFinder automatically identifies and curates suitable texts in a large corpus of reviews, and transforms these texts into templates. Each template can be used to produce a large number of mutant texts, by filling in placeholders with concrete values associated with a class (e.g., male vs. female) given a demographic characteristic (e.g., gender)(See Section III and IV). Using these mutant texts, BiasFinder then runs the SA system under test, checking if it predicts the same sentiment for two mutants associated with a different class (e.g. male vs. female) of the given characteristic (e.g. gender). A pair of such mutants are related through a metamorphic relation where they share the same predicted sentiment from a fair SA system (See Section V and VI).
Run this command inside the codes/gender/
folder
bash biasfinder-generate-mutant.sh
Some trouble shooting:
-
If you face a problem with
neuralcoref
, please build the library from the source instead of installing using pip. Check here. -
You also need to run the following commands if you meet problem
ModuleNotFoundError: No module named 'en_core_web_lg'
.
python -m spacy download en
python -m spacy download en_core_web_lg
This code will generate mutant texts for gender and saved the mutant texts inside a folder data/biasfinder/gender/
Run this command inside the codes/occupation/
folder
python main.py
This code will generate mutant texts for occupation and saved the mutant texts inside a folder data/biasfinder/occupation/
. Important note: Occupation bias need StanfordCoreNLP to detect occupation term in the text. Thus please make sure to serve StanfordCoreNLP as an API - Stackoverflow Guide to serve StanfordCoreNLP as an API.
Run this command inside the codes/country/
folder
bash generate-country-mutant.sh
This code will generate mutant texts for country-of-origin and saved the mutant texts inside a folder data/biasfinder/country/
Run this command inside the codes/fine-tuning/
folder
bash predict-imdb.sh
This code will produce the prediction of mutant texts.
Run this command inside the codes/fine-tuning/
folder
bash predict-twitter-s140.sh
This code will produce the prediction of mutant texts.
Mutants of differing classes that are produced from the same template are expected to have the same sentiment. Therefore, if the SA predicts that two mutants of different classes to have different sentiments, they are an evidence of a biased prediction. Such pairs of mutants are output as bias-uncovering test cases (BTC). Thus BTC is a pair that contains 2 different class (e.g. male female for gender bias) and their predictions, such that the Sentiment Analysis produce a different prediction. Example of BTC for gender bias:
<(male, prediction), (female, prediction)>
<(“He is angry”, "positive"), (“She is angry”, "negative")>
Notebook evaluation/BTC-Gender.ipynb
contains the BTC calculation for gender bias targeting mutant texts.
Notebook evaluation/BTC-Occupation.ipynb
contains the BTC calculation for occupation bias targeting mutant texts.
Notebook evaluation/BTC-Country.ipynb
contains the BTC calculation for country-of-origin bias targeting mutant texts.