This repository contains code and data for the paper:
Luo, Y., Card, D. and Jurafsky, D. (2020). Detecting Stance in Media on Global Warming. In Findings of the Association for Computational Linguistics: EMNLP 2020.
BibTex TBA
- Create and activate a Python 3.6 environment.
- Run
pip install -r requirements.txt
. - Re-install neuralcoref with the
--no-binary
option:
pip uninstall neuralcoref
pip install neuralcoref --no-binary neuralcoref
- Download SpaCy's English model:
python -m spacy download en
- Update the
config.json
file with your local OS variables.
- Our dataset GWSD itself can be accessed via
GWSD.tsv
in the main directory. The dataset contains tab-separated fields for each of the following:sentence
: the sentenceworker_0
, ...,worker_7
: ratings from each of the 8 workers for the stance of the sentencedisagree
: the probability that the sentence expresses disagreement with the target (that climate change/global warming is a serious concern), as estimated by our Bayesian modelagree
: ditto for the "agrees" labelneutral
: ditto for the "neutral" labelguid
: a unique ID for each sentencein_held_out_test
: whether the sentence was used in our held-out-test set for model and baseline evaluation
Note: The first 5 rows are the 5 screen sentences we use to make sure that annotators correctly understand the task, and thus do not have estimated probability distributions.
- Our lexicons of framing devices are located in
4_analyses/lexicons
. - The sequence of code to replicate our results can be found in the individual READMEs of the numbered sub-directories.