Name		Name	Last commit message	Last commit date
parent directory ..
facebook_candidate		facebook_candidate
facebook_instagram_keyword		facebook_instagram_keyword
reddit_keyword		reddit_keyword
twitter_candidate		twitter_candidate
twitter_keyword		twitter_keyword
README.md		README.md
facebook_ads_ids.csv.gz		facebook_ads_ids.csv.gz

README.md

Introduction

Here we release the data.

Facebook candidate posts

See /facebook_candidate. We share the IDs of the posts in CSV files with the following schema:

Column	Note
post_id	ID of the post
post_url	URL of the post

Facebook and Instagram posts based on keywords

See /facebook_instagram_keyword. We share the IDs of the posts in CSV files with the following schema:

Column	Note
post_id	ID of the post
platform	facebook or instagram
post_url	URL of the post
relevant*	1=related to midterm; 0=irrelevant

Post relevance label: see the paper for details. The procedure is not perfect, use with discretion.

Reddit submissions and comments based on keywords

See /reddit_keyword.

We share the data from each day in json files. Each line of the file is an object. We add a relevant label to each object.

Candidate tweets

See /twitter_candidate. We share the IDs of the tweets in CSV files with the following schema:

Column	Note
tid	ID of the tweet

Tweets based on keywords

See /twitter_keyword. We share the IDs of the tweets in CSV files with the following schema:

Column	Note
tid	ID of the tweet
relevant*	1=related to midterm; 0=irrelevant

Post relevance label: see the paper for details. The procedure is not perfect, use with discretion.

Facebook ads

See facebook_ads_ids.csv.gz . We share the IDs of the ads in CSV files with the following schema:

Column	Note
id	ID of the ad

4chan data

Since the 4chan data is too large for GitHub, we host it on zenodo.

For the archive files, each line is a thread in json format. For the snapshot files, each line is a json object containing the snapshots of 4chan's catalog endpoint.

Note: we did not perform any keyword matching on the 4chan data so not all content is related to the 2022 US midterm elections.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

README.md

Introduction

Facebook candidate posts

Facebook and Instagram posts based on keywords

Reddit submissions and comments based on keywords

Candidate tweets

Tweets based on keywords

Facebook ads

4chan data

Files

data

Directory actions

More options

Directory actions

More options

Latest commit

History

data

Folders and files

parent directory

README.md

Introduction

Facebook candidate posts

Facebook and Instagram posts based on keywords

Reddit submissions and comments based on keywords

Candidate tweets

Tweets based on keywords

Facebook ads

4chan data