fec-standardizer

An experiment to standardize individual donor names in campaign finance data using simple graph theory and machine learning.

Basics

The objective of this project is to build an entirely automated workflow that can identify canonical individual donors in an arbitrary set of campaign finance data. In order to test its accuracy, this experiment uses federal campaign finance data that has already had its donors standardized by the Center for Responsive Politics. In order to measure the accuracy of the process, this workflow is designed to show how often an automated process' judgment matches CRP's, which is considered the gold standard.

Results

Using a set of 100,000 random individual contributions selected from CRP's data, this workflow identified the same canonical donors as CRP's combination of human and automated classifiers between 96 and 98 percent of the time.

More details

The whole process is documented in detail in the wiki of this repo. Here's the table of contents:

Questions or comments

At best, I'm an amateur when it comes to a lot of the techniques used here. I'm sure I made some mistakes and did some stupid things. If so, I'd love to hear about them!

If you have any questions or comments, feel free to leave them in the wiki or contact chase.davis@gmail.com. Thanks for your interest!

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
campfin		campfin
data		data
.gitignore		.gitignore
README.md		README.md
manage.py		manage.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fec-standardizer

Basics

Results

More details

Questions or comments

About

Releases

Packages

Languages

cjdd3b/fec-standardizer

Folders and files

Latest commit

History

Repository files navigation

fec-standardizer

Basics

Results

More details

Questions or comments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages