Applications and next steps

At the end of the day, our process did a pretty decent job coming up with results that matched CRPs -- even in some cases finding some things that the CRP data got wrong. But the question still remains: So what? What good does it do to work with campaign data at the donor level as opposed to the contribution level?

Here are a few examples of applications that might derive from this process at the local, state and federal levels.

Potential applications

Recall that part of the motivation behind this project was to generalize the standardization of donor names across campaign finance datasets. The main fields in most campaign finance datasets -- local, state or federal -- look pretty much the same: donor name, recipient name, some location information, and often some info about occupation and employer. CRP does a great job cleaning up this data on the national level, and the National Institute for Money in State Politics does some similar cleanup at the state level, but neither of them are going to be able to standardize your local city council's campaign finance records on demand. So there's one application right there.

But that still leaves a bigger question: What's the point of standardizing this stuff at all?

Improving accuracy

Check bias vs. variance and recommend accordingly
Better features, particularly ZIP code distance
Review training data. Some CRP stuff is wrong.
Name parser improvements: nicknames; Jrs and Srs are screwing up in some places
Zero padding on ZIPs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Applications and next steps

Potential applications

Improving accuracy

Clone this wiki locally