-
Notifications
You must be signed in to change notification settings - Fork 13
Applications and next steps
At the end of the day, our process did a pretty decent job coming up with results that matched CRPs -- even in some cases finding some things that the CRP data got wrong. But the question still remains: So what? What good does it do to work with campaign data at the donor level as opposed to the contribution level?
Here are a few examples of applications that might derive from this process at the local, state and federal levels.
Recall that part of the motivation behind this project was to generalize the standardization of donor names across campaign finance datasets. The main fields in most campaign finance datasets -- local, state or federal -- look pretty much the same: donor name, recipient name, some location information, and often some info about occupation and employer. CRP does a great job cleaning up this data on the national level, and the National Institute for Money in State Politics does some similar cleanup at the state level, but neither of them are going to be able to standardize your local city council's campaign finance records on demand. So there's one application right there.
But that still leaves a bigger question: What's the point of standardizing this stuff at all?
- Check bias vs. variance and recommend accordingly
- Better features, particularly ZIP code distance
- Review training data. Some CRP stuff is wrong.
- Name parser improvements: nicknames; Jrs and Srs are screwing up in some places
- Zero padding on ZIPs