Releases: openeventdata/Dictionaries
Releases · openeventdata/Dictionaries
v1.0 release
Phoenix has been in daily production for 9 months now and many bugs and omissions in the dictionaries and coding pipeline have been ironed out since then.
This release includes several major updates. While several important gaps remain (especially in the pre-1990 period), the dictionaries are suitable for basic production use.
Major features:
- many now-defunct countries and their have been added, most significantly the Soviet Union and East Germany.
- Adding many political titles (e.g. "U.S. Secretary of Defense") and departments/institutions ("The Pentagon"). Previously, the holders of these offices had good coverage, but news articles often make references just to the institution or office and we missed all of these before.
- For prominent figures, added many alternative spellings. Before, most people had only one version of their names in the dictionaries. Variations in middle initials, titles, etc. would not be coded. Expanding beyond the most prominent figures will require a degree of automation.
- Coverage of militarized non-state actors has been significantly improved and a bug that prevented most codings of groups was corrected (specifically, abbreviations were removed from the group's name).
- Several hundred actors that appeared frequently in the news but were not in the dictionaries were automatically identified and added, giving us better coverage of current events.
0.0.2 beta
Only a few updates from 0.0.1b:
- removed SEP and INS, replaced with REB. See the commits and PR for details.
- removed SET (settler), replaced with CVL
- used @philip-schrodt's new tool to add alternative names for high-frequency actors
Dictionaries Version 0.0.1b (changes from alpha Phoenix release)
This release includes a number of changes based on feedback and tests on the first alpha release of Phoenix. These changes are primarily aimed at reducing the false positives on CAMEO 19s (Fight) and several more basic updates to bring the dictionaries up-to-date. See the commit logs for more details, but broadly, the changes include:
- update some frequently covered countries for leadership changes in the last few years
- improve actor accuracy by tweaking actor and agent files
- reduce miscodings (specifically 190s) by changing the verb file.
- dramatically ramp up the ruthlessness of the discard list