Releases: ElectionDataAnalysis/electiondata
electiondata version 2.0.1
Changes since v.2.0
- Improved documentation
- revised existing files
- added comprehensive instructions for getting started (
Sample_Session.md
,Installation.md
) - added instructions for unit tests (
Testing_Code_with_pytest.md
) - Added
CODE_OF_CONDUCT.md
,CONTRIBUTING.md
andCONTACT_US.md
- Improved in-line documentation of code
- Improved error messaging
- Moved all references to postgres into
database
submodule - Corrected reference and name for NIST json export
- Cleaned up some munger, jurisdiction and ini_files_for_results files
- Rounded bar chart output percentages to nearest 0.1%
- Slugified output file names
- Fixed encoding-related bug for Windows environments (per Journal of Open Source Software review)
- Improved file names for analysis exports
- Fixed some bugs
Version 2.0 Release
Changes since Version 1.0:
Requirements:
- python3.9
- Some later package versions -- see
requirements.txt
Running the software:
- Main parameter file no longer required to be
run_time.ini
in current directory (though that is still the default) - Database name no longer required to be given in the main parameter file (though that is still the default)
- Headers for parameter files have changed: [electiondata] instead of [election_data_analysis] in the main parameter file; result parameter files need header [election_results]
- Module and repository name changed to
electiondata
; submodule names changed to remove underscores.
Database
- Structure has changed substantially, and new code is not backwards-compatible.
- BallotMeasureSelection table no longer exists; instead, ballot measure selections ('Yes' or 'No') are treated as CandidateSelections with Party 'ballot measure selection'.
- Census and other external data now stored in two tables:
ExternalDataSet
to hold metadata, andExternalData
to hold data.
Jurisdiction Prep
- There are no longer
Election.txt
files in the individual jurisdiction folders, but a singleElection.txt
file in thejurisdictions/000_for_all_jurisdictions
folder.
Data Loading
-
By default, the following data loading routines will decline to load the files for a particular jurisdiction and election to the desired database unless they pass the tests in Analyzer.test_loaded_results(), including comparison to reference result contest totals:
- DataLoader.load_ej_pair()
- DataLoader.load_all()
- load_or_reload_all()
Each of these routines has an optional parameterrun_tests
defaulting to True. IfFalse
, data will not be tested .
-
File structure for mungers has changed substantially. Each munger is defined by one
*.munger
file, and all those files are in themungers
directory. -
Munger parameter scheme has changed substantially. See User_Guide.md.
-
Handling of munge formulas has changed, allowing more flexibility in combining parameters, but requiring that certain reserved characters be avoided. See User_Guide.md.
-
For flat text files, strings that are in the headers of the count columns are now referred to by, e.g., <count_header_0> rather than <header_0>.
-
If the
rows_to_skip
parameter is 1 or more, these rows are removed as the file is read, and all other row numbers should not count the skipped rows. For example, if rows_to_skip=2, then count_header_0 references the third row of the original file. -
json and xml files can be munged.
-
Multi-block text flat text files can be munged.
-
Values to ignore (e.g., Candidate=Total) can be specified.
-
Handling of lookups in auxiliary files has changed substantially. See
User_Guide.md
. -
Ballot Measure contest functionality has not been tested and is not guaranteed to work correctly.
Error reporting
- Errors are now recorded in files in the directory given by the
reports_and_plots_dir
parameter specified in the main parameter file - All data loading routines have
suppress_warnings
parameter allowing user to keep error reporting but omit warnings (such as CandidateContests not found).
Exporting and Analyzing Data
- There are substantial changes to the routines and parameters for exporting and analyzing data. See
User_Guide.md
.
Testing
tests/load_and_test_all.py
is no longer available. Instead, runpytest tests/dataloading_tests
.- Added reference results for testing correctness of uploaded data. Note that reference results use the convention that, e.g., 2021 Democratic Primary and 2021 Republican Primary are two separate elections. Some of the mungers for pre-2021 Primary Elections have not caught up with this change.
tests
directory also contains tests forAnalyzer
andJurisdictionPrepper
functionality (applied before each pull request touching code)
Version 1.0 release
This codebase allows ingestion of state-level election datafiles into a common data format stored in a postgres database. Processing files for number of states are already included but additional states will be constantly added. Additionally, this codebase provides a number of analysis methods, such as aggregation and visualization.