Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW]: electiondata: a Python package for consolidating, checking, analyzing, visualizing and exporting election results #3739

Closed
40 tasks done
whedon opened this issue Sep 20, 2021 · 81 comments
Assignees
Labels
accepted published Papers published in JOSS Python recommend-accept Papers recommended for acceptance in JOSS. review

Comments

@whedon
Copy link

whedon commented Sep 20, 2021

Submitting author: @sfsinger19103 (Stephanie Singer)
Repository: https://github.com/ElectionDataAnalysis/electiondata
Version: v2.0.1
Editor: @ajstewartlang
Reviewer: @vaneseltine, @andrewheiss
Archive: 10.5281/zenodo.5802556

⚠️ JOSS reduced service mode ⚠️

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/27e248fc1e46488384e884dc242aac80"><img src="https://joss.theoj.org/papers/27e248fc1e46488384e884dc242aac80/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/27e248fc1e46488384e884dc242aac80/status.svg)](https://joss.theoj.org/papers/27e248fc1e46488384e884dc242aac80)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@vaneseltine & @andrewheiss, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @ajstewartlang know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Review checklist for @vaneseltine

✨ Important: Please do not use the Convert to issue functionality when working through this checklist, instead, please open any new issues associated with your review in the software repository associated with the submission. ✨

Conflict of interest

  • I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the repository url?
  • License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • Contribution and authorship: Has the submitting author (@sfsinger19103) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
  • Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • A statement of need: Does the paper have a section titled 'Statement of Need' that clearly states what problems the software is designed to solve and who the target audience is?
  • State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

Review checklist for @andrewheiss

✨ Important: Please do not use the Convert to issue functionality when working through this checklist, instead, please open any new issues associated with your review in the software repository associated with the submission. ✨

Conflict of interest

  • I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the repository url?
  • License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • Contribution and authorship: Has the submitting author (@sfsinger19103) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
  • Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • A statement of need: Does the paper have a section titled 'Statement of Need' that clearly states what problems the software is designed to solve and who the target audience is?
  • State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?
@whedon
Copy link
Author

whedon commented Sep 20, 2021

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @vaneseltine, @andrewheiss it looks like you're currently assigned to review this paper 🎉.

⚠️ JOSS reduced service mode ⚠️

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

⭐ Important ⭐

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@whedon generate pdf

@whedon
Copy link
Author

whedon commented Sep 20, 2021

PDF failed to compile for issue #3739 with the following error:

 Can't find any papers to compile :-(

@sfsinger19103
Copy link

Paper is (and has been) here: https://github.com/ElectionDataAnalysis/electiondata/blob/joss-submission/JoSS_Submission/paper.md

Not sure why @whedon couldn't find it. Please let me know what to do to fix the problem.

@danielskatz
Copy link

@whedon generate pdf from branch joss-submission

@whedon
Copy link
Author

whedon commented Sep 20, 2021

Attempting PDF compilation from custom branch joss-submission. Reticulating splines etc...

@danielskatz
Copy link

@whedon check references from branch joss-submission

@whedon
Copy link
Author

whedon commented Sep 20, 2021

Attempting to check references... from custom branch joss-submission

@whedon
Copy link
Author

whedon commented Sep 20, 2021

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1089/elj.2019.0544 is OK
- 10.6028/NIST.SP.1500-100 is OK
- 10.6028/NIST.SP.1500-100r2 is OK

MISSING DOIs

- None

INVALID DOIs

- None

@whedon
Copy link
Author

whedon commented Sep 20, 2021

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

@danielskatz
Copy link

Paper is (and has been) here: https://github.com/ElectionDataAnalysis/electiondata/blob/joss-submission/JoSS_Submission/paper.md

Not sure why @whedon couldn't find it. Please let me know what to do to fix the problem.

JOSS doesn't store branch info when the paper is in a branch, so we just have to issues commands manually and add from branch joss-submission to them to tell whedon this

@whedon
Copy link
Author

whedon commented Oct 4, 2021

👋 @andrewheiss, please update us on how your review is going (this is an automated reminder).

@whedon
Copy link
Author

whedon commented Oct 4, 2021

👋 @vaneseltine, please update us on how your review is going (this is an automated reminder).

@vaneseltine
Copy link

Currently getting prerequisites in place (PostgreSQL, Python 3.9) Two comments while I run through setup:

  1. There's a bit of inconsistency in Python version. The instructions in User_Guide.md require Python 3.9, but the main README.md only mentions 3.7 for contributions, and no version requirement is enforced (e.g. with python_requires) in setup.py.

  2. This requirement in the user guide is best to avoid: "If you use the alias python, make sure it points to python3.9." In most cases, in a Python-based project, calling literal "python" can be entirely avoided. And doing a couple of searches through the code, I'm not sure why this seems important for electiondata. Happy to help troubleshoot if there's a point in the code where this is an issue.

@sfsinger19103
Copy link

  1. Thanks for flagging the version inconsistency. It should be 3.9
  2. The comment about aliasing is not intended to be a requirement, but rather as a helpful hint to users who may not have much experience with python.

@ajstewartlang
Copy link

Hi 👋 @andrewheiss and @vaneseltine. I'm just checking in with you to ask how your reviews are progressing?

@andrewheiss
Copy link

Hi! Sorry for the delay on this! Here's my review:


electiondata is a powerful new Python package that allows users to read and standardize raw election data that state and local election agencies provide in a wide variety of file formats and data structures. It is important software for anyone interested in election data, which is ordinarily incredibly messy and hard to work with, and will make it much easier to analyze and visualize election results in a standard way.

Installation

In the paper, the authors state that “Users do not need to know python (other than the basics for installing and calling the package)”, but I found the installation process more difficult than expected. Using a clean, new macOS Monterrey installation with freshly installed Python 3.9, without using any virtual environments or conda, and following the installation instructions in the user guide, I ran into these surprises:

  • I ran python3 setup.py install, following the instructions, and it seemed that the package and all dependencies installed correctly. But running import electiondata as ea from the Python shell gave errors about missing various dependencies (numpy, lxml, and others).
  • So I ran pip3 install -r requirements.txt to install from the requirements file instead, and that mostly worked, except…
    • …it’s not immediately clear in the installation section of the guide that PostgreSQL is required (and is specifically a prerequisite for installing the package; psycopg2 wouldn’t install through pip without it on my system). Running brew install postgresql fixed this just fine.
    • matplotlib wouldn’t install because I was missing freetype. brew install freetype fixed this.
    • After installing PostgreSQL and Freetype, pip3 install -r requirements.txt worked fine and the package installed; running import electiondata as ea in the Python shell produced no errors

These ↑ are all issues specific to a clean installation of macOS, so it’s not anything that can be universally addressed (and Windows and Linux will have their own issues), but it might be helpful to expand the installation section of the user guide to specify that external programs like PostgreSQL and Freetype are necessary before installing electiondata.

Additionally, it might be helpful to include basic instructions for how to create a new local PostgreSQL user and database so those settings and credentials can be included in run_time.ini. If the package’s goal is to be accessible to people with minimal Python experience, providing this extra guidance would be helpful (it’s been a few years since I last used PostgreSQL, for instance, so I had to google how to create a new user and db).

It’s also not clear in the user guide what needs to happen to set up the initial database. Before trying to load any data, I poked around in the package code to find what the expected schema should be, expecting to need to create different tables on my own, but then discovered that the requisite tables are built by the create_database() function in src/electiondata/database/__init__.py upon importing the package, which was a nice surprise. Documenting that process could be helpful.

Finally (and this is definitely more of a distant feature request/idea!), it might be worth it to someday use something more local like SQLite by default, which lets you embed the database directly into the application and doesn’t require a separate SQL server process. It would simplify code testing and development, since users wouldn’t need to have a separate server running. Currently it is possible to do something like SQLite—the documentation points out that users can modify src/database/__init__.py to use other flavors of SQL, and the package uses squalchemy to provide a standardized API for all sorts of SQL backends, including SQLite—but switching between PostgreSQL and SQLite is nontrivial. Again, this is definitely not a dealbreaker at all! Just a future thought to simplify the installation and onboarding for the package.

Documentation and onboarding

The user guide is comprehensive and works well as a complete documentation reference, but it is very difficult to get started initially. There are sample .ini files, but they’re scattered throughout the project.

It would be really helpful to have a sort of “Getting Started” vignette or guide that showed a minimal working example of how to:

  1. Structure the project directory with the different [electiondata] folders like results_dir, repository_content_dir, etc.
  2. Generate the jurisdiction files for a known state (like Georgia, since some raw election data is already included in the tests)
  3. Load the data from one of the raw election datasets (like Georgia’s XML file or Guam’s Excel file from the tests)
  4. Analyze the loaded data (again, for Georgia or Guam, since those raw datasets are included with the package already)

Again, the current user guide is quite comprehensive and covers all sorts of edge cases, conventions, additional arguments, and other details, but it’s hard to use it to get started from scratch. (I had to figure out lots of the program flow from the tests, but I eventually got it to work and load Georgia and Guam data into my local PostgreSQL database)

A complete working example that users can run would help them understand the flow of the package and ensure that all the moving parts are working before they start feeding their own downloaded data files into the database.

Testing

There is a suite of tests for each of the package’s main purposes: jurisdiction creation, data loading, and analyzing. These run well and are sufficient.

Contributions

The README includes a section on contributing to the project, specifying that code should be compatible with Python 3.7 (though this should be 3.9), and that it should follow black linting style.

It might be helpful to have some community guidelines in the package too, such as a CONTRIBUTING.md file (perhaps modeled after something like this or this) and a CONDUCT.md file (like this or this), in order to meet JOSS’s requirements that guidelines tell people how to (1) contribute, (2) report issues, and (3) seek support.


In general this is great! My only big concerns really just deal with getting it all set up and working and running—a vignette with a minimal working reproducible example that takes a raw XML/Excel/JSON/whatever file from its raw state into the database into nice graphs would be extraordinarily helpful and would make it a lot easier to help people start using this package right away.

@ajstewartlang
Copy link

Many thanks for your very detailed and helpful review @andrewheiss

@sfsinger19103 would you be able to address the points that are raised? On my reading, it does sound like additional installation instructions would be helpful to the novice Python user. Adding a 'Getting Started' vignette sounds like another great suggestion. The other points raised are important too - espec. in terms of adding/improving documentation and giving a minimal working reproducible example. This looks like a great submission and addressing these points will likely make the package even more usable and likely increase adoption. The helpful suggestions raised by @andrewheiss sound very do-able to me so I hope you can address them. @sfsinger19103 it might be easiest if you open a separate issue for each so we can then tick each of them off as they are addressed. Does this sound ok?

@sfsinger19103
Copy link

Yes, sounds good. This is the first time I've been involved in (much less led) a project of this scale, so I'm grateful for such explicit suggestions.

@sfsinger19103
Copy link

@ajstewartlang what's the appropriate way to make a small correction to the submitted paper?

(Specifically, the paper says that there is an export in json format that conforms to the NIST Common Data Format V1 when in fact the json format conforms to NIST Common Data Format V2.)

@ajstewartlang
Copy link

@sfsinger19103 you can go ahead and correct the paper on the joss-submission branch (or on the joss-submission-updated branch if that's where changes have been made) - I'll then re-generate the pdf - just let me know which branch it's on.

@sfsinger19103
Copy link

@ajstewartlang corrected version now on joss-submission branch.

@ajstewartlang
Copy link

@whedon generate pdf from branch joss-submission

@whedon
Copy link
Author

whedon commented Nov 7, 2021

Attempting PDF compilation from custom branch joss-submission. Reticulating splines etc...

@whedon
Copy link
Author

whedon commented Jan 3, 2022

Attempting dry run of processing paper acceptance...

@whedon
Copy link
Author

whedon commented Jan 3, 2022

PDF failed to compile for issue #3739 with the following error:

 Can't find any papers to compile :-(

@ajstewartlang
Copy link

@whedon recommend-accept from branch joss-submission

@whedon
Copy link
Author

whedon commented Jan 3, 2022

Attempting dry run of processing paper acceptance...

@whedon
Copy link
Author

whedon commented Jan 3, 2022

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1089/elj.2019.0544 is OK
- 10.6028/NIST.SP.1500-100 is OK
- 10.6028/NIST.SP.1500-100r2 is OK

MISSING DOIs

- None

INVALID DOIs

- None

@whedon
Copy link
Author

whedon commented Jan 3, 2022

👋 @openjournals/joss-eics, this paper is ready to be accepted and published.

Check final proof 👉 openjournals/joss-papers#2858

If the paper PDF and Crossref deposit XML look good in openjournals/joss-papers#2858, then you can now move forward with accepting the submission by compiling again with the flag deposit=true e.g.

@whedon accept deposit=true from branch joss-submission 

@sfsinger19103
Copy link

Just noticed a sentence fragment in the summary. If it's not too late, how can we correct it per: openjournals/joss-papers#2858 (comment)

@ajstewartlang
Copy link

Just noticed a sentence fragment in the summary. If it's not too late, how can we correct it per: openjournals/joss-papers#2858 (comment)

@arfon can you merge this PR please? I don't think I can.

@arfon
Copy link
Member

arfon commented Jan 4, 2022

@whedon recommend-accept from branch joss-submission

@whedon
Copy link
Author

whedon commented Jan 4, 2022

Attempting dry run of processing paper acceptance...

@whedon
Copy link
Author

whedon commented Jan 4, 2022

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1089/elj.2019.0544 is OK
- 10.6028/NIST.SP.1500-100 is OK
- 10.6028/NIST.SP.1500-100r2 is OK

MISSING DOIs

- None

INVALID DOIs

- None

@whedon
Copy link
Author

whedon commented Jan 4, 2022

👋 @openjournals/joss-eics, this paper is ready to be accepted and published.

Check final proof 👉 openjournals/joss-papers#2860

If the paper PDF and Crossref deposit XML look good in openjournals/joss-papers#2860, then you can now move forward with accepting the submission by compiling again with the flag deposit=true e.g.

@whedon accept deposit=true from branch joss-submission 

@arfon
Copy link
Member

arfon commented Jan 4, 2022

@ajstewartlang – I believe @kthyng is the EiC on rotation this week. She should be picking this up 🔜

@kthyng
Copy link

kthyng commented Jan 5, 2022

@sfsinger19103 -

  • version looks updated ✅
  • Please update Zenodo metadata so that title and author list exactly match JOSS paper
  • Paper comments:
  • first bullet, page 2: need punctuation before "See for example"
  • first paragraph under "statement of need": "Academic sources ..." should have the parenthetical references together in one set of parentheses
  • references aren't properly capitalized, for example Herron 2019 and "north carolina" — preserve capitalization with {} around characters in .bib file

@sfsinger19103
Copy link

Done! @kthyng

@kthyng
Copy link

kthyng commented Jan 5, 2022

@whedon generate pdf from branch joss-submission

@whedon
Copy link
Author

whedon commented Jan 5, 2022

Attempting PDF compilation from custom branch joss-submission. Reticulating splines etc...

@whedon
Copy link
Author

whedon commented Jan 5, 2022

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

@kthyng
Copy link

kthyng commented Jan 5, 2022

ok looks ready to go!

@kthyng
Copy link

kthyng commented Jan 5, 2022

@whedon accept deposit=true from branch joss-submission

@whedon
Copy link
Author

whedon commented Jan 5, 2022

Doing it live! Attempting automated processing of paper acceptance...

@whedon whedon added accepted published Papers published in JOSS labels Jan 5, 2022
@whedon
Copy link
Author

whedon commented Jan 5, 2022

🐦🐦🐦 👉 Tweet for this paper 👈 🐦🐦🐦

@whedon
Copy link
Author

whedon commented Jan 5, 2022

🚨🚨🚨 THIS IS NOT A DRILL, YOU HAVE JUST ACCEPTED A PAPER INTO JOSS! 🚨🚨🚨

Here's what you must now do:

  1. Check final PDF and Crossref metadata that was deposited 👉 Creating pull request for 10.21105.joss.03739 joss-papers#2866
  2. Wait a couple of minutes, then verify that the paper DOI resolves https://doi.org/10.21105/joss.03739
  3. If everything looks good, then close this review issue.
  4. Party like you just published a paper! 🎉🌈🦄💃👻🤘

Any issues? Notify your editorial technical team...

@kthyng
Copy link

kthyng commented Jan 5, 2022

Congrats on your new publication @sfsinger19103! Many thanks to editor @ajstewartlang and reviewers @vaneseltine and @andrewheiss for your time, hard work, and expertise!!

@kthyng kthyng closed this as completed Jan 5, 2022
@whedon
Copy link
Author

whedon commented Jan 5, 2022

🎉🎉🎉 Congratulations on your paper acceptance! 🎉🎉🎉

If you would like to include a link to your paper from your README use the following code snippets:

Markdown:
[![DOI](https://joss.theoj.org/papers/10.21105/joss.03739/status.svg)](https://doi.org/10.21105/joss.03739)

HTML:
<a style="border-width:0" href="https://doi.org/10.21105/joss.03739">
  <img src="https://joss.theoj.org/papers/10.21105/joss.03739/status.svg" alt="DOI badge" >
</a>

reStructuredText:
.. image:: https://joss.theoj.org/papers/10.21105/joss.03739/status.svg
   :target: https://doi.org/10.21105/joss.03739

This is how it will look in your documentation:

DOI

We need your help!

Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted published Papers published in JOSS Python recommend-accept Papers recommended for acceptance in JOSS. review
Projects
None yet
Development

No branches or pull requests

8 participants