scraper_senate-lobbying-disclosures

Scripts to process quarterly Lobbying Disclosure Act Reports from the United States Senate.

Requirements

Python 3.x — brew install python
Pipenv — brew install pipenv

What's in here

.env.example: Sample configuration variables.
scrape_lda_filings.py: The principal code of this repo, which pulls these disclosure reports and related files. NOTE: This needs to be refactored into numerous smaller files.
utils/: Utilities called in scrape_lda_filings.py.
reports: A folder to contain all downloaded quarters' reports.

Getting started

First-time installation

Clone this repo and cd into it:

$ git clone git@github.com:The-Politico/scraper_senate-lobbying-disclosures.git
$ cd scraper_senate-lobbying-disclosures

Create a .env file with the following setting (see .env.example):
```
SENATE_LDA_API_KEY='token-goes-here'
```
Setup a Python 3 virtual environment, step into it and install dependencies:
```
 $ pipenv install --dev
```

Updating your local project

After pulling someone else's changes from Github you may need to run a couple of commands to sync your local database and virtual environment:

Use pipenv sync to make sure your local dependencies line up with the latest version of the requirements file (be sure you're in your virtual environment for this step):
```
$ pipenv install --dev
$ pipenv sync
```

Configuration

The following configuration is automatically read from a .env file in the project's root.

Variable	What it does
`SENATE_LDA_API_KEY`	Required: An API key from the Senate LDA site, used to request data from their systems. Sign up at this link, or use the INT's existing key as listed in the password manager.

Capturing a new quarterly report

For now, run the following code (replacing 2020 and Q4 with your desired year and quarter):

  pipenv run python -c \
    'from scrape_lda_filings import scrape_lda_filings; filings = scrape_lda_filings("2020", "Q4")'

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
reports		reports
utils		utils
.env.example		.env.example
.gitignore		.gitignore
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
firm-notes.md		firm-notes.md
known-multiple-firms.csv		known-multiple-firms.csv
known-top-firms.csv		known-top-firms.csv
scrape_lda_filings.py		scrape_lda_filings.py
self_lobbying_overrides.json		self_lobbying_overrides.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

scraper_senate-lobbying-disclosures

Requirements

What's in here

Getting started

First-time installation

Updating your local project

Configuration

Capturing a new quarterly report

Copyright

About

Releases

Packages

Languages

The-Politico/scraper_senate-lobbying-disclosures

Folders and files

Latest commit

History

Repository files navigation

scraper_senate-lobbying-disclosures

Requirements

What's in here

Getting started

First-time installation

Updating your local project

Configuration

Capturing a new quarterly report

Copyright

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages