-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lobbyist employer scraper #31
Conversation
|
||
|
||
@click.command() | ||
@click.option("--employer", "is_employer_scrape", is_flag=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used a flag to hook into the existing filing scraper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking legit! One suggestion inline, and a question about how you organized your abstract base class and inheriting classes.
scrapers/lobbyist/scrape_filings.py
Outdated
"ClientVersionID": version, | ||
} | ||
|
||
def scrape(self, id, version): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this call signature different from the abstract base class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't figure out a clean way for the LobbyistScraper
base class to reference the url
attribute in a child class.
Turns out you can't stack @classmethod and @property to define a static class attribute as of 3.11.
If I'm overlooking something, please let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about stacking @property
and @abstractmethod
? https://docs.python.org/3/library/abc.html#abc.abstractproperty
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea!
@hancush was there any other feedback you had for this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing inline. Are these years hard coded elsewhere? If so, you can open an issue to address this separately and bring this in as is.
|
||
|
||
class IndependentExpenditureScraper(scrapelib.Scraper): | ||
election_years = ("2021", "2022", "2023", "2024") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably have this be a range from 2021 to the current year (or maybe the current year plus one) so it auto-updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea-- yes, they're also hard coded in scrape_offices.py
. I'll open an issue.
Overview
This PR adds a lobbyist employer scraper to the existing lobbyist scraper in
scrapers/lobbyist/scrape_filings.py
. The after scraping both individual lobbyist filings and lobbyist employer filings, the pipeline is unchanged.Depends on #30
Closes #29
Testing
Run
make data/processed/lobbyist_expenditures.csv