Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lobbyist employer scraper #31

Merged
merged 4 commits into from
Oct 7, 2024
Merged

Conversation

antidipyramid
Copy link
Contributor

Overview

This PR adds a lobbyist employer scraper to the existing lobbyist scraper in scrapers/lobbyist/scrape_filings.py. The after scraping both individual lobbyist filings and lobbyist employer filings, the pipeline is unchanged.

Depends on #30

Closes #29

Testing

Run make data/processed/lobbyist_expenditures.csv



@click.command()
@click.option("--employer", "is_employer_scrape", is_flag=True)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used a flag to hook into the existing filing scraper.

@antidipyramid antidipyramid marked this pull request as ready for review September 20, 2024 15:49
Copy link
Member

@hancush hancush left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking legit! One suggestion inline, and a question about how you organized your abstract base class and inheriting classes.

lobbyists.mk Outdated Show resolved Hide resolved
"ClientVersionID": version,
}

def scrape(self, id, version):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this call signature different from the abstract base class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't figure out a clean way for the LobbyistScraper base class to reference the url attribute in a child class.

Turns out you can't stack @classmethod and @property to define a static class attribute as of 3.11.

If I'm overlooking something, please let me know.

Copy link
Member

@hancush hancush Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about stacking @property and @abstractmethod? https://docs.python.org/3/library/abc.html#abc.abstractproperty

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!

@antidipyramid
Copy link
Contributor Author

@hancush was there any other feedback you had for this?

Copy link
Member

@hancush hancush left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing inline. Are these years hard coded elsewhere? If so, you can open an issue to address this separately and bring this in as is.



class IndependentExpenditureScraper(scrapelib.Scraper):
election_years = ("2021", "2022", "2023", "2024")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably have this be a range from 2021 to the current year (or maybe the current year plus one) so it auto-updates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea-- yes, they're also hard coded in scrape_offices.py. I'll open an issue.

@antidipyramid antidipyramid merged commit fb438dd into main Oct 7, 2024
@hancush hancush deleted the feature/29-lobbyist-employer branch October 15, 2024 23:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scrape lobbyist employer expenditures
2 participants