Skip to content

tigeorgia/GeorgiaCorporationScraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is a scraper for the corporate registry of the country of Georgia. It is implemented in Python, using the excellent Scrapy framework.

Although there are still bugs, this scraper has significantly exceeded the capabilities of our old scraper, so please use this one from now on.

Installation

Should be pretty simple:

  1. virtualenv geo_corp_scrape
  2. cd geo_corp_scrape
  3. source bin/activate and clone the repo
  4. cd into the repo folder and pip install -r requirements.txt
  5. cp settings.py.example settings.py and edit to suit.
  6. Install poppler

Usage

scrapy crawl corps -- That's it. You should get a series of JSON files representing the scraped data.

Releases

No releases published

Packages

No packages published

Languages