A simple web crawler created by use of Scrapy. It crawls Wikipedia for all countries list and extracts their name and ISO-3166-1 alpha-2 as well as ISO-3166-1 alpha-3 codes. Moreover it follows each country and extracts it's subdivisions (regions) and their corresponding ISO-3166-2 codes.
All of that is exported into a JSON file as following:
- Python 3.5+
- Scrapy
- Install Scrapy following their documentation
- Clone this git
- From the repo directory run
scrapy crawl codes
Note that crawler will not overwrite output country_codes.json
file, but will append to it. Therefore you might want to backup the output file first by renaiming it.