A webcrawler project for real estate websites without an API
This project designed to build multiple spiders with different settings to crawl and gather real estate data through different very basic websites and some which do not make data available through an API nor allow scraping.
-Database So far, a MySQL database has been created to programmaticaly manage and index URLs. Further on, all real estate data shall be added to this or to a different database.
-Spiders Only one spider has been added, to crawl through one specific domain. Multiple spiders will be created, especially for multiple domains
There is a lot to be defined yet