Skip to content

Latest commit

 

History

History
16 lines (16 loc) · 673 Bytes

README.md

File metadata and controls

16 lines (16 loc) · 673 Bytes

Scrapy-Basic

In this scraping project, the site 'https://quotes.toscrape.com/' was scarped. The following actions were performed :

  • We logged into the site by extracting the csrf token
  • We scraped the following data:
    1. Quotes
    2. Author
    3. Tags
  • We filtered some of the quotes using regex to remove some unicode.
  • We etracted the data into the json file named 'quotes'.

Keep in mind the sites used did not load in data dynamically and the data parsed was not stored into any database ( sql / nosql )