This is a simple web crawler to crawl the HTML files of websites like wikipedia. Implemented in Python. Currently it is optimized for German Wiki entries.
The website to crawl could be given directly to the script:
deep_crawl_wiki_article("https://de.wikipedia.org/wiki/Wikipedia")
Inspired by: https://www.freecodecamp.org/news/scraping-wikipedia-articles-with-python/