Skip to content

An 'end to end' data science project analysing the letters between two German poets. Includes a simple scraper for getting raw data, data cleaning, preprocessing, analysis and visualisation.

Notifications You must be signed in to change notification settings

Ma-Fi-94/letters

Repository files navigation

Letters

A small data science project I just started to work on for fun, analysing the letters between two famous German poets -- J. W. v. Goethe and J. C. F. v. Schiller :).

Contents

  • scrape.py downloads 14 HTML files from Projekt Gutenberg (www.projekt-gutenberg.org) containing ~1000 letters exchanged between between Goethe and Schiller.
  • preprocess.py extracts all letter numbers, letter writers and letter contents from the raw HTML files, and writes them to one single CSV file. This will be used for further analysis
  • all_letters.csv is this CSV file
  • The jupyter notebooks show the results of the analyses.

Writeup

A writeup of the analyses and results can be found on my blog: https://mmfischer.de/003_letters/003_letters.html

About

An 'end to end' data science project analysing the letters between two German poets. Includes a simple scraper for getting raw data, data cleaning, preprocessing, analysis and visualisation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published