Analysing the word frequency in documents and books off Project Gutenberg.
The Pareto principle (also known as the 80/20 rule, the law of the vital few, or the principle of factor sparsity) states that, for many events, roughly 80% of the effects come from 20% of the causes.
In this project, I try to visualise the word frequency in the Gutenberg corpus and try to verify the Pareto principle, and the Zipf's law.