In this repository I explore statistical concepts and certain exercises using python. My objective is to breakdown concepts in such a way that readers with high school knowledge of math/stats can easily follow along and learn something useful.
You will need python 3.x and the following packages:
- numpy
- pandas
- scipy
- matplotlib
- statsmodels
Almost all scripts are presented in jupyter notebook format.
In a medium article on statistical significance we explore how to determine if your results are significant. We will need to explore both the invariant and the evaluation metrics
Practical significance is another topic for statisticians to consider. Can your experiment be done in the real world? This is covered in this medium article.
In this part we go over three statistical questions and how to solve them in python. The flow is as follows:
- State the problem
- Set up the experiment
- Implement in code
- Validate result analytically
For more information, see this aricle.
In this part we go over the mathematical formulations behind PCA as outlined in Lindsay Smith's A Tutorial on Principal Component Analysis. The complete walkthrough can be found in this article.