The project involved drawing inferences from 3 case studies, namely - Wholesale Customer Data (Store Sales), University Survey Data & Manufacturing Shingles Data. The concepts of various measures of Descriptive Statistics, Probability and Probability Distributions, and various Estimation & Hypothesis Testing measures are used to analyze these case studies.
- Firstly, we created a table of summary for numerical variables.
- Identified that the region that spent the most is 'Other' and that spent the least is 'Oporto'. And the Channel that spent the most is 'Hotel' and that spent the least is 'Retail'.
- Found behaviour of 6 different varieties of items across all regions and channels.
- Identified that the item 'Fresh' shows the most inconsistent behaviour. And the item 'Delicatessen' shows the least inconsistent behaviour.
- Identified the outliers in the dataset.
- Created various contingency tables and found probabilities for different combinations to understand data better and get better insights.
- Analyzed distribution graphs to understand numerical variables in a much better fashion
Using hypothesis testing we can say that,
- Mean moisture content of Shingle type A is not within permissible limits.
- Mean moisture content of Shingle type B is within permissible limits.
- The population means for Shingles A & B are equal.
Skills : Descriptive Statistics, Probability, Estimation, Hypothesis Testing.
Tools : Jupyter Notebook (Python), MS-Word.