Skip to content

Solvve/ml_lottery_eda

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Lottery-EDA

License Python 3.7 Solvve

Description

Four digits lottery exploratory data analysis.

Fairness of a lottery presumes that any number has the same opportunity to become a winning one. When numbers are not equally distributed it signals that numbers are not random. Subsequently, it means that the process is consciously or by some unknown pitfall is rigged to draw certain numbers more often than the others. Thus, the lottery is not fair.

The first step is to figure out how random are the winning numbers. To assess their randomness machine learning experts run primarily statistical analysis. The goal is to see if the numbers meet the criteria of uniform distribution, meaning that each and every number has the same odds to be drawn.

There are several criteria to measure this uniform randomness: extreme points criteria, Foster-Stuart criteria, and Spearman’s rank correlation coefficient. They help to understand if there are any features of the time series pointing out to patterns in what numbers win more often.

The second step is to identify distribution. This is yet another way of looking at what numbers tend to win more often and what numbers are less common in the winning sequences. There are three ways to identify distribution: by histogram, by skew and kurtosis, and by probability grid. All of them help to visualize the data and locate anomalies.