Skip to content

Latest commit

 

History

History
316 lines (260 loc) · 20.3 KB

README.rst

File metadata and controls

316 lines (260 loc) · 20.3 KB

simplestatistics

Circle CI codecov Documentation Status PyPI version

simple-statistics for Python.

simplestatistics is compatible with Python 2 & 3.

Installation

Install the current PyPI release:

pip install simplestatistics

Or install the development version from GitHub:

pip install git+https://github.com/sheriferson/simplestatistics

Usage

>>> import simplestatistics as ss
>>> ss.mean([1, 2, 3])
2.0
>>> ss.t_test([1, 2, 2.4, 3, 0.9], 2)
-0.3461277235039039

Documentation

You can read the documentation online.

Or you can generate it yourself:

Inside simplestatistics/.

make html

Documentation will be generated in _build/html/.

Tests

To run all doctests and see test coverage:

pip3 install -r requirements.txt
pytest simplestatistics --doctest-modules --cov=simplestatistics

The code adheres to PEP8 guidelines except for the following checkers:

  • invalid-name
  • len-as-condition
  • superfluous-parens
  • unidiomatic-typecheck

To lint the code, make sure you have [pylint] installed (pip install pylint), cd into the simplestatistics/statistics directory, then run:

pylint -d 'invalid-name, len-as-condition, superfluous-parens, unidiomatic-typecheck' *.py

Functions and examples

Descriptive statistics

Function Example
Min min([-3, 0, 3])
Max max([1, 2, 3])
Sum sum([1, 2, 3.5])
Quantiles quantile([3, 6, 7, 8, 8, 9, 10, 13, 15, 16, 2 0], [0.25, 0.75])
Product product([1.25, 2.75], [2.5, 3.40])

Measures of central tendency

Function Example
Mean mean([1, 2, 3])
Median median([10, 2, -5, -1])
Mode mode([2, 1, 3, 2, 1])
Geometric mean geometric_mean([1, 10])
Harmonic mean harmonic_mean([1, 2, 4])
Root mean square root_mean_square([1, -1, 1, -1])
Add to mean add_to_mean(40, 4, (10, 12))
Skewness skew([1, 2, 5])
Kurtosis kurtosis([1, 2, 3, 4, 5])

Measures of dispersion

Function Example
Sample and population variance variance([1, 2, 3], sample = True)
Sample and population Standard deviation standard_deviation([1, 2, 3], sample = Tru e)
Sample and population Coefficient of variation
``coefficient_of_variation([1, 2, 3], sample
= True)``
Interquartile range interquartile_range([1, 3, 5, 7])
Sum of Nth power deviations ``sum_nth_power_deviations([-1, 0, 2, 4], 3) ``
Sample and population Standard scores (z-scores) ``z_scores([-2, -1, 0, 1, 2], sample = True) ``

Linear regression

Function Example
Simple linear regression linear_regression([1, 2, 3, 4, 5], [4, 4.5, 5 .5, 5.3, 6])
Linear regression line function generator <http://simplestatistics.readthedocs. io/en/latest/#linear-regression-line-function> __ ``linear_regression_line([.5, 9.5])([1, 2, 3])` `

Similarity

Function Example
Correlation correlate([2, 1, 0, -1, -2, -3, -4, -5], [0, 1, 1, 2, 3, 2, 4, 5])
Covariance covariance([1,2,3,4,5,6], [6,5,4,3,2,1])

Distributions

Function Example
Factorial factorial(20) or factorial([1, 5, 20])
Choose choose(5, 3)
Normal distribution normal(4, 8, 2) or normal([1, 4], 8, 2)
Binomial distribution binomial(4, 12, 0.2) or binomial([3,4,5], 12, 0.5)
Bernoulli distribution bernoulli(0.25)
Poisson distribution poisson(3, [0, 1, 2, 3])
Gamma function gamma_function([1, 2, 3, 4, 5])
Beta distribution beta([.1, .2, .3], 5, 2)
One-sample t-test t_test([1, 2, 3, 4, 5, 6], 3.385)
Chi Squared Distribution Table chi_squared_dist_table(k = 10, p = .01)

Classifiers

Function Example
Naive Bayesian classifier See documentation for examples of how to train and classify.
Perceptron See documentation for examples of how to train and classify.

Errors

Function Example
Gauss error function error_function(1)

Hyperbolic functions

Function Example
sinh sinh(2)
cosh cosh(2.5)
tanh tanh(.2)

Spirit and rules

  • Everything should be implemented in raw, organic, locally sourced Python.
  • Use libraries only if you have to and only when unrelated to the math/statistics. For example, from functools import reduce to make reduce available for those using python3. That’s okay, because it’s about making Python work and not about making the stats easier.
  • It’s okay to use operators and functions if they correspond to regular calculator buttons. For example, all calculators have a built-in square root function, so there is no need to implement that ourselves, we can use math.sqrt(). Anything beyond that, like mean, median, we have to write ourselves.

Pull requests are welcome!

Contributors