Skip to content

[statistics] posterior, likelihood, prior, evidence, MLE, MAP

Myungchul Shin edited this page Sep 6, 2021 · 7 revisions
  • what we want to do?

    • estimate parameter S given evidence X, that is, estimate P(S | X).
    • for example, let's flip a coin 100 times and get 54 heads, 46 tails. then what is P(HEAD) and P(TAIL)?
    • evidence X => '54 heads, 46 tails'
    • parameter S => P(HEAD), P(TAIL) = 1 - P(HEAD)
  • bayesian inversion

P(S | X) = P(X | S) * P(S) / P(X)

P(S) : prior probability, belief, we can assume that P(HEAD) is 0.4
P(S | X) : posterior probability, P(S) is changed after we observed the evidence X.
P(X | S) : likelihood of S, 
           probability of X given S
           likelihood of S given that X was observed = L(S | X)
           function of S given that X was observed
P(X) : evidence probability, observation probability, fixed

so, `posterior` is relative to `likelihood * prior`

  • example
for example, get back to our coin example. 
we have evidence X = '54 heads, 46 tails'.
 
let's assume P(HEAD) = 0.4 for prior S0.
then, P(S0) => {P(HEAD) = 0.4, P(TAIL) = 0.6}

and, let's pick a parameter from parameter search space( 0 <= P(HEAD) <= 1 ).
S1 = {P(HEAD) = 0.3, P(TAIL) = 0.7}

P(X | S1) = P(HEAD)^54 * P(TAIL)^46 = 0.3^54 * 0.7^46

P(S1 | X) = P(X | S1) * P(S0)

this should be a very small value. 
anyway, we want to find the best parameter from available parameters(search space).
there are two kind of techniques. 

MAP(Maximum A Posteriori)
S' = argmax{ P(S | X)} = argmax{P(X | S) * P(S)} for all S

MLE(Maximum Likelihood Estimation)
S' = argmax{P(X | S)} for all S
in our coin example, find S' is very easy. it can be computed directly from the observed data.
P(HEAD) = 54/100, P(TAIL) = 46/100

Clone this wiki locally