Skip to content

Latest commit

 

History

History
100 lines (61 loc) · 3.95 KB

lecture-2.md

File metadata and controls

100 lines (61 loc) · 3.95 KB

Lecture 2: Learning to Answer Yes/No

1. Perceptron Hypothesis Set -- hyperplanes/linear classifiers in R^d

X

Y

img

2. Percepptron Learning Algorithm (PLA) -- correct mistakes and improve iteratively

Start from some W0 (say, 0), and 'correct' its mistakes on D

img

3. Guarantee of PLA -- no mistake eventually if linear separable

There are two cases: linear separable and not linear separable

img

if linear separable

  1. With the number of t increases, Wt gradually moves closer to Wf.

img

next we need formular:

f1 (1)

img

img

next we need formular:

f2 (2)

  1. Prove that the number of iterations is limited.

From (1) we can see:

f3

From (2) we can see:

f4

So that the following equation can be obtained:

f5

When W0=0:

f6

among them: img

So evidence. (Because of the left of the above ≤ 1)

4. Non-Separable Data -- hold somewhat 'best' weights in pocket

More about PLA

  • Guarantee: as long as linear separable and correct by mistake
    • inner product of Wf and Wt grows fast; length of Wt grows slowly
    • PLA 'lines' are more and more aligned with Wf => halts
  • Pros: simple to implement, fast, works in any dimension d
  • Cons:
    • 'assumes' linear separable D to halt
    • not fully sure how long halting takes

if not linear separable

modify PLA Algorithm by keeping best weights in pocket.

img

The efficiency is much slower than PLA.