Structure and Predictions
Machine Learning: Jordan Boyd-Graber
University of Maryland
PERCEPTRON: SLIDES ADAPTED FROM LIANG HUANG
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 1 / 1
How do we set the feature weights?
Goal is to minimize errors
Want to reward features that lead to right answers
Penalize features that lead to wrong answers
Problem: predictions are correlated
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 2 / 1
Perceptron Algorithm
Rather than just counting up how often we see events?
We’ll use this for intuition in 2D case
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 3 / 1
Perceptron Algorithm
1: w ~ 1 ← ~ 0
2: for t ← 1 . . . T do
3: Receive x t
4: ˆ y t ← sgn ( w ~ t · ~ x t )
5: Receive y t
6: if ˆ y t 6 = y t then
7: w ~ t + 1 ← ~ w t + y t ~ x t
8: else
9: w ~ t + 1 ← w t
return w T + 1
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 4 / 1
Binary to Structure
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 5 / 1
Binary to Structure
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 5 / 1
Binary to Structure
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 5 / 1
Generic Perceptron
perceptron is the simplest machine learning algorithm
online-learning: one example at a time
learning by doing
find the best output under the current weights
update weights at mistakes
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 6 / 1
2D Example
Initially, weight vector is zero:
~
w 1 = 〈 0 , 0 〉 (1)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 7 / 1
Observation 1
x 1 = 〈− 2 , 2 〉 (2) ˆ
y 1 = 0 (3)
y 1 = + 1 (4)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 8 / 1
Update 1
~
w t + 1 ← ~ w t + y t x ~ t (5)
~
w 2 ← (6)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 9 / 1
Update 1
~
w t + 1 ← ~ w t + y t x ~ t (5)
~
w 2 ← 〈 0 , 0 〉 + 〈− 2 , 2 〉 (6) (7)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 9 / 1
Update 1
~
w t + 1 ← ~ w t + y t x ~ t (5)
~
w 2 ← 〈 0 , 0 〉 + 〈− 2 , 2 〉 (6)
~
w 2 = 〈− 2 , 2 〉 (7)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 9 / 1
Observation 2
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 10 / 1
Observation 2
x 2 = 〈− 2 , − 3 〉 (8) ˆ
y 2 = + 4 + − 6 = − 2 (9)
y 2 = − 1 (10)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 10 / 1
Update 2
~
w t + 1 ← ~ w t (11)
~
w 2 ← (12)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 11 / 1
Update 2
~
w t + 1 ← ~ w t (11)
~
w 2 ← 〈− 2 , 2 〉 (12)
(13)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 11 / 1
Update 2
~
w t + 1 ← ~ w t (11)
~
w 2 ← 〈− 2 , 2 〉 (12)
~
w 2 = 〈− 2 , 2 〉 (13)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 11 / 1
Observation 3
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 12 / 1
Observation 3
x 3 = 〈 2 , − 1 〉 (14) ˆ
y 3 = − 4 + − 2 = − 6 (15)
y 3 = + 1 (16)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 12 / 1
Update 3
~
w t + 1 ← ~ w t + y t x ~ t (17)
~
w 3 ← (18)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 13 / 1
Update 3
~
w t + 1 ← ~ w t + y t ~ x t (17)
~
w 3 ← 〈− 2 , 2 〉 + 〈 2 , − 1 〉 (18) (19)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 13 / 1
Update 3
~
w t + 1 ← ~ w t + y t ~ x t (17)
~
w 3 ← 〈− 2 , 2 〉 + 〈 2 , − 1 〉 (18)
~
w 3 = 〈 0 , 1 〉 (19)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 13 / 1
Observation 4
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 14 / 1
Observation 4
x 4 = 〈 1 , − 4 〉 (20) ˆ
y 4 = − 4 (21) y 4 = − 1 (22)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 14 / 1
Update 4
~
w 4 ← (23)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 15 / 1
Update 4
~
w 4 ← ~ w 3 (23)
(24)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 15 / 1
Update 4
~
w 4 ← ~ w 3 (23)
~
w 4 = 〈 0 , 1 〉 (24)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 15 / 1
Observation 5
x 5 = 〈 2 , 2 〉 (25) ˆ
y 5 = 2 (26) y 5 = + 1 (27)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 16 / 1
Update 5
~
w 5 ← (28)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 17 / 1
Update 5
~
w 5 ← ~ w 4 (28)
(29)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 17 / 1
Update 5
~
w 5 ← ~ w 4 (28)
~
w 5 = 〈 0 , 1 〉 (29)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 17 / 1
Observation 6
x 6 = 〈 2 , 2 〉 (30) ˆ
y 6 = 2 (31) y 6 = + 1 (32)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 18 / 1
Update 6
~
w 6 ← (33)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 19 / 1
Update 6
~
w 6 ← ~ w 5 (33)
(34)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 19 / 1
Update 6
~
w 6 ← ~ w 5 (33)
~
w 6 = 〈 0 , 1 〉 (34)
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 19 / 1
Structured Perceptron
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 20 / 1
Perceptron Algorithm
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 21 / 1
POS Example
Machine Learning: Jordan Boyd-Graber | UMD Structure and Predictions | 22 / 1