• No results found

Lecture 41.pdf

N/A
N/A
Protected

Academic year: 2020

Share "Lecture 41.pdf"

Copied!
37
0
0

Loading.... (view fulltext now)

Full text

(1)

Non-Parametric Methods

+ Multiclass Classification

CS540-002, Spring 2015 Lecture 41

(2)

● Remember WHW3

● We have a final exam coming up

○ Thursday, May 14 ○ 5:05 - 7:05 PM

○ You can bring TWO sheets (8.5x11) of notes

○ Exam is cumulative, but focuses on material from after the midterm.

(3)

Today:

● Non-Parametric Methods

(4)

Recall: Parametric Methods

With previous methods, we produce a model: ● DTs: A tree

● Linear Regression: A vector* ● Perceptrons: A vector*

● Logistic Regression: A vector*

(5)

Non-Parametric Methods

k-Nearest Neighbors classification:

Basic idea:

Given a test point x, find the k closest points

in the training set.

(6)

x x2

(7)

kNN: Implementation

The “learning” phase of kNN is putting the data into a structure such that queries are fast.

For example: ● A k-d Tree

(8)

kNN: A Potential Problem

Height (Feet)

(9)

kNN: A Potential Problem

Height (cm)

(10)

Data Normalization:

Basic Idea:

The units used shouldn’t make a difference. So, convert everything to a “unitless” measure. For the ith feature:

Let mi be the mean of that feature (that we observed), and

let si be the standard deviation.

Replace xi(j) by (xi(j) - mi) / si for every j.

(11)

Data Standardization:

Basic Idea:

The units used shouldn’t make a difference. So, convert everything to a “unitless” measure. For the ith feature:

Let mi be the max of feature (that we observed).

Let ni be the min of that feature.

Let ri be mi- ni.

Replace xi(j) with (xi(j) - ni)/ri for every j.

(12)

Non-Parametric Regression

(13)

Non-parametric Regression:

Piecewise Linear Non-Parametric Regression Given a query x and training set T:

Let (L, y(L)) be the largest* point in T such that L ≤ x.

Let (R, y(R)) be the smallest* point in T such

that

R ≥ x.

h(x; T) = α y(R) + (1 - α) y(L)

(14)

0

Non-Parametric Regression:

y

= α

(15)

Given a query x and training set T:

Let (x1, y1), …, (xk, yk) be the k closest* points to x in T.

h(x ; T) = mean(y1, …, yk)

Non-parametric Regression:

kNN Averaging

(16)
(17)

Given a query x and training set T:

Let (x1, y1), …, (xk, yk) be the k closest* points to x in T.

Let:

Non-parametric Regression:

kNN (Linear) Regression

(18)
(19)

Given a query point x, a training set T, and a kernel K:

Compute a weight vector w for a (imaginary) dataset where

each point is weighted according to K.

Now return hw(x)

Non-parametric Regression:

(20)
(21)

LWLR:

(22)

Advantages

(of Non-Parametric Methods)

:

● They can easily leverage locality in the data.

○ Example:

Suppose we can separate Small Cats from Small

Dogs using some particular decision boundary, but a completely different boundary works better for Big Cats vs Big Dogs.

(23)

Issues in the Real World:

Suppose we’re classifying cats vs dogs.

(24)

Multi-Class Classification

● Not to be confused with Multi-Label Classification

● No big difference from the binary case for some algorithms

○ kNN

(25)

Multiclass Classification

With Linear Models: Perceptrons

(26)

Training:

Augment labels to be: Class 1

Not Class 1

Learn Perceptron h1

Repeat for Class 2, Class 3, ...

Multiclass Classification

(27)

Testing:

Given x, compute h1(x), h2(x), ...

Take whichever one says “Positive”.

Multiclass Classification

(28)

Multiclass Classification

With Linear Models: Perceptrons

x x2

(29)

Problem:

x might not bird, nor plane, nor even frog

(all Perceptrons say Negative)

When you data isn’t linearly separable, multiple might say Positive.

Solution?

Multiclass Classification

(30)

An Additional Problem:

Suppose there are C classes.

(31)

1-of-k Encoding:

Point x1 x2 y

A 4 -1 2

B -1 5 3

C -1 -2 1

D -1 4 1

Point y1 y2 y3

A 0 1 0

B 0 0 1

C 1 0 0

(32)

1-of-k Encoding:

We’re using 1 bit per class. We can do better.

(33)

1-of-k Encoding:

(34)

Fancier Encoding:

(35)

1-of-k Encoding: Example

Classifying Cars vs Trucks vs Cats vs Dogs

Classifier 1: Car or Not Car

Classifier 2: Truck or Not Truck

Classifier 3: Cat or Not Cat

(36)

Fancier Encoding: Example

Classifying Cars vs Trucks vs Cats vs Dogs

Classifier 1: Car/Truck vs Cat/Dog

Classifier 2: Car vs Truck

(37)

Fancier Encoding: Example

Classifying Cars vs Trucks vs Cats vs Dogs

Classifier 1: Car/Cat vs Truck/Dog

Classifier 2: Car vs Cat

References

Related documents

on diuretic doses administered as a marker of response to diuretic therapy, and further studies in dogs and cats with acute CHF are required to assess the correlation in dogs and

thanks to efficient manufacturing processes – for small production runs too – and favourable financing.. 01 BRG-256-10 Folder Druckguss GB RZ.indd 1 31.03.2010

Dapatan ini menunjukkan bahawa semasa pengajian formal pendidikan khas, responden berpendapat mereka tidak menerima latihan yang komprehensif atau mantap untuk mengenali ciri

compromised utility had been able to practice the most basic type of network security for corporate and industrial control systems, they would likely have detected

Bakgrunnen for dette var at jeg tenkte jeg da ville få en større bredde blant informantene og kanskje flere interessante funn En faktor som det viste seg skulle være ulikt blant

Bagi elemen pembangunan profesional, kajian ini mengkaji aktiviti pembelajaran berkesan serta faktor sokongan dan penghalang bagi meningkatkan profesionalisme

Banco Federado is subject to the regulations and financial standards imposed by the General Superintendency of Financial Intermediaries ("Auditoria General de

occurrence of prototypical allographs in new, unseen hand- writing, and (iv) derive copybooks comprising handwriting styles by clustering these membership vectors.. The domain of