• No results found

Performance Measures in Data Mining

N/A
N/A
Protected

Academic year: 2021

Share "Performance Measures in Data Mining"

Copied!
29
0
0

Loading.... (view fulltext now)

Full text

(1)

Performance

Performance Measures in Data Mining

Common Performance Measures used in Data Mining and Machine Learning Approaches

L. Richter J.M. Cejuela

Department of Computer Science Technische Universität München

Master Lab Course Data Mining, SS 2015, Jul 1st

(2)

Performance

Outline

Item Set and Association Rule Weights Simple Measures

Complex Measures Classification

Basic Performance Measures

Complex Measures – Performance Curves Regression

Assessment Strategies

(3)

Performance

Item Set and Association Rule Weights Simple Measures

Measures for Item Sets

Various algorithms can yield frequent item sets. From frequent item sets c and c ∪ {i} you can derive if c then {i}. Typically there is only one item in the RHS (right hand side of the rule).

I

Support (of an item set):

sup(X → Y ) = sup(Y → X ) = P(X ∧ Y )

how many times the item set is found in the database

I

Confidence (of a rule): conf (X → Y ) = P(Y |X ) =

P(X and Y )/P(X ) = sup(X → Y )/sup(X )

(4)

Performance

Item Set and Association Rule Weights Simple Measures

Measures for Item Sets

Various algorithms can yield frequent item sets. From frequent item sets c and c ∪ {i} you can derive if c then {i}. Typically there is only one item in the RHS (right hand side of the rule).

I

Support (of an item set):

sup(X → Y ) = sup(Y → X ) = P(X ∧ Y )

how many times the item set is found in the database

I

Confidence (of a rule): conf (X → Y ) = P(Y |X ) =

P(X and Y )/P(X ) = sup(X → Y )/sup(X )

(5)

Performance

Item Set and Association Rule Weights Simple Measures

Measures for Item Sets

Various algorithms can yield frequent item sets. From frequent item sets c and c ∪ {i} you can derive if c then {i}. Typically there is only one item in the RHS (right hand side of the rule).

I

Support (of an item set):

sup(X → Y ) = sup(Y → X ) = P(X ∧ Y )

how many times the item set is found in the database

I

Confidence (of a rule): conf (X → Y ) = P(Y |X ) =

P(X and Y )/P(X ) = sup(X → Y )/sup(X )

(6)

Performance

Item Set and Association Rule Weights Complex Measures

Measures for Item Sets cont.’d

Measures how frequent an item set / how interesting a rule is in comparison to the expected occurrence (interesting):

I

Leverage (of an item set):

lev (X → Y ) = P(X and Y ) − (P(X )P(Y ))

I

Lift (of a rule): lift(X → Y ) = lift(Y → X ) = P(X ∧ Y )/(P(X )P(Y )) =

conf (X → Y )/sup(Y ) = conf (Y → X )/sup(X )

I

Conviction (of a rule): Similar to Lift, but directed

Compares the probability that X appears without Y, if they were independent with the observed frequency of X and Y.

conviction(X → Y ) = P(X )P(¬Y )/P(X ∧ ¬Y ) =

(1 − sup(Y )/(1 − conf (X → Y ))

(7)

Performance

Item Set and Association Rule Weights Complex Measures

Measures for Item Sets cont.’d

Measures how frequent an item set / how interesting a rule is in comparison to the expected occurrence (interesting):

I

Leverage (of an item set):

lev (X → Y ) = P(X and Y ) − (P(X )P(Y ))

I

Lift (of a rule): lift(X → Y ) = lift(Y → X ) = P(X ∧ Y )/(P(X )P(Y )) =

conf (X → Y )/sup(Y ) = conf (Y → X )/sup(X )

I

Conviction (of a rule): Similar to Lift, but directed

Compares the probability that X appears without Y, if they were independent with the observed frequency of X and Y.

conviction(X → Y ) = P(X )P(¬Y )/P(X ∧ ¬Y ) =

(1 − sup(Y )/(1 − conf (X → Y ))

(8)

Performance

Item Set and Association Rule Weights Complex Measures

Measures for Item Sets cont.’d

Measures how frequent an item set / how interesting a rule is in comparison to the expected occurrence (interesting):

I

Leverage (of an item set):

lev (X → Y ) = P(X and Y ) − (P(X )P(Y ))

I

Lift (of a rule): lift(X → Y ) = lift(Y → X ) = P(X ∧ Y )/(P(X )P(Y )) =

conf (X → Y )/sup(Y ) = conf (Y → X )/sup(X )

I

Conviction (of a rule): Similar to Lift, but directed

Compares the probability that X appears without Y, if they were independent with the observed frequency of X and Y.

conviction(X → Y ) = P(X )P(¬Y )/P(X ∧ ¬Y ) =

(1 − sup(Y )/(1 − conf (X → Y ))

(9)

Performance

Item Set and Association Rule Weights Complex Measures

Supplementary Material

J-Measure

I

empirically observed accuracy of rule

I

Cross-entropy (measuring how good a distribution approximates another distribution) between the binary variables φ and θ with vs. without conditioning on event θ

J(θ ⇒ φ) = p(θ)



p(φ|θ)log p(φ|θ)

p(φ) + (1 − p(φ|θ)) log 1 − p(φ|θ) 1 − p(φ)



(10)

Performance Classification

Basic Performance Measures

Basic Building Blocks for Performance Measures

I

True Positive (TP): positive instances predicted as positive

I

True Negative (TN): negative instances predicted as negative

I

False Positive (FP): negative instances predicted as positive

I

False Negative (FN): positive instances predicted as negative

I

Confusion Matrix:

Predicted a Predicted b

Real a TP FN

Real b FP TN

(11)

Performance Classification

Basic Performance Measures

Performance Measures

Accuracy , acc = TP + TN TP + FN + FP + TN

= Number of correct predictions Total number of predictions Error rate, err = FN + FP

TP + FN + FP + TN = 1 − acc

= Number of wrong predictions

Total number of predictions

(12)

Performance Classification

Basic Performance Measures

Performance Measures cont’d

True Positive Rate, TPR, Sensitivity = TP TP + FN True Negative Rate, TNR, Specificity = TN

TN + FP

False Positive Rate, FPR = FP TN + FP False Negative Rate, FNR = FN

TP + FN

(13)

Performance Classification

Basic Performance Measures

Performance Measures cont’d

True Positive Rate, TPR, Sensitivity = TP TP + FN True Negative Rate, TNR, Specificity = TN

TN + FP False Positive Rate, FPR = FP

TN + FP False Negative Rate, FNR = FN

TP + FN

(14)

Performance Classification

Basic Performance Measures

Performance Measures cont’d

True Positive Rate, TPR, Sensitivity = TP TP + FN True Negative Rate, TNR, Specificity = TN

TN + FP False Positive Rate, FPR = FP

TN + FP False Negative Rate, FNR = FN

TP + FN

(15)

Performance Classification

Basic Performance Measures

Performance Measures cont’d

True Positive Rate, TPR, Sensitivity = TP TP + FN True Negative Rate, TNR, Specificity = TN

TN + FP False Positive Rate, FPR = FP

TN + FP False Negative Rate, FNR = FN

TP + FN

(16)

Performance Classification

Basic Performance Measures

Performance Measures cont’d

Precision, p = TP TP + FP Recall, r = TP

TP + FN

F

1

measure = 2rp

r + p = 2 × TP

2 × TP + FP + FN

(17)

Performance Classification

Basic Performance Measures

Performance Measures cont’d

Precision, p = TP TP + FP Recall, r = TP

TP + FN F

1

measure = 2rp

r + p = 2 × TP

2 × TP + FP + FN

(18)

Performance Classification

Basic Performance Measures

Performance Measures cont’d

Precision, p = TP TP + FP Recall, r = TP

TP + FN F

1

measure = 2rp

r + p = 2 × TP

2 × TP + FP + FN

(19)

Performance Classification

Complex Measures – Performance Curves

Performance Curves

I

"costs" of different error types are different

I

prediction behaviour changes over the test set

I

performance display in 2D

I

different domains prefer different chart types

(20)

Performance Classification

Complex Measures – Performance Curves

Lift Charts

Taken from

http://www.dmg.org/rfc/pmml-3.2/ModelExplanation.html

I

from marketing area to evaluate mailing success

I

y -axis: number or percentage of responders

I

x -axis: sample

I

red diagonal: random lift

I

green line: optimum lift

(21)

Performance Classification

Complex Measures – Performance Curves

ROC Curves

Taken from http://en.wikipedia.org/wiki/File:Roccurves.png

I

Receiver Operator Characteristics

I

y -axis: TPR

I

x -axis: FPR

I

diagonal: random

guessing (TPR=FPR)

(22)

Performance Classification

Complex Measures – Performance Curves

Sensitivity vs. Specificity

Taken from http://i.stack.imgur.com/fnUd2.png

I

preferred in medicine

I

y -axis: TPR

I

x -axis: TNR (specificity)

I

also frequently as ROC

curve with 1 - specificity

(23)

Performance Classification

Complex Measures – Performance Curves

Recall Precision Curves

Taken from http://scikit-learn.github.io/scikit-learn.org/

I

preferred in information retrieval

I

positives are the documents retrieved in response to a query

I

true positives are

documents really relevant to the query

I

y -axis: precision

I

x -axis: recall

(24)

Performance Regression

Error Measures for Regression

Mean − squared error , MSE = (p

1

− a

1

)

2

+ · · · + (p

n

− a

n

)

2

n

Root mean−squared error , RMSE =

r (p

1

− a

1

)

2

+ · · · + (p

n

− a

n

)

2

n

Mean − absolute error , MAE = |p

1

− a

1

| + · · · + |p

n

− a

n

|

n

(25)

Performance Regression

Error Measures for Regression

Mean − squared error , MSE = (p

1

− a

1

)

2

+ · · · + (p

n

− a

n

)

2

n

Root mean−squared error , RMSE =

r (p

1

− a

1

)

2

+ · · · + (p

n

− a

n

)

2

n

Mean − absolute error , MAE = |p

1

− a

1

| + · · · + |p

n

− a

n

|

n

(26)

Performance Regression

Error Measures for Regression

Mean − squared error , MSE = (p

1

− a

1

)

2

+ · · · + (p

n

− a

n

)

2

n

Root mean−squared error , RMSE =

r (p

1

− a

1

)

2

+ · · · + (p

n

− a

n

)

2

n

Mean − absolute error , MAE = |p

1

− a

1

| + · · · + |p

n

− a

n

|

n

(27)

Performance Regression

Relative Error Measures

Relative − squared error = (p

1

− a

1

)

2

+ · · · + (p

n

− a

n

)

2

(a

1

− ¯ a)

2

+ · · · + (a

n

− ¯ a)

2

Root relative − squared error = s

(p

1

− a

1

)

2

+ · · · + (p

n

− a

n

)

2

(a

1

− ¯ a)

2

+ · · · + (a

n

− ¯ a)

2

Relative − absolute error = |p

1

− a

1

| + · · · + |p

n

− a

n

|

|a

1

− ¯ a| + · · · + |a

n

− ¯ a|

(28)

Performance

Assessment Strategies

General Problem

I

each algorithm abstracts from observations (instances)

I

the aspects kept and the aspect discarded differ between the learning scheme (inductive bias)

I

this means also: information about individual instances are contained in the model, too

I

individual instance information leads to overfitting

(29)

Performance

Assessment Strategies

Solution Strategies

I

use fresh date, i.e. instances not used for the training

I

for very large numbers of instances: simple split in test and training set

I

most common: 10-fold cross validation

I

LOOCV: Leave one out cross validation

References

Related documents

The offer cannot be used in conjunction with any other promotion, is subject to availability and tickets must be booked at least 24 hours in advance of visit.. All prices are

High capacity water pump To aquarium From prefilter Drip plates Bio-media Water level Figure 6.2.. Drip plate cover

1) When designing Equipment Rooms, consider incorporating building information systems other than traditional voice and data communications systems ( e.g. CATV distribution

their Technology Action Team for recognizing the specific challenges facing smaller community foundations in making technology work for them. We especially thank the

novels, such as Pride and Prejudice ’s Elizabeth and Jane Bennet, and Northanger Abbey ’s..

ار روشک تیعمج زا یمیظع رشق نایوجشناد هکنیا هب هجوت اب هرود هاگشناد هب دورو و دنهد یم لیکشت یگدنز زا ساسح یا ام روشک رد ار ناناوج یم لیکشت نیا یتخانشناور یتسیزهب

[r]

5 A comparison of the rFTIR spectrum of azurite with glair on parchment (red line) and azurite with glair on aluminum foil (black line) demonstrates that the parchment support