Methods of digital classification accuracy assessment

(1)

Rochester Institute of Technology

RIT Scholar Works

Theses

Thesis/Dissertation Collections

6-1-1997

Methods of digital classification accuracy

assessment

Jeffrey R. Allen

Follow this and additional works at:

http://scholarworks.rit.edu/theses

This Thesis is brought to you for free and open access by the Thesis/Dissertation Collections at RIT Scholar Works. It has been accepted for inclusion in Theses by an authorized administrator of RIT Scholar Works. For more information, please contactritscholarworks@rit.edu.

Recommended Citation

(2)

Methods

of

Digital Classification

Accuracy

Assessment

M.S. Thesis

by

Jeffrey

R. Allen

B.S. Rochester Instituteof

Technology

(1996)

Rochester Institute of

Technology

Center for

Imaging

Science

Digital

Imaging

andRemote

Sensing Laboratory

(3)

Acknowledgments

Iwouldliketoacknowledgethe

following

peoplefortheircontributions.. .

First,

theeffortsofallthreemembersof_mythesiscommittee. Yourtimeand patience

are_greatly appreciated. Iam_particularlythankfultoDr. Navalgund Rao for

kindling

_my

interestindigitalimageprocessing,Mr. Rolando Raquefio foralways

taking

timefrom

hishecticscheduletodealwith_{my insignificant}_problems, andDr. John R. Schott for

exposingmeto thefieldof remote_sensing, _providingme withthe_opportunitytoconduct this research, and

helping

shape_myfuture.

Iam gratefulto theNational Reconnaissance Office

(NRO)

for_providing

funding,

without whichthis thesis would nothave beenpossible.

Wordscannotbegintoexpress_myappreciationtowards_mymother,Lorna. Her

influence has beenpermanent,hersupport without

bounds,

andher loveunconditional

and absolute.

Allof_my

family

fortheirloveand support

including

_mysistersKimand

Maryann,

brothers Timand

Scott,

andthenewest membertojoinour_ranks, _myadorable niece

Rachel Allen. In_addition,IwouldliketocongratulateScottandhis fiancee Rebeccaon

their_pendingnuptials.

My

girlfriend of_nearly sixyears,Tracie

Bonacci,

for her loveand_{understanding}through years of_schoolingandbeyond.

Alloftheprofessors I have hadpleasure of

knowing

fromtheCenter for

Imaging

Science

andtheCollegeofScienceatRTT. Youreffortsare appreciated.

My

friends fromtheDIRS

lab,

the

Center,

andTriangle

fraternity

for

helping

maintain my sanity (I dunk).

Mr. Stephen L.

Schultz,

unix_wizard, for his programmingassistance andMr. Scott D.

Brown,

DIRSIGguru, for generatingtherequiredsyntheticimages.

Allthesupport staff atthe

Center,

especially,Mrs. Sue Chan for heracademic planning.

Dr. Paul Wilson forhis contributionstowardthe contentofthis thesis_alongwithhis

recommendations and usefulcomments.

Lastly,

alltheunmentioned

friends,

_colleagues,and_studypartnersI havecollectedover

thelast 6years atRIT.

(4)

Dedication

Thisthesisis dedicatedto_my

Father,

Edward Oliver Allen Jr. Itisdifficulttobe brief

whenIspeakofhim. Hiskindnature,endlesspatience,andlimitless curiosity hasshaped

mypersonalityandidentity. Iam wellserved

by

_attemptingtoemulatehim in every respect.

My

father'stechnicalknowledgeand mechanical expertise is inspirationaltoall whohaveknown himandhasmadehim mygreatestmentor. Iconsider myself_very

(5)

APPROVAL OF M.S. THESIS

Methods of Digital Classification Accuracy Assessment

by

Jeffrey

R.

Allen

B.S. Rochester Institute of Technology

(1996)

A thesis submitted in partial fulfillment of the

requirements for the degree of Master of Science

in Imaging Science from the Center for Imaging

Science, Rochester Institute of Technology.

June 1997

Signature of Author:

~~

Jeffrey

R.

Allen

Dr. Harvey Rhody, Coordinator, M.S. D

(6)

Center for Imaging Science

Rochester Institute of Technology

Rochester, New York

Certificate of Approval

Master of Science Degree Thesis

The Master of Science degree thesis of Jeffrey R. Allen

has been examined and approved by the thesis committee

as satisfactory for the thesis requirement for the Master

of Science Degree.

Dr. John R. Schott, Primary Thesis Advisor

.

Dr. Navalgund Rao, Committee Member

Rolando Raquefio, Committee Member

(7)

Center for Imaging Science

Rochester Institute of Technology

Rochester, New York

Thesis Release Permission Form

Thesis Title:

Methods of Digital Classification Accuracy Assessment

I, Jeffrey R. Allen, grant permission to the Wallace Memorial

Library of the Rochester Institute of Technology to reproduce

this thesis in whole or in part provided any reproduction will

not be of commercial use or for profit.

Jeffrey R. Allen

(8)

1.

Abstract

Landcover classification of_remotely sensed data has found many useful applications in

industries such as

forestry,

_agriculture, and defense. With the push toward end users,

class maps are often incorporated

directly

into geographical information systems for use

in _solving

large,

complex problems.

However,

errors are inherent in the classification

process. Theimportanceof_assessingthethematic _accuracyof dataderived fromremote

sensing platforms is _universally recognized and has motivated much research. Classification accuracy assessmentis often required to determine the "fitness of

use" or

suitabilityof a data set for a particular application. Failure to

identify

the magnitude of inaccuracies in classified datacan result inerrors _{cascading into} subsequent exploitation

and_eventually resultin false conclusions or flawed products.

Many

different techniques

have been developed and utilized

by

the remote _{sensing community for performing}

thematic _accuracy assessment. To

date,

no one procedure has been adopted as an

industry-widestandard.

Thepurpose ofthisresearch was to evaluatetheeffectiveness and compare theresults of

several state-of-the-art assessment techniques.

Synthetically

generated

imagery,

_along

with real multispectral line scanner

data,

served as the baseline for the comparison.

Synthetic

imagery

_{is uniquely}suitedforthis taskbecausetheexactclassification_accuracy

(9)

Table

of

Contents

1. Abstract vii

2. Introduction 13

2.1

Collecting

Reference Data 14

2.2

Accuracy

Representation 16

2.3 Factors

Degrading

Classifier Performance 17

2.4

Correcting

for Reference Bias 18

2.5 Relative Classifier Performance 18

3. Objectives 19

3.1 Analysisof

Accuracy

Assessment 19

3.2 Application of

Accuracy

Assessment 19

4. Work Statement / Deliverables 21

5. Background 23

5.1

Utility

ofClassification 23

5.2 Motivations for

Accuracy

Assessment 24

5.3 Classification Algorithms 25

5.3.1 Gaussian Maximum Likelihood 27

5.3.2

Fuzzy

ARTMAP 31

5.3.3 Rule Based Genetic Algorithm 35

5.4 Image Data Sets 37

5.4.1 Tank Scene 38

5.4.2 Desert Scene 39

5.4.3 Forest Scene 41

6. Approach 43

6.1 Experimental Data Set Matrix 44

6.2

Importing Training

Data 45

6.3 UseofSynthetic ImageData 47

6.4 Simulationof

Stressing

Parameters 49

6.4.1 Modulation Transfer Function 49

6.4.2 Atmospheric Effects 49

7.

Theory

52

7.1 Factors

Effecting

Classification

Accuracy

52

7.2

Assessing

Classification

Accuracy

53

7.3 Confusion Matrices 55

7.3.1 User Selected Reference Data 58

7.3.1.1 Dependent Data Sets 58

7.3.1.2 Independent Data Sets 59

7.3.2 Random Point

Sampling

60

7.3.2.1 Simple Random

Sampling

61

7.3.2.2 Stratified Random

Sampling

₆₁

(10)

7.3.2.4 Systematic Random

Sampling

63

7.3.3 Synthetic

Imagery

Verification 64

7.4

Accuracy

Metrics 66

7.4.1

Uncertainty

ofEstimates andConfidence Intervals 67

7.4.2 Image Wide

Accuracy

Metrics 68

7.4.2.1 Simple

Accuracy

68

7.4.2.2 Weighted

Accuracy

70

7.4.2.3 Kappa Coefficient 73

7.4.2.4 BrennanandPrediger's Kappa 75

7.4.2.5 Tau Coefficient 77

7.4.3 Single Class Metrics 79

7.4.3.1 Producer's

Accuracy

Metric 80

7.4.3.2 User's

Accuracy

Metric 81

8. Discussion 83

8.1 Optimistic Bias 83

8.2 Conservative Bias 83

8.3 Confusion Matrix Marginal Distribution

Scaling

85

8.3.1 ExampleofConfusion Matrix

Scaling

88

8.3.2 Kolmogorov-Smirnov

Testing

ofPost Priori Probabilities 91

9. Results 97

9.1 EffectofReference Data Source 101

9.2

Accuracy

MetricResults 109

9.2.1 Simple

Accuracy

112

9.2.2 Weighted

Accuracy

112

9.2.3 Kappa Coefficient 113

9.2.4 BrennanandPrediger's Kappa 115

9.2.5 Tau Coefficient 116

9.3 Effectof

Stressing

Parameters ₁₁₆

9.3.1 Resolution 116

9.3.2 Atmosphere 120

9.4 ResultsofConfusion Matrix

Scaling

123

9.4.1

Scaling

ofForest Scene Confusion Matrices 123

9.4.2

Scaling

ofTank Scene Confusion Matrices 129

9.4.3

Scaling

ofDesert Scene Confusion Matrices 132

9.5Mystic_{Classifier Performance}

135

9.5.1 Classification Results 135

9.5.2 Suggestions for Improvement ₁₃₈

10.

Summary

& Conclusion 141

11. Recommendations for Future Work ₁₄₅

12. References ₁₄₆

(11)

List

of

Figures

Figure 5-1 GML Classificationof aTwo Band Image 29

Figure 5-2

Fuzzy

ARTMAP Architecture 32

Figure 5-3 Weight Vector Operation 33

Figure 5-4 Inter-ARTField Operation 34

Figure 5-5 Southern Rainbow Tank Scene 39

Figure 5-6 Western Rainbow Desert Scene 40

Figure 5-7 BandpassesofDaedalus Sensor 40

Figure 5-8 SyntheticForest Scene 41

Figure 6-1 Experimental Matrices 45

Figure 7-1

Contingency

Diagram 54

Figure 7-2 Sample ConfusionMatrix 56

Figure 7-3 ProbabilisticConfusion Matrix 57

Figure 7-4 ClassificationandVerificationofSynthetic Scene 65 Figure 7-5

Mapping

ofDIRSIG MaterialstoClass

Map

Categories 65

Figure 7-6 Standard Normal

Density

67

Figure 8-1 Confusion Matrix Marginal

Scaling

86

Figure 8-2 Sample Class

Map

88

Figure 8-3 Confusion Matrix forSample Class

Map

89

Figure 8-4

Scaling

Sample Confusion Matrix 90

Figure 8-5 Scaled ConfusionMatrix for Sample Class

Map

91

Figure 8-6 Forest Class

Probability

Distributions 94

Figure 8-7 Forest Class Cumulative

Probability

Distributions 94 Figure 9-1 Class Mapsfrom Forest 23km

Visibility

Scene Classification 97 Figure 9-2 SyntheticReference

Map

andOriginal Forest Image 98 Figure 9-3 Class Maps from Tank 23km

Visibility

SceneClassification 98 Figure 9-4 Class Maps from Desert lm GIFOV Scene Classification 100

Figure 9-5 Class Maps from Desert 2m& 4m GIFOV Scene GML Classification 101

Figure 9-6 Random

Sampling

of278 Points ₁₀₂

Figure 9-7 Multisource AssessmentofDesert Scene GML Class

Map

106

Figure9-8 ErrorofRandom Forest Assessment ₁₀₇

Figure9-9 Classifier PerformanceonForest 23km lm Image 109 Figure9-10 GML Classification

Accuracy

for Desert Scene 1 17 Figure9-11 ARTMAPClassification

Accuracy

for Desert Scene 1 18 Figure9-12 Mystic_{Classification}

Accuracy

for Desert Scene 1 19

Figure9-13 EffectofTankSceneAtmospheric

Visibility

on_{Classifier Performance...}₁₂₁ Figure9-14 EffectofForest Scene Atmospheric

Visibility

on_{Classifier Performance. 122} Figure9-₁_{5 Spatial Correlation} _of_{Classifier Error}

(12)

Figure 9-16 Spatial CorrelationofClassifier Errorat5km

Visibility

123

(13)

List

of

Tables

Table 5-1 Southern Rainbow Bandpasses 38

Table5-2WesternRainbow Bandpasses 40

Table 5-3 DIRSIG Scene Bandpasses 42

Table 7-1 Confidence Interval Z-Scores 68

Table 8-1 DistributionofClass

Map

89

Table 8-2 DistributionofReference 90

Table 8-3 Quantiles oftheSmirnov Two Sample Test Statisticof size n 93 Table 8-4 CalculationofSmirnov Test Statistic for Forest 23k GML Class

Map

95

Table 8-5 Results ofKolmogorov-Smirnov Two-Sample Test 95

Table 9-_{1 Effect}_of_{Forest Reference Source}_on_Measured_{Percent Correct} ₁₀₃ Table 9-2 EffectofTank Reference Source onKappa Coefficient 104 Table 9-3 EffectofDesert Reference SourceonTau Coefficient 105

Table 9-4 VerificationofDependent Reference Data 107

Table 9-5 VerificationofIndependent Reference Data 108

Table 9-6 Chance Agreement Coefficients Ill

Table 9-7

Probability

DistributionsofReference Data 124

Table 9-8

Probability

DistributionsofForest Scene Class Maps 124 Table 9-9 RMS ErrorofMarginal Distribution Approximation for Forest Scene 125

Table 9-10

Scaling

Coefficientsfor Forest Scene 127

Table 9-11 Percent

Accuracy

Results ofMatrix

Scaling

Forest Scene 128 Table 9-12 Absolute Error ResultofMatrix

Seating

Forest Scene 128

Table 9-13 HistogramsofTank Scene Class Maps 130

Table 9-14 RMS ErrorofMarginal Distribution Approximation for Tank Scene 130 Table 9-15

Scaling

Coefficientsfor Tank Scene IndependentReference 131 Table 9-16 Kappa Coefficient

Accuracy

ResultsofMatrix

Scaling

Tank Scene 131 Table 9-17 RMS ErrorofMarginal Distribution Approximation for Desert Scene ₁₃₃

Table9-18

Scaling

Coefficients for Desert Scene 133

(14)

2.

Introduction

Digital imageclassificationisone ofthemostcommon operations performed on

remotely senseddata. Classificationreferstoaprocess where each pixelinanimage is assignedtoa certaincategory,knownas a class. In thecontextof remote sensing,these

classes_usuallycorrespondtotypesof ground cover. Theresultof classificationis known

asaclass map.Theterm'map' shouldnotbeconfused withthecartographic meaning. A class_{map is digital}rasterdatawheredigitalcounts

(DC)

correspondtoclass _membership

and spatiallocationcorresponds to thesamelocationasintheoriginal image. Recent interest intheintegration of remote_{sensing data into}geographicalinformationsystems

(GIS)

has rekindledresearch andheightened interest inclassification_accuracyassessment

(JanssenandVan der

Wei,

1994). Thepushtowardsreal world applications andtheend

userhas further increased theneedforreliable methodsof_accuracy assessment. Errors

areintroduced intoclassification when a pixelismisclassified

by

_{assigning it}to the

wrongclass. Thetermpixel (picture_element)isusedtoreferto the smallest element of

theoriginal and classifiedimages. Theoriginal and classifiedimagesconsistof atwo

dimensional arrayof pixelsbuttheoriginal image usually hasanadditional dimensionof spectraldataas well.

Ideally,

_accuracyassessment wouldconsistof_comparingtheclass ofall pixels inaclassifiedimageto their trueclass. In_practice, _accuracyassessment

consists of_comparinga small_samplingof classified pixelsto a set ofdata believedtobe

their trueclass. Overtheyears,_manymethodsfor accuracyassessmenthave been presentedinremote_{sensing literature but}no dominantstandardhasyetbeenadopted.

Inthisthesis,thecurrent_{state-of-the-art.}_accuracyassessmenttechniquesare

presented andafewuniqueadaptations areproposed,aswell. These assessment

techniques arethenimplementedon aseries ofbaseline images. Threescenesareused

(15)

imageswere acquired_usingairbornemultispectrallinescannerswhilethelast imagewas

syntheticallygenerated. Detailsabouttheseimage setscanbefound in 5.4. Classifier performanceisaffected

by

_manyreal world

imaging

parameters. Twosuch parameters

areimageresolutionand atmosphericvisibility. Thethreescenesweredegraded using

thesetwoparameterstocreate nineimages. Thesenineimageswere useful

because,

after classification,

they

provided a complete range ofclassification _accuracywhich was

neededto

thoroughly

comparethevariousassessmenttechniques. In_addition,

they

were also usedto_{quantitatively}measuretheeffect ofthe _stressingparametersonclassification accuracy. Three differenceclassifiers were usedtoproducetherequisite class maps: the Gaussian Maximum Likelihood

(GML)

_usingparametric multivariate_statistics, the

Fuzzy

ARTMAPneural network_utilizinga

fuzzy

logicset,andMystic a new classifier_using mathematical_rules,optimized

by

a genetic _algorithm,tosegment classes. Thesethree classifiers-,described in

5.3,

were_selectivelyused onthenineimagestogenerate

twenty

threeclass maps. Alloftheseclass mapsthenunderwent_accuracy assessmentbasedon a

varietyof referencedatasources. Theresult was onehundredandnineteenconfusion

matrices and several_{corresponding accuracy}metrics foreach. Forthe exact combination

of

image,

classifier, and referencedatathereaderisreferredto theexperimentalmatrices in 6. 1. Theresearchofthe thesisis divided into fivemajorthrusts:

obtaining reference

data,

accuracymetrics, parameters_stressingclassifier_performance,correctionfor biased reference

data,

andtherelative performance oftherulebasedclassifier. Eachofthese topicsis discussed ingreaterdetail below.

2.1

Collecting

Reference

Data

Theprocess of classification_accuracy assessment canbegroupedinto twodistinct

steps. Inthefirststep,theclass_mapis spot_{checked against reference}_{data. The}_second step involves calculatingameaningful metricfromthe datacollectedinthefirststep.

(16)

next. Classification accuracyassessmentispresented,in

detail,

in 7.2. Referencedata

isa_groupof pixels which

belong

toknownclassesthatare usedtoestimatethe_accuracy

oftheentire map. Severalmethods of_obtainingreferencedataispresentedin 7.3. When_assessingclassifier_performance,referencedata iscompared againsttheclass _map tobuild aconfusionmatrix. Aconfusion matrixis a_{contingency table,}oftenusedin

categoricaldataanalysis,_usuallywith reference data_alongthecolumns andclass_map data alongtherows. Ineachelement, _alongtherow and column oftheconfusion_matrix,

the_{corresponding}number of pixelsthatfall into bothcategoriesisposted.

Any

discrepancy

betweenreferencedataand classifieddataisconsidered a classification error. The

difficulty

_{lies in obtaining}reference

data,

sometimesknownas_verified,

identified,

known,

ortruth

data,

whichisrepresentative oftheentire scene.

Determining

theexact_accuracyof aclass_{map is impossible in}almost all

circumstances. Forcertain,it is impractical inall cases

involving

realimagery. There are,

however,

several_widelyaccepted methodsfor

formulating

a_reasonablyclose approximationtothe true accuracy. Aproper estimate will alsoincludethe

corresponding confidence_{interval. When selecting}a methodfor_accuracy_assessment, thereis atradeoffbetweencost and accuracy. Thecost of_accuracyassessmentincludes

many factors such as

labor,

physical_resources, _{time, travel,} andothers. Thelargestcost ofassessmentis incurred obtainingthereferencedata. Lessrobustmethods resultinless accurateapproximations of_accuracywithlargeconfidenceintervalsbut at alowercost. High qualityassessmentprocedures are more_accurate,butalsomore expensive andtime

consuming. Eachproject mustfindthebalancepointbetweencost and acceptablefidelity.

Many

accuracyassessmenttechniquesintroducebias intotheirestimations. Bias isthe

systematic error_{resulting from}consistentover orunder estimation ofthetrueclass_map accuracy. Optimistic andconservativebiaswillbediscussed in 8.1 and 8.2

(17)

Forthis_thesis,real andsynthetic

imagery

willbeemployedfor_comparing

different samplingtechniques. Inthecontext of classification_accuracyassessment,

sampling refersto_selectingcertainpixels,or groups of_pixels,and

determining

which

class

they truly

belong

in. Synthetic

imagery

isof particularinterest because sampling is

not_necessarysincetheexact class_membershipof each pixelisknown. Thisa priori

knowledgewill permit an_unbiased,quantitative evaluation ofthepopular_sampling

techniques. Theuse of synthetic

imagery

inthisresearchisexplainedin 6.3.

Overtheyears,several_samplingtechniqueshave beenemployed

by

theremote

sensing community forthispurpose.

However,

each_samplingtechniquehas

correspondingadvantages anddisadvantages. In9.1 theresults ofthe analysis ofthe

effect of reference datasourceonthereported_accuracymetric willbe detailed.

2.2

Accuracy

Representation

Oncereference data isusedtocreate a confusion _matrix,it isoftendesirableto

reducethe matrixintoa_single, meaningfulindexof accuracy. This single_metric,_usually expressedas a coefficientbetweenzero and_one,estimates the trueaverage_{map accuracy}

or classifier performance.

Many

different accuracy metricshave beenintroducedto compensateforthefactthat theestimateis

being

made onlessthancomplete

information. Othermetrics,ideal for measuringclassifierperformanceratherthanclass

mapaccuracy,correctfortheproportionof pixels_properlyclassified_only

by

chance. It is importantto

keep

inmindthemethod usedtogenerate theconfusionmatrix when

selectingthismetric. Themost often quotedmetrics aretheSimple_accuracy,Weighted accuracy, Kappacoefficient,BrennanandPrediger's

Kappa,

andtheTaucoefficient

whichwillbeintroduced in 7.4. This lackof a standardhascreated

difficulties

_in comparing differentclass maps. Aconversionfromone metrictoanother cannotbe

madebecause

they

alsodependonthemarginaldistributionofthe confusionmatrixin

(18)

notingtheadvantages and disadvantagesof each. Theappropriate confidence

interval,

accounting for uncertaintyfromall _sources,willbereported_along withthismetric. This willbe accomplishedwith

highly

characterized real

imagery

andcomputer generated

syntheticimages.

Themetrics willbeevaluated onclassmaps generated withtheGMLclassifier,

the

Fuzzy

ARTMAPclassifier,and

Mystic,

a rulebasedclassifier. Supervised

classification algorithms and uncorrectedimageswillbeemployedinthis _{study because}

they

are most_commonlyused

by

theremote _sensingcommunity. _{Methods for accuracy}

assessmentand_accuracymetrics are_normally considered_{completely independent}ofthe

classificationtechniqueutilized.

However,

because classifiers_mayexhibitdifferent

degreesof spatial correlation of_{errors, three}differentclassifiers willbeusedtoensure

universal_{applicability}oftheresults. The baselineimageswill also contain a_variety of

landcoverstoavoidcorrelationinthefinalresults. In

9.2,

theresults oftheanalysisand

comparisonsbetweenthe_accuracymetrics are covered.

2.3

Factors

Degrading

Classifier

Performance

In additiontoimagecontent andthe_qualityof

training

data,

the_accuracyofimage

classificationisafunctionof several real world

imaging

parameters. A discussionof

severalfactors effectingclassification_{accuracy is}containedin 7.1.

However,

_onlytwo

significantfactorswere examined as part ofthisresearch. The

first,

imageresolution,

was examined_usingthedesertscene. Thesecond

factor,

atmospheric visibility,was analyzed_using theforestandthe tankscenes.Both_stressingparameterswere simulated

usingtheprocedureoutlinedin 6.4. Todeterminetheextentoftheeffect on

(19)

2.4

Correcting

for Reference Bias

Quality

referencedatatobeusedforclassification_accuracyassessmentisoften

difficult,

time consuming, and_costlytoobtain. Oftenanalysts utilized user selected

reference as a_quick,low-costalternativetorigorousrandom verification.

However,

user

selectedreferencedataalmost always suffers from overlyoptimisticbias. Userselected

reference alsohasanother problem. Ingeneral, itsmarginaldistribution inthe_resulting

confusion matrixdoesnot_accuratelyapproximatethe trueclass_{probability distribution.}

In

8.3,

a methodis proposedtocorrectforthisshortcoming. Thisprocessisclassed

confusion matrix marginaldistribution scaling

by

post priori probabilities. Itisusedin

thisthesistoadjusttheconfusion matrices of allthreescenes constructed_using

independentreferencedata. Theaccuracies_{resulting from}scaled matrices are then

comparedtotheunsealed andtrue accuracies. Theseresults are presentedin 9.4.

2.5

Relative Classifier Performance

Thelastarea of researchistheperformanceoftheMysticclassifier relativeto

theother_two, more_traditional, classifiers. Mysticis a_new,rulebasedclassifier. It

uses a genetic algorithmtooptimizetheparameters ofthe rulestoobtainthehighest

classification_accuracypossible. TheMystic_classifier, _alongwiththe

GML,

and

Fuzzy

ARTMAP aredeveloped in 5.3. Tothispoint_{in time,}the _accuracy andproperties ofthe

Mystic

classifier are _relativelyuntested. It howeverappearstobea uniqueclassifier

with_promisingpotential. Whilea majorthrustofthis thesisis not a comparisonbetween

(20)

3.

Objectives

Theobjective ofthisthesisisseveralfold.

First,

thecurrenttechniques of

classification_accuracyassessment,alsoknownas classification validation,willbe presented. One objective willbeto

develop

acommonformalismand

taxonomy

of

accuracyassessment.

Many

independentresearchershavepresented results on

classification_accuracy assessment. Several contrastingapproacheshave beengiven, as

well.

Many

ofthesepapershaveuseddifferent

terminology

evenwhen_referringto the same_{phenomenology because}nostandards yet exist.

Ideally,

this thesiswillserve as a compendium of classification validation

by

_providing a common source of research results drawn fromyears of remote_{sensing literature.}

3.1

Analysis

of

Accuracy

Assessment

Inthisproject severaldifference samplingschemes willbeemployed. The

accuracyoftheseschemes willbe determined usingsynthetic referenceor more rigorous sampling. Corrections forreference datawhich_poorlyestimatesthe trueclass_probability distributionswill alsobemade.

Accuracy

metrics willbeevaluated and comparedina

similar method. In addition,itwillbe determinedifeach metricis accurately estimating the_{quantity it is}supposedtobemeasuring.

3.2 Application

of

Accuracy

Assessment

Aspart ofthisproject, adatabasewith asignificantnumber of confusion matrices

has beengenerated. Inthis thesis,thepurposeofthesematriceswastoanalyze _accuracy

metrics,reference datasources, classifier_performance, andtheeffectof_stressing

(21)

When analyzingtheeffectof a parameter on classificationaccuracy, itisoften

difficulttoseparatetheeffect ofthedesiredparametersfromtheeffect oftheassessment

procedure. Thequantitativeassessment of_stressingparameterswill

inevitably

include

bias introduced

by

themethod of assessment. Differentassessment methods will resultin

differing

values. Thisis because it is difficulttoobtain an accurate or precise _accuracy

assessment. Forthis reason, theevaluationof_stressingparametersand classifier

(22)

4.

Work

Statement

/ Deliverables

Statement

ofWork

Designateand optimize acommon

training

settobeused

by

all classifiers.

Classify

candidate

imagery

with

GML,

Fuzzy

ARTMAP,

and rulebasedgenetic algorithm

(Mystic)

classifiers_usingcommon

training

data.

Obtainreferencedatafrom

dependent, independent,

_random,andsynthetic sources.

Generate confusionmatricesfromreferencedata and evaluateclassification_accuracy

using

Simple, Weighted,

Kappa,

B&P's

Kappa,

and singleclass _accuracycoefficients.

Generate andanalyze confusionmatricesmadefromuser selected reference which

have beenscaledtomatch post prioridistributions.

Utilizersynthetic

imagery

to

identify

most precise andefficient methodofclassification

accuracyassessment.

Analyzeeffectof_stressingparameters on classification _accuracy and relative effectiveness ofMysticclassifier.

List ofDeliverables

Program_{for converting}ENVFM

training

datatoMystic

andAVSformat.

A Mathematica

library

_{for generating}confusion matricesfrom

dependent,

independent,

random point and syntheticdatasources.

A Mathematica

library

_{for evaluating}

Simple,

Weighted, Kappa,

B&P's

Kappa,

and single class_accuracycoefficients with confidenceintervals.

Awrittendatabase containingconfusion matricesforallclassifiedimages basedon user_{selected, random,}and synthetic reference.

(23)

Awritten_{document containing}suggestionsfor minimizing bias and

increasing

precision of classification _accuracyassessment.

(24)

5.

Background

5.1

Utility

of

Classification

Asmentioned_previously,digital imageclassificationis one ofthemostimportant

processeswhen_{preparing remotely}senseddata foruseinapplications or research.

Differentusers sometimesreferto imageclassificationas class_{segmentation,}

categorization, orlandcover determination. A varietyof users havefoundclassification

of satellite and aerialimages a cost effective solutionto _{challenging large}scale problems.

However,

thesynoptic _view,highavailability,andfrequentoverflightshasmade satellite

imagery

the preferred,low-cost datasource of_manyusers. Classifiedimagesareknown

by

several names

including

classmaps, thematicmaps,productmaps, land-use maps, and

landcovermaps. Classificationcanbeusedtodeterminethelandcover,constituent

material_type,orobject class of each pixelinanimageacquiredat greatdistances.

Theenvironmental_{community has}madewide use of classification as atoolwhen

studying large areas ofisolatedenvironments. Datais often collected overtime to monitor environmental change such asdeforestationand changes in wetlands.

Classification has proventobeaninvaluableaidinthe_mappingof wildlands andin

drafting

inventoriesofisolated locations

(Fitzpatrick-Lins,

1980;

Senseman etal,

1995;

Rutchey

and

Vilcheck,

1994).

National governmentshave beenthelargestuser of classification on remote

sensing data. It is often used_{for surveying}and_monitoring vast natural resources(Bauer

etal, 1994). Forexample,classificationhas helpedoptimize water usagein

developing

countries (NageswaraRao and

Mohankumar,

1994). Governmentshavealsobeen

successfulin predicting crop failureand_{avoiding famine in}

developing

nations

by

(25)

Thecommercial sectorhas alsofound manyuseful applications. Segmented

imagesoftenaidinoilexploration,identification of mineraldepositsinremote

locations,

populatingGISdatabases and cartography. Classmapsof cropshave beenusedin many waystooptimize agriculture.

Crop

yields canbemaximized

by determining

where and

whenit is best_{to plant,}

fertilize, irrigate,

orharvest. It has beenused

by

the

logging

industry

to

identify

and manageforestresources. Evenlarge brokerage houses have

utilized classificationof_remotelysensedimages for predicting commodityandfuture

prices

by

_{monitoring crop health}and_{measuring biomass.}

Many

oftheseapplications baseimportant decisionson evidence uncovered

by

imageclassification. This

underscorestheneedfor high qualityclass maps wherethe_accuracyand confidence intervals areknown.

Classification is sometimes used as a preprocessortofurther digital image

processing. For_example,class maps canbeusedforatmospheric calibration or

emissivity determination inthermal studies. Fortheseapplications_especially,class maps mustbeof_{high accuracy} toensure excessive errorisnot propagatedto _{further processing} steps. Classificationis commonlyusedin imageexploitation as an analyst'stool. It

reducesthe

dimensionality

ofdatawithlittleor nolossofcriticalinformationwhichin

turn aidsin humanassimilation

(Harsanyi,

1994).

5.2

Motivations for

Accuracy

Assessment

Thereare severaltypes of errorintroduced intoremote_{sensing data. Other}than

radiometric, therearetwomajortypesof errorthatare of concern. The

first,

positional

error,referstotheimproperrelativelocationof a pixelin a scene whencomparedto the originalscene geometry. _{Positional accuracy}isoftenmeasuredinrootmean square

(RMS)

units and correctedfor usingone ofthemethodsofimageregistrationor

(26)

this thesisisthematic _accuracyof classifiedimagesbutpositional_accuracy affectsthe

measurementofthematic accuracy. Positionalerror when_collectingreferencedata from

a second registeredimagewill resultinunderestimatedthematicaccuracy. Inaddition, class_mappixelswith positional error willnolongercorrespondtoproper relative locationontheground. Inthisthesis,accuracy,unless noted_otherwise,willbe referring to the thematic_accuracyof a class map.

Accuratedataiscriticalto all oftheapplications mentioned above. Classified imagesare of no use if

they

containexcessiveamounts of_error,_therefore, thevalidation of classifieddataisparamount. Theamount oftolerableerrorisspecifictoeach

applicationbuttheneed for accuracyassessmentisconsistent. Thethematic _{accuracy is} oftenthe

deciding

factorin

determining

whethera class_{map is} appropriatefora study.

Precise accuracy assessmentis neededtodeterminetheeffectiveness ofdifferent classifiers.

Continuing

researchintonew, more robust classifiers requires effective

methods for measuringtheirperformance. Vigorous accuracyassessmentscan alsopoint outflaws in existingclassifiers andleadtoimprovements. Assessment has alsobeen usedto facilitatestudiestodetermine how

imaging

parameterssuchasview_angle, time of

day,

spatial resolutions and eventhesensor used affectthe finalclassificationaccuracy.

5.3 Classification

Algorithms

Therearetwo distincttechniquesofimageclassification. The firsttypeis

unsupervised classifiers such ask-means (Dudaand

Hart,

1973)

andISODATA (Tou and

Gonzalez,

1974)

algorithms. Theseroutines are

highly

automatedand require_only one input fromthe user, thenumber ofdesiredclass categories. Thecategories segmented

by

thesemethods_mayor_maynot correspondto classes which_maybedesirablefortheuser.

However,

unsupervisedclassifiersare often usedasaquickfirstruntodeterminehow separabledesiredclasses mightbe.

They

are also oftenusedtoprovide pure

training

data

(27)

the supervised algorithms. Supervisedclassification routines require prototype

training

data fromtheuser.

Training

dataare samplepixels, _alongwiththematic

labels,

which

belong

tothe

classes whichtheuser wishestosegment. Once

they

have beenselected, theentire spectral vector of each pixelisused

by

theclassifier. The gatheringof

training

dataisa subjective,man-in-the-loopprocess,whichhasalarge

bearing

ontheultimate

classification accuracy. Supervisedclassification

training

datais usually identified

by

an imageanalyst_using one oftwo techniques. Withthefirstand most common methodthe

analyst

interactively

selects solid polygons over areas of animagewhich arebelievedto contain_onlythedesiredclass. Thesecondtechniquerequirestheimageanalystto_only

select a single pointinthecenter of ahomogeneousarea ofthe_{image representing}the

desired imageclass. This single pointisthenused as a seedforan unsupervised

classifies

suchasthe

fuzzy

_{k-means clustering} _algorithm,which extrapolatestoselect spectrallyand_spatiallynearimage datatobeusedas

training

data. Both supervised

training

methods requiretheanalystto determinethenumberofdesiredclasses, and select atleastone region per category.

Aftertheclassifieristrainedordevelopedwiththe

training data,

thesupervised

classifier proceedsto assignthematiclabelstothepixelsintheimage. Mostclassifiers allow pixelstoremain undefined whichdonotfitwell_{into any}oftheestablished categories. Inthiscase,theuser must _supplya_membershipcoefficientthreshold that

mustbeexceededfor any given pixeltobeclassified. Undefinedpixels will notincrease ordecreasemeasured_{accuracy because}

they

are notincluded inconfusiontables. Three supervised classifiers willbeusedinthisproject.

They

also are all 'per-pixel' classifiers. Thismeans

they

assign pixels

individually

toclassesbased onlyonthespectral signature fromthatpixel,withno regardtothe_surroundingpixels. Otherclassification routines

(28)

5.3.1

Gaussian Maximum Likelihood

The Gaussian Maximum Likelihood

(GML)

classifieristhemost popular of all classifiers. The GMLisa supervisedclassificationalgorithm which employsBayesian

probability

theory

toselecttheclasstowhich apixel most

likely

belongs. Thisis accomplished

by

_{segmenting feature}spacewith n-dimensional clouds called

hyperellipsoids. Ifthestatistical assumptions setforth

by

thismethod are validfora givendataset, theresultingclassification willminimize overallclassification error.

Becausethisclassifierisso_widelyaccepted and

theoretically

_understood,itisoften used

asabenchmarkforcomparisons against new classifiers.

ThesubsequentderivationoftheGMLclassification routine follows closelywith

thatofSchott (1997). The GMLclassifierismost_{readily derived}and visualized

by

considering a single

band,

_grayscaleimage. Thistreatmentoftheunivariate caseisthen

abletobescaledto themultivariate case withthe appropriate number ofspectral

channels.

Using

Bayesian probability theory,thea posteriori_probability

[p(/IDC)],

isthe probabilitythata pixel with an observeddigitalcount ofDC will

belong

toclass /.

p(DCIi') p(i')

Thea priori_{probability, p(i),} isthe_probabilitythat_any class/willbeobserved. Inother words, this termistheproportion of pixels which

belong

intheclassi. Thechancethata particulardigitalcountDC willbeobservedwithin a certainclassisgiven

by

p(DCIi). Thisvalueisevaluated

by

theGMLclassifier_{using Equation 5-2 for}all valuesofDC and ibasedonthe

training

datasupplied

by

theuser. Afewyearsagothecomputer storage

requirementofthiscalculation wassignificantwhen

dealing

withmultispectral and

especiallywithhyperspectral imagery.

Today,

with modern_{computers, this} sameamount

(29)

countswithin _anygiven classhaveaGaussian distribution. This isbecausethespectral

distributionof classes are_onlyrepresented

by

theirmeans and standarddeviations.

Where:

f \2

DC-DC;

p(DCIi)=

jhta]

v

2o? /

/ istheclass,

DC isthe digitalcount of a_pixel,

DC,-is theaverageDCoftheclass / and,

a, isthe standarddeviationoftheclass i.

(5-2)

Theterm_p(DC)isthe_probabilityof_{any digital}count_occurring, otherwiseknownasthe

imagewidenormalizedhistogram. This function isthesameforall classes and_simply

scalesthe_resultinga posterioriprobability. Ifthe_p(DC)termis dropped fromtheGML

classifier, itwillhavenoeffect onthe results. Thegoalistofindtheclass, i,withthe

highest probabilitynotthevalue oftheabsolute probability. Therank_{ordering is}

maintainedwiththe

following

simplified equation:

p(ilDC)=_p(DCIi)_p(i)

(5-3)

Bayes decision function isthendefinedtobetheGMLdiscriminationmetric, D'

,

by

substituting Equation 5-2 into Equation 5-3. The GMLdiscriminatemetric (Equation

5-4)

isthevalue

by

whichclass_membershipwillbedecidedon apixel

by

pixelbasis.

\2

D;

=_p(DCI0p(0 =_-r=2=e

DC-_DC;

2a]

J2na2

Thismetriccanbefurthersimplified

by

taking

thelogarithmof D'

D,"=ln[D/]=ln[p(i)]

--ln[2n]

-ln[o,

]

-(DC-DC;)2 2o?

(5-4)

(30)

Finally

addingaconstanttoEquation

5-5,

wehavearrivedatthefinal GML discriminant

shown asEquation 5-6. Neither

taking

thelogarithmnor_addinga constant will change

therank_orderingdetermined

by

thediscriminant.

D,

=

ln[p(0]-ln[a,]-(DC-DC,)'

2a?

(5-6)

Ateach pixelinthe

image,

the

discriminant,

D;

, isevaluatedforallclasses, i. Theclass

withthehighestvalueisselected astheclass ofthatpixel. In many

implementations,

the

userisallowedto select a_probabilitythresholdwhich mustbeexceededbeforea pixel canbe assignedto a class. Pixels thatare not assignedtoaclassareleftasundefinedin

thefinal class map.

DC band 2

Isoprobability contours

[image:30.552.105.417.327.549.2]

DC bandl

Figure 5-1 GMLClassification of aTwo BandImage

Whilethe GMLclassifierhas been derivedthusfar_assumingaone_{band image is}

being

classified, this is rarelythecaseinpractice. Imageswhich are_normally classified

(31)

classifyingmultidimensional

imagery,

thealgorithmisthesame except scalar

mathematicsisreplacedwithvector mathematics. Forexample,takeann-dimensional spectralvector of apixel x and a m-dimensional vector w ofthetargetclassification

classeswhere nisthenumber ofimage bands and misthenumber oftargetclasses.

X=

\XnJ

VV=

W, W,

(5-7)

\WmJ

Inthiscase,Equation5-1 would needtobetransformedtoitsvector equivalentform shown

by

Equation 5-8. The sameis truefortherest ofthecalculationsintheGML classifier.

p(w,.lx)=

p(xlw.)p(w[)

[image:31.552.181.335.174.280.2]

P(x)

(5-8)

Figure 5-1 has beenprovidedtoaidin visualizingtheclassification of atwoband image containingthreedistinctclasses. Thethreeclasses are centered abouttheir

respective multivariate means

Mi, Mz,

andM3. Theconcentric ellipsoids centered about

thesemeans representiso-contour intervalsof equal class_membership_probabilityor GML discriminatevalue. The distributionof pixelshas botha meanin bandone andin

bandtwo.

However,

as seenin Figure

5-1,

thedistributioncantakeon adiagonal

characteras well. Thisis duetocorrelationin digitalcountsofclasses inmultiband

images. Themultivariatestatistical approachtaken

by

theGMLclassifier accountsfor the shapeofthis typeofdistributionwith acovariancematrix.This _abilityoftheGML classifier resultsinhigherclassification_accuracythansimilarclassifiers such asthe

parallelepiped classifier whichlacksthis ability.

Unlikethe

Fuzzy

ARTMAPandtheRule Based Genetic

Algorithm

which willbe

(32)

statisticsand makes decisions usingclass orientation and spectral extentinformation

containedinthemean vector and variance-covariance matrix.Thisparametric model

minimizes effectsof_noisyor_outlying

training

data duetoits averagingproperties. This advantageismoderated

by

thefactthatimage datawhich varies _greatlyfromnormal can beproblematic. Itis commonlynotedthatGMLperformancemapsbesttovisual interpretationwhen comparedtonon-parametric classifiers. The Environment for

Visualizing

Images

(ENVl)

software package wasselectedfor its GML implementation foruseinthis thesis.

5.3.2

Fuzzy

ARTMAP

Inrecentyears,several neural-networktypearchitectureshave been implemented

to_{classify images. The interest in}neural-networksforuseinclassifiersis duetotheir

abilitytoJearnand remainflexible. Theirrulefor

deciding

inwhich_category to_classifya

pixel will change and adaptfromregion toregioninan attemptto make optimal

decisions. Traditionalneural-networkclassifiers havetwo_primarydisadvantages.

First,

neural-networksusetraditionallogicwhich allowsfor only_crisp _set,

binary

decisions.

Secondly,

conventional networkshaverequired excessiveamountof

training

_cycles, or

epochs. The

fuzzy

ARTMAPsupervisedclassifier,developed

by Grossberg

and

CarpenteratBoston

University

in

1991,

overcomesboththeselimitations. Itcombines a fast

learning

neural-networks architecture with

fuzzy

logic decisionmaking.

Underlying

principlesofthenetwork'soperation arebasedon_modelingofthehumaneye-brain system. The

fuzzy

ARTMAParchitecture's _abilitytolearnand adapt makeitwellsuited to theclassificationof_remotelysensedimages. Itwillbeone ofthesupervisedclassifiers

(33)

fuzzy

ART,

W;ab map field

Pb

F2a

ab

W;'

Fia

F a

reset

A=

(a,ac)

Fb

fuzzy

ART,

wt

match

tracking

Fl"

[image:33.552.109.443.89.300.2]

F b B=

(b,bc)

Figure 5-2

Fuzzy

ARTMAP Architecture

(Nessmiller,

1995)

The

Fuzzy

ARTMAP doesnothave a classical mathematicalderivation asdoes

theGMLclassifier. The

fuzzy

ARTMAPclassifier consists of an advanced neural

networkknown astheAdaptive Resonance

Theory

MAPping

(ARTMAP)

combined with

fuzzy

logic algorithms. Thearchitectureofthe

fuzzy

ARTMAP,

shownin Figure

5-2,

will

help briefly

describe itsoperation as outlined

by

Nessmiller (1995). The

fuzzy

ARTMAPclassifier consists oftwoAdaptive Resonance

Theory

(ART)

neural_networks,

labeled

ARTa

andARTt,. The ART'sare unsupervised classifiers

by

themselves. Two

ART'scanbecombinedto formasupervisedclassifierknownas an ARTMAP. An ARTMAPcanbemodifiedtoincorporate

fuzzy

logicwhichthenforms the

fuzzy

ARTMAPclassifier. Thefirststep,as with_anysupervised_classifier,is to_supply

training

pixels. Thespectral vectorfroma

training

pixelissuppliedat a andthe_{corresponding}

classlabelissuppliedatb. The

intensity

values of each oftheNspectralbands inthe

training

datamust

first, however,

benormalizedtovaluesbetween 0and 1.

Next,

both inputsundergo a calculation called complement_codingatthe_{preprocessing fields}F0aand

(34)

thelabel isencoded with a

binary

designatorwhich willbeused

by

thenetworkto _specify theclass categories. Nextcomesthe

long

term_memory ofthe

fuzzy

ARTMAPwhich consists ofthe_activityvector

Fi

andtheclassification vectorF2.

classification _p a /Ov/v~N(Z

field

h2

oyow

DO

weight vector W:

Wjl/

/

\

\.WJ2N

/

A2

\

inputfield Fja

@00

Q

Figure 5-3 WeightVector Operation

(Nessmiller, 1995)

Before training,all theweight vectors are settounity. Thegoal ofthisnetworkis tofindthestrongest connection oftheweightfactorbetweentheinputfieldandthe classificationfield.

However,

beforetheclassificationcanbeconsidered acceptableit must meet or exceedthevigilance parameter. Thevigilance_parameter, _p,_isa_certainty thresholdwhich mustbeexceededinorderto _classifyapixelinagiven class. The higher the value, themore certaintheclassifier mustbe. This isanexample of

fuzzy

logic

(35)

classificationfield

ofART

F**

O

w.ab Wjl

inter-_{ART field} Fab _(x

classificationfield

ofARTh

v

6

Figure 5-4 Inter-_{ART Field Operation}

(Nessmiller,

1995)

The last step is theinter-ART

field,

represented

by

Fab,

which couplesthe two

ART'stogether. Theinter-_{ART field}_has_two_purposes.

First,

_itmapstheclassification

from

ARTa

to theclassification outputofARTb.

Secondly,

itrealizesthematch

tracking

rule. Whenthere isa mismatch

during

training

betweenthe output of

ARTa

andthe

correct classification of

ARTb,

match _trackingoccurs. Comparedto otherimage

classifiers, the

fuzzy

ARTMAPtendstobe_{mathematically}complex and_{computationally}

intensive. Fora more rigorousdevelopmentofthe

Fuzzy

ARTMAP classifier, thereader

mayconsultNessmiller

(1995)

or

Carpenter,

etal(1991).

The

fuzzy

ARTMAPis a non-parametric classifier soitmakes no assumption of

normalityastheGMLclassifierdoes.

However,

likeother non-parametricclassifiers,

experiencehas indicatedthatittendsto_{be extremely}sensitive tobiased

training

sets and

noisy datapoints. Forthisreasonitrequires a

highly

homogenous

training

set. This

property is importanttoremember when_selecting

training

regions.

Therefore,

the

criterion_{is very different}when_selecting

training

setsforthe

fuzzy

ARTMAPwhen

comparedto theGMLclassifier.

Nevertheless,

when supplied withrobust

training

data it

(36)

5.3.3

Rule Based

Genetic Algorithm

Mystic_is

aclassifier,termed terraincategorization

(TERCAT),

whichuses

logicalrules toassignimagepixelstotheirrespective classes. Ithas been implemented

withintheMATRIX_environment. _Rules _can_be_powerful _and_flexiblemethodsfor

associatinganobserved pixel with a specific class.

Mystic'

s reliance on rules rather

thanstatistics allowstheclassifiertomake no assumption of normality.

Therefore,

this

typeof non-traditional classifierdoesnot makethesame errorsthatothertraditional

classifiers,such asthe

GML,

make

by

_{erroneously assuming}targetreflectanceis

distributed inaGaussianmanner. Rules are_simplyalogicalstatement which selects

some pixels and rejects others. Asample rule(Equation

5-9)

isprovidedtoillustratethe

classification process. Parameterswithin each rule are optimizedinsuch a_waythethat

therulesfunction inthebestmannerpossible onthesupplied

training

data. Themeasure

of howwell a specific rulefunctionsis basedonitsperformance

during

theoptimization

process wereit isused againstthe

training data,

wherethe 'true'class in known. This

measurefora given ruleiscalled a rewardfunctionandis calculated

by

_applyingtherule

toall pixelsinthe

training

set and

finding

thenumberof_correctlyclassified pixels. The

more pixels_properly_classified,thehigherthereward valueforthatcombination of

variables. Inotherwords, thedependent set_accuracyassessmentisused as feedback into

theclassifier.

Obviously,

_assessing the_accuracywiththissamedataset will resultinan

overly optimistic_accuracyestimate. Theenormous amount of parametercombinations

allowed

by

even simple rules necessitatestheuse of anadvancedoptimization algorithm.

Attempting

to testeach combinationisprecludedduetopracticalitiesoftimeconstraints

on_anycurrent orforeseeablecomputer. Recentdevelopmentsofsophisticated

(37)

function,

to continueto thenext generation. Evenwiththisadvancedoptimization

technique,theMystic_classifier_is

extremelycomputationintensive.

Oneofthesimplest and most successfulrulesis givenbelow. This typicalruleis

calledtheonebandthreshold. Once selected,thisrulewouldbeoptimized

by

Mystic

ontheentire supervised

training

setprovided

by

theuser. Therewardfunctionfortheset

of optimization variablesij,kis thenumberof pixels _correctlyclassifiedwhenthe

prototype ruleisappliedtothe

training

set. Theset ofoptimizationvariableswiththe

highestrewardfunction isthenselected and used withtheruleto_classifytheentire

image.

Theoretically,

once a ruleis optimizeditcanbeappliedtoother, similardatasets.

(5-9)

Where:

fy

is

theDCinthei*

band

and,

ijjcare variables optimized

by

the

GA.

Then:

bi

belongs

toclass associatedwith

i,

j\and

L

Mystic

requiresthat theuserselecttherules which willbeusedto

identify

pixelsineachoftheclasses. Mysticispackaged with6predefined rulesand

allowances are madeforuserdefinedrules. Adifferentrule canbeusedto

identify

each

classbut only one ruleis allowed within each class. Forexample,differentrules canbe

usedto assign pixelstoclassAor class_{B. But only}one rule can assign pixelstoclassA

and_onlyone rule can _classifypixels as classB. TheMysticalgorithm usestheGAto

optimizethe parameters of eachrule,butnot which ruleisused.

Currently,

theMystic

classifiers are_verysimple and utilize_onlyspectralinformationof each pixel. Allrules

arebasedontheDCinthebandsof one pixel without regardtothe_neighboringpixels.

Neglecting

the_surroundingpixelsfailstoutilize_anyofthespatialinformationof a scene

(38)

5.4

Image Data Sets

Three different scenes were selectedtobeusedinthis study. Ofthese

images,

onewas _{synthetically}generated on acomputer whiletherest wereacquired_usingreal

airborne sensors. Theseparticularimageswere selectedbecause

they

representa wide

samplingof_terrain,_{phenomenology,}andcontent. The M7andDaedalus sensors usedto

acquirethese multispectralimagesare of particularinterestbecause oftheircombination

ofhighspectral and spatial resolution. Thiscombinationhas a great potentialfor

generatingimageswhich canbeclassifiedto ahigh degreeof_accuracy andprecision.

Theimagesusedinthisproject wheretakeninthevisible

(VIS)

toshort-waveinfrared

(SWLR)

spectral region oftheelectromagnetic spectrum. Bands longerthan_{this, if}_any,

were eliminatedtoavoidthermalphoton contributions. Thermal bandsare often avoided

when_{classifying images because}thesebandshave low

day-to-day

correlation. This

attributeisnotdesirable because itmakes

training

datacollectedfromoneimagenot

applicabletoimagesacquired on subsequentdays. Portions aroundtheperimeter of two

images have beenremovedbecause

they

exhibited erroneous sensor effects. These

portions where not classified anddidnot contributeto _accuracy assessment. The images

consistedof rawdigitalcounts.

Nearly

any study

involving

different imageclassification algorithms will utilize

theGMLclassifier. The GMLclassifierhas consistently demonstrated highclassification

accuracyand

frequently

isusedas abaseline forcomparisonsof newclassifiers. Itwas

selected foruse inthis_{study for}thesereasons.

However,

the non-parametricnature of

theMysticclassifierdiffers_{significantly}fromtheGML. Thenon-parametric

fuzzy

ARTMAPclassifier was chosenbecauseitutilizeda_equallynontraditional approach as

(39)

5.4.1

Tank Scene

The first image(Figure

5-5),

which willbecalledthe tankscene,was acquired as

partoftheSouthernRainbowcollection

by

Environmental Research InstituteofMichigan

(ERTM). Itwascapturedat 8-bitsper pixels _usingthe 16 band M7 aeriallinescanner.

Band number

16,

the thermal

band,

was removed and not usedinthisstudy. The

bandpasses forthe_{remaining bands} arelisted Table 5-1. Thisimage inparticularwas

selectedfor its

diversity

of content. Inadditionto

forest, brush,

and exposed_soils, the

scene containsa_varietyof man-madeobjects. Thescene derived itsname fromthe fact

thatseveral_military_vehicles,

including

_tanks,arecamouflagedthroughouttheimage.

During

classification, all vehicles werecategorizedintoone metal class. Toreduce

classificationerrorandproduce a useful classmap, 9classes were neededtocategorize

thisimagecomparedto_{approximately}5forother scenes. This scene wasimagedas part

of a wellorganized collection andistherefore

highly

characterized.

Many

groundphotos [image:39.552.181.376.425.661.2]

are availablefor

building

accuratereferencedatasets.

Table 5-1 Southern Rainbow Bandpasses M-7 Band Bandpass

(\im)

"Color"

1 0.45

-0.47

2 0.48- 0.50 Blue

3 0.51

-0.55

4 0.55- 0.60 Green

5 0.60

-0.64

6 0.63- 0.68 Red

7 0.68

-0.75

8 0.71 - 0.81 Near IR

9 0.81 - 0.92

10 1.02-1.11

11 1.21 -1.30

12 1.53

- 1

.64

13 1.54

- 1 .75

14 2.08 - 2.20

(40)

[image:40.552.176.382.90.302.2]

Figure 5-5 Southern Rainbow Tank Scene

5.4.2 Desert Scene

Thedesertscene(Figure

5-6)

was acquired as part oftheWestern

Rainbow,

Joint

Camouflage ConcealmentandDeception

(JCCD)

fieldcollection_using theDaedalus

airborne sensor. The siteofthissceneisthe_{Yuma proving}grounds. The original

GIFOVofthescene was one_meter,buttheimagewas alsodegradedto twoandfour

meterresolutionsforuseinthis study. Thescene consists of_{mostly desert}pavement(or

desert_varnish)butnotable featureshave beenexpandedforillustrationpurposesin

Figure5-6. Thethermalbands have beenremoved again andtheedges which exhibited

severe geometricdistortion have bemasked out. Thecollection was welldocumented

and_manyground photographsare available_{for verifying}thelandcover. The imagewas

(41)

[image:41.552.102.455.154.606.2]

Figure 5-6 Western Rainbow Desert Scene

Table 5-2 Western Rainbow Bandpasses M-7 Bandpass

(nm)

"Color" Band

1 0.405-0.455 Blue

2 0.435-0.535

3 0.500-0.625 Green

4 0.570-0.650

5 0.595-0.720 Red

6 0.645-0.790

7 0.700-0.955 Near IR

8 0.785-1.070

9 1.495-1.835

10 2.011-2.560

0.385 0.885 1.385

Wavelength

1.885 2.385

(42)

5.4.3 Forest Scene

The forestscene(Figure

5-8)

isthefinal image. Unlike thefirsttwoscenes,

which wereimagedwithrealairborne_{sensors, this}imagewas generated_{synthetically}

withtheDigital

Imaging

andRemote

Sensing

Image Generation

(DIRSIG)

model. The

bandpasses(Table

5-3)

simulatethatoftheM7linescanner. Theradiancefieldgenerated

by

DIRSIGwas convolved with a3x3equal weighted

kernel,

resampledtoonethirdof

theoriginal size_usingcubic_convolution, and quantizedto 8 bitsper pixelforeach ofthe

15 bands. Convolutionwas_{necessary because}theradiance fieldpixelsare _spectrally

purebuttheconvolution results contain mixed_pixels,asisthecaseinrealimages. Three

versions ofthesynthetic scenewere generated. These images had LOWTRAN

atmospheric visibilities of

23km,

7km,

and5 kilometers. For further detailsabout

syntheticimages generated

by

DIRSIG,

thereaderisreferredto

DIRSIG,

Digital

Imaging

andRemote

Sensing

Image

Generation, Description, Enhancement,

andValidation [image:42.552.181.373.389.580.2]

(Schottet_al, 1993).

(43)

Table5-3 DIRSIG Scene Bandpasses

SyntheticBand Bandpass

(|j.m)

"Color"

1 0.44-0.50 Red

2 0.46-0.53

3 0.49-0.57 Green

4 0.53-0.62

5 0.58-0.67 Blue

6 0.61-0.72

7 0.66-0.76

8 0.70-0.93 Near IR

9 0.76-1.04

10 0.90-1.38

11 1.10-1.39

12 1.30-1.79

13 1.40-1.89

14 1.90-2.39

[image:43.552.167.386.119.354.2]

(44)

6.

Approach

Allthreeclassifierswere trained_usingthesame

training

regionsforeachimage.

Providing

an_optimal,common

training

setforall classifiers wasdifficult buta

quantitative comparisonwould notbepossible without it. The accuracyof each ofthe

resultingclassmaps was assessed_using

dependent, independent,

and random reference

sources. Reference datafrom DIRSIGmaterial maps was usedforthe syntheticimages

as well. Fromthesereference_sources, the

Simple, Weighted, Kappa,

Prediger's

Kappa,

andtheTaucoefficients were calculated. Theresults were obtained_usingacombination

ofrealand syntheticimagery. Thesyntheticdatasets servedasa goodindicatortobias in

theother_samplingtechniques. Trendswere thenobservedintheresults obtainedfrom

boththe_samplingmethods and_accuracymetrics. Thegoal ofthis novel approach wasto

identify

theoptimal overall method_{for accuracy}assessmentofclassmapsbasedon

accuracyand efficiency.

Asingle program was writtentogenerate a confusionmatrix and evaluatethefive

most common _accuracymetrics. Theconfusion matrices were generatedfrom anyoneof

four differentgroundtruthsources.

Dependent,

independent,

randomand syntheticdata

sets were readinas rawimage files. Inadditionto_anyone ofthesedatasets,theuser

must also _supplyaclassmap. Thisclass_mapcanbegenerated

by

_anyofthe

classificationmethodsbutmust alsobesuppliedintheformof a rawimage file. Each

referenceand class _mapmustbea singlebandimage. Eachclass wasdesignated

by

a

uniquedigitalcount

(DC)

andthebackgroundclass,ifany,wasdesignated

by

aDCof

zero(black). The DC intheclass_mapmust matchtheDC inthe truthdatasetforeach

correspondingclass. This wasdone usingaUNIX utility

(XV)

by

_changingthe_{gray level}

ineitherimagetomatchforeach class. A

key

filewasusedforeachclass_mapto

identify

(45)

6.1

Experimental Data Set Matrix

Threesceneswere used asthebasis forthiseffort. Imageswere generatedfrom

these scenes withdegradedatmospheric_visibilityor spatial resolution. Images hadthree

possible spatial resolutions: 1 meter,2meter,and4ground spot size. Theatmospheric

visibilityoftheimageswas either23

kilometers,

7

kilometers,

or5 kilometers. Dueto

thelargenumberofpossible combinations of scenes and_stressing_parameters, _only a

limitednumber where selectedforanalysis. Figure 6-1 illustrates theexperimental

matricesforthe _stressingparameters of resolution and atmospheric_visibilitywhich were

selectedforeach ofthescenes. Thefigure indicates thesource ofthereference dataused

toassess the_accuracyof each class map. The numbernextto thereference source

indicateswhich classifier or classifiers was usedto categorizethatimage. Foreach ofthe

numbers, thescene was

degraded,

the_{classifier(s)}were _trained,theimageclassified, and

thefinalclass_{map accuracy}was evaluated. Aspart ofthisthesis, atotalof onehundred

andnineteen

(119)

confusion matrices were generated. Theresults ofthese _accuracy

(46)

u <U 1/2 o lm Tank

Scene

Resolution 2m 4m dependent 1,2,3 independent 1,2,3 scaledindependent 1.2.3 random1,2,3 dependent 1 independent 1 scaledindependent 1 random1 dependent 1 independent 1 scaledindependent 1 random1 1) & _E o t-lm Forest Scene Resolution 2m 4m dependent 1,2,3 independent 1,2,3 scaledindependent 1,2.3 random1.2,3 synthetic1,2,3 dependent 1,2,3 independent 1,2,3 scaledindependent 1,2,3 random1,2,3 synthetic1.2,3 dependent 1,2,3 independent 1,2,3 scaledindependent 1,2,3 random1,2,3 synthetic1.2,3

6

< lm Desert Scene Resolution 2m 4m i-i <D O p= dependent 1,2,3 independent 1,2,3 scaledindependent 1,2,3 random1,2,3 dependent 1,2,3 independent 1.2.3 scaledindependent 1,2.3 random1,2,3 dependent 1,2,3 independent 1,2,3 scaledindependent 1,2,3 random1,2,3

1-_Gaussian_{Maximum Likelihood}

2- _RuleBased Generic Algorithm

(MYSTIC) 3

-FuzzyARTMAPNeural Network

Figure 6-1 Experimental Matrices

6.2

Importing

Training

Data

Training

dataconsistsofthedigitalcounts

(DC)

ineachbandof aselect pixel and

theproperclasstowhichitshouldbeassigned.

Training

regions are the imageareas over [image:46.552.65.470.81.526.2]

(47)

selectedastheapplicationfromwhich

training

regions willbeselected.

Using

the

mouse,polygon verticeswillbeselectedineach imagetodesignate thedesiredclasses.

Differentcolor polygons willbeusedforeach class. Theseregions ofinterest

(ROI)

can

thenbeused

directly

forsupervisedclassification

(GML)

withinENVI. Tomake an

impartialcomparisonbetweenclassification algorithmsitwasdecidedcommon

training

setswouldbeusedforeachimagewithallthreeclassification methods.

The

following

procedurewas usedtoimport

training

data intotheMysticrule

base geneticalgorithm classifier.

First,

theimageunderneaththe polygons willbe

replaced

by

ablack backgroundwithinENVI. The ROI'ssuperimposed overtheblack

backgroundwillthenbesaved as aGIF image. This GIFimage willthenbeconvertedto

a portable pixel_map

(PPM)

_{using PNMTOOLS. Once}thisiscomplete, the PPM image

canbe imported into a program which uses thisimage as a mask againsttheoriginal

image. Areas inthemask which areblackarekept

black,

andin areas wherethemaskis

notblacktheoriginalimage will pass. Thiswillbe done on each ofthebands inthe

originalimage automatically

by

theprogram. Theresultis aMystic

training

image

which wasblackeverywhereexcept werethedesired ROI'swere selectedin ENVI. In

these_areas, theoriginal multibandimagewill appeared. DuetotheMystic256x256

pixellimiton

training

imagessizes,one extra_{step is}required. TheMystic

training

imageswerelargerthan this soENVI willbeusedto generate a smallerimage

(<256x256)

into which each ofthe

training

regions willbecutandpasted. Themosaic

imagewillserveasthefinalMystic

training

image. This image isthenimported into

Mystic'

s

training

function. Eachclass regionisselected

by

_specifyingtheproper

region fromtheMystic

training

image. TheMysticiso-datafunction helpsautomate

thisprocesses

by

_{automatically selecting}theproperpolygonaftertheuserclicks within

each

training

region withthemouse. Theiso-dataparameterswillbeadjustedto the

properthreshold toallow properfunctioning. Aftertheregionselection is

done,

Mystic_is

trained(rulesare_optimized)onthisdata. OnceMystic_{has been}

(48)

classificationis performedontheoriginalimage. Thisprocedurewillberepeated for all thebaseline images.

Theprocedure for

training

the

Fuzzy

ARTMAPwill requiredless steps onthepart oftheuserwhen comparedto thatofMystic. Firstthe

training

polygons were saved as animagewithablack background. Eachofthepolygons_{corresponding}to anindividual classwillbedesignated

by

a unique color. This imageis thensavedfromwithinENVI as a_{GIF image exactly in}thesame manner as was used while

training

Mystic. This imagewillthenbeconve