A Comparison of pattern classification techniques for orienting chest X-rays

(1)

Rochester Institute of Technology

RIT Scholar Works

Theses

Thesis/Dissertation Collections

1997

A Comparison of pattern classification techniques

for orienting chest X-rays

Martin R. Hoffmann

Follow this and additional works at:

http://scholarworks.rit.edu/theses

This Thesis is brought to you for free and open access by the Thesis/Dissertation Collections at RIT Scholar Works. It has been accepted for inclusion

in Theses by an authorized administrator of RIT Scholar Works. For more information, please contact

[email protected].

Recommended Citation

(2)

Rochester Institute of Technology

Computer Science Department

A Comparison of Pattern Classification Techniques for

Orienting Chest X-rays

by

Martin R Hoffmann

A thesis, submitted to

The Faculty of the Computer Science Department,

in partial fulfillment of the requirements for the degree of

Master of Science in Computer Science.

Approved by:

Professor Peter Anderson

Dr. Roger Gaborski

Professor John Biles

(3)

Permission To Copy

Title of thesis: A Comparison ofPattem Classification Techniques for Orienting Chest X-rays

I, Martin

R.

Hoffmann, hereby grant permission to the Wallace Library of the Rochester

Institute of Technology to reproduce my thesis in whole or in part. Any reproduction will not be

for commercial use or profit.

(4)

Abstract

The

problemof

orienting

digital images

ofchest

x-rays,

which were captured at some multiple of

90 degrees from

tli rtrue

orientation,

is

atypicalpattern classification problem.

In this case, the

solution

to

theproblem must assign

an

instance

of a

digital

image

toone of

four

classes,

where each class correspondstoone ofthe

four

possible

orientations.

A

large

number oftechniquesareavailable

for

developing

a pattern classifier.

Some

ofthese techniquesare

characterized

by

independent

variableswhosevaluesare

difficult

torelate

back

to theproblem

being

solved.

If

a

technique

is

highly

sensitiveto thevalues of

these variables,

the

lack

of a rigorous

way

of

defining

themcan

be

a

significant

disadvantage

to

the

inexperienced

researcher.

This thesis

presents experiments

by

theauthortosolvethechest

x-ray

orientation problem

using

four

different

patternclassification

techniques:

genetic

programming,

an artificial neural networktrainedwith

back

propagation,a

probabilisticneural

network,

and asimple

linear

classifier.

In addition,

theauthor will

demonstrate

thatan

understanding

ofthe

design

of a

feature

set

may

allow aprogrammerto

develop

atraditionalprogram which

does

an adequate

job

of

solving

theclassificationproblem.

Comparisons

ofthe

different

techniqueswill

be

based

not

only

ontheirsuccess at

solving

the

problem,

but

also onthe timerequiredto

find

an acceptable solution andthe

degree

towhich eachtechnique

is

sensitiveto thevalues ofthevariables which characterize

it.

The

thesis

demonstrates

thatall ofthe techniquescan

be

usedto

derive very

accurate chest

x-ray

orientation

classifiers.

While it

is dangerous

togeneralizetheresults oftheseexperimentstopattern classification problems

in

general,

theauthor will arguethatthemagnitude ofthe

differences

in

performance

between

the

different

techniques

minimizesthis

danger.

In particular,

theexperiments suggestthat the

linear

classifier

is

so

computationally

inexpensive

that

it

is

always worth

trying,

unlessthere

is

a priori

knowledge

that

it

will

fail.

The

experiments also

suggestthatgenetic

programming

is

much more

computationally

expensivethanarethe

linear classifier,

artificial

neural_network,and probabilistic neural networktechniques.

Of

the

four

conventional pattern classificationtechniqueswhich were

examined, it

will

be

shownthat theartificial

neural network producedthemost accurate classifiers

for

the

x-ray

orientation problem.

In addition,

theresultsof a

number oftrialssuggestthat the

final accuracy

oftheclassifier

is relatively

insensitive

to thevalues ofthe

parameters which characterizethis

technique,

making it

an appropriate choice

for

the

inexperienced

researcher.

With

respectto the

ability

ofthe

resulting

classifierto

accurately

orient sample x-rays which were not

included

in

the

training

_{set, the}artificial neural network performed_well,when comparedto theothertechniques.

Although

theclassifiers produced

by

thegenetic

programming

techniquewere

significantly

more expensiveto

construct and were

slightly

less

accuratethan the

best

artificialneural_{networks, the}results of genetic

programming

experiments can provide

insights into

theproblem

being

studied,which would

be difficult

to

discern from

the

classifiers produced

by

theothertechniques.

For

example,oneoftheclassifiers which was produced

by

genetic

programming

uses

only

eightofthe

twenty

feature

values extracted

from

thesample x-ray.

Not only

does

this

reducethecostof

extracting

the

feature

values

from

an unknown_sample,

but

theclassifier

itself

would

be

much

(5)

Table Of

Contents

1.

INTRODUCTION

1

2.

PATTERN CLASSIFICATION TECHNIQUES

3

2.1. Genetic

Programming

3

2.2. Artificial Neural

Network

30

2.3.

Probabilistic

Neural

Network

(PNN)

42

2.4.

Linear

Pattern Classifier

48

3. CHEST X-RAY FEATURE

EXTRACTION:

57

3.1. Common

Elements

Of The Two Feature Sets

59

3.2. Feature Set

F]

60

3.3.

Feature

Set

F2

61

3.4. The Feature Extraction Program

63

4. DESIGN OF

EXPERIMENTS

66

4.1. The Latin Square Technique

67

4.2.

Analysis

Of Variance

69

5. THE CHEST X-RAY ORIENTATION EXPERIMENTS

71

5.1. Genetic

Programming

Experiments

71

5.2. Artificial Neural Network Experiments

80

5.3. Probabilistic Neural Network Experiments

88

5.4. Linear Classifier Experiments

95

6. A TRADITIONAL PROGRAM FOR CHEST X-RAY ORIENTATION

97

7. CONCLUSIONS

99

7.1.

Accuracy

Of The Classifiers

99

7.2.

Validity

Of The Sample Data

100

7.3. The Cost Of

Developing

A Classifier

103

7.4.

Sensitivity

To Tunable Parameters

104

7.5.

Ideas For Further Research

104

A. ANALYSIS

OF CLASSIFIER PRODUCED BY GENETIC PROGRAMMING

106

B.

NEURAL NETWORK WEIGHTS

110

C. PROBABILISTIC NEURAL NETWORK EXEMPLARS

112

D. LINEAR CLASSIFIER MATRICES

121

(6)

1.

Introduction

In

orderto

investigate

theproblem of chest

x-ray

orientation,

theauthor

implemented

classifiers

using

four

different

patternclassification

techniques:

genetic

programming,

artificial neural networkstrainedwith

back propagation,

probabilisticneural

networks,

and simple

linear

classifiers.

In addition,

theauthor employed

knowledge

used

in

the

design

of a particular

feature

setto

develop

a moretraditionalprogram which could also

be

used as a chest

x-ray

orientationclassifier.

Each

oftheseclassifiers was

implemented

using Sun

C++

3.1,

and all oftheexperiments were executed on a

Sun

SPARCstation

10

with

64 MBytes

of

RAM,

running Solaris

V2.3.

The

author obtained

238

sample chest

x-ray

images from

the

Eastman Kodak

Company

Research Labs. Each

sample was

repeatedly

rotated

90 degrees

toproduce a sample

image

in

each ofthe

four

target

orientations,

resulting in

atotalof

952

sample

images.

The

original samples were

45x55 pixel, 8-bit

gray-scale

images. These

were cropped abouttheircenterstoproduce

square,

45x45

pixel

images.

Two

feature

sets were extracted

from

the

952

samples

images,

to

be

used

in

training

and

evaluating

theperformance ofthevarioustechniques.

The

simpler ofthe two

feature

sets was

derived

by

summing

pixel values across various pathsthrough the

image.

A

second more complex

feature

set was

designed

to

detect

theorientation ofthe

dark

region which appears

between

the

lungs in

typicalchest

x-ray images.

A

number of experiments were performed

in

whicha classifier wastrained

using

a

fraction

ofthesample

images,

andtheoverall

accuracy

ofthe

resulting

classifier wasevaluated

using

the

remaining

samples.

Care

wastaken to

ensurethat thesamples used

for

training

originated

from

a

different

set oftheoriginal

238

images

than

did

the

samplesused

for

evaluating

the

final

classifier.

Experiments using

thegenetic

programming

and artificial neural networktechniquesrequiretheselection of values

for

a numberof

different

independent

variables whichare

difficult

torelateto theproblem

being

solved.

The

author

employedthe

Latin Squares

techniqueto

develop

a suiteof experiments whichtested theclassifiers against various

(7)

This

paper

is

organized as

follows.

Following

this

introduction

are sections which

describe

each ofthe

four

pattern

classificationtechniqueswhich were usedtosolvethechest

x-ray

orientation problem.

Each

ofthese sections

begins

by briefly describing

the

technique,

and concludes with a

description

oftheauthor's

implementation

ofthe

techniqueused

in

theexperiments.

After

thesections which

describe

thepatternclassificationtechniques

is

a section which

describes

thesample

x-ray

images

used

in

theexperiments andthe two

feature

sets which were extracted

from

these

images. Subsequent

sections ofthepaper will

describe

theexperiments performed with each ofthe

four

techniquesandtwo

feature

sets,

including

the

setup

oftheexperiments andtheresults

from

theseexperiments.

Finally,

theauthor

describes his

attempttowrite atraditionalprogram

for orienting

thex-rays and offers

his

(8)

2.

Pattern Classification Techniques

This

sectionofthepaper

describes

the

four different

pattern classificationtechniqueswhich were usedtoorientthe

chestx-rays.

Each

description begins

with some

background

aboutthe techniqueand ends with

details

ofthe

implementation

used

in

theexperiments

described

in

thispaper.

2.1.

Genetic

Programming

In

1975,

John Holland

published

Adaptation

In Natural AndArtificial

Systems,

which showed

how

the

evolutionary

process can

be

appliedtoproblems

in

adaptation.

The

techniqueof

Genetic Algorithms

was

developed

as a means

of

using

evolutiontosolve such problems and

is

the

foundation

upon whichthe

Genetic

Programming

technique

is

based.

2.1.1.

Genetic

Algorithms

With

thesimplest

form

of

Genetic

Algorithm,

candidate solutionstoa problem are represented

by

fixed-

length

strings.

A

population of candidate solutions are

randomly

selected

from

theset of all possible solutions andthe

individuals

in

thepopulation are ranked

according

totheir

ability

tosolvethe targetproblem.

Based

ontheresults

obtained

for

this population,a new population

is drawn from

thesolution space and evaluated.

This is

repeated

until an acceptable solution

is found.

A

new population of candidate solutions

is

generated

from

theprevious population

by

applying

three

different

methodsof

breeding:

asexual_{reproduction, cross-over,}and mutation.

Asexual

reproduction

involves

selecting

a

candidate

from

thecurrent population and

copying it into

thenew population.

Figure

1: Example

OfBreeding By

Asexual Reproduction

Candidate Solution A:

"Ax

A2 A3 A4 A5 A6 A7 A8

A9"

New Candidate Solution A'

:

"A

A2

A3

A4 A5 A6 A7 A8

A9"

(same as original)

Cross-over

is

done

by

selecting

two

individuals from

the

existing

population and

randomly

selecting

a point where

(9)

Figure 2:

Example

OfBreeding By

Cross-Over

Candidate

Solution A:

"Ax

A2 A3 A4

A5

A6 A7

A8

A9"

Candidate

Solution

B:

"Bi

B2

B3

B4 B5 B6 B7 B8

B9"

Randomly

Selected

Cross-Over

Position = ₄

New

Candidate

Solution A'

:

"Ax

A2 A3 B4 B5 B6 B7 B8

B9" New

Candidate

Solution B'

:

"Bx

B2 B3 A4 A5 A6 A7 A8

A9"

Like

asexual

reproduction,

mutation

involves selecting

a single

individual

from

thecurrent population and

copying

it

into

thenew population.

With mutation,

however,

one character

in

thenew

string

is

randomly

selected and

changedtoa

different

value.

Figure 3: Example

OfBreeding By

Mutation

Candidate Solution A:

"A

A2 A3 A4 A5 A6 A7 A8

A9''

New Candidate Solution A'

:

"Ai A2 A3 A4 A5 A6 B7 A8

A9"

While

mutation can preserve variation

in

the

resulting population, in

practice

it

sees

little

use

(Koza,

1992:105).

The

relative

frequencies

by

which asexual

reproduction, cross-over,

and mutation are usedtogeneratethenew

population of candidate solutions aretunableparameters oftheapplication ofthealgorithm.

By

relating

the

fitness

of a candidate solutionto the

probability

that

it is

selected

for

breeding,

the

Genetic

Algorithm

directs

thesearch

for

theoptimal solutiontothe targetproblem.

Selection

is done

with

replacement,

so

that thesame solution

may be

selectedtoparticipate

in

a number of

breeding

operations.

Holland

showedthat the

generation ofthenewpopulation

using

these techniqueswas

nearly

optimal

in

minimizing

thecost associated with

sampling

thesolution space

(Holland,

1992:139).

2.1.2.

Genetic

Programming

John Koza

has developed

a variationon

Genetic Algorithms

called

Genetic Programming. With Genetic

Programming,

solutionstothe targetproblem are represented

by

programs ratherthansimple strings

(actually,

(10)

In

his

book,

Genetic

Programming,

Koza

arguesthat

representing

candidatesolutions as parsetrees

instead

of

strings

increases

the

expressive power ofthe

algorithm,

without

invalidating

thework of

Holland.

Therefore,

he

concludesthat thealgorithm

described below is

a near optimal method of

sampling

a solution space

consisting

of

candidateprograms

(Koza,

1992:116).

Each

node within a candidate parsetreerepresents an

input

or

function.

Terminal

nodes are

inputs

or

functions

which require no arguments.

Non-terminal

nodes represent

functions

whichtakeone or more arguments.

The

values ofthesearguments are

found

by

evaluating

child nodes

in

the tree.

Figure

4: Example

Of

A

Genetic

Programming

Parse Tree

In

ordertoensurethat

randomly

generated parsetreesand parsetrees

resulting from

breeding

are always valid

programs,

Genetic

Programming

requiresthat theprogram

inputs,

function

_arguments,and

function

return values

be

ofthesame

data

type.

In

a

Genetic

Programming

experiment,a population of programs

is

randomly

generated.

Each

member ofthe

population

is

evaluated againsttest

data

andtheprogramsare ranked asto their

fitness

for solving

the target

problem.

A

new population

is

generated

by breeding

individuals from

theprevious population.

The

terminaland non-terminal nodes available

for

program generation

depend

onthenature oftheproblem

being

solved,andare part ofthecharacterizationof a particular experiment.

The

fitness

measure usedtoassociate a

numerical

rating

for

the

suitability

ofeach candidate program

in

solving

the targetproblem

is

also

domain

specific.

For

example,

in

theexperiments conducted

by

the

author,

each

x-ray

was reduced

to

a set of

20

feature

values.

The

(11)

plusonenode

for

each of

four integer

constants:

0, 1,2,

and

3. The

set of non-terminal nodes

included

addition, the

greater-than comparison

operator,

a conditional

"if

operator,

and

bitwise

AND, OR,

and

NOT

operators.

The

fitness

of a particular solution was

determine

by

evaluating

thecandidate program

repeatedly

againstthe

feature

vectors associated with a

fixed

set of sample

x-ray

images.

The

low

ordertwo

bits

oftheresultof an evaluation of

theprogram was usedtoselect

from

amongstthe

four

possible orientation values.

The

fitness

valueassignedto the

program was

simply

the

fraction

of

correctly

classified

training

samples.

2.1.3.

Creating

The Initial Population Of Programs

The

initial

population of candidate programs

is

generated

randomly using

thesets ofterminaland non-terminal

nodes

defined for

theexperiment.

Koza

describes

twomethods of

generating different

shaped parsetreesandthen

recommends a

hybrid

method called

"ramped half-and-half

(Koza,

1992:91).

The

first

ofthe two

basic

parsetreegeneration methods

is

thefullmethod.

Thefull

method

begins

by

selecting

a

target

depth for

theparsetree.

The

tree

is

generated

from

the

top

down. When

a node

is

neededas an argumenttoa

function in

theprevious

level

ofthe

tree,

a non-terminal node

is

randomly selected, if

and

only if

the

depth

ofthe

current

branch

ofthe tree

is less

than the target

depth. If

the

depth

ofthecurrent

branch is

equalto the target

depth,

aterminalnode

is randomly

selected.

Uniform

probabilities are used

for

node selection.

(12)

The

secondofthe two

basic

methods of parsetreegeneration

is

thegrowmethod.

With

thegrowmethod,

only

the

maximum

depth

of all

branches is

pre-defined.

When

a node

is

needed as an argumenttoa

function

in

theprevious

level

of

the

treeand

the

depth

of

the

current

branch is less

than the

target

maximum

depth,

thena node

is

randomly

selected

from

eithertheset ofterminalornon-terminalnodes.

This

meansthat the

lengths

ofthe

different branches

oftheparsetreecan vary.

Koza

suggeststhat theselection of nodes at

intermediate levels

ofthe tree

be

made

uniformly from

theunion ofthe

sets ofterminaland non-terminal nodes

(Koza,

1992:92). In

practice, this

leads

torather

uninteresting

parsetrees

when

the

number ofterminalnodes

is

significantly

greaterthan thenumber of non-terminal nodes.

Figure 6: A

Parse

Tree Generated

By

The

"Grow"

Method

Fi

F2

Fi

T,

/

/ s

\

T3

Ti

Fi

F3

T2

T,

The depthof eachbranchcanbe different (max depth is

5)

The

ramped

half-and-half

method

is

a

hybrid

ofthese twomethods of parsetreegeneration.

With

this

method,

half

ofthe

initial

populationof parsetrees

is

generated

using

the

full

methodandtheother

half is

generated

using the

growmethod.

The

minimumtarget

depth

usedtogenerate

any

tree

is

two.

The

maximumtarget

depth is

atunable

parameter

typically

around six

(Koza,

1992:1 16). The

same numberof parsetreesare generated

for

eachtarget

(13)

During

thegenerationofthe

initial

population of

programs,

duplicate

parsetreesare eliminated

by

replacing

one of

the treeswith a newtreewiththesamecharacteristics.

Because

theprograms

do

not

interact

withoneanother, there

is

noadvantageto

having

duplicate

programs

in

the

initial

population of a

Genetic

Programming

experiment.

After

the

first

generation of programs

is

created,

theprograms are evaluated againsttest

data

andare assigned

fitness

values which

define

their

relative success at

solving

the targetproblem.

Subsequent

generations of programs

are created

by

breeding

individuals

from

thecurrent generation.

2.1.4.

Creating

Subsequent Generations Of Programs

As

withthegeneral

Genetic

Algorithm,

breeding

is done

by

asexual_{reproduction, cross-over,}and mutation.

In

this

case,

cross-over

is

performed

by

randomly selecting

a node

in

each oftheparent programs.

The

sub-trees rooted at

these twonodes arethenswappedtogeneratetwoprograms

for

thenewpopulation

(see Figure 7).

The

mutation operation

(which is rarely

used and was not used

in

theexperiments

described in

this

paper)

is

performed

by

pruning

a

randomly

selected sub-tree of a selected program and

replacing it

withanother

randomly

generated sub-tree.

In

additiontoasexual_{reproduction, cross-over,}and mutation

Koza

describes

threemethods of

generating

new

programs

from

an

existing

populationof programs

(Koza,

1992:107-1 12). Although described

below,

these

methods were not employed

by

theauthor

in his

experiments.

A

permutationoperationgenerates a new program

by

randomly re-ordering

theargumentstoone ofthe

functions in

an

existing

parsetree.

The editing

operationmodifies an

existing

parsetree

by

recursively applying

a set of

domain

independent

and

optionally domain dependent editing

rules.

Koza

gives an example of a

domain independent

editing

rule as

follows:

"If

any function

that

has

no side effects and

is

not context

dependent has only

constant atoms as

arguments, the

editing

operationwill evaluatethe

function

and replace

it

withthevalue obtained

from

theevaluation"

(Koza, 1992:108)

The

final

method of

generating

new programs

is

encapsulation.

With

_{encapsulation,}a

randomly

selected sub-tree

of a

randomly

selected candidate program

is

wrapped

by

a new_primitive,whichthenreplacesthesub-tree

in

the

original program.

The

new primitive

is

like

a subroutine.

If

mutation

is

being

usedto

breed

programs,

thenew

(14)

Figure 7:

Example

Of

Cross-Over In

Genetic

Programming

Parent A

with

Parent B

with

randomly

selected

randomly

selected

cross-over point

A5

cross-over point

B4

Ai

Bi

rA>^

,-<>-,

A2

A3

B2

B3

X<T

r-O^n

A4

A5

Ag

B4

B5

>>>_,

A7

B6

B7

r-^P-n

Ag

Offspring

A'

Offspring

B'

A,

B,

rA>^

r^T^

A2

A3

B2

B3

Kt,

r-4Nn

A4

B4

Ae

A5

B5

r_Z^n

B6

B7

A7

>>_

Ag

As

with

Genetic

Algorithms,

therelative

frequencies

with whichthe

different

breeding

methods are employed

in

generating

a new population of programs aretunableparameters.

Programs

are selected

for

breeding

with a

frequency

which

is

proportionalto theirrelative

fitness

at

solving

the

targetproblem.

The

same program can

be

selectedtoparticipate

in

multiple

breeding

operations,

and

because

asexual reproduction

is

also used

during

breeding,

thesame program can appearmorethanonce

in

thenew

(15)

In

his

book,

Koza

describes

other

techniques

which are useful

for

improving

therate at which a

Genetic

Programming

experimentconvergestoa solution.

The

author ofthispaper

included

twoofthese techniques

in

his

implementation: decimation

and

greedy

over-selection.

Because

thegeneration ofthe

initial

populationof programs

is

almost

purely random,

thereare

likely

to

be

a

large

number of candidate programs

in

this

first

generation with

extremely

poor

fitness

ratings.

Decimation

providesa

fast way

of

eliminating

thepoorestcandidate

programs,

by

simply removing

a

fixed

number ofthem

from

the

initial

population,

before

breeding

begins.

Even

after

employing decimation

to trim the

initial

population of an

experiment,

thenumber of programs which

remain

may be

quite

large.

The probability

thateventhe

fittest

candidate program

is

selected

for

breeding

may

be

relatively

small.

Greedy

over-selection

is

a method which

increases

the

probability

that theprograms which are

better

candidate solutions are selected

for

breeding.

With greedy

over-selection, theprograms

from

thecurrent population are

divided

into

twogroups.

The

first group

containsthe

fittest

individuals

which

collectively

account

for

sometotal

fraction

of overall

fitness

ofthepopulation.

The

second

group

contains all oftheother programs.

When

a program needsto

be

selected

for

breeding,

a

group

is

randomly

selected andthena program

is

randomly

selected

from

amongstthecandidate programs

in

thegroup.

The

probabilities of

selecting between

the twogroups are skewedto

greatly favor

the

group

of programs which

containsthe

fittest individuals. Once

a

group is selected,

a program

is

selected

based

ontherelative

fitness

ofthe

programs withinthegroup.

Koza's

rules ofthumbarethat the

first

group is

selected

80%

ofthe timeandthat the

fraction

oftotal

fitness

which

determines

the

division

of programs

between

the twogroups

depends

onthe total

number of programs as

follows

(Koza,

1992:99):

Table 1: Koza's Allocation

Of

Programs For

Greedy

Over-Selection

Number

Of Programs

Fraction Of Total Fitness From

Group-

1 Programs

1,000

32%

2,000

16%

4,000

8%

(16)

2.1.5.

Characteristics

Of A Genetic

Programming

Experiment

With

the

inclusion

of

decimation

and

greedy

over-selection,

theset oftunableparameters associated with a genetic

programming

experiment

includes:

Table

2: Tunable

Parameters

For

Genetic

Programming

Experiments

Terminal

Set

(the

set of program

inputs

which can appear as parsetree

leaves)

Function

Set

(the

set of

functions

which can appear as nodes

in

theparse

trees)

Fitness

Measure

(a

method of

assigning

a numeric

rating

to the

fitness

of a

program)

Population

Size

(number

of programs

in

the

initial

population)

Maximum

Number Of

Generations (number

of

iterations

during

the

experiment)

Maximum Depth Of Parse Tree In Initial Population

Maximum Allowable Depth Of Parse Tree

During

Experiment

Probability

Of Cross-Over

Probability

Of Asexual Reproduction

Probability

Of Mutation

Probability

Of Permutation

Probability

Of Edit

Probability

Of Encapsulation

Probability

Of

Selecting

Leaf Node

During

Cross-Over

Fraction Of Programs Discarded

By

Decimation

Step

Fraction Of Overall Fitness Allocated To

Greedy

Over-Selection Group-I

Probability

Of

Selecting

Program From

Greedy

Over-Selection Group-I

Koza

claimsthat the

accuracy

ofthe

Genetic

Programming

algorithm

is

relatively

insensitive

to

many

ofthese

variablesand

he

typically

usesthesameset of values

for

mostofhisexperiments

(Koza,

1992:1 14). In

the

experiments

described

later,

theauthor variesa small number oftheseparameterstoevaluatethe

sensitivity

ofthe

chest

x-ray

orientationproblemto thesevalues.

2.1.6.

Implementing

The

Genetic

Programming

Simulator

The implementation

ofthe

Genetic

Programming

algorithm

by

Koza

was

done

using

LISP,

because

oftherelative

ease with which

individual programs,

stored as_{s-expressions,}could

be

manipulated.

To

performthe

Genetic

Programming

experiments

described

in

this paper, theauthor

developed

a

reasonably flexible

implementation

ofthe

algorithm

using

the C++

programming language.

[image:16.555.49.513.122.445.2]

(17)

Unlike

LISP,

C++

is

a

strongly

typed

language.

To

makethe

Genetic

Programming

environmentas

flexible

as

possible,

theclasses

described

below

are

in

general

implemented

as

template classes,

parameterized

by

the typeof

data

associated withtheprogram

inputs,

function

arguments,

and

function

return values ofthegenerated parsetrees

(there is

one

data

type

for

all of

these).

For clarity

in

the

descriptions

which

follow,

theauthor will

forego

the C++

templatenotation

in

referencestoclass names afterthe

first

(e.g.,

Node<T>and

Node

will

be

usedtoreferto the

sameclass).

The

most

basic

of classes

in

the

design

is

theNode<T>class.

Node

is

an abstract

base

class

for

allterminaland

non-terminal nodes which can appear

in

a parsetree.

Derived

classes are

implemented for

eachtypeof non

terminalnodeto

be included

in

the

experiment,

and a single

class,

TerminalNode<T>,

is

usedtorepresentthe

leaf

nodes oftheparsetrees.

Each

derived

class

implements

a small number of member

functions

which characterizethe typeofoperation

performed

by

thisclass of node.

These

functions

include

the

following

functions

which are

declared

pure virtual

in

Node:

int

args

(void);

Returns

thenumber of arguments required as

input

to thisnode.

TerminalNode

implements

this

function

toreturn zero.

The

integer

add node

described later

returnsthevalue

2,

because it

requirestwo

inputs:

the

left

and right addends.

const char"name

(void);

Returns

a short

descriptive

name

for

theclassof_node,which

is

used when a parsetree

is

displayed

totheuser.

Node

*gnu

(void);

Returns

a new

instance

ofthesame exact class asthereceiverofthe

gnu(

)

call.

This

is

usedto generate new nodes of a particular class at runtime.

In

theauthor's original

design for

the

simulator,

each

derivative

Node

also

implemented

a virtual

function,

eval(

),

which obtainedthenode's

inputs from

its

childnodes

in

theparsetreeand combinedthese

inputs

toproducethe

result

for

thenode.

For

example:

int

IntAdd::eval

(void)

{

return

(

child

[0]

->eval

(

)

+ child

[1]

->eval

(

)

_;

(18)

The

inputs

to theprogram were stored

in

an

array,

and each

instance

of

TerminalNode

was assigned an

index

into

thearray.

The implementation

of

eval(

)

in

TerminalNode simply

returnedtheappropriate element ofthe

input

array.

Because

oftherecursive calls

to eval(

)

in

its

implementation

by

non-terminal

nodes,

an entire parsetreecould

be

evaluated

by

calling

eval(

)

againsttheroot node ofthe tree.

While

theauthor

believes

this to

be

good

object-oriented

design,

theevaluation of a parsetree thenrequires a number of virtual

function

calls,equaltothenumber

of nodes

in

theparsetree.

Virtual

function

calls are

relatively

expensive when comparedto theexecution ofa

switch

statement,

upon which a parsetree

interpreter

might

be implemented.

In

fact,

througha separateexperiment,

theauthor estimatedthata virtual

function

call ofthe

form

shown above requires abouttwiceas much

CPU

timeas

theexecution of a switch statement

(in

theenvironment

in

whichtheexperiments were conducted).

In

ordertoavoid

introducing

an unfair

bias

againstthe

Genetic

Programming

technique,

theauthor

implemented

a

different

approachtoprogram evaluation.

In

this

approach,

a

4-byte

code

is

associated with each class of non

terminalnode and each unique

input

value.

For

non-terminal_nodes,the

high-order bit

ofthecode

is

setto

distinguish

thecode

from

thatof aterminalnode.

For

terminal nodes, this

bit is

not set andthecode

is

equalto the

node's

index

into

the

input

array.

Through

a_process,which will

be

referredtoas

flattening,

a parsetree

is

compiled

into

a one

dimensional

array

of

these

byte

codes.

Flattening

is

accomplished

by

calling

a virtual

flatten( )

function

againsttheroot node ofthe tree.

This

function is

passeda pointertoan

array

and a referencetoan

index

variable which pointsto thenext writable

position

in

thearray.

Each

node respondsto

flatten( ) by

calling

flatten( )

against each_child,andthen

pushing

the

byte

code which correspondsto thenode

itself,

ontotheend ofthearray:

void IntAdd: :flatten (unsigned

int

*array,

unsigned

int

&index)

{

child[l]->flatten

(array,

index);

child[0]->flatten

(array,

index);

array

[index++]

=

IntAdd_Byte_Code;

}

In

orderto

determine

thesize ofthe

array

requiredto

hold

the

flattened

version of a particular parse

tree,

Node

also

implements

a virtual

function

toreturnthenumberof

bytes

requiredto

flatten

thesub-tree rooted at a node.

The

default implementation

ofthis

function,

in

class

Node,

is

sufficient

for

most

derived

classes.

That

implementation

(19)

simply

calls

the

function

recursively

against each child

(summing

the

results)

andthenadds

one,

toaccount

for

the

byte-code

ofthe node

itself.

Thus,

the

flattening

process requirestwocallstovirtual

functions for

each node

in

theparsetree.

Fortunately,

the

flattened

parsetreewill

be

evaluated a

large

number of

times, resulting

in

a net savings of

CPU

time

(in

each ofthe

Genetic

Programming

experiments

described later

in

this

paper,

theoriginal

implementation

ofthesimulatorwas

testedand was

found

to

be

at

least

twiceas slow asthisnew

version,

whose results are used

for

comparison withthe

otherclassification

techniques).

To

evaluatethe

flattened

parse

tree,

aStack<T>class

is

neededto

hold intermediate

results.

The

stack provides an

Mined

member

function

topush a value onto the

stack,

and another

inlined function

which popsthestackand

returnsthevalue

just

removed.

Starting

with an

empty

_{stack, the}evaluator

function

iterates

through the

array

of

byte-codes.

If

thecurrent

byte-code

is

thatof aterminal

node,

then the

corresponding

value

from

the

input

array

is

pushed onto thestack.

Otherwise,

an

inlined eval(

)

function

is

called againsttheclass of

Node

indicated

by

the

byte-code.

This eval(

)

function

popstherequired arguments

from

the

stack,

computestheresult of_{the operation,}

and pushestheresult ontothestack.

After

the

last

element ofthe

array

has

been processed,

thestack contains a

single element which correspondsto theprogram result.

int

evaluateByteCodes (unsigned

int

*array,

unsigned

int

num_codes)

{

static Stack<int> s; s.clear

(

)

;

for

(int

i=0;

i<num_codes;

i++)

switch

(array[i])

{

case IntAdd_Byte_Code: IntAdd: :eval

(s)

;

break;

default: s.push( TerminalNode<int>inputs

[array

[i] ]

)

;

}

return s.pop

(

)

;

inline

void IntAdd::eval (Stack<int>

&s)

{

s.push

(

s.pop(

)

+ s.pop(

)

);

This design

has

the

disadvantage

thata

hard-coded

switch statement must

be formulated in

the

implementation

of

(20)

the

design becomes

even more

obfuscated,

withthe

introduction

of

the

"conditional if

operation as atypeof non

terminalnode.

The

if-operator

acceptsthreearguments.

If the

first

argumentevaluatestoa non-zero

value,

then the

operation returns

its

second argument.

Otherwise,

theoperation returns

its

thirdargument

(thus

the

if-operator

is

equivalentto the C++

"?

:"

operator).

This

operation could

be implemented

as

follows:

inline

void Intlf: : eval (Stack<int>

&s)

{

int

argl =

s.pop(

)

int

arg2 =

s.pop(

)

int

arg3 =

s.pop(

)

}

return

(argl

? arg2 : _{arg3) ;}

But

this

design

is

horribly

inefficient,

since allthreearguments are_evaluated,eventhough

only

one ofthesecond

andthird

is

actually

needed

(in

theauthor's_{experiments, there}

is

no chance of side-effects

being

introduced

by

the

evaluation of a_sub-tree,sotheevaluation need not

be

done

at_all,

if

theresult

is

not used).

In

ordertopreventthe

unnecessary

evaluation of unused_results,a

way

is

neededto

delay

theevaluationofthe

latter

twoargumentsto the

if-operator

until a

decision

is

made astowhich will

be

evaluatedand which will

be ignored.

The

author's solution

is

similartoone which

he

later discovered

was

described

by

Keith

and

Martin

(Keith,

1994:294).

First,

a new

byte-code

called a"skip"code

is

introduced,

which

is

composed of an

identifying

flag

and an

array index

value which representstheelementtowhichtheevaluator should proceed next.

This

new

skip

code

is

used

in

the

implementation

of

flatten for

theclass

Intlf.

void Intlf::

flatten

(unsigned

int

*array,

unsigned

int

&index)

{

childfO]->flatten

(array,

index);

//

encode 1st

arg

array

[index++]

=

IntIf_Byte_Code;

//

_add _self _to

byte-code

array

unsigned

int

save =

index++;

//

leave

room for "goto"

child

[1]

->flatten

(array,

index);

//

encode 2nd

arg

array[save] =

SkipFlag

|

(index+1);

//

"goto"

points past 2nd

arg

save =

index++;

//

leave

room

for

2nd "goto"

child[2]->flatten

(array,

index);

//

encode 3rd

arg

array

[save]

=

SkipFlag

|

index;

//

"goto"

points past 3rd

arg

}

The

flattened

versionofan

7?7/"operation

requirestwoadditionalslots

for

the

skip

codes.

This

is

themain reason

that the

function

which

determines

thenumber of

array

elements required

by

a

flattened

parsetreemust

be

a virtual

(21)

function. Unlike

other

Node eval(

) functions,

the

implementation

of

eval(

)

for

Intlf

does

not push a result value

ontothestack.

Instead,

it

adjuststhe

index

which

is

being

usedtowalkthrough the

array

of

byte-codes

during

program evaluation:

inline

void Intlf: :eval (Stack<int>

&s,

int

&i)

{

if

(

s.pop(

)

i++;

}

This

function

popsthestacktoretrievethe

first

argumentto the

if-operator.

If

thisargument

is

zero,

then the

eval(

)

function does

nothing.

The

processed will

be

a

skip

code which causestheevaluatorto

jump

past

the

byte-codes

associated withtheif-operator's second argumentto thestart of

byte-codes

associated withthe

if-operator'

sthirdargument.

Subsequently,

whenthatsection ofthe

byte-code

array

has

been

processed, thevalue

corresponding

to the thirdargument will reside atthe

top

of_{the stack,}as

if it had been

returned

by

the

if-operator.

However,

if

the

first

argumentto the

if-operator is

non-zero,

then the

eval(

)

function increments

the

index

being

used

by

the

evaluator,

sothat

it

will missthe

skip

code.

At

this

point,

thesecond argumentto the

if-operator

will

be

evaluated and pushed ontothestack.

The

byte-code

which

immediately

follows

the

flattened

second argumentto

the

if-operator

is

another

skip code,

which causestheevaluatorto

skip

pastthe third

argument,

which

does

not need

to

be

evaluated.

Thus,

in

either

case, only

one ofthesecond andthirdargumentsto the

if-operator

will

be

evaluated.

In

additionto theoverrides ofthepure virtual

functions

of class

Node

andthemember

functions

relatedtoprogram

evaluation,

each

derived

class which representsa non-terminal node

implements

a special constructor which

is

used

only once,

toregistertheclass of non-terminal node

in

a static registry.

The

contents ofthis

registry

determines

which

Node derivatives

will

be

created and used

during

the

Genetic

Programming

experiment.

A

single static

instance

oftheclass

is

constructed

using

this constructor, toaccomplishtheregistration.

Here

is

an example ofthe

(22)

const unsigned

int

IntAdd_Byte_Code

=

NonTerminalFlag

|

0x00000001;

class

IntAdd

: public Node<int>

{

private:

static

IntAdd

a;

//

static

instance

used to register class IntAdd

(NodeType

t)

:

Node<int>(t)

{

}

//

for registration

public:

static void eval (Stack<int>

&s)

{

s.push

(

s.pop(

)

+ _s.pop(

)

);

}

int

args

(void)

{

return

2;

}

const char *name

(void)

{

return

"+";

}

Node<int>

*gnu

(void)

{

return new IntAdd

(instance)

;

}

unsigned

int

byte_code

(void)

{

return

IntAdd_Byte_Code;

}

void

flatten

(unsigned

int

*array,

unsigned

int

Sindex)

{

child

[1]

->flatten

(array,

index);

child

[0]

->flatten

(array,

index);

array

[index++]

=

IntAdd_Byte_Code;

}

};

IntAdd IntAdd: :

instance

(NONTERMINAL);

When

theobject module

for IntAdd is linked

into

an

executable,

thestatic

instance

oftheclass

is

automatically

registeredat application startup.

Registration

involves storing

a pointerto the

instance in

a static

array

of pointers

maintained

by

the

Node

class.

At

run

time,

a class of non-terminal node

is randomly

selected

by

choosing

an

instance

from

thisarray.

A

newnode ofthisclass

is

produced

by

calling

the

gnu(

)

function

againstthat

instance.

The

registrationofterminalnodes

is handled

a

little

differently,

because

there

is

only

oneC++classtorepresentall

ofthe terminalnodes

in

theexperiment.

At

application_startup,a global

function

named

numberOflnputs(

)

is

called

to

determine

thenumber of uniqueterminalnodes.

This

function

must

be

provided

by

theexperimenter.

The

value

returned

by

this

function

is

usedtocreatethecorrect numberof

instances

of

TerminalNode

(one

per

input).

These

instances

are stored

in

another static

array

of

Node

pointers associated withthe

base

class.

At

run

time,

when a new

terminalnode

is

needed,

one ofthe

instances

maintained

by

the

Node

class

is

selected at random and cloned

using

the

gnu(

)

function implemented in TerminalNode.

(23)

Each

instance

of

TerminalNode

maintains an

index into

a static

array

of

input

values

(the

sizeofthis

array

is

also

equalto thevaluereturned

by

numberOflnputs(

)).

When

a

TerminalNode is

created via

gnu(

),

the

index

value

is

copiedsothat thenew node points

to

thesame

input

value.

During

the

run, the array

of

input

values

is loaded

appropriately before

a program

is

evaluated.

The

Node

base

class maintains

information

aboutthenode's place within

its

parsetree.

This information includes:

A

pointerto thenode's parent

in

theparsetree

An array

of pointersto thenode's children

in

theparsetree

The

maximum

depth

of

any

sub-tree rooted atthisnode

The

number of non-terminal

descendants

ofthisnode

The

numberofterminal

descendants

ofthisnode

This

information

is

used

during

theconstruction and

breeding

of parsetrees througha number of

helper functions

onthe

Node

class

(these functions

describe

thecharacteristics ofthesub-tree rooted atthenode and can return

specific elements ofthatsub-tree givena depth-

first

numeric

index into

thesub-tree).

The

class which usesthese

member

functions

is

theclass whose

instances

representthe

individual

parsetrees

in

the

experiment, Program<T>.

Each

Program

storesa pointerto theroot nodeoftheparsetree

it

represents.

The

first

timea

Program

is

askedto

evaluate

itself,

it

createsthe

flattened

versionoftheparsetree

by

calling

the

flatten(

)

function

againsttheroot node

ofthe tree.

The Program

thencallstheglobal

function,

evaluateByteCodes(

),

which was

described

earlier.

The

flattened

versionoftheparsetree

is

not

discarded

untilthe

Program

is

deleted.

Thus,

it

can

be

evaluated multiple

times.

In

additionto

storing

a pointertotheparsetreeandthe

byte-code

array, Program

maintains a number of other

data

members

including

data

members

for:

thenumberoftimestheprogram

has been

run

(i.e.,

evaluated).

(24)

thenumber of

hits

scored

by

theprogram

(a

"hit"

is

scored eachtime theprogram returns an

exactly

correctresult and

is

useful

if

somenon-zeroamount ofthe

fitness

measure

is

awarded

for

answers which are

"close but

not exact").

the

probability

that thisprogram should

be

selected

for breeding.

thenumber oftimes theprogram

has

actually

been

selected

for

breeding

a

description

of

how

theprogram was produced

(e.g.,

from

cross-over)

with

data

to

identify

the

parent

program(s)

which were

bred

toproducethisprogram

Much

ofthis

data is

simply

gathered

for

measuring

statistics ofthe

experiment,

althoughthe

fitness

measure

is

used

tocomputetheselection

probability

andtheselection

probability

is

used

during breeding

toselect programs at

frequencies

which are proportionalto their

fitness

at

solving

the targetproblem.

The Program

class providesthreemethods

for creating

a new program.

During

thecreation ofthe

first

generation

of programs within an

experiment,

a public constructor ofProgram

is

usedtocreate new programs

by

eitherthe

"full"

or"grow"methods

described

earlier.

Subsequently,

a

copy

constructor

is

provided

for

asexual reproduction

and a

breed( )

method

is

implemented

for

cross-over

(mutation is

also supported via a

mutate(

)

method, but

the

author

has

nottested this).

Both

the"full"and "grow"

methods of

generating

atree

from

scratch are

implemented

by

Program

via recursive

callsto

its

generate

Tree(

)

member

function. This function is

passed an enumerated valueto

distinguish

themethod

of generation and a pair of values which representtheminimum and maximum

depths

allowed

for

thegenerated

tree.

The generateTree(

)

function randomly

selectsatarget

depth, d,

for

thenew

tree,

suchthat

d

lies

in

therange

between

thespecified minimum and maximum allowable.

The

C++run-time

library

function

erand48(

)

is

used

for

this and all other situations

in

which random numbers are needed.

Separate

random number streams are used

for

each

decision,

by

maintaining

multipleseed arrays.

The

"full"

method oftreegeneration

is implemented

as

follows:

(25)

if

(

d

is

less

than or equal to one

)

{

return a new,

randomly

selected, terminal node

}

else

{

create a new,

randomly

selected, non-terminal node n

for each argument required

by

n, generate a new sub-tree

by

calling

generateTree

(

)

recursively, with minimum and maximum

depth

parameters

both

equal to

(d

-1)

(and

_using

the "full"

method)

attach these arguments

(i.e.,

sub-trees) to node n and return n.

}

This

algorithm results

in

atree

in

whichthe

depth

ofeach

branch

extending

from

theroot

is

exactly

equalto the

depth

selected

in

theoriginal callto

generateTree(

).

Program

implements

the"grow"

method oftreegeneration

slightly

differently

thanas

described

by

Koza. In

this

implementation,

therewill always

be

exactly

one

branch

which reachesthe targetmaximum

depth for

the tree.

The generateTree(

)

method

implements

"grow"

as

follows:

if

(

d

is

less

than or equal to one

)

{

return a new,

randomly

selected, terminal node

}

else

{

create a new, randomly selected, non-terminal node n

randomly

select one of the arguments to n and create a sub-tree for

it

by

calling

generateTree

(

)

recursively, with minimum and maximum depths set to

(d

-1)

(and

using

the "grow" method)

create the other arguments to n

by

calling

generateTree

(

)

recursively with a minimum depth parameter of 1 and a maximum depth of

(d-l)

(again,

the "grow"

method

is

used

in

these recursive calls).

attach these arguments to node n and return n.

}

This

algorithm also

differs from

that

described

by

Koza

in

that the

depths

ofthevarious

branches

ofthe treesare

independent

oftherelative

difference

in

thenumberof classes of non-terminalnodes andthenumber of

different

inputs

(i.e.,

terminalnodes).

Thus

thegeneratedtreeswill not

be artificially

flattened

whenthenumber of

inputs

is

(26)

The

"full"

and"grow"methods are

only

usedtogeneratethe

initial,

random population of programs evaluated as

part ofthe

first

generation ofthe

Genetic

Programming

experiments.

Later

generations of programs are created

by

breeding

programs

from

theprevious generation.

The

twomethodsof

breeding

employed

in

theexperiments

by

the

author are

implemented

viathe

copy

constructor ofthe

Program

class and

Program's

breed(

)

method.

The copy

constructor

is

used

for

asexualreproductionof a

Program

during

breeding.

The

counts

for

numberof

runs,

hits,

andcumulative

fitness

are preserved

in

thenew

Program,

but

other

data

members,

like

the

probability

of

selection

for

breeding

andthenumber oftimes theprogram was selected are not copied

from

the

existing

Program,

because

these

data

members will

depend

ontheresults

found

in evaluating

theother programs

in

thenew

generation.

As

an

optimization,

the

fitness

of a program produced via asexual reproduction

is

not recomputed

for

thenew

generationof programs.

Such

evaluation would

be

a waste of

time,

since

it is

assumedthat the

fitness

ofa particular

program

is independent

of generation

in

which

it

appears.

Obviously,

thissame optimization cannot

be

appliedto

programs produced

by

cross-over,

sincethe

resulting

programs will almost

certainly be different

than theirparent

programs.

Cross-over

is implemented in

the

breed( )

method of class

Program. The

method

is

passed a pointertoanother

program

(which

serves asthesecond parent

in

the

cross-over)

and

it

returnsthe twoprograms which result

from

the

cross-over.

Cross-over

is

implemented

as

follows:

1

.

Use

the

copy

constructor of

Program

toproducetwonew programs which are

identical

in

structureto the twoparent programs

involved

in

thecross-over.

2.

Randomly

selecta node

from

each ofthe twoparsetreesasthecross-over points.

3.

Swap

thesub-trees rootedatthese twonodes

by

extracting

each

from its

current parsetreeand

inserting

it in

theother parsetree.

Even if both

oftheparent programswere

in fact

the

same

instance

of class

Program,

theabove algorithm

works,

because

thechildren are

first

produced

by

cloning,

before

thecross-over occurs.

One

situation which

does

require

special

handling

is

when one or

both

ofthecross-over pointsaretheroot nodes ofthechild programs.

In

this

case,

an entire program

is

being

replacedandthepointerstoredto therootnode

in

the

Program

object must

be

updated.

(27)

The

selection ofa cross-over point

is

atwo

step

process.

First,

a

decision is

madeastowhetherthecross-over point

should

be

aterminalor non-terminal node.

The

probability

of

selecting

aterminalnode

in

this

step

is

atunable

parameter.

Node

is

asked

for

thenumber of nodes oftheselectedtypewhichexist

in

the treerooted at thatnode.

An

index

toone ofthesenodes

is

randomly

generated andtheroot

Node

is

askedtoreturna pointerto

thenode which correspondsto this

index,

given a

depth first ordering

ofterminalor non-terminal nodes.

The

node

which

is

returned

becomes

thecross-over point.

The

two

Program

constructors andthe

breed( )

member

function

each

implement

a single

step

in

theprocess of

generating

a population oftestprograms.

These

steps must

be

repeated an appropriate number oftimestoproduce thecomplete population.

The responsibility for generating

and

evaluating

a population of programs

falls

to the

Generation<T> class.

A

Generation

maintains a collection of

Program

objects.

The

class providestwoconstructors

for

creating

new

generations,

a methodtoevaluateall of programs

in

the generation,and a methodtosave

information

aboutthe

generationtoa

file. This information includes

theprograms

themselves,

as wellas some performance_statistics,

which will

be described

shortly.

The

class

Generation

providestwoconstructors.

The first

constructor

is

thevoid_constructor,which

is

usedto

createthe

first

generation of programs

for

an experiment.

This

constructor

implements

the

"ramped

half-and-half

method of program generation

described

earlier,

using

the"full"and "grow"

methods of parse-tree generation

implemented

by

Program.

The

second

Generation

constructor

is

passeda pointertoan

existing

generation.

This

is

not a

copy

constructor.

Instead,

a new generation

is

produced

by breeding

theprograms

from

thespecified generation.

The

programs

in

the

existing

generation must

already have

been

evaluated and sorted

according

to theirrelative

fitness. In addition,

the

probability

of selection

for

breeding

must

already

have

been

computed

for

each program and a cumulative value

for

this

probability

stored

in

eachoftheprograms.

Breeding

proceedsas

follows:

1.

A

method of

breeding

is

selectedat random.

The

frequency

with which each ofthemethods

is

selected

is

atunableparameteroftheexperiment.

2.

If

theselected method

is

cross-over, then twoprograms

from

theprevious generation are selected

(28)

selected

for

breeding

is

proportionalto

its

relative

fitness

withintheprevious generation of

programs.

Selection

is done "with

replacement".

3.

The

new program

is

produced

(two

programs

in

thecase of

cross-over)

by

calling

theappropriate

method ofthe

Program

class.

4.

Steps

1

through

3

are repeated until a complete population of programs

is

generated.

Currently,

only two

methods of

breeding

are supported

by

theclass

Generation:

cross-overand asexual

reproduction.

A

method

is

selected

by

comparing

a random number

in

therange

0.0

to

1

.0withthe target

probability for

cross-over.

If

thisrandom number

is less

than the

probability

of

cross-over,

then themethod selected

is

cross-over.

The

algorithmtoselect programs with appropriate selection

frequencies is

implemented

as

follows:

a random

number

is

generated

in

therange

0.0

to

1.0. The

sorted collection of programs are searched

in

order of

decreasing

fitness,

until a program

is found

whosecumulative

probability

of selection exceedstherandom number.

This

program

is

selected.

Greedy

over-selection

is implemented

during

theassignment of selection

probabilities,

sothe

division

oftheprograms

into

two

groups,

as

described for

that

technique, is implicit

in

theselection algorithm

described here.

Once

theentire populationof programs

has

been created,

the

Generation

can

be

evaluated

by

calling

its

eval(

)

method.

Although

theprocess of

evaluating

the

fitness

of each program

in

thegeneration

is implemented

in

a

generic_way,

it depends

on callstoa number of

functions

which are externalto theclass

Generation

and are

provided

by

theexperimenterto tailor thealgorithmtotheproblem

being

studied.

These

functions include:

int

numberOflnputs

(void)

This

function,

which was mentioned

earlier,

defines

thenumber of

different

terminalnodesto

be

registeredat application startup.

int

numberOjTests

(void)

The

evaluation ofa program

is broken

down

into

one or more

"tests".

At

thestart of a

test,

the experimenter

is

askedto

load

the

array

ofterminalnode values with

inputs

appropriateto thenext test.

Then,

theexperimenter

is

askedtoevaluate each program

in

the

population,

one at a

time,

and assign a

fitness

valueto theprogram

for

this test.

The function numberOfTests(

)

is

calledto

determine

thenumber of

different

testswhich will

be

performed.

This

number must

be independent

ofthegeneration

being

evaluated,

because

(29)

programs produced

by

asexual reproduction are not reevaluated.

In

theexperiments

described in

this

paper,

a"test"consistsof

evaluating

each program againstthe

feature

setextracted

from

a singlesample

x-ray image.

The

number oftests

is

thereforeequalto

thenumber of

training

samples.

The

author's

Genetic

Programming

framework

actually

evaluatestheentire population of

programs against

only

a subset ofthe

tests

indicated

by

numberOfTests( ).

Then,

after a selection

probability

has

been

assignedtoeach

program,

the

remaining

testsare usedtoevaluatethe

"best

of

generation"

program.

Since

these

latter

testsnever affecttheselectionof programs

for

breeding,

they

serve as a good measure ofthesuccess ofthealgorithm at

finding

a solution which

is

general enoughtowork well withintheproblem space represented

by

the testsamples.

void

loadlnputs

(void)

This

function is

called atthe

beginning

ofeach

test,

topreparethe

array

ofterminalnode values

for

thestart ofthenexttest.

In

theexperiments

by

_{the author, the}set of

inputs

was

exactly

equal

to the

feature

values extracted

from

a single sample

x-ray (in

one of

its

orientations),

plus some

constants.

The

implementation

of

loadlnputs

would populatethe

input

array

with values

for

a

single

training

sample.

The

number of callsto

loadInputs( )

for

each generation

is

equalto thenumber returned

by

numberOfTests( ). The

version of

loadInputs( )

implemented

by

theauthor

automatically

restarts

atthe

beginning

ofthecollection of

training

samples,

afterthe

features

for

the

last

training

sample

have been loaded.

The function correctly

presumesthat thisrestartoccurs

only

when a new

Generation

begins

evaluating

its

programs.

void evalProgram(Program<T>

*program,

double &

fitness,

int

&hits,

...)

The Generation

class calls

evalProgram(

)

toobtain a

fitness

value

for

a specified program

during

thecurrenttest.

The

fitness

values ofa single program are summed over alltests toproducethe

program'scumulative

fitness,

which

is

a measure oftheprogram's success at

solving

the target

problem.

In

its

simplest

form,

evaluating

a program

during

atest

involves

calling

the

eval(

)

function

ofthe

Program

and

using

thereturn valuetoassign a

fitness

valueto theprogram

for

this test.

Koza

givesexamples wheretheevaluation of a program

for

a singletest

may

involve

multiple callsto

its

eval(

)

method,

whereeach evaluation

is

usedtoadjuststhe

input

array

before

thenext call

(Koza,

1992:147).

In

these

cases,

the

fitness

measure

is

often

inversely

proportionalto thenumber oftimes the

programneedsto

be

evaluatedto

satisfy

someterminationcondition.

This

kind

oftestcan

be

implemented using

thisauthor's

framework,

providedthat the

evalProgram(

)

function

restoresthe

input

array

to

its

original state

before

returning.

double

adjustFitness

(double

fitness)

This

function is

called onceper program after

it has

been

evaluated against alltests.

The

fitness

valuepassedto this

function is

thecumulative

fitness

oftheprogram

determined

overthecourse

ofthetests.

The

experimenter can

implement

(30)

Because

programs produced

by

asexualreproductionare not_reevaluated,andtheircumulative

fitness is

not

readjusted,

theadjustment performed

by

this

function

must

be

independent

of

both

thecharacteristicsofthecurrentgeneration andtheprograms

it

contains.

This

restriction

severely

limits

what can

be done

by

this

function,

and

in

practice,

theauthor

found

no needto

modify

the

cumulative

fitness

values.

double

perfectScore

(void)

This

function

returnstheadjusted

fitness

valuethatwould

be

achieved

by

a program which

performed

perfectly

during

all ofthe tests.

One

ofthe

stopping

criteria

for

theexperiment

is

finding

a program whose adjusted

fitness

equals

perfectScore(

).

int

evaluateByteCodes

(unsigned int

*array,

unsigned

int numcodes)

This

is

the

interpreter function

described

earlier.

The

actual returntype

depends

onthe typeof

data

associated withtheprogram

inputs

and return values.

The

first

argument

is

an

array

containing

a

flattened

parsetree.

The

second argument

i