• No results found

Measuring static complexity

N/A
N/A
Protected

Academic year: 2020

Share "Measuring static complexity"

Copied!
14
0
0

Loading.... (view fulltext now)

Full text

(1)

lnternat. J. Math. & Math. Sci. VOL. [5 NO. (1992) 161-174

MEASURING STATIC COMPLEXITY

BEN GOERTZEL

Departmentof Mathematics

UniversityofNevada,Las Vegas Las

Vegas

NV89154

(Received August 17, 1990 and in revised form March 1, 1991}

ABSTRACT:Theconceptof

"pattern"

isintroduced,formallydefined, and usedto analyze

variousmeasures ofthe complexityof finite binary

sequences

andother objects. The standardKolmogoroff.Chaitin.Solomonoff complexitymeasureisconsidered,alongwith Bennett’s’logicaldepth; Koppel’s"sophistication’,and Chaitin’sanalysisof thecomplexity

ofgeometric objects. Thepattern.theoretic pointof view illuminatesthe shortcomingsof these measures and leads to specific improvements, it giles rise to two novel

mathematical

concepts

"orders" ofcomplexity and "levels" of

pattern,

andityieldsa new

measureof complexity, the "structural complexity’, whichmeasuresthe totalamountof structure anentity

possesses.

KEY WORDS AND PHRASES. Kolmogorov complexity,

algorithritic

Information,

pattern,

sophistication,

structure,

depth

AMS SUBJECT

CLASSIFICATION

CODE. 68Q30

1.

INTRODUCTION

Different contexts require different

concepts

of "complexity’. In the theory of computational complexity, as outlined forinstance byKronsjo [1], one deals with the complexity of problems. And the complexity of evolving

systems

falls under the aegis of dynamical

systems

theory, asconsideredforexample by Bowen

[2].

The

present

paper,

however,isconcerned with thecomplexityof staticobjects,asubjectwhichhasreceiled rather little attention. Althoughmostof the discussionfocusesonbinary

sequences,

the implicationsaremuch more general.
(2)

162 B. GOERTZEL

DEFINITION1. LetMbeauniversalTuringmachine. Let us saythata

program

for

M

isserf.delimiting ifitcontainsa

segment

tellingMitstotallengthin bits. Then,the

KCS

complexity of a finite binary

sequence

x is the length of the shortest

program

which

computes

x on M.

In the decades since its conception, this definition has led to a number of Interestingdevelopments. Chaitin[4]has useditto provideanInteresting new proofof

Godel’s theorem; and Bennett [6], Zurek[7]and othershaveapplied it to problems in

thermodynamics. However, it has Increasingly beenrealized that the concept of

KCS

complexityfallstocapturethe Intuitive meaning of"complexity."

The problem is that, according to the KCS definition, "random", structureless

sequences

arejudgedthemost complex. The leastcomplex

sequences

arethose like

O(X)IX)O000...O00, 010101010101...010101, and 1010010001000010(XXX)...O, which canbe computed byveryshort

programs.

And themostcomplex

sequences

x arethose which cannotbecomputed by

any

programshorterthan"printX’. There isasensein which this isnot adesirable

property

foradefinition ofcomplexitytohave

In

whichahumanor a

ree

orthe

sequence

ofprime numbersismore"complex"thanarandom

sequence.

Over the

past

decade, there have been two noteworthyattempts to remedythis deficiency: Bennett’s [6]"logical depth", andKoppel’s[8]"sophistication."

Weoutline a generalmathematical frameworkwithin which various measuresof

complexity

may

beformulated,analyzedandcompared. Thisapproachyields significant

modifications of these

measures,

as well as several novel, general

concepts

for the analysisofcomplexity. Furthermore,itgilesrisetoanentirelynewcomplexity

measure,

the "structural complexity’, which measures the total amount of structure an entity

possesses.

Intuitilely,thistellsone"how much thereisto say"abouta gilen object.

2. PATTERN

DEFINITION 2.

A pattern space

is a set

(S,*,l

I),

where

S

is a set, * is a binary

operationdefinedon some subsetof

SxS,

and isa

map

fromSintothe nonnegatile real numbers.

Let usconsiderasimple example: Turingmachinesandfinitebinary

sequences.

DEFINITION3. Let

y

bea

program

foraunilersalTuring machine;let beafinite

binary

sequence.

Definey*z to bethebinary

sequence

which

appears

on the

memory

tape

of the Turingmachineafter, having been startedwith onitsinput

tape

beginning directly under the

tape

headand extendingtothe right,

program y

finishesrunning. If

y

never

Mops

running, then lety*zbe undefined.

Let

zl

denote the length of as

Its

length,and let

yl

denote thelength of the

program

y.

Nowwe are ready togileageneraldefinItionof

pattern.

DEFINITION

4. Let

a,

b, andc denote

constant,

nonnegative numbers. Thenan

ordered pair(y,z)isapafternIn xifx=y*zand

alYl

+

blzl

+

cC(y,z) <

Ixl,

whereC(y,z) denotesthe complexity of obtainingxfrom(y,z).

DEFINITION6. ff

y

isa Turingmachine

program

and isafinItebinary

sequence,

(3)

MEASURING STATIC COMPLEXITY 163 equippedwith

program

yand givenzas initial input.

For many

purposes,

the numbersa,b andc arenot

Important.

Often theycanall be takento equal1,sothattheydonot

appear

in the formulaatall. Butinsomecasesit

may

beusefulto, forinstance,seta=b= 1andc=O. Then the formula reads

Yl

+

Izl

<

Ixl.

Theconstants couldbedispensedwith,butthen itwould be

necessary

toredefine and Cmore often.

Intuitively, an orderedpair (y,z) is a patternin xifthe complexity of

y,

plus the

complexityofz,plusthecomplexityofgettingxoutof

y

and

z,

isless than thecomplexity

ofx. Inotherwords, anorderedpair (y,z)isapatterninxifit issimpler to

represent

xin

termsof

y

andzthanitistosay "x’. Theconstants

a,

bandc

are,

ofcourse,weights: If a=3/4 and b=5/4, for example, then the complexity of a is counted less than the

complexity ofb.

The definition ofpattern canbe generalizedto ordered n.tuples, and to takeinto

accountthepossibilityofdifferentkindsof combination,say *,and

*:

DEFINITION7:Anorderedsetofnentities(x,,x2

,...,x.

)isa patterninxif

x=x,*,

x:*:

...*.

x.

and

a, lx,

+a:lx:l +...+a.lx.I

+

a./,C(x,

,...,x.

) <

Ixl,

whereC(x,

,...,x.

)

isthecomplexityofcomputing

x,*x,*...*x,

and

a, ,...,a.,

arenonnegative numbers. Also, the followingconceptwill be ofuse:

DEFINITION 8: The intensityinxofa ordered pair(y,z) such thaty*z=x

may

be

definedas IN[(y,z)lx] (

Ixl

-/alyl /

bizl

/cC(y,z)])/Ixl

Obviously,thisquantityispositivewhenever(y,z)isapatternin

x,

andnegativeor zerowhenever itisnot;anditsmaximum value is 1.

3.ANEXAMPLE: GEOMETRIC PATTERN

Mostof the present

paper

is devotedto Turingmachinesandbinary

sequences.

However,

the definition of

pattern

doesnotInvolvethetheoryofcomputation; essentially, a

pattern

is a "representationas something simple’. Instead of Turlng machines and

binary

sequences

letus now considerpictures.

Suppose

that

A

is a one inch

square

picture,and

B

isafive inch

square

picture made

up

of

twenty.five

non.overlappingone. inchpicturesidenticaltoA. Intuitively, it issimpler to

represent

B

as an

arrangement

of copiesofA, thanit istosimplyconsiderBasa"thingin itself’.

Very

roughly speaking,it wouldseemlikelythatpartofthe

process

ofrememberingwhatBlooks like consists of

representingBas anarrangementof copies ofA.

This Intuitionmaybeexpressedintermsof the definition of

pattern.

Wherexand

y

are

square

regions, let:

y *, z

denote theregionobtainedbyplacing

y

tothe right of

z

Y

"2

z

denote theregionobtainedbyplacing

y

to theleft ofz

y

*

zdenote theregionobtainedby placing

y

belowz

y

*, zdenote theregionobtainedbyplacing

y

above

z

And,althoughthisis obviously averycrudemeasure,letusdefinethecomplexity

Ix

of asquareregionwithablack.and.whitepicturedrawn initastheproportionof theregion
(4)

164 B. GOERTZEL

The operations *,,

"2

*

and *,

may

be called simple operations. Compound

operationsare,then, compositionsofsimple operations,suchastheoperation(x*, w*,x)*, w. Ifyisa compoundoperation, letusdefine itscomplexity

Yl

tobe thelengthof the shortest

program

which

computes

the actualstatementofthecompound operation. For

Instance, }(x*

w* x)*, w is defined to be the lengthofthe shortest

program

which outputsthesequenceofsymbols"(x* w* x)*,w’.

Whereyisa simple operationandzisa

square

region,lety*z denote theregion

that results fromapplying

y

to z. A compound operation actson a numberof

square

regions. Forinstance,(x*,

w*,

x)*, wactsonwandxboth. We mayconsideritto acton the orderedpair(x,w). Ingeneral, wemayconsidera compound operation

y

to actonan

orderedsetof

square

regions (x,

,x,

xn

), where

x,

isthe letterthatoccursfirst in the statementofy,x2isthe letter thatoccurssecond, etc. Andwe

may

definey*(x,

xn

)to

be theregionthat results fromapplyingthecompound operation

y

totheorderedsetof

regions(x,

x

).

Letusreturn tothetwopictures,AandB,discussed above. Letq=A*,A*,A*, A*, A.

Then,

itis

easy

to seethatB=q*,q*,q*,q*,q. In otherwords, B

(A*, A *, A*, A*, A)*,(A*,

A*,

A*, A*, A)*,(A*, A*, A*, A*, A)*, (A*, A*, A*, A*, A)*,(A*, A*,

A*,

A*,A). Where y is the compound operation given in the previous sentence, we have B=y*A. The complexityofthatcompound operation,

lYl,

is certainly

very

close to the

lengthof the

program

"Let

q=A

*, A*, A*, A*, A;print

q*,q*,q*,q*’.

Notethat this

program

isshorter than the

program

"Print(A*,

A*, A*,

A*, A)*,(A*, A*, A*,

A*,

A)*,(A*,

A*, A*,

A*,

A)*(A*, A*, A*, A*, )*, (A*, A*, A*, A*, A)", soitis clear that the latter shouldnotbe used in thecomputationof

Yl-We havenotyetdiscussedthe termC(y,(B,

,B.

)), which

represents

theamount

ofeffort requiredto

execute

the compound operation

y

onthe regions (x,

,...,x.

). For

simplicity’s sake, we shall simply set it equal to the number of times the symbol "*" appearsin thestatementofy;that is, tothe number ofsimple operations involvediny.

So,is(y,A) apatterninB? Letusassumethat theconstantsa,b andc areallequal to 1. Weknowy*A=B;thequestioniswhether

lYI

/

IAI

/C(y,A) <

IBI.

According tothe above definitions,

Yl

is 37 symbols long. Obviouslythis is a matterof the particularnotationbeing used. Forinstance,it would be less ifonly one characterwereusedtodenote*,,and it would bemoreif itwerewritteninbinarycode.

C(y,z)is eveneasierto

compute:

there are24 simple operations involved in the constructionofBfromA.

Sowe have,

very

roughly speaking,37

+

Izl

+

24 <

Ixl.

Thisistheinequality that mustbesatisfiedff(y,z)istobe consideredapatterninx. Rearranging, wefind:

Izl

<

Ixl

61. Recall that we defined thecomplexity ofa regionas theproportion of

blackwhich it contains. Thismeansthat(y,z)isapatterninxifandonlyifittheamount

ofblack required to drawB exceedsamountofblackrequired todrawA by atleast 62.

(5)

MEASURING S’rATIC COMPIEXI’IY 1)5

region. Ingeneral, wemaydefine

I(x,

,x, )1

{x,I

+...+

Ixnl,

assuming that theamount

of blackinaunionofdisjointregionsisthesumof theamountsof blackintheindividual

regions. From this it follows that (y,(x,

xn

)) is a

pattern

in x if and only ifa

lYl

+

b(Ix, l/.../lx.I)

/ cC(y,(x,,...,x.)) <

Ixl.

Resultssimilartothese could also beobtainedfromadifferentsortofanalysis. In ordertodealwithregions other thansquares,it isdesirableto replace *,, *,, %,

**

with

a single "joining"operation*, namelythethe set.theoreticunion U. Letz=(x,,

xn

), let ybeaTuringmachine, let f beamethodforconvertingapictureintoabinary

sequence,

and letgbe amethodforconvertingabinary

sequence

intoa picture. Thenwehave DEFINITION 9: If x

x,

U

x

U U

xn

then (y,z,f,g) is a

pattern

in x if

alYl +blzl +clfl

+dlgl +eC(y,z,f,g) <

Ixl.

Wehavenotsaidhow

Ill

and gl aretobe defined.

However,

thiswould require adetailedconsiderationof thegeometricspacecontainingx,and thatwouldtakeustoo

farafield. This general approachissomewhatsimilarto that taken in Chaitin[9].

4. ORDERS OF COMPLEXITY

It should be apparentfrom the foregoing thatcomplexityandpattern are deeply

interrelated. In this and the following sections, we shall explore several different

approaches to measuring complexity, all ofwhichseekto

go

beyondthesimplisticKCS approach. According to theKCS approach, complexitymeans structurelessness. The

most "random;least structured

sequences

arethemostcomplex. The formulation of this

approach was a great step forward. But it seems clear that the next

step

is to give

formulaswhichcapture moreof the intuitivemeaningof theword"complexity".

First,weshallconsiderthe idea thatpatternItselfmaybeusedtodefinecomplexity.

Recall thegeometricexampleof the previous section, inwhichthecomplexityofablack. and.whitepictureina

square

region wasdefinedastheamount ofblackrequired todraw it. Thismeasuredidnoteven

presume

to

gauge

theeffort requiredto

represent

ablack.

and.whitepictureina

square

region. One

way

tomeasurethe effortrequiredto

represent

such apicture, callit

x,

isto lookatallcompound operations

y,

and allsets of

square

black.and.whitepictures(x,

,...,x,

), such thaty*(x,

,...,x.

)=x. One

may

thenask which

y

and(x, x,)givethesmalleMvalue of

alYl

+ b(Ix, +

+ Ix.l)

/ cc(y,(x,,...,x.)). Th#s minimal value of

alYl

+

b(Ix, l+...+lx,

I)

may

be defined to be the "second.order" complexityofx. The second.order complexityisthen bea measureof howsimply

x

can berepresented intermsofcompound operationson

square

regions.

Ingeneral,givenanycomplexitymeasure

I,

we may usethissortofreasoningto definea complexitymeasure

I’.

DEFINITION 10: If is a complexity measure,

I’

isthe complexity measure

definedsothat,for all

x,

Ixl’

is the smallest value that thequantity

alYl

+

blzl

+

cC(y,z) takeson, forany(y,z) such thaty*z=x.

xl’

measureshowcomplexthesimplest representation ofxis, wherecomplexity

ismeasuredby

I.

Sometimes, asinourgeometricexample, and

I’

willmeasure
(6)

166 B. GOERTZEI

Extendingthisprocess,onecanderivefrom

I’

ameasure

I":

thesmallest value thatthequantity

alYl’

+

blzl’

+

cC(y,z) (4.1)

takes on, forany (y,z) such thaty*z=x.

Ixl"

measures thecomplexity of the simplest

representation ofx,wherecomplexityismeasuredby

I’.

And from

I",

one

may

obtain

ameasure

I’".

Itis clear thatthisprocess maybecontinued indefinitely.

Itisinteresting toask when and

I’

are equivalent, oralmost equivalent. For Instance, assumethatyisaTuring machine, andxand are binarysequences. If,inthe

notationgiven above, welet

I,,

then

Ix

l’

isanaturalmeasureof the complexityof

asequencex. Infact, if a=b= 1andc=O,/tis exactlytheKCScomp/ex/tyofx. Without

specifying

a,

b andc,letusnonethelessuseChaitin’s[4]notationfor thiscomplexity:I(x). Also, letus adoptChaitin’s notation

I(vl

w) for the complexityofvrelativetow. DEFINITION11. LetybeaTuringmachineprogram, vandwbinarysequences;then

I(vlw)

denotes the smallest value the quantity

alYl,+cC,

(y,w) takes on for any self.

delimitingprogram ythatcomputes vwhen itsinputconsistsofa minimal.lengthprogram forcomputing w.

Intuitively,thismeasureshow hardit istocompute vgivencomplete knowledgeofw. Finally,itshould be noted that and

I’

arenot always substantiallydifferent:

THEOREM 1. If

Ixl’=l(x),

a=b=l,andc=O,thenthere issomeKsothat for all x,

lxl’-

Ixl"l

< K.

PROOF:

alYl’

+

blzl’

+

cC(y,z)

lYI’

+

Izl:

So,whatisthesmallest value that lyl’

+

Izl’

assumesfor

any

(y,z)such thaty*z=x? Clearly, thissmallest valuemustbe

eitherequal to

xl’.

For, what ff

YI’

+

zl’

is biggerthan

xl’?

Thenitcannotbe the smallest yl’

+ zl;

because ffone tookzto be the "empty sequence" (the

sequence

consistingofno characters)and then tookytobe theshortest

program

forcomputing

x,

onewould have

Izl’=o

and

lYl

’=

Ixl:

And, onthe otherhand,is itpossiblefor

lYI’+ Izl’

tobesmaller than

Ix

’? If

Y

I’+

z

I’

weresmaller than

x,

thenorecouldsupply a Turing

machinewitha

program

saying"Plugthe

sequence

zintotheprogram y,"andthelength

of this

program

wouldbe greaterthan

Ix

l’

byno more than the lengthof the

program

P(y,z)=’Plug the

sequence

z intothe

program

y". This lengthis the constantKin the theorem.

COROLLARY1. For a Turingmachineforwhichthe

program

P(y,z)mentioned inthe

proofisa"hardware function" which takesonly oneunitoflength toprogram,

’’=

I’-PROOF:Both

I’

and

I"

are

Integer

valued,andbythetheorem, forany x,

Ixl’-<

Ixl"<- Ixl’+l.

5. PATTERNS IN

PATTERNS; SUBSTITUTION

MACHINES

Wehavediscussedpatternin

sequences,

andpatternsinpictures. Itisalsoquite

possible toanalyzepatternsinotherpatterns. Thisisinterestingfor

many reasons,

one being thatwhendealingwith machinesmorerestricted than Turing machines, it

may

often be thecasethattheonly

way

toexpressan intuitively simple phenomenonisas a

pattern

(7)

MEASURING STATIC COMPLEXITY 167

Letus considera simple example. Suppose thatwe are dealing notwith Turing

machines, but rather with "substitution machines" machines which are capable of

running onlyprogramsofthe formP(A,B,C)=’Wherever

sequence

B occursin

sequence

C, replaceitwith sequence A’. Instead ofwriting P(A,B,C) each time, we shalldenote

such a

program

with the symbol (A,B,C). For instance,

(1,10001,1000110001100011000110001)

=

11111. (A,B,C)should be read "substituteAfor Bin C’.

We maydefinethecomplexity

xl

ofa

sequence

x asthelength of the

sequence,

Le.

xl

Ix

I,,

andthecomplexity yl ofasubstitution

program y

asthe number ofsymbols required to express y in the form (A,B,C). Then,

110001100011000110001100011

=25,

I1

5and

I(OOO,

,z)

=

.

,z=l,(ooo,,z)=

ooo ooo ooo

OOOl lOOO. For example, is ((10001,1,z), 11111) a

pattern

in 1000110001100011000110001?

What is requiredisthata(11)

+

b(5)

+

cC((10001,1,z), 11111) <25. Ifwetake a=b= 1 and

c=O (thus Ignoringtimecomplexity),thisreducesto 11

+

5 < 25, soit isindeedapattern. Ifwetakec=1 instead ofc=O,and leaveaandbequal toone,then this will still be

a pattern, as long as the computational complexity of obtaining 1000110001100011000110001 from (10001,1,11111)does not exceed 9. Itwouldseem mostintuitivetoassumethat thiscomputational complexityC((10001,1,z),11111)isequal to 5, sincethereare5onesinto which 10001 mustbe substituted,and there isnoeffort

involvedinlocatingthese l’s. Inthatcase the fundamentalinequalityreads

11

+

5

+

5 < 25,which verifiesthata patternis indeedpresent.

Now,

let us look at the

sequence

x 1001001001001001000111001 1001001001001001001011101110100100100100100100110111100100100100100100.

Remember, we are not dealing withgeneralTuring machines, weare onlydealingwith

substitution machines, andanythingwhichcannotberepresentedinthe form(A,B,C),in thenotationgivenabove,isnotasubstitution machine.

Therearetwoobviousways to computethis

sequence

xo0asubstitutionmachine. First ofall, onecanlety=(100100100100100100,B,z),and

z=

B0111001 B1011101110B 110111 B. This amounts to recognizing that 100100100100100100 is repeated in x. Alternatively, onecanlety’=(lOO,B,z’),andletz’=BBBBBB0111001BBBBBB1011101110

BBBBBB

110111BBBBBB. Thisamounts to recognizingthat 100 isapatterninx. Letus assumethata=b=l,andc=O. Then in the firstcase

lYl

+

Izl

24

+

27 51; and in the secondcase ly’l /

Iz’l

9

+

47 56. Since

Ixl

95, both(y,z)and(y’,z’) are patterns

inx.

The problemisthat,sinceweare only usingsubstitutionmachines, thereisno

way

tocombinethetwo

patterns.

One may

say

that 100100100100100100 apatternin

x,

that

100 is a

pattern

in

x,

that 100 is a pattern in 100100100100100100.

BUt,

using only

substitution machines, there isno waytosaythat thesimplest

way

to lookatxis as"a
(8)

168 B. GOERTZEL

y=(100100100100100100,B,z), and z= B 0111001 B 1011101110 B 110111 B. Thus, assuming as wehave that a=b=l andc=O,

Ixl’=51.

This ismuch lessthan

Ixl,

which

equals95.

Now,letusconsider thisoptimaly. Itcontainsthesequence100100100100100100. If we ignore the fact that y denotes a substitution machine, and simply considerthe sequenceof characters "(lO0100100100100100,B,z)’, wecan search forpatternsin this sequence,just as wewould inanyother

sequence.

Forinstance, ifwelety,=(lOO,

C,z,

), and

z,=CCCCCC,

then y,*z,=y,

lY,

=1o, and

Iz,

=6. It is apparent that (y,

z,

) is a

pattern

in y, since

lY,

+

Iz,

10

+

6 16, whereas

lYl

18. Byrecognizing the

pattern

(y,z)inx,and then recognizing thepattern(y,

z,

)iny,onemay expressboth the

repetitionof 100100100100100100 inxand the repetition of 100in 100100100100100100 as patternsin

x,

usingonlysubstitutionmachines,

Is (y,

z,

)a

pattern

inx? Strictly speaking,itisnot. But wemightcall ita second-levelpatterninx. Itisapatterninapatterninx. And,iftherewere apattern

(Y2

z2)in

thesequencesofsymbols representingy, or

z,,

wecould call thatathird-level

pattern

in x, etc. Ingeneral, wemaymake thefollowingdefinition:

DEFINITION 12. Let Fbe amap from S intoS. Wherea first.levelpattern inxis

simplyapatterninx, andnisanintegergreaterthanone,weshallsaythatPisannh. levelpatterninxifthere issome QsothatPisan (n.1) h.level

pattern

inx andPisa

pattern

inF(Q).

In the examples we have given, the map F has been, implicity, the map from substitution machinesinto theirexpressionin(A,B,C)notation.

6.APPROXIMATE PATTERN

Suppose

thaty,*z,=x,whereasy2*z2doesnot equalx,butis stillveryclosetox.

Say

Ixl

=1ooo. Then, evenif ly,

l+lz,

l=9OO

and ly,

+ lz,

=lO, (y,

z,

) isnota

pattern

inx, but (y,

z,

) is. This is not a flawin the definition ofpattern afterall, computing somethingnearxisnotthesame ascomputing x. Indeed,itmightseemthat if(y=

z

) were really so close to computing

x,

it could be modified into a pattern in x without

sacrificingmuchsimplicity.

However,

theextent towhich thisisthecaseisunknown. In orderto incorporate pairs like (y=

z=

), we shall introduce the notion ofapproximate

pattern.

In

ordertodeal withapproximatepattern,wemustassumethatItismeaningfulto talk about thedistanced(x,y)betweentwoelements ofS. Let(y,z)beanyorderedpair for whichy*zis defined. Thenwe have

DEFINITION13. The orderedpair(y,z)isanapproximatepatterninxif[1

+

d(x,y*z) ][

alYl

+

blzl

+

cC(y,z) ] <

Ixl,

where a, b, c and C are definedas in the ordinary

definitionof

pattern.

Obviously, whenx=y*z, thedistance d(x,y*z)betweenxandy*zis equal to

zero,

(9)

MEASURING STATIC COMPLEXITY 169 Of

course,

ifthedistancemeasured is definedsothatd(a,b)isinfinite whenever

a and bare notthesame, thenanapproximatepattemisan exactpattern. Thismeans

that when onespeaks of"approximate

pattern’,

oneis also speakingofordinary, exact

pattern.

Most conceptsinvolving ordinaryor"strict"pattern

may

begeneralized tothecase ofapproximate

pattern.

Forinstance,wehave:

DEFINITION14: The intensityofanapproximate

pattern

(y,z)inxisIN[(y,z)lx]

=

(

Ixl-[

+d(x,y*z)][alYl

+blzl

+cC(y,z)]

)/Ix

I.

DEFINITION15: Wherevandwarebinary

sequences,

theapproximate

com

ofvrelativeto

w,

I.(v,w), isthesmallest valuethat[1 +d(v,y*w)][a }y} +cC(y,w)]takeson for

any program

ywithinput consistingofaminimal

program

forw.

TheIncorporation ofinexactitudepermitsthe definition ofpatternto

encompass

all sodsof Interesting practicalproblems. Forexample,

suppose

xisacurvein theplaneor someotherspace, zisasetofpointsinthat

space,

andyissomeinterpolationformula whichassigns toeach setofpointsa curvepassingthroughthose points. Then I,[(y,z)

ix]

isanIndicatorofhowmuchuseit isto approximatethecurvexby applyingthe

Interpolationformula

y

tothesetofpointsz. 7.

SOPHISTICATION

ANDCRUDITY

As Indicated above, Koppel [8]has recentlyproposed an alternative to the KCS complexitymeasure. According to Koppel’s measure, the

sequences

which are most complexare notthe structurelessones. Neither, ofcourse, aretheythe oneswith

very

simple

structures,

likeO00(XX)tX)(X Rather, themore complex

sequences

aretheones withmore"sophisticated"structures.

Thebasic idea[10]isthata

sequence

witha sophisticatedstructureis

part

ofa nabral class of

sequences,

allof whichare computedby thesame

program.

The

program

producesdifferent

sequences

dependingonthedataitisgiven,but these

sequences

all

possess

thesameunderlyingstructure. Essentially,the

program represents

thestructured

padofthe

sequence,

andthedata the randompart. Therefore,the"sophistication"ofa

sequence

x should be definedas the size of the

program

defining the "natural class"

containingx.

Buthow is this "natural"

program

tobe found? Asabove,where

y

isa

program

and

z

isabinary

sequence,

let yl and

zl

denote thelengthof

y

andzrespectively. Koppel

proposes

thefollowing:

ALGORITHM

1:

1)search overallpairsofbinary

sequences

(y,z)for which the

two-tape

Turingmachine with

program y

and dataz

computes x,

and find thosepairs

for which

yl

/

zl

issmallest.

2)

search overall pairs found in

Step

1, andfindtheoneforwhich

yl

is

biggest. Thisvalueof

zl

isthe "sophistication"ofx.

All the pairs foundin

Step

I are"best"representations of x.

Step

2 searchesallthe
(10)

170 B. GOERTZEL

This

program

isassumedtobe the naturalstructureofx,anditslengthistherefore taken

as ameasureof thesophisticationofthestructureofx.

Thereisnodoubtthat thedecompositionofa

sequence

intoastructuredpartand

arandompartisanimportant and usefulidea. ButKoppel’s algorithmforachievingit is

conceptually problematic. Supposethe

program

pairs (y,

z,

) and (y, z2) both

cause a Turingmachine to outputx, but whereas ly,

l=50

and

Iz,

l=300,

ly21=250and

Iz,

l=lO.

Since

lY, +

Iz,

=350, whereas

lY=I

+

Iz=l

=36o, (y=

z=

)willnotbeselectedin Step 1, which searches for thosepairs(y,z)that minimize

Yl

+

zl

What if, in

Step

2, (y,

z,

)ischosenasthepairwithmaximum

lYl

? Then thesophisticationofxwillbeset at

lY,

=50. Doesitnotseemthat theintuitivelymuchmoresophisticated

program

y,,which computes xalmostaswellas

y,,

shouldcounttoward thesophistication ofx?

Inthelanguageof

pattern,

whatKoppel’s algorithm doesis:

1) Locatethepairs(y,z)thatarethemostintensepatternsinx according to

It,

a=b=l,c=0.

2)

Among

thesepairs, select theonewhich isthemost

intensepatterninxaccordingto

I=1

I,,

a=l, b=c=0.

Itapplies twodifferentspecialcasesof thedefinitionofpattern, oneafter the other. How canallthisbe modifiedtoaccomodate exampleslikethepairs (y,

z,

), (y=

z,

)given above? Oneapproachistolookat some sortofcombinationof

Yl

+

z with

Yl- Yl

+

zl

measuresthecombinedlengthof

program

anddata,and

Yl

measuresthe lengthof the

program.

Whatis desiredisasmall

Yl

+

zl

buta large

Yl.

This issome motivationforlookingat

(lYl

+

Izl)/lYl.

The smallerlyl

+

Izl

gets,the smallerthisquantity gets;and thebigger

Yl

gets,the smalleritgets. Oneapproach to measuring complexity, then, istosearch all(y,z)such thatx=y*z,andpicktheonewhichmakes (lyl

+ Izl)/lYl

smallest. Of course,

(lYl + Izl)/lYl

+ Izl/lYl,

so whatever makes

(lYl + Izl)/lYl

smallest alsomakes

zl/lYl

smallest.

Hence,

inthiscontext,thefollowingisnatural: DEFINITION 16. Thecrudityofapattern (y,z)is z /

Yl.

The crudityis simplythe ratio of data to

program.

The

crudera pattern is, the greatertheproportion of data to

program.

A verycrudepattern is mostly data;anda patternwhichismostly

program

isnotverycrude. Obviously, "crudity"isIntendedas an

Intuitive opposite to "sophistication"; however, it is not exactly the opposite of

"sophistication"as Koppeldefined it.

Thisapproach canalso be interpretedto assign eachx a "naturalprogram" and hencea"natural class’. Onemust simplylookatthepattern(y,z)inxwhosecrudityisthe smallest. The

program

yassociated with thispattern is, in a sense, the mostnatural

program

forx.

8.LOGICAL DEPTH

Bennett [9], asmentionedabove,hasproposed a complexitymeasurecalled"logical depth’, whichincorporatesthetimefactorinaninterestingway. The KCScomplexityof x measuresonlythe length of the shortest

program

requiredforcomputingx itsays

nothingabouthow longthis

program

takestorun. Isitreally correct to calla

sequence

(11)

MEASURING STATIC COMPLEXITY 171

years to run? Bennett’sidea is to lookatthe runningtimeofthe shortest

program

for

computingasequencex. Thisquantityhe calls thelogical depthofthesequence. Oneof the motivations for thisapproach was adesireto capturethesensein which abiological organismismorecomplexthana random

sequence.

Indeed,itis

easy

tosee thata sequence xwithnopatternsinithas the smallestlogical depthof

any sequence.

The shortestprogram forcomputingitis "Print

x",

which obviouslyruns fasterthan

any

other

program

computinga

sequence

of thesame length as x. Andthem isno mason to doubtthehypothesisthatbiological organisms haveahigh logicaldepth. Butitseems to us that, in some

ways,

Bennett’sdefinition is nearly as counterintuitive as the KCS approach.

Suppose them are two competing

programs

for computing x,

program

y and

program y’.

Whatif

y

hasalength of1000 andarunningtime of 10 minutes, but

y’

has a lengthof 999 and a runningtime of 10

years.

Then if

y’

is the shortest

program

for computing

x,

thelogical depthof

x

Isten

years.

Intuitively,thisdoesnseemquite right: itIs notthecasethatxfundamentally requiresten

years

to

compute.

AtthecoreofBennett’s measureistheideathat the shortest

program

forcomputing xis themostnaturalrepresentation ofx. Otherwisewhywould the running time of this

particularprogrambea meaningfulmeasure oftheamountoftimexrequires toevolve

naturally. But one define the "most natural representation" ofa given entity in

many

different

ways.

Bennett’sisonlythesimplest. ForInstance, one

may

studythequantity dC(y,z)

+

elzl/lYl + f(lYl + Izl),

where d, andfarepositiveconstants definedsothat d+e+f=3.

The motivationfor this isas follows. Thesmaller z

I/lYl

is, the less crude is the

pattern

(y,z). And,asIndicatedabove,thecrudityofa

pattern

(y,z)

may

beInterpretedas ameasureof hownaturalarepresentationit is. ThesmallerC(y,z)is, the lesstimeittakes toget x outof(y,z). And, finally, thesmaller lyl

+

Izl s,

themoreintenseapattern (y,z) is. Allthese facts

suggest

thefollowing:

DEFINITION

17: Let

m

denotethesmallestvaluethat the quantity

dC(y,z)

+ elzl/lYl + f(lYl

/

Izl)

assumesfor

any

pair (y,z)such thatx=y*z (assuming

them is suchaminimumvalue). Thedepth complexilyofxmaythen be definedasthe timecomplexityC(y,z)ofthe

pattern

(y,z) atwhichthis minimummisattained.

Setting d=e=Oreduces thedepth complexitytothelogicaldepth as Bennettdefined it. Setting e=Omeansthateverythingisas

Bennett’s

definitionwouldhaveit,

except

that casessuchas thepatterns (y,

z,

), (y,

z,

) described above are resolved in a more

Intuitivematter. Setting f=Omeansthatoneisconsideringthe timecomplexityof themoM sophistJcated leastcrude, moststructured representationof

x,

rather thanmere/ythe

shortest. And keeping alltheconstants nonzero ensures abalancebetween time,

space,

andsophistication.

Admittedly,thisapproachisnotnearlysotidyas Bennett’s. Itskey shortcoming

(12)

172 B. GOERTZEL

for considering all the relevant factors.

9. STRUCTUREAND STRUCTURALCOMPLEXITY

Wehavediscussedseveral differentmeasuresof staticcomplexity, whichmeasure ratherdifferentthings. Butall thesemeasureshave one thingincommon:theyworkby singling outtheonepatternwhich minimizessomequantity. It isequallyinteresting to studythe totalamountofstructureinanentity.

For Instance,

suppose

xand

x,

both haveKCScomplexityA,but whereasxcan only

be computed by one

program

of lengthA,

x,

can be computed by a hundred totally

different

programs

oflength

A.

Doesitnot seemthat

x,

is insomesense morecomplex than

x,

that thereismoreto

x,

thantox?

LetusdefinetheMnectureofx asthesetofall(y,z)whichare approximatepatterns

in x (assuming the constants

a,

b, and c, and the metric d(v,w), havepreviouslybeen

fixed), and denoteitP(x). Then thequestionis: what isameaningfulwaytomeasurethe

size of P(x)? At first one might think to add up the Intensities

[l

+d(Y*Z,X)][alYl

+blzl

+cC(y,z)] of all the elements in P(x). Butthisapproachhasone crucialflaw, revealedbythefollowing example.

Say

xisa

sequence

of10,000characters,and(y,

z,

)isa patterninxwith

[z,I

=70,

lY,

1000, and C(y,

z,

)=2000.

Suppose

thaty, computesthefirst1000 digits ofxfrom thefirst7digitsof

z,

according toacertainalgorithmA. And

suppose

It computesthe

second 1000 digitsofxfrom thenext7digitsof

z,

according tothe samealgorithmA. Andso onfor thethird1000 digitsof

z,,

etc. a/ways usingthesamealgorithmA

Next,

considerthe pair(y,

z,

)whichcomputesthe first 9000digitsofxinthesame manner as

(Y2

z,

), but

computes

the last 1000 digits ofxby storing them in

z,

and

printingthemafter therestofitsprogramfinishes. Wehave

z,

1063,andsurely Y, Is notmuch larger than Y,

I.

Let’s say y, 1o. Furthermore, C(y,

z,

) is certainlyno

greater

than C(y,

z,

): after all, the change from (y,

z,

) to (y,

z,

) Involved the replacement ofseriouscomputationwithsimplestorageand printing.

Thepointisthat both(y,

z,

)and(y

z,

) are patternsin

x,

but incomputing the totalamountofstructure in

x,

itwould befoolishto countboth of them. Ingeneral, the problemisthat differentpatterns mayshare similar

components,

and it isunacceptable tocounteachofthese

components

several times. Inthe

present

examplethe solution is

easy:

don count (y, z2). BUt one

may

alsoconstruct examplesof

very

different

patterns

whichhaveasignificant, sophisticated

component

in common. Clearly,what isneeded

isa generalmethod of dealingwithsimilarities between

patterns.

Recall thatI.(v w)wasdefinedastheapproximate version of the effortrequired to

compute

vfrom aminimal

program

forw, so that ifvandwhavenothing in common,

I.(v,w)=l.(v).

And,ontheotherhand,ifvandwhavea largecommon component,then bothI.(v,w)andI.(w,v)are

very

small.

I.(vlw

)isdefinedonlywhenvandw are

sequences.

But weshall also needtotalk aboutone

program

beingsimilartoanother. Inordertodo
(13)

MEASURING STATIC COMPLEXITY !73

asit iscomputableon aTuringmachine, anditdoesnot assignthesame

sequence

toany two differentprograms.

The introduction ofa programming languageLpermits us todefine the complexity of a

program

y as I.(L(y)), and to define the complexityofone

program

y, relative to anotherprogramY2 as I.(L(y, ) L(Y, )). Asthelengths ofthe

programs

involved increase, thedifferencesbetweenprogramming languagesmatterless and less. Tobeprecise, let Land

L,

be

any

two programming languages,computable on Turingmachines. Then it canbe shown that,as L(y,

)

andL(y2

)

approachInfinity, the ratiosI.(L(y, ))/I.(L,(y, ))and I.(L(y,)

L(Y=

))/I.(L, (y, )

L, (Y,

))bothapproach 1.

Where is

any

binary

sequence

of length

n,

let D(z) be the binary

sequence

of length2n obtainedby replacingeach 1 in with01, andeach 0 in with10. Wherewand are

any

two binary

sequences,

letwz denote the

sequence

obtained by placing the

sequence

111attheendofD(w),andplacingD(z)atthe endofthiscomposite

sequence.

The point is that 111 cannot occurin eitherD(z) or D(w), so that wz is essentiallyw Juxtaposedwith

z,

with111 asamarker inbetween.

Now,we

may

definethecomplexityofa program.datapair(y,z) as

I.

(L(y)z), andwe

may

define thecomplexityof(y,z)relativeto (y,

z,

)as

I.

(L(y)z L(y,)z, ). We

may

define thecomplexityof(y,z)relativeto asetofpairs {(y,

z,

),(y,,

z,

) (y,,

z,

)} to be

I.

(L(y)z L(y, )z,L(y,)z,...L(y,)z, ). This is the toolwe need tomakesense of thephrase

the totalamountofstructure of

Let Sbeanysetofprogram-data pairs(x,y). Thenwe

may

define thesize

IS

ofS asthe resultofthe following

process:

ALGORITHM2:

StepO. Makea list ofallthepatternsinS, andlabel them(y,

z,

), (y,

z,

), (y.

z.

).

Step

1.Let s,(x)=l.(L(y, )z, )

Step

2. Let s,(x)=s,(x)+l.(L(y,

)z,

)l(L(y,

)z,

) Step3. Let s,(x) =s,(x)

+

l.(L(y,)z, L(y, )z,L(y,)z, ))

Step4. Lets,(x)=s,(x)+l.(L(y,)z, lL(y, )z,L(y, )z,)L(y,)z, ))...

Step N. Let

ISl

=s(x)=s,,(x)+l.(L(y,)z, lL(y, )z,L(y, )z,

)...i.(y,,

)z,, )

Atthekh

step,

onlythatportion of(y,

z,

)which isindependentof

{(y,

z,

), (y,.,

,z,., )}

isaddedontothecurrentestimate of

Sl.

Forinstance, in

Step

2,

if(y,

z,

)isindependent of(y,

z,

),then thisstepincreasestheinitial estimateof

Sl

by thecomplexity of(y,

z,

). Butif(y,

z,

)ishighlydependent on (y,

z,

), notmuch will be

addedontothe first estimate. Itisnotdifficultto seethat this

process

will arriveatthe

same answer regardlessof the orderin whichthe(y,

z,

)

appear:

THEOREM2:Theresu/tofAlgorithm 2is invariantunder permutationof the(y,

,z,

).

Where P(x) is the set of all

patterns

in x, we

may

now define the structural complexity ofx to be the quantity

P(x)

I.

This, we

suggest,

is the sense of the word complexity" thatone uses when one

says

thata

person

is more complex than atree, whichismore complexthana bacterium. Inaway,structuralcomplexitymeasureshow
(14)

174 B. GOERTZEL

bacterium.

REFERENCES

1.KRONSJO,I_

Alclorithms:

TheirComplexity and Efficiency, Wiley.lnterscience,NewYork,

1979.

2.

BOWEN,

R.Symbolic DynamicsforHyperbolic Flows,Am.

J.

ofMath.95(1973),421-460.

3.

KOLMOGOROV,

A.N. ThreeApproaches totheQuantitativeDefinition ofInformation,

Prob. Info. Transmission 1 (1965), 1.7.

4.

CHAITIN,

G.Information.TheoreticComputational Complexity, IEEE.TI120(1974), 10-15. 5. SOLOMONOFF, I_ A FormalTheoryof Induction, Pads I.andI1., Info. andControl

20

(1964),224-254.

6.

BENNETT,

C.H.TheThermodynamicsofComputation

A

Review,Int.

J.

Theor.Phys.21 (1982), 905-940.

7.ZUREK, W.H.AlgorithmicInformationContent,Church.Turing Thesis,Physical

Entropy,

and Maxwell’sDemon, InComplexity_,

Entro_pv

and the Physics of Information,(ed. W.H. Zurek),Addison.Wesley, New York, 1990.

8:KOPPEL, MOSNE. Complexity, DepthandSophistication, Complex

Systems

1 (1987), 1087-1091.

9. CHAITIN, G. Toward aMathematical Definition ofLife, InThe Maximum

Entropy

Formalism(ed. LevineandTribus), MIT Press,Cambridge MA, 1978.

References

Related documents