lnternat. J. Math. & Math. Sci. VOL. [5 NO. (1992) 161-174
MEASURING STATIC COMPLEXITY
BEN GOERTZEL
Departmentof Mathematics
UniversityofNevada,Las Vegas Las
Vegas
NV89154(Received August 17, 1990 and in revised form March 1, 1991}
ABSTRACT:Theconceptof
"pattern"
isintroduced,formallydefined, and usedto analyzevariousmeasures ofthe complexityof finite binary
sequences
andother objects. The standardKolmogoroff.Chaitin.Solomonoff complexitymeasureisconsidered,alongwith Bennett’s’logicaldepth; Koppel’s"sophistication’,and Chaitin’sanalysisof thecomplexityofgeometric objects. Thepattern.theoretic pointof view illuminatesthe shortcomingsof these measures and leads to specific improvements, it giles rise to two novel
mathematical
concepts
"orders" ofcomplexity and "levels" ofpattern,
andityieldsa newmeasureof complexity, the "structural complexity’, whichmeasuresthe totalamountof structure anentity
possesses.
KEY WORDS AND PHRASES. Kolmogorov complexity,
algorithritic
Information,pattern,
sophistication,
structure,
depthAMS SUBJECT
CLASSIFICATION
CODE. 68Q301.
INTRODUCTION
Different contexts require different
concepts
of "complexity’. In the theory of computational complexity, as outlined forinstance byKronsjo [1], one deals with the complexity of problems. And the complexity of evolvingsystems
falls under the aegis of dynamicalsystems
theory, asconsideredforexample by Bowen[2].
Thepresent
paper,
however,isconcerned with thecomplexityof staticobjects,asubjectwhichhasreceiled rather little attention. Althoughmostof the discussionfocusesonbinarysequences,
the implicationsaremuch more general.162 B. GOERTZEL
DEFINITION1. LetMbeauniversalTuringmachine. Let us saythata
program
forM
isserf.delimiting ifitcontainsasegment
tellingMitstotallengthin bits. Then,theKCS
complexity of a finite binary
sequence
x is the length of the shortestprogram
whichcomputes
x on M.In the decades since its conception, this definition has led to a number of Interestingdevelopments. Chaitin[4]has useditto provideanInteresting new proofof
Godel’s theorem; and Bennett [6], Zurek[7]and othershaveapplied it to problems in
thermodynamics. However, it has Increasingly beenrealized that the concept of
KCS
complexityfallstocapturethe Intuitive meaning of"complexity."
The problem is that, according to the KCS definition, "random", structureless
sequences
arejudgedthemost complex. The leastcomplexsequences
arethose likeO(X)IX)O000...O00, 010101010101...010101, and 1010010001000010(XXX)...O, which canbe computed byveryshort
programs.
And themostcomplexsequences
x arethose which cannotbecomputed byany
programshorterthan"printX’. There isasensein which this isnot adesirableproperty
foradefinition ofcomplexitytohaveIn
whichahumanor aree
orthesequence
ofprime numbersismore"complex"thanarandomsequence.
Over thepast
decade, there have been two noteworthyattempts to remedythis deficiency: Bennett’s [6]"logical depth", andKoppel’s[8]"sophistication."Weoutline a generalmathematical frameworkwithin which various measuresof
complexity
may
beformulated,analyzedandcompared. Thisapproachyields significantmodifications of these
measures,
as well as several novel, generalconcepts
for the analysisofcomplexity. Furthermore,itgilesrisetoanentirelynewcomplexitymeasure,
the "structural complexity’, which measures the total amount of structure an entity
possesses.
Intuitilely,thistellsone"how much thereisto say"abouta gilen object.2. PATTERN
DEFINITION 2.
A pattern space
is a set(S,*,l
I),
whereS
is a set, * is a binaryoperationdefinedon some subsetof
SxS,
and isamap
fromSintothe nonnegatile real numbers.Let usconsiderasimple example: Turingmachinesandfinitebinary
sequences.
DEFINITION3. Let
y
beaprogram
foraunilersalTuring machine;let beafinitebinary
sequence.
Definey*z to bethebinarysequence
whichappears
on thememory
tape
of the Turingmachineafter, having been startedwith onitsinputtape
beginning directly under thetape
headand extendingtothe right,program y
finishesrunning. Ify
never
Mops
running, then lety*zbe undefined.Let
zl
denote the length of asIts
length,and let
yl
denote thelength of theprogram
y.
Nowwe are ready togileageneraldefinItionof
pattern.
DEFINITION
4. Leta,
b, andc denoteconstant,
nonnegative numbers. Thenanordered pair(y,z)isapafternIn xifx=y*zand
alYl
+
blzl
+
cC(y,z) <Ixl,
whereC(y,z) denotesthe complexity of obtainingxfrom(y,z).DEFINITION6. ff
y
isa Turingmachineprogram
and isafinItebinarysequence,
MEASURING STATIC COMPLEXITY 163 equippedwith
program
yand givenzas initial input.For many
purposes,
the numbersa,b andc arenotImportant.
Often theycanall be takento equal1,sothattheydonotappear
in the formulaatall. Butinsomecasesitmay
beusefulto, forinstance,seta=b= 1andc=O. Then the formula reads
Yl
+
Izl
<Ixl.
Theconstants couldbedispensedwith,butthen itwould benecessary
toredefine and Cmore often.Intuitively, an orderedpair (y,z) is a patternin xifthe complexity of
y,
plus thecomplexityofz,plusthecomplexityofgettingxoutof
y
andz,
isless than thecomplexityofx. Inotherwords, anorderedpair (y,z)isapatterninxifit issimpler to
represent
xintermsof
y
andzthanitistosay "x’. Theconstantsa,
bandcare,
ofcourse,weights: If a=3/4 and b=5/4, for example, then the complexity of a is counted less than thecomplexity ofb.
The definition ofpattern canbe generalizedto ordered n.tuples, and to takeinto
accountthepossibilityofdifferentkindsof combination,say *,and
*:
DEFINITION7:Anorderedsetofnentities(x,,x2
,...,x.
)isa patterninxifx=x,*,
x:*:
...*.
x.
anda, lx,
+a:lx:l +...+a.lx.I
+
a./,C(x,,...,x.
) <Ixl,
whereC(x,,...,x.
)isthecomplexityofcomputing
x,*x,*...*x,
anda, ,...,a.,
arenonnegative numbers. Also, the followingconceptwill be ofuse:DEFINITION 8: The intensityinxofa ordered pair(y,z) such thaty*z=x
may
bedefinedas IN[(y,z)lx] (
Ixl
-/alyl /bizl
/cC(y,z)])/IxlObviously,thisquantityispositivewhenever(y,z)isapatternin
x,
andnegativeor zerowhenever itisnot;anditsmaximum value is 1.3.ANEXAMPLE: GEOMETRIC PATTERN
Mostof the present
paper
is devotedto Turingmachinesandbinarysequences.
However,
the definition ofpattern
doesnotInvolvethetheoryofcomputation; essentially, apattern
is a "representationas something simple’. Instead of Turlng machines andbinary
sequences
letus now considerpictures.Suppose
thatA
is a one inchsquare
picture,and
B
isafive inchsquare
picture madeup
oftwenty.five
non.overlappingone. inchpicturesidenticaltoA. Intuitively, it issimpler torepresent
B
as anarrangement
of copiesofA, thanit istosimplyconsiderBasa"thingin itself’.Very
roughly speaking,it wouldseemlikelythatpartoftheprocess
ofrememberingwhatBlooks like consists ofrepresentingBas anarrangementof copies ofA.
This Intuitionmaybeexpressedintermsof the definition of
pattern.
Wherexandy
aresquare
regions, let:y *, z
denote theregionobtainedbyplacingy
tothe right ofz
Y
"2
z
denote theregionobtainedbyplacingy
to theleft ofzy
*
zdenote theregionobtainedby placingy
belowzy
*, zdenote theregionobtainedbyplacingy
abovez
And,althoughthisis obviously averycrudemeasure,letusdefinethecomplexity
Ix
of asquareregionwithablack.and.whitepicturedrawn initastheproportionof theregion164 B. GOERTZEL
The operations *,,
"2
*
and *,may
be called simple operations. Compoundoperationsare,then, compositionsofsimple operations,suchastheoperation(x*, w*,x)*, w. Ifyisa compoundoperation, letusdefine itscomplexity
Yl
tobe thelengthof the shortestprogram
whichcomputes
the actualstatementofthecompound operation. ForInstance, }(x*
w* x)*, w is defined to be the lengthofthe shortestprogram
which outputsthesequenceofsymbols"(x* w* x)*,w’.Whereyisa simple operationandzisa
square
region,lety*z denote theregionthat results fromapplying
y
to z. A compound operation actson a numberofsquare
regions. Forinstance,(x*,
w*,
x)*, wactsonwandxboth. We mayconsideritto acton the orderedpair(x,w). Ingeneral, wemayconsidera compound operationy
to actonanorderedsetof
square
regions (x,,x,
xn
), wherex,
isthe letterthatoccursfirst in the statementofy,x2isthe letter thatoccurssecond, etc. Andwemay
definey*(x,xn
)tobe theregionthat results fromapplyingthecompound operation
y
totheorderedsetofregions(x,
x
).Letusreturn tothetwopictures,AandB,discussed above. Letq=A*,A*,A*, A*, A.
Then,
itiseasy
to seethatB=q*,q*,q*,q*,q. In otherwords, B(A*, A *, A*, A*, A)*,(A*,
A*,
A*, A*, A)*,(A*, A*, A*, A*, A)*, (A*, A*, A*, A*, A)*,(A*, A*,A*,
A*,A). Where y is the compound operation given in the previous sentence, we have B=y*A. The complexityofthatcompound operation,
lYl,
is certainlyvery
close to thelengthof the
program
"Letq=A
*, A*, A*, A*, A;printq*,q*,q*,q*’.
Notethat thisprogram
isshorter than the
program
"Print(A*,A*, A*,
A*, A)*,(A*, A*, A*,A*,
A)*,(A*,A*, A*,
A*,
A)*(A*, A*, A*, A*, )*, (A*, A*, A*, A*, A)", soitis clear that the latter shouldnotbe used in thecomputationof
Yl-We havenotyetdiscussedthe termC(y,(B,
,B.
)), whichrepresents
theamountofeffort requiredto
execute
the compound operationy
onthe regions (x,,...,x.
). For
simplicity’s sake, we shall simply set it equal to the number of times the symbol "*" appearsin thestatementofy;that is, tothe number ofsimple operations involvediny.
So,is(y,A) apatterninB? Letusassumethat theconstantsa,b andc areallequal to 1. Weknowy*A=B;thequestioniswhether
lYI
/IAI
/C(y,A) <IBI.
According tothe above definitions,
Yl
is 37 symbols long. Obviouslythis is a matterof the particularnotationbeing used. Forinstance,it would be less ifonly one characterwereusedtodenote*,,and it would bemoreif itwerewritteninbinarycode.C(y,z)is eveneasierto
compute:
there are24 simple operations involved in the constructionofBfromA.Sowe have,
very
roughly speaking,37+
Izl
+
24 <Ixl.
Thisistheinequality that mustbesatisfiedff(y,z)istobe consideredapatterninx. Rearranging, wefind:Izl
<Ixl
61. Recall that we defined thecomplexity ofa regionas theproportion ofblackwhich it contains. Thismeansthat(y,z)isapatterninxifandonlyifittheamount
ofblack required to drawB exceedsamountofblackrequired todrawA by atleast 62.
MEASURING S’rATIC COMPIEXI’IY 1)5
region. Ingeneral, wemaydefine
I(x,
,x, )1
{x,I
+...+Ixnl,
assuming that theamountof blackinaunionofdisjointregionsisthesumof theamountsof blackintheindividual
regions. From this it follows that (y,(x,
xn
)) is apattern
in x if and only ifalYl
+
b(Ix, l/.../lx.I)
/ cC(y,(x,,...,x.)) <Ixl.
Resultssimilartothese could also beobtainedfromadifferentsortofanalysis. In ordertodealwithregions other thansquares,it isdesirableto replace *,, *,, %,
**
witha single "joining"operation*, namelythethe set.theoreticunion U. Letz=(x,,
xn
), let ybeaTuringmachine, let f beamethodforconvertingapictureintoabinarysequence,
and letgbe amethodforconvertingabinary
sequence
intoa picture. Thenwehave DEFINITION 9: If xx,
Ux
U Uxn
then (y,z,f,g) is apattern
in x ifalYl +blzl +clfl
+dlgl +eC(y,z,f,g) <Ixl.
Wehavenotsaidhow
Ill
and gl aretobe defined.However,
thiswould require adetailedconsiderationof thegeometricspacecontainingx,and thatwouldtakeustoofarafield. This general approachissomewhatsimilarto that taken in Chaitin[9].
4. ORDERS OF COMPLEXITY
It should be apparentfrom the foregoing thatcomplexityandpattern are deeply
interrelated. In this and the following sections, we shall explore several different
approaches to measuring complexity, all ofwhichseekto
go
beyondthesimplisticKCS approach. According to theKCS approach, complexitymeans structurelessness. Themost "random;least structured
sequences
arethemostcomplex. The formulation of thisapproach was a great step forward. But it seems clear that the next
step
is to giveformulaswhichcapture moreof the intuitivemeaningof theword"complexity".
First,weshallconsiderthe idea thatpatternItselfmaybeusedtodefinecomplexity.
Recall thegeometricexampleof the previous section, inwhichthecomplexityofablack. and.whitepictureina
square
region wasdefinedastheamount ofblackrequired todraw it. Thismeasuredidnotevenpresume
togauge
theeffort requiredtorepresent
ablack.and.whitepictureina
square
region. Oneway
tomeasurethe effortrequiredtorepresent
such apicture, callit
x,
isto lookatallcompound operationsy,
and allsets ofsquare
black.and.whitepictures(x,,...,x,
), such thaty*(x,,...,x.
)=x. Onemay
thenask whichy
and(x, x,)givethesmalleMvalue of
alYl
+ b(Ix, +
+ Ix.l)
/ cc(y,(x,,...,x.)). Th#s minimal value ofalYl
+
b(Ix, l+...+lx,
I)
may
be defined to be the "second.order" complexityofx. The second.order complexityisthen bea measureof howsimplyx
can berepresented intermsofcompound operationsonsquare
regions.Ingeneral,givenanycomplexitymeasure
I,
we may usethissortofreasoningto definea complexitymeasureI’.
DEFINITION 10: If is a complexity measure,
I’
isthe complexity measuredefinedsothat,for all
x,
Ixl’
is the smallest value that thequantityalYl
+
blzl
+
cC(y,z) takeson, forany(y,z) such thaty*z=x.xl’
measureshowcomplexthesimplest representation ofxis, wherecomplexityismeasuredby
I.
Sometimes, asinourgeometricexample, andI’
willmeasure166 B. GOERTZEI
Extendingthisprocess,onecanderivefrom
I’
ameasureI":
thesmallest value thatthequantityalYl’
+blzl’
+
cC(y,z) (4.1)takes on, forany (y,z) such thaty*z=x.
Ixl"
measures thecomplexity of the simplestrepresentation ofx,wherecomplexityismeasuredby
I’.
And fromI",
onemay
obtainameasure
I’".
Itis clear thatthisprocess maybecontinued indefinitely.Itisinteresting toask when and
I’
are equivalent, oralmost equivalent. For Instance, assumethatyisaTuring machine, andxand are binarysequences. If,inthenotationgiven above, welet
I,,
thenIx
l’
isanaturalmeasureof the complexityofasequencex. Infact, if a=b= 1andc=O,/tis exactlytheKCScomp/ex/tyofx. Without
specifying
a,
b andc,letusnonethelessuseChaitin’s[4]notationfor thiscomplexity:I(x). Also, letus adoptChaitin’s notationI(vl
w) for the complexityofvrelativetow. DEFINITION11. LetybeaTuringmachineprogram, vandwbinarysequences;thenI(vlw)
denotes the smallest value the quantityalYl,+cC,
(y,w) takes on for any self.delimitingprogram ythatcomputes vwhen itsinputconsistsofa minimal.lengthprogram forcomputing w.
Intuitively,thismeasureshow hardit istocompute vgivencomplete knowledgeofw. Finally,itshould be noted that and
I’
arenot always substantiallydifferent:THEOREM 1. If
Ixl’=l(x),
a=b=l,andc=O,thenthere issomeKsothat for all x,lxl’-
Ixl"l
< K.PROOF:
alYl’
+
blzl’
+
cC(y,z)lYI’
+
Izl:
So,whatisthesmallest value that lyl’+
Izl’
assumesforany
(y,z)such thaty*z=x? Clearly, thissmallest valuemustbeeitherequal to
xl’.
For, what ffYI’
+
zl’
is biggerthanxl’?
Thenitcannotbe the smallest yl’+ zl;
because ffone tookzto be the "empty sequence" (thesequence
consistingofno characters)and then tookytobe theshortest
program
forcomputingx,
onewould have
Izl’=o
andlYl
’=
Ixl:
And, onthe otherhand,is itpossibleforlYI’+ Izl’
tobesmaller than
Ix
’? IfY
I’+
zI’
weresmaller thanx,
thenorecouldsupply a Turingmachinewitha
program
saying"Plugthesequence
zintotheprogram y,"andthelengthof this
program
wouldbe greaterthanIx
l’
byno more than the lengthof theprogram
P(y,z)=’Plug the
sequence
z intotheprogram
y". This lengthis the constantKin the theorem.COROLLARY1. For a Turingmachineforwhichthe
program
P(y,z)mentioned intheproofisa"hardware function" which takesonly oneunitoflength toprogram,
’’=
I’-PROOF:Both
I’
andI"
areInteger
valued,andbythetheorem, forany x,Ixl’-<
Ixl"<- Ixl’+l.
5. PATTERNS IN
PATTERNS; SUBSTITUTION
MACHINESWehavediscussedpatternin
sequences,
andpatternsinpictures. Itisalsoquitepossible toanalyzepatternsinotherpatterns. Thisisinterestingfor
many reasons,
one being thatwhendealingwith machinesmorerestricted than Turing machines, itmay
often be thecasethattheonlyway
toexpressan intuitively simple phenomenonisas apattern
MEASURING STATIC COMPLEXITY 167
Letus considera simple example. Suppose thatwe are dealing notwith Turing
machines, but rather with "substitution machines" machines which are capable of
running onlyprogramsofthe formP(A,B,C)=’Wherever
sequence
B occursinsequence
C, replaceitwith sequence A’. Instead ofwriting P(A,B,C) each time, we shalldenote
such a
program
with the symbol (A,B,C). For instance,(1,10001,1000110001100011000110001)
=
11111. (A,B,C)should be read "substituteAfor Bin C’.We maydefinethecomplexity
xl
ofasequence
x asthelength of thesequence,
Le.
xl
Ix
I,,
andthecomplexity yl ofasubstitutionprogram y
asthe number ofsymbols required to express y in the form (A,B,C). Then,110001100011000110001100011
=25,I1
5andI(OOO,
,z)=
.
,z=l,(ooo,,z)=ooo ooo ooo
OOOl lOOO. For example, is ((10001,1,z), 11111) apattern
in 1000110001100011000110001?What is requiredisthata(11)
+
b(5)+
cC((10001,1,z), 11111) <25. Ifwetake a=b= 1 andc=O (thus Ignoringtimecomplexity),thisreducesto 11
+
5 < 25, soit isindeedapattern. Ifwetakec=1 instead ofc=O,and leaveaandbequal toone,then this will still bea pattern, as long as the computational complexity of obtaining 1000110001100011000110001 from (10001,1,11111)does not exceed 9. Itwouldseem mostintuitivetoassumethat thiscomputational complexityC((10001,1,z),11111)isequal to 5, sincethereare5onesinto which 10001 mustbe substituted,and there isnoeffort
involvedinlocatingthese l’s. Inthatcase the fundamentalinequalityreads
11
+
5+
5 < 25,which verifiesthata patternis indeedpresent.Now,
let us look at thesequence
x 1001001001001001000111001 1001001001001001001011101110100100100100100100110111100100100100100100.Remember, we are not dealing withgeneralTuring machines, weare onlydealingwith
substitution machines, andanythingwhichcannotberepresentedinthe form(A,B,C),in thenotationgivenabove,isnotasubstitution machine.
Therearetwoobviousways to computethis
sequence
xo0asubstitutionmachine. First ofall, onecanlety=(100100100100100100,B,z),andz=
B0111001 B1011101110B 110111 B. This amounts to recognizing that 100100100100100100 is repeated in x. Alternatively, onecanlety’=(lOO,B,z’),andletz’=BBBBBB0111001BBBBBB1011101110BBBBBB
110111BBBBBB. Thisamounts to recognizingthat 100 isapatterninx. Letus assumethata=b=l,andc=O. Then in the firstcaselYl
+
Izl
24+
27 51; and in the secondcase ly’l /Iz’l
9+
47 56. SinceIxl
95, both(y,z)and(y’,z’) are patternsinx.
The problemisthat,sinceweare only usingsubstitutionmachines, thereisno
way
tocombinethetwo
patterns.
One maysay
that 100100100100100100 apatterninx,
that100 is a
pattern
inx,
that 100 is a pattern in 100100100100100100.BUt,
using onlysubstitution machines, there isno waytosaythat thesimplest
way
to lookatxis as"a168 B. GOERTZEL
y=(100100100100100100,B,z), and z= B 0111001 B 1011101110 B 110111 B. Thus, assuming as wehave that a=b=l andc=O,
Ixl’=51.
This ismuch lessthanIxl,
whichequals95.
Now,letusconsider thisoptimaly. Itcontainsthesequence100100100100100100. If we ignore the fact that y denotes a substitution machine, and simply considerthe sequenceof characters "(lO0100100100100100,B,z)’, wecan search forpatternsin this sequence,just as wewould inanyother
sequence.
Forinstance, ifwelety,=(lOO,C,z,
), andz,=CCCCCC,
then y,*z,=y,lY,
=1o, andIz,
=6. It is apparent that (y,z,
) is apattern
in y, sincelY,
+
Iz,
10+
6 16, whereaslYl
18. Byrecognizing thepattern
(y,z)inx,and then recognizing thepattern(y,z,
)iny,onemay expressboth therepetitionof 100100100100100100 inxand the repetition of 100in 100100100100100100 as patternsin
x,
usingonlysubstitutionmachines,Is (y,
z,
)apattern
inx? Strictly speaking,itisnot. But wemightcall ita second-levelpatterninx. Itisapatterninapatterninx. And,iftherewere apattern(Y2
z2)inthesequencesofsymbols representingy, or
z,,
wecould call thatathird-levelpattern
in x, etc. Ingeneral, wemaymake thefollowingdefinition:DEFINITION 12. Let Fbe amap from S intoS. Wherea first.levelpattern inxis
simplyapatterninx, andnisanintegergreaterthanone,weshallsaythatPisannh. levelpatterninxifthere issome QsothatPisan (n.1) h.level
pattern
inx andPisapattern
inF(Q).In the examples we have given, the map F has been, implicity, the map from substitution machinesinto theirexpressionin(A,B,C)notation.
6.APPROXIMATE PATTERN
Suppose
thaty,*z,=x,whereasy2*z2doesnot equalx,butis stillveryclosetox.Say
Ixl
=1ooo. Then, evenif ly,l+lz,
l=9OO
and ly,+ lz,
=lO, (y,z,
) isnotapattern
inx, but (y,z,
) is. This is not a flawin the definition ofpattern afterall, computing somethingnearxisnotthesame ascomputing x. Indeed,itmightseemthat if(y=z
) were really so close to computingx,
it could be modified into a pattern in x withoutsacrificingmuchsimplicity.
However,
theextent towhich thisisthecaseisunknown. In orderto incorporate pairs like (y=z=
), we shall introduce the notion ofapproximatepattern.
In
ordertodeal withapproximatepattern,wemustassumethatItismeaningfulto talk about thedistanced(x,y)betweentwoelements ofS. Let(y,z)beanyorderedpair for whichy*zis defined. Thenwe haveDEFINITION13. The orderedpair(y,z)isanapproximatepatterninxif[1
+
d(x,y*z) ][alYl
+
blzl
+
cC(y,z) ] <Ixl,
where a, b, c and C are definedas in the ordinarydefinitionof
pattern.
Obviously, whenx=y*z, thedistance d(x,y*z)betweenxandy*zis equal to
zero,
MEASURING STATIC COMPLEXITY 169 Of
course,
ifthedistancemeasured is definedsothatd(a,b)isinfinite whenevera and bare notthesame, thenanapproximatepattemisan exactpattern. Thismeans
that when onespeaks of"approximate
pattern’,
oneis also speakingofordinary, exactpattern.
Most conceptsinvolving ordinaryor"strict"pattern
may
begeneralized tothecase ofapproximatepattern.
Forinstance,wehave:DEFINITION14: The intensityofanapproximate
pattern
(y,z)inxisIN[(y,z)lx]=
(Ixl-[
+d(x,y*z)][alYl+blzl
+cC(y,z)])/Ix
I.
DEFINITION15: Wherevandwarebinary
sequences,
theapproximatecom
ofvrelativeto
w,
I.(v,w), isthesmallest valuethat[1 +d(v,y*w)][a }y} +cC(y,w)]takeson forany program
ywithinput consistingofaminimalprogram
forw.TheIncorporation ofinexactitudepermitsthe definition ofpatternto
encompass
all sodsof Interesting practicalproblems. Forexample,suppose
xisacurvein theplaneor someotherspace, zisasetofpointsinthatspace,
andyissomeinterpolationformula whichassigns toeach setofpointsa curvepassingthroughthose points. Then I,[(y,z)ix]
isanIndicatorofhowmuchuseit isto approximatethecurvexby applyingtheInterpolationformula
y
tothesetofpointsz. 7.SOPHISTICATION
ANDCRUDITYAs Indicated above, Koppel [8]has recentlyproposed an alternative to the KCS complexitymeasure. According to Koppel’s measure, the
sequences
which are most complexare notthe structurelessones. Neither, ofcourse, aretheythe oneswithvery
simple
structures,
likeO00(XX)tX)(X Rather, themore complexsequences
aretheones withmore"sophisticated"structures.Thebasic idea[10]isthata
sequence
witha sophisticatedstructureispart
ofa nabral class ofsequences,
allof whichare computedby thesameprogram.
Theprogram
producesdifferent
sequences
dependingonthedataitisgiven,but thesesequences
allpossess
thesameunderlyingstructure. Essentially,theprogram represents
thestructuredpadofthe
sequence,
andthedata the randompart. Therefore,the"sophistication"ofasequence
x should be definedas the size of theprogram
defining the "natural class"containingx.
Buthow is this "natural"
program
tobe found? Asabove,wherey
isaprogram
andz
isabinarysequence,
let yl andzl
denote thelengthofy
andzrespectively. Koppelproposes
thefollowing:ALGORITHM
1:1)search overallpairsofbinary
sequences
(y,z)for which thetwo-tape
Turingmachine with
program y
and datazcomputes x,
and find thosepairsfor which
yl
/zl
issmallest.2)
search overall pairs found inStep
1, andfindtheoneforwhichyl
isbiggest. Thisvalueof
zl
isthe "sophistication"ofx.All the pairs foundin
Step
I are"best"representations of x.Step
2 searchesallthe170 B. GOERTZEL
This
program
isassumedtobe the naturalstructureofx,anditslengthistherefore takenas ameasureof thesophisticationofthestructureofx.
Thereisnodoubtthat thedecompositionofa
sequence
intoastructuredpartandarandompartisanimportant and usefulidea. ButKoppel’s algorithmforachievingit is
conceptually problematic. Supposethe
program
pairs (y,z,
) and (y, z2) bothcause a Turingmachine to outputx, but whereas ly,
l=50
andIz,
l=300,
ly21=250andIz,
l=lO.
SincelY, +
Iz,
=350, whereaslY=I
+
Iz=l
=36o, (y=z=
)willnotbeselectedin Step 1, which searches for thosepairs(y,z)that minimizeYl
+
zl
What if, inStep
2, (y,z,
)ischosenasthepairwithmaximumlYl
? Then thesophisticationofxwillbeset atlY,
=50. Doesitnotseemthat theintuitivelymuchmoresophisticatedprogram
y,,which computes xalmostaswellasy,,
shouldcounttoward thesophistication ofx?Inthelanguageof
pattern,
whatKoppel’s algorithm doesis:1) Locatethepairs(y,z)thatarethemostintensepatternsinx according to
It,
a=b=l,c=0.2)
Among
thesepairs, select theonewhich isthemostintensepatterninxaccordingto
I=1
I,,
a=l, b=c=0.Itapplies twodifferentspecialcasesof thedefinitionofpattern, oneafter the other. How canallthisbe modifiedtoaccomodate exampleslikethepairs (y,
z,
), (y=z,
)given above? Oneapproachistolookat some sortofcombinationofYl
+
z withYl- Yl
+
zl
measuresthecombinedlengthofprogram
anddata,andYl
measuresthe lengthof theprogram.
Whatis desiredisasmallYl
+
zl
buta largeYl.
This issome motivationforlookingat(lYl
+
Izl)/lYl.
The smallerlyl+
Izl
gets,the smallerthisquantity gets;and thebiggerYl
gets,the smalleritgets. Oneapproach to measuring complexity, then, istosearch all(y,z)such thatx=y*z,andpicktheonewhichmakes (lyl+ Izl)/lYl
smallest. Of course,
(lYl + Izl)/lYl
+ Izl/lYl,
so whatever makes(lYl + Izl)/lYl
smallest alsomakes
zl/lYl
smallest.Hence,
inthiscontext,thefollowingisnatural: DEFINITION 16. Thecrudityofapattern (y,z)is z /Yl.
The crudityis simplythe ratio of data to
program.
The
crudera pattern is, the greatertheproportion of data toprogram.
A verycrudepattern is mostly data;anda patternwhichismostlyprogram
isnotverycrude. Obviously, "crudity"isIntendedas anIntuitive opposite to "sophistication"; however, it is not exactly the opposite of
"sophistication"as Koppeldefined it.
Thisapproach canalso be interpretedto assign eachx a "naturalprogram" and hencea"natural class’. Onemust simplylookatthepattern(y,z)inxwhosecrudityisthe smallest. The
program
yassociated with thispattern is, in a sense, the mostnaturalprogram
forx.8.LOGICAL DEPTH
Bennett [9], asmentionedabove,hasproposed a complexitymeasurecalled"logical depth’, whichincorporatesthetimefactorinaninterestingway. The KCScomplexityof x measuresonlythe length of the shortest
program
requiredforcomputingx itsaysnothingabouthow longthis
program
takestorun. Isitreally correct to callasequence
MEASURING STATIC COMPLEXITY 171
years to run? Bennett’sidea is to lookatthe runningtimeofthe shortest
program
forcomputingasequencex. Thisquantityhe calls thelogical depthofthesequence. Oneof the motivations for thisapproach was adesireto capturethesensein which abiological organismismorecomplexthana random
sequence.
Indeed,itiseasy
tosee thata sequence xwithnopatternsinithas the smallestlogical depthofany sequence.
The shortestprogram forcomputingitis "Print
x",
which obviouslyruns fasterthanany
other
program
computingasequence
of thesame length as x. Andthem isno mason to doubtthehypothesisthatbiological organisms haveahigh logicaldepth. Butitseems to us that, in someways,
Bennett’sdefinition is nearly as counterintuitive as the KCS approach.Suppose them are two competing
programs
for computing x,program
y andprogram y’.
Whatify
hasalength of1000 andarunningtime of 10 minutes, buty’
has a lengthof 999 and a runningtime of 10years.
Then ify’
is the shortestprogram
for computingx,
thelogical depthofx
Istenyears.
Intuitively,thisdoesnseemquite right: itIs notthecasethatxfundamentally requirestenyears
tocompute.
AtthecoreofBennett’s measureistheideathat the shortest
program
forcomputing xis themostnaturalrepresentation ofx. Otherwisewhywould the running time of thisparticularprogrambea meaningfulmeasure oftheamountoftimexrequires toevolve
naturally. But one define the "most natural representation" ofa given entity in
many
different
ways.
Bennett’sisonlythesimplest. ForInstance, onemay
studythequantity dC(y,z)+
elzl/lYl + f(lYl + Izl),
where d, andfarepositiveconstants definedsothat d+e+f=3.The motivationfor this isas follows. Thesmaller z
I/lYl
is, the less crude is thepattern
(y,z). And,asIndicatedabove,thecrudityofapattern
(y,z)may
beInterpretedas ameasureof hownaturalarepresentationit is. ThesmallerC(y,z)is, the lesstimeittakes toget x outof(y,z). And, finally, thesmaller lyl+
Izl s,
themoreintenseapattern (y,z) is. Allthese factssuggest
thefollowing:DEFINITION
17: Letm
denotethesmallestvaluethat the quantitydC(y,z)
+ elzl/lYl + f(lYl
/Izl)
assumesforany
pair (y,z)such thatx=y*z (assumingthem is suchaminimumvalue). Thedepth complexilyofxmaythen be definedasthe timecomplexityC(y,z)ofthe
pattern
(y,z) atwhichthis minimummisattained.Setting d=e=Oreduces thedepth complexitytothelogicaldepth as Bennettdefined it. Setting e=Omeansthateverythingisas
Bennett’s
definitionwouldhaveit,except
that casessuchas thepatterns (y,z,
), (y,z,
) described above are resolved in a moreIntuitivematter. Setting f=Omeansthatoneisconsideringthe timecomplexityof themoM sophistJcated leastcrude, moststructured representationof
x,
rather thanmere/ytheshortest. And keeping alltheconstants nonzero ensures abalancebetween time,
space,
andsophistication.
Admittedly,thisapproachisnotnearlysotidyas Bennett’s. Itskey shortcoming
172 B. GOERTZEL
for considering all the relevant factors.
9. STRUCTUREAND STRUCTURALCOMPLEXITY
Wehavediscussedseveral differentmeasuresof staticcomplexity, whichmeasure ratherdifferentthings. Butall thesemeasureshave one thingincommon:theyworkby singling outtheonepatternwhich minimizessomequantity. It isequallyinteresting to studythe totalamountofstructureinanentity.
For Instance,
suppose
xandx,
both haveKCScomplexityA,but whereasxcan onlybe computed by one
program
of lengthA,x,
can be computed by a hundred totallydifferent
programs
oflengthA.
Doesitnot seemthatx,
is insomesense morecomplex thanx,
that thereismoretox,
thantox?LetusdefinetheMnectureofx asthesetofall(y,z)whichare approximatepatterns
in x (assuming the constants
a,
b, and c, and the metric d(v,w), havepreviouslybeenfixed), and denoteitP(x). Then thequestionis: what isameaningfulwaytomeasurethe
size of P(x)? At first one might think to add up the Intensities
[l
+d(Y*Z,X)][alYl
+blzl
+cC(y,z)] of all the elements in P(x). Butthisapproachhasone crucialflaw, revealedbythefollowing example.Say
xisasequence
of10,000characters,and(y,z,
)isa patterninxwith[z,I
=70,lY,
1000, and C(y,z,
)=2000.Suppose
thaty, computesthefirst1000 digits ofxfrom thefirst7digitsofz,
according toacertainalgorithmA. Andsuppose
It computesthesecond 1000 digitsofxfrom thenext7digitsof
z,
according tothe samealgorithmA. Andso onfor thethird1000 digitsofz,,
etc. a/ways usingthesamealgorithmANext,
considerthe pair(y,z,
)whichcomputesthe first 9000digitsofxinthesame manner as(Y2
z,
), butcomputes
the last 1000 digits ofxby storing them inz,
andprintingthemafter therestofitsprogramfinishes. Wehave
z,
1063,andsurely Y, Is notmuch larger than Y,I.
Let’s say y, 1o. Furthermore, C(y,z,
) is certainlynogreater
than C(y,z,
): after all, the change from (y,z,
) to (y,z,
) Involved the replacement ofseriouscomputationwithsimplestorageand printing.Thepointisthat both(y,
z,
)and(yz,
) are patternsinx,
but incomputing the totalamountofstructure inx,
itwould befoolishto countboth of them. Ingeneral, the problemisthat differentpatterns mayshare similarcomponents,
and it isunacceptable tocounteachofthesecomponents
several times. Inthepresent
examplethe solution iseasy:
don count (y, z2). BUt onemay
alsoconstruct examplesofvery
differentpatterns
whichhaveasignificant, sophisticated
component
in common. Clearly,what isneededisa generalmethod of dealingwithsimilarities between
patterns.
Recall thatI.(v w)wasdefinedastheapproximate version of the effortrequired to
compute
vfrom aminimalprogram
forw, so that ifvandwhavenothing in common,I.(v,w)=l.(v).
And,ontheotherhand,ifvandwhavea largecommon component,then bothI.(v,w)andI.(w,v)arevery
small.I.(vlw
)isdefinedonlywhenvandw aresequences.
But weshall also needtotalk aboutone
program
beingsimilartoanother. InordertodoMEASURING STATIC COMPLEXITY !73
asit iscomputableon aTuringmachine, anditdoesnot assignthesame
sequence
toany two differentprograms.The introduction ofa programming languageLpermits us todefine the complexity of a
program
y as I.(L(y)), and to define the complexityofoneprogram
y, relative to anotherprogramY2 as I.(L(y, ) L(Y, )). Asthelengths oftheprograms
involved increase, thedifferencesbetweenprogramming languagesmatterless and less. Tobeprecise, let LandL,
beany
two programming languages,computable on Turingmachines. Then it canbe shown that,as L(y,)
andL(y2)
approachInfinity, the ratiosI.(L(y, ))/I.(L,(y, ))and I.(L(y,)L(Y=
))/I.(L, (y, )L, (Y,
))bothapproach 1.Where is
any
binarysequence
of lengthn,
let D(z) be the binarysequence
of length2n obtainedby replacingeach 1 in with01, andeach 0 in with10. Wherewand areany
two binarysequences,
letwz denote thesequence
obtained by placing thesequence
111attheendofD(w),andplacingD(z)atthe endofthiscompositesequence.
The point is that 111 cannot occurin eitherD(z) or D(w), so that wz is essentiallyw Juxtaposedwith
z,
with111 asamarker inbetween.Now,we
may
definethecomplexityofa program.datapair(y,z) asI.
(L(y)z), andwemay
define thecomplexityof(y,z)relativeto (y,z,
)asI.
(L(y)z L(y,)z, ). Wemay
define thecomplexityof(y,z)relativeto asetofpairs {(y,z,
),(y,,z,
) (y,,z,
)} to beI.
(L(y)z L(y, )z,L(y,)z,...L(y,)z, ). This is the toolwe need tomakesense of thephrasethe totalamountofstructure of
Let Sbeanysetofprogram-data pairs(x,y). Thenwe
may
define thesizeIS
ofS asthe resultofthe followingprocess:
ALGORITHM2:
StepO. Makea list ofallthepatternsinS, andlabel them(y,
z,
), (y,z,
), (y.z.
).Step
1.Let s,(x)=l.(L(y, )z, )Step
2. Let s,(x)=s,(x)+l.(L(y,)z,
)l(L(y,)z,
) Step3. Let s,(x) =s,(x)+
l.(L(y,)z, L(y, )z,L(y,)z, ))Step4. Lets,(x)=s,(x)+l.(L(y,)z, lL(y, )z,L(y, )z,)L(y,)z, ))...
Step N. Let
ISl
=s(x)=s,,(x)+l.(L(y,)z, lL(y, )z,L(y, )z,)...i.(y,,
)z,, )Atthekh
step,
onlythatportion of(y,z,
)which isindependentof{(y,
z,
), (y,.,,z,., )}
isaddedontothecurrentestimate ofSl.
Forinstance, inStep
2,if(y,
z,
)isindependent of(y,z,
),then thisstepincreasestheinitial estimateofSl
by thecomplexity of(y,z,
). Butif(y,z,
)ishighlydependent on (y,z,
), notmuch will beaddedontothe first estimate. Itisnotdifficultto seethat this
process
will arriveatthesame answer regardlessof the orderin whichthe(y,
z,
)appear:
THEOREM2:Theresu/tofAlgorithm 2is invariantunder permutationof the(y,
,z,
).
Where P(x) is the set of all
patterns
in x, wemay
now define the structural complexity ofx to be the quantityP(x)
I.
This, wesuggest,
is the sense of the word complexity" thatone uses when onesays
thataperson
is more complex than atree, whichismore complexthana bacterium. Inaway,structuralcomplexitymeasureshow174 B. GOERTZEL
bacterium.
REFERENCES
1.KRONSJO,I_
Alclorithms:
TheirComplexity and Efficiency, Wiley.lnterscience,NewYork,1979.
2.
BOWEN,
R.Symbolic DynamicsforHyperbolic Flows,Am.J.
ofMath.95(1973),421-460.3.
KOLMOGOROV,
A.N. ThreeApproaches totheQuantitativeDefinition ofInformation,Prob. Info. Transmission 1 (1965), 1.7.
4.
CHAITIN,
G.Information.TheoreticComputational Complexity, IEEE.TI120(1974), 10-15. 5. SOLOMONOFF, I_ A FormalTheoryof Induction, Pads I.andI1., Info. andControl20
(1964),224-254.
6.
BENNETT,
C.H.TheThermodynamicsofComputationA
Review,Int.J.
Theor.Phys.21 (1982), 905-940.7.ZUREK, W.H.AlgorithmicInformationContent,Church.Turing Thesis,Physical
Entropy,
and Maxwell’sDemon, InComplexity_,
Entro_pv
and the Physics of Information,(ed. W.H. Zurek),Addison.Wesley, New York, 1990.8:KOPPEL, MOSNE. Complexity, DepthandSophistication, Complex
Systems
1 (1987), 1087-1091.9. CHAITIN, G. Toward aMathematical Definition ofLife, InThe Maximum