Automatic Extraction of
Object-Oriented Component Interfaces
John Whaley Michael C. Martin Monica S. Lam
Computer Systems Laboratory
Stanford University
f
jwhaley, mcmartin, lam
g@stanford.edu
ABSTRACT
Component-based software design is a popular and ee -tive approa h to designing large systems. While om-ponents typi ally have well-dened interfa es, sequen ing information|whi h allsmust omeinwhi horder|is of-tennotformallyspe ied.
This paperproposes using multiple nite state ma hine (FSM)submodelstomodeltheinterfa eofa lass. A sub-modelin ludesasubsetofmethodsthat,forexample, imple-mentaJavainterfa e,ora esssomeparti ulareld. Ea h state-modifyingmethodisrepresentedasastateintheFSM, andtransitionsoftheFSMsrepresentallowablepairsof on-se utivemethods. Inaddition,state-preservingmethodsare onstrainedtoexe uteonlyunder ertainstates.
Wehavedesignedandimplementedasystemthatin ludes stati analysestodedu eillegal allsequen esinaprogram, dynami instrumentationte hniquestoextra tmodelsfrom exe utionruns,and adynami model he kerthatensures thatthe ode onformstothemodel. Extra tedmodels an serveasdo umentation;they anserveas onstraintstobe enfor ed by a stati he ker; they anbe studied dire tly bydeveloperstodetermineiftheprogramisexhibiting un-expe ted behavior; or they an be used to determine the ompletenessofatestsuite.
Oursystemhasbeenrunonseverallarge odebases, in- luding the joeq virtual ma hine, the basi Java libraries, andtheJava2EnterpriseEditionlibrary ode. Our experi-en esuggeststhatthisapproa hyieldsusefulinformation.
1.
INTRODUCTION
A popular approa h to designing large systems is omponent-based software design. The idea is to improve software reuseby rafting arefully engineeredsoftware el-ements suitable for a broad array of appli ations. Also,
This resear h was supported in part by NSF award #0086160,anNSFstudentfellowship,andaStanford Grad-uateFellowship.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
Copyright 2001 ACM X-XXXXX-XX-X/XX/XX ...
$5.00.
begin
commit
save
rollback
Figure 1: FSMfor the databaseexample
on etheappli ationsprogramminginterfa e(API)hasbeen designed, implementors an lower the number of ross-appli ation dependen ies that their ode needs to worry about, on entratinginsteadonprovidingtheservi esthat the APIspe ies. As aresult, modern enterprise omput-ing ar hite tures are notmonolithi systems,buttypi ally onsistofmultipletiersofreusable omponentssu hasuser interfa esor systemservi essu has transa tions,se urity, anddatabases. Theresulting omponentlibraries anthen bereuseda rossmanydierentappli ations.
TheAPIof these omponentsoften in ludes onstraints onwhenapubli methodmaybeinvoked. Forexample,an SQLservermayin ludethe ommandsbegin, ommit,save, androllba k. The ommandbeginmustrstbeexe uted, thenaseriesofsaveandrollba k ommands anbeissued, andthen,nally, ommit. Itwouldbein orre t,forexample, ifasave,rollba k,or ommitisinvokedbeforeanybegin ommand. We an represent these onstraints as a nite state ma hine (FSM), as shown in Figure 1. Constraints su hasthesearevery ommoninAPIs.
There has been a good deal of re ent work that deals withmodelingobje tswithnitestatema hines. Previous systemslikeVault[7℄,NILandHermes[23℄provide program-mers with linguisti onstru tsto spe ify the state of the variablesinaprogram,anda ompilerisusedtoensurethat the obje tisalwaysinthe orre t state. Theadvantageof thisapproa histhat odewritteninthiswayisguaranteed to onform to the nitestate models. This, however, re-quiresprogrammerstorewritetheir ode;retrottinglarge systemsofexisting odeintothisframeworkisnotplausible. Systemssu hasPREx[3℄,Metal[5,8℄andSLAM[2℄ oper-atedire tlyonexistingsoftware;they he kifthe ode on-formstopre-dened orre tness onstraints,manyofwhi h anbeexpressedasFSMs.Thesesystemshavebeen demon-stratedtobesu essfulinndingmanyerrorsinoperating systems.
Oneof the bottlene ks of su h systems is that the or-re tness onstraintsareoftennotreadilyavailableforlarge systems. Few programmers are both willing and able to
mayalsobewrongwhenspe ifyingthe onstraints. Todeal withthis,therehave beenvariousre ent attemptstoinfer onstraintsautomati ally. Engleretal.[16℄ proposed te h-niquestoinfer onstraintsfromprogramsbyanalyzingthe behavior of the ode stati ally. To allow for errors inthe program,behaviorthatisobservedmostofthetimeis on-sideredthenorm. Theirresults suggestthatthisapproa h isee tiveinndingbugslikewhetheralo kfun tion all isfollowedbyanunlo k. Inaddition,dynami te hniquesto extra tinformationdire tlyfrom programshavealsobeen proposed[1, 9℄. In parti ular, Ammonsetal. use ma hine learning te hniques,with limitedsu ess, to extra t spe i- ationsfromthe programby analyzingdynami program tra es[1℄.
1.1
The Problem
Thefo us ofthis paperis ondevelopingte hniquesthat an automati ally extra t appli ation interfa es dire tly from existing ode. However, unlike most of the previous workinthis area, we are interestedinte hniquesthat ap-plytolargeobje t-oriented omponent-levelsoftwarerather thansystem odewritteninlow-levellanguageslikeC.Large appli ations typi ally manipulate many dynami ally allo- atedobje tsstoredinre ursivedatastru tures. Instan es of the same lassmay all follow the same state transition model,butthey anbeindierent statesatanyonetime. Thestatestheyareinmaybeen odedinsome asesby ex-pli itvariables; however, more oftenthannot, their states are governed by highly appli ation-spe i semanti s that are unknownto the programming tools. Te hniques that tra kthe ontrol oworvaluesofvariablesthroughlimited programmingpaths are not ee tive here. Be auseof our interestinhandlinglargesoftware,ontheorderofmillions oflinesof ode,formalmethodssu hasmodel he kingare alsonotfeasible.
Ourte hniquesaredesignedtotakeadvantageofthe mod-ularityinobje t-orientedprogramminglanguages. Ina well-designedobje t-orientedsystem, thesetofpubli methods inea h lass denitiondenes a ompleteinterfa e for all instan esofobje ts belongingtothe lass. It isalso guar-anteedthatonlythemethodsofa lass anmutatethestate oftheobje t. We anthusoperateatthemethodlevel: in-steadof dealingwiththedenitionsand usesofindividual memorylo ations,we anfo usonndinghigher-level rela-tionslikehowmethodssu has ommitandrollba krelate toea hother.
This paper proposes using multiple FSM submodels to modelthe interfa eofa lass. Asubmodelin ludes a sub-setof methods that, for example, implement aJava inter-fa e,or a ess someparti ular eld. Ea hstate-modifying methodisrepresentedasastateintheFSM,andtransitions oftheFSMsrepresentallowable pairsof onse utive meth-ods. Inaddition,state-preservingmethodsare onstrained toexe ute onlyunder ertain states. Weexplore two fully automati te hniquesforextra tingpartialinterfa emodels fromlargeJavaprograms,onebeingadynami toolandthe otherastati analysistool.
Ourdynami analysis te hniqueobservestheprogramas itrunstobuildupamodelthatre e tstheoperations per-formedduring theexe utionoftheprogram. Thedynami analysisdoesnotsuerthedisadvantageofstati analyses, whi h must ne essarily be onservative. Here, the results
an a tuallytake pla e. Unfortunately,a dynami system anonlyprovethingsexistentially. Anypurported univer-sal that a dynami system nds may prove to onlybe an artifa tofthetest asesthatthesystemwas trainedon.
To demonstrate that a model a tively forbids ertain ases, we must turn to stati analysis. Our stati analy-sis te hniquesear hes for method sequen es on a ompo-nent that invariably ause ex eptions to be thrown. Su h sequen esaredeemedillegal. Astati systembyitself, how-ever,suersfromthedisadvantagethatit annoteasily iden-tifyaprioriwhetherornotagivensequen eispartoftypi al or permittedusage. Theadvantagesand disadvantages of thestati anddynami approa hes omplementoneanother ni ely;a hybridsystem thatusesthemboth ana quire a greatdealmoreinformationthaneitheronealone.
Thehybridsystem analsotakeadvantageoffurther ad-vantagesofthedynami andstati approa hesthat donot dire tly pertain to the model. A stati system has fuller a esstothestru tureofthe ode,and anmakeavariety ofdedu tionsaboutthenature ofthe lassestoletthe dy-nami systemfo usmore arefullyonusefulfeaturesofthe programruns. Thedynami systeminturnhaseasya ess tothe urrentstateoftheheapatanygiventime,sothere is noneedfor advan edpointeraliasanalysis onthestati side. Dedu tions whi h require detailed alias information ansimplybedelayeduntilruntime.
1.2
Applications of Automatically Extracted
Models
Partialmodelsextra tedbyoursystemareusefulin sev-eral ways. First, the information anserveas do umenta-tionofthesystem. Thisisespe iallyusefulforprogrammers fa edwithusingalargepie eofsoftwarefor thersttime. Se ond, omponent designersmaywishtoexaminethe ex-tra tedmodelto seeifit oin ideswiththeir expe tations. Third,thedynami analysis anbeusedasameansto mea-surethe overageofatestsuite. Themodelextra tedfrom analyzing the test suite gives a su in t summary of the sequen ing of methods tested. These results may suggest additional tests bewrittento test important subsequen es ofmethodinvo ations. Fourth,givenamodelthatmayhave beeneitherextra tedautomati allyorpreparedbythe pro-grammer, we anmonitorthe exe utionof aprogram and automati ally agerrorsthatdeviatefrom orre tbehavior. Finally,stati toolsto he kthe onforman eofthemodel anbeimplementedtolo ate errorsinprograms. Wehave implementedtheseanalysesandsystemstooperateonJava byte odes.
Inthispaper,wepresentseveral asestudiestoshowthat our relatively simplebuts alablete hniquesare su essful inextra tingusefulmodelsfromseverallargeprograms. We testedourtoolsonfourdierentappli ationsthattogether onsistofover1.2millionlinesof ode. Wedemonstrate:
1. The ee tiveness of the model. The model auto-mati allyextra ted for the so ket implementationin java.net illustrates howour interfa e models are ef-fe tivein apturingthe orre t usageofa lass.
2. Automati stati model extra tion. Our stati analy-sissu essfullyextra tsusefulmodelsforthestandard Javalibraries. Theinformationisusefulas do umen-tation; also,they an beused by stati tools to look
for errorsinprograms.
3. Use of the model to hara terize test suites. We demonstrate s alability by using the dynami analy-sistoolontheJ2EEenterprise editionplatform. We showthat theextra tedmodels onveyinteresting in-formationabouttheprogrambeingtestedandthetest suiteitself.
4. Useofthemodelsinsoftware auditing. Togetasense oftheappli abilityofoursystemforprogramevolution and developerfeedba k, we appliedour te hniqueto aprogramthatwearefamiliarwith: thejoeqvirtual ma hine. Thedynami systemwas able tondsome dis repan ies between theintendedand implemented API,andthestati systemprovidedana urate ren-ditionoftheappropriate allsequen e.
1.3
Paper Overview
Werstdeneourmodelof omponentinterfa esin Se -tion 2, illustrate it with an example, and dis uss how to optimizeitsutility. Wedes ribeourstati analysisin Se -tion3and our dynami analysis inSe tion 4. We present ourpreliminaryexperien einSe tion5. Se tion6dis usses relatedworkandSe tion7 on ludes.
2.
A COMPONENT INTERFACE MODEL
The design of our model of the interfa e is inspired by the on ept of path expressions[4℄. Path expressions were originallydesignedtoallowsimplespe i ation of omplex syn hronizations hemesfor on urrentpro essesand oper-atingsystems.Regularexpressionsareusedtodenetheset ofadmissible exe ution historiesof operationsonashared resour e. Thesehistoriesrepresentaninterleavingofa tions onmultiplethreads.
Ourapproa hisbasedonasimpliedversionofpath ex-pressions that treat obje ts as a resour e shared between dierent se tions of an appli ation. Instead of allowing a generalregular expression to apture the history of meth-ods invokedonan obje t, we instead pla erestri tions on ourFSMstomakethemeasiertoextra tandmoreeÆ ient toenfor edynami ally.
A nave approa his to have everymethodinthe model representedbyexa tlyonestate.
Denition 2.1 Navemodel. A nave interfa e modelof a lass l is anite state ma hine, denotedM
l
=hS;Ti, whereSisthesetofmethodsin lass lplusthetwospe ial methods start and end, and T (SS) is the set of legal methodsequen es. Aninstan e isinstate opifopis the last methodfrom the set S that has been invoked on theinstan e,orinstatestartifnosu hmethodexists. A methodop
0
2S anbeinvokedonaninstan eifandonlyif theinstan eisinstateopand(op;op
0 )2T.
There are two major problems with this nave model. First,mostobje tshavesuÆ iently omplexsequen ing on-straintsthatmerelyknowingthelastmethod alled annot apturetheproperbehavior oftheobje t. Se ondly,many methodsarestate-preserving: theydonothaveanyside ef-fe tsanddonotadvan ethestateoftheobje t. In luding theseinthenavemodeldestroysitsa ura y.
Intheremainderofthisse tionwereneournavemodel to address these issues. Se tion 2.1 dis usses the issue of
START
set_a
set_b
END
get_a
get_b
Figure2: Nave modelexample
set_a
set_b
get_a
get_b
END
START
END
START
Figure3: Improvedmodelwitheld splitting
more omplexsequen ing onstraints, and Se tion2.2 dis- usses state-preserving transitions. Se tion 2.3 on ludes withaformalspe i ationofthemodel.
2.1
Model Slicing by Type and by Member
Fields
Modelsareasso iatedwithindividual lasses;whetherthe lass is an instantiable obje t or some abstra t super lass is unimportant to the model. Most obje ts are members of multiple lasses|theyare membersof a lass that sub- lasses some other type, or they implement a number of interfa es. Ea htype, withitsmethods, en apsulate aset of onstraints. Thus,ea htypeto whi hanobje tmay be astspe iesaseparatemodel,whi hinturnisrepresented as anFSM. The full set of sequen ing onstraints for the obje tarerepresentedbytheprodu toftheFSMs.
Bythesametoken,wealsosli ethemethodsina lassby theirelda esses. Morepre isely,givena lassofnelds, we reateuptonsubmodels,whereea hsubmodel ontains themethods(notne essarilyuniquetothatsubmodel)that a essthesameeld. Asubmodelmay subsumeanotherif itsmethodsandallowablemethodpairsareasupersetofthe other. Therationaleforsli ingbyeldsisthatunless meth-odsrefertothesamevariable,theyarenotrelated,andtheir relativeorderingisindependentofea hother. Representing themseparatelyallows usto a uratelymodelindependent aspe tsofasingleobje t.
Considerasimple lasswithtwoeldsaandb,andfour methods,set a,geta,set b,andget b. Here,themethod geta annot be invoked before set a, and similarly, the methodgetb annotbeinvokedbeforesetb. Attempting to modelthis behavior withthe nave modelprodu es the model shown in Figure 2. Noti e that in order to allow arbitraryinterleavingofgetandsetoperations,wehadto give up the apability of enfor ing that both a and b be dened before use. If we split the model in two, we get theFSMsshowninFigure3. Thisnewsystemisthusable toenfor e thatgetaandgetbarepre ededbyset aand setb,respe tively.
START
set_a
END
get_a
set_b
get_b
START
END
Figure4: Finalmodelfor get/setexample
2.2
State-Preserving Transitions
Inanavemodel,everymethod all hangesthestateof themodel. However,notallmethodsupdatethestateofthe obje t;thetoString andhashCodemethodsdenedonall Javaobje tsareexamplesofthis. Havingmethodslikethese inourmodelallowsanymethodtobefollowedbyanyother method,providedthestate-preservingmethodisinvokedin between. A ommonapproa husedtoimprovea ura yis to reate nite state ma hinesthat keep tra k of the last k alls instead of only the most re ent all[21℄. However, usingsu hanapproa hwouldonlyexa erbatetheproblem. Themodelmaynow ontainup to n
2k
transitions, greatly ompli atingtheanalysis. Furthermore,itislegal tomake kstate-preserving allsinarow,againlandingtheobje tin astatethat anrea hanyotherstate.
Ourapproa histodistinguishbetweenmethodsthat pro-du e side ee ts and those that do not. Side-ee t free methodsdonotadvan ethestateoftheobje t; ommon ex-amplesofside-ee tfreemethodsinJavain ludetoString andhashCode. In ludingthemwould reateanoverlyweak modelas dis ussed above. Wethusignoretheseside-ee t freemethodswhenbuildingoursetS.
However,althoughaside-ee tfreemethodmaynot ad-van e the state of an obje t, it may still have sequen ing onstraints;for example,amethodto geta value maynot beinvokablebeforethevaluehasbeenset. Wethusdene forea hside-ee tfreemethodthesetofstatesforwhi hit islegalto allthatmethod.Werepresentthisinourmodel diagramswithdottedlines onne tingmethodsthatarenot statestothestatesinwhi hthatmethodislegal.
Toillustratethis,letus onsideragaintheexamplefrom theprevious se tion, and assume that getaand getb do nothavesideee ts. ThenalmodelsareshowninFigure4.
2.3
Formal definition of the model
Wenowexpandournavedenitionofamodeltoin lude thesetworenements:
Denition 2.2 An interfa e model of a lass is a ol-le tion of submodels, ea h of whi h is a triple, denoted M = hU;S;Ti, where U is the set of methods governed bythe submodelplusthespe ialmethodsstartandend, S U is the set of methods that qualify as states, and T (SU) is theset of allowable methodpairs. An in-stan eisinstateopifopisthelastmethodfromthesetS thathasbeeninvokedonthe instan e,orinstate startif nosu hmethodexists. Amethodop
0
2U anbe invoked onaninstan eifandonlyiftheinstan eisinstateopand (op;op
0 )2T.
3.
STATIC MODEL EXTRACTOR
Thisse tiondes ribesastati analysiste hniquethat an
automati ally dete t the sequen ing onstraints in ompo-nentinterfa esdesignedtoguardagainstmisuse. Thebasi idea is that we analyze the uses and denitions of elds inthe lass, and identifypairs ofmethodinvo ations that would ause anex eption intheprogram. For example, if onemethodstoresanull to aeld, it annotbe followed by aninvo ationto a methodthat dereferen esthat eld un onditionally. We rst dis uss why this information is oftenavailableinwell-designedJava omponents,andthen dis ussourte hniquethatextra tssu hinformation.
3.1
Defensive Programming
Widely used omponents are often defensively pro-grammedto dete tillegalsequen esofmethodinvo ations soas toensuretheintegrityofa omponent. Atypi al ap-proa histomaintainastatevalueandthen he kifthestate isvalidbeforeperforminganoperation. Ifitisnotvalid,a Javaprogramwill usuallythrowanappropriateex eption; programswritteninotherlanguageswouldtypi allyreturn a return ode indi atinganerror or wouldassert that the stateisinvalidandexittheprogram.
Thestatemaybeimpli itlyen odedinaeldthatholds thenormaldatabeingoperatedon. Forexample,anulleld wouldindi atethatthemethodthatsetsithasnotbeen in-voked. Sometimes,thestateisexpli itlyen odedinaeld. This eld is typi allyimplemented asan enumeratedtype holdingallthepossiblestatestheobje tisin. Withoutthe benetofenumeratedtypes,Javaimplementationstypi ally useintegersorbooleans. Theeldissimilartothe on ept of type state variables[22℄. This paperpresentsempiri al eviden e thatthe notion oftypestate variables is parti u-larly appli able to obje t-oriented software, and espe ially tolanguagesthathaveagoodex eptionhandlingfa ility.
Let us onsider as an example the design of a list it-erator, as shown in Figure 5. This ode is taken from thejava.util.Abstra tList.ListItr lassinthestandard Java1.3.1 lasslibrary. java.util.Abstra tList.ListItr implements the java.util.ListIterator interfa e. The set operationwritessome obje tto the\ urrent"element oftheiterator, asindi atedby theindexvariablelastRet. The next or previous methods adjust the index variable, whereastheremoveoraddmethodseliminatethenotionof having a urrent element. The latter is indi ated by set-ting the variable lastRet to -1, whi h will ause the set operation to throw an IllegalStateEx eption if there is nointerveningnext orprevious methodinvo ations. The lastRet variablethus doubles asavalue usedinthe algo-rithmaswellasatypestate.
For odethathasbeendefensivelyprogrammed,itis pos-sible to analyze the implementation to dedu e sequen es thatareforbiddenbythe omponent.
3.2
Algorithm
Our algorithm onsists of three steps. First, for ea h methodm,weidentifythoseeldsandpredi atesthatguard whether ex eptions anbe thrown. Se ond, we ndthose methodsm
0
thatsettheseeldstovaluesthat an ausethe ex eption. Immediatetransitionsfromm
0
tomare illegal. Finally,aninterpro eduralanalysis isrunoverthe lassto determinewhi hmethodsmaymodifyorreferen etheeld. The omplementoftheillegaltransitionswithrespe ttothe relevantmethodsforms amodeloftransitionsa eptedby thestati analysis.
// ...
lastRet = ursor++; // ...
}
publi Obje t previous() { // ...
lastRet = ursor; // ...
}
publi void remove() { if (lastRet == -1)
throw new IllegalStateEx eption(); // ...
lastRet = -1; // ... }
publi void add(Obje t o) { // ...
lastRet = -1; // ... }
publi void set(Obje t o) { if (lastRet == -1)
throw new IllegalStateEx eption(); // ...
}
Figure5: Example ode fromAbstra tList.ListItr.
The general problem of whether the invo ation of a methodmay ause another to raise ex eptionsis unde id-able. Fortunately,statevariablesareoftenusedinarather simplemanner, asillustrated intheexampleabove. Thus, we restri t ouranalysis to nding thosesimple predi ates that involve testing a eld with a onstant null or a on-stant integer. Notethat inJava, all dereferen esare pre- ededwithanull he k,thusthisdenitionin ludesnding theruntimeex eptionsthrownasaresultofJavasemanti s. First,thealgorithmndspredi atesthat ontrolwhether ornotex eptionsarethrown. Thealgorithm omputesthe ontroldependen einformation[10℄forea hmethod.Then, for ea hsite that anthrow anex eption, it he ksif the predi ateguardingitsexe ution isasingle omparison be-tweenaeldofthe urrentobje tanda onstantvalue,and that the eld is not written prior to being tested. If so, wesaythatthatsinglepredi ateisapre- onditionforthat method,andthattheeld beingtestedisastatevariable.
The se ond stepanalyzes all the methods to nd outif theymustassignsome onstantvaluestothestatevariables. Thatis,weperforma onstantpropagationanalysisonthe methodsanddetermineifthestatevariablesaresettosome onstantattheexit. Wehaveidentiedanillegaltransition ifthe onstant valuesatisesthe onditionthatguardsthe ex eption-throwingstatements.
We have implemented these algorithms using the joeq ompiler system[25℄. It orre tlyidenties start ! set, start ! remove, remove ! set, add ! set, remove ! remove, and add ! remove as transitionsthat will throw an ex eption. The omplement|that is, the allowable transitions|isshowninFig. 6.
next,
previous
remove
add
start
set
Figure 6: Submodel extra ted stati ally from the sli e onlastRet inAbstra tList.ListItr.
4.
THE
DYNAMIC
EXTRACTOR
AND
CHECKER
This se tion des ribes a dynami instrumentation te h-nique that an automati ally dete t the usage pattern of obje ts, building up a series of submodels, or an enfor e an alreadyspe iedmodelat run time. The basi idea is totra kea hinstan eofaninstrumented lassindividually. When a method is invoked, we update the history of the obje tandre ordthe sequen einea hrelevant submodel. Werstdis ussourbasi extra tionte hnique,thendis uss someissuesthatmustbeaddressedwhenimplementingthe systeminJava.
4.1
Extracting a model
Givenasetof lasslestoinstrument,webeginby run-ningthemod/refanalysisofSe tion3overthebyte odeto determine whi hmethodsare relevant toea h eld. Ea h eldfisassignedasubmodel. Anymethodthatwritesf is deemedstate-modifying, while any methodthat readsbut does notwrite f is deemedrelevant, butstate-preserving. Wethenusethistoguidetheinsertionof allstoour anal-ysisroutinesatmethodboundaries.
OursystemusestheByteCodeEngineeringLibrary[6℄to rewritethebyte odestoinsertthe allstoouranalysis rou-tines. Theanalysisroutinesonlydealwithonesubmodelat a time; if a methodis relevant to multiple models, multi-ple alls are inserted. On e all the alls havebeen pla ed, we then runtraining programs that use our instrumented obje ts.
Ea hinstan ekeepstra kof,forea hsubmodel,thatlast state-modifying method that was alled on that instan e. Whenamethodis invokedonanobje t,the extra tor ex-aminesthesere ords,updatesthesubmodelsappropriately, and then updatesthe last- all information of the instan e forea hsubmodelwherethemethodisstate-modifying. In orderto apture the exportedinterfa e and ignore the in-ternal usage of the obje t, the dynami system maintains knowledgeofthelo al allsta kandignoresany allthatis internal tothat instan e. On etheprogramnishes, tran-sitionstotheendstatearenotedforea hinstan e,andthe nalsetofmodelsiswrittenout.
Thedynami he kera eptsasetofmodelsasinput. It is nearly identi al to the extra tor; the primary dieren e isthatwheretheextra toraddselementstothesetoflegal transitions,the he kerprintswarningsorraisesex eptions whenapreviouslyunseentransition o urs.
4.2
Complications
java.net1.3.1 Networkinglibrary 12,000 stati &dynami Javalibraries1.3.1 Generalpurposelibrary 300,000 stati
J2EE1.2.1 Businessplatform 900,000 dynami
joeq JavaJVMimplementation 65,000 stati &dynami
Table1: Appli ations Usedinthe Experiments
ompli ationsarisewhenimplementingthissystem. Inheritan e. A submodelspe ies whi h methodsto in-strumentina lass. However, someofthesemethodsmay nota tuallyexistinthe. lasslebe ausetheyare inher-itedfrom asuper lass. We solvethis problem by adding dummymethodsthatsimply allthemethodsofthe super- lassexpli itly.Thesemaythenbeinstrumentedalongwith themethodsthatthe lassitselfa tuallydenes.
Ex eptional ontrol ow.Control owmayleaveamethod via an un aught ex eption. Therefore, it is not suÆ ient tomerelyinsert allsat thebeginningofea hmethodand beforeea hreturninstru tion. To deal withun aught ex- eptions, we add an additional ex eption handler to ea h method that will at h any un aught ex eption, all our analysisroutines,andthenrethrowtheex eption.
Multithreading. TheJavalanguagesupportsand en our-ages multithreading. This raises twoissues; rst, we must ensurethat our modelextra tion routines donot get or-ruptedduetora e onditions,andse ond,wemustaddress the issue what a \last all" meanswhen multiple threads aremaking on urrent allsonanobje t.
To ensure that model state is not orrupted, only one threadshould a tually be inthemodel extra ting odeat any giventime. We ensuremutual ex lusionby de laring theextra tion/ he kingroutinestobesyn hronized. Sin e these routines are fairly short and only o ur at method boundaries,thisdoesnotsigni antlyimpa tprogram exe- ution.
When an obje t is being used by multiple threads, our system keeps a separate all history for ea h thread that a essesanobje t,thus apturingthe proto olfollowedby ea hthreadindependently.
Memoryleaks. Wemaynotmaintainanyhardreferen es toany obje tinthea tualprogram,be ausethis will pre- ludeotherwise unrea hable storage from being re laimed by thegarbage olle tor. We solvethis problembytaking advantage ofthe Java 1.2 fa ility of weak referen esto re-fer to anobje t without inhibiting its re lamation by the garbage olle tor. When anobje tloses all non-weak ref-eren es,thegarbage olle tornoties thedynami system, whi hthenaddsanendtransitionand easestra kingthe obje t.
5.
EXPERIENCES
Weappliedourtoolstoseveralreal-lifeappli ations. The setofprogramsareshowninTable1. Werstuseamodel extra tedfromthejava.netlibraryv1.3.1toillustrateour model. Weshowthatthemodelprovidessigni antlybetter resultsthanthenaveonewhereallthemethodsare onsid-eredstatetransformersinthesamemodel. Se ond,weuse ourstati analysiste hniquetoextra tthemodelsfromthe Javastandardlibrary v1.3.1. Third,weapplythedynami instrumentation toolto extra tmodelsfromthe J2EE
ap-pli ation, a large appli ation framework for whi hwe had no prior knowledge about the implementation. We have developeda simple\test suite" generator, and the models extra tedgivesususefulinsightonthe ompletenessofthe test suite. Finally, we apply thestati and dynami tools to the joeq program, whi his a Java ompiler and a Java virtualma hinewritteninJava,andshowthevaluethatthe automati modelextra tion anbringtoasystemdesigner.
5.1
Illustration of our Model and Tools
Our rst experiment is to demonstrate the feasibil-ity of our model in apturing a omponent interfa e. To do that we apply our te hnique to the abstra t lass java.net.So ketImpl in the java.net library. The java.net.So ketImpl isa ommonsuper lassofall lasses that implementso kets inJava. It is used to reate both lientandserverso kets.
The lass onsists of 16 methods, shown in Figure 7. We rantwoexperiments. Inthe rstexperiment, we used the dynami modelextra tor tondalowerboundonthe model. WeinstrumentedtheEteriaIRC lientandexer ised it by onne ting toanIRCserver. These ond experiment applied the stati analysis tool to nd anupperbound of themodel.
Method Des ription
reate reatesdatagramso ket
onne t onne tstheso ket
bind bindstheso ket
listen setsthemaximumqueuelength
forin oming onne tions
a ept a eptsa onne tion
getInputStream returnsinputstreamview getOutputStream returnsoutputstreamview available returnsthenumberofbytesthat
anbereadwithoutblo king
lose losestheso ket
shutdownInput shutsdowninputstream shutdownOutput shutsdownoutputstream getFileDes riptor returnsledes riptor
getInetAddress returnsaddressforthisso ket
getPort returnstheremoteportnumber
getLo alPort returnsthelo alportnumber reset losesandreinitializes theso ket
Figure7: Importantmethodsinjava.net.So ketImpl
5.1.1
Experiment: the dynamic model extractor
Weranthedynami toolsrstassumingthenavemodel whereeverymethod all hangesthestateofthenite-state ma hine. Weobtained the modelinFigure8. Themodel has11statesand17edges. Mostoftheimpliedusagerules
create
connect
getInputStream
getOutputStream
start
close
setOption
available
end
finalize
getFileDescriptor
Figure 8: Dynami ally extra ted nave model for java.net.So ketImpl
getOutputStream
available
getInputStream
start
close
end
create
connect
getFileDescriptor
Figure9: Dynami allyextra tedsubmodelofthefd eldin java.net.So ketImpl
weresimplyartifa ts inthe lient ode|for example, we foundthattheonly allthatfollowedgetInputStreamwas getOutputStream. Thisshows that anavely dynami ally-extra ted model, without sli ing by elds or identifying state-preservingmethods,isbasi allypointlessinthis ase. Wepro eededtosli ethemethodsbytheeldsthatthey a ess, separateout the state-preservingmethods,and re-run the dynami tests on these submodels. We obtained sixmodels, ve of whi h were simple two-state \put-get" models. Themodelforthefdledes riptoreldwasmore interesting. Thefdeldhasthefollowingstate-settingand state-dependentmethods:
state-modifying: reate, onne t, lose
state-dependent: available, lose,
shutdownInput, getInputStream,getOutputStream, shutdownOutput, getFileDes riptor
Were-ran thedynami testsusingthis information,and obtainedthemodelinFigure9.
Sli ing by elds and separating state-preserving meth-odsprodu edadramati allysimplermodel|onewhi h su - in tlysummarizesthesequen ing onstraintsontheobje t.
5.1.2
Experiment: the static analysis tool
Inthe se ondexperiment,weappliedour stati analysis tool to determine transitions that would throw an ex ep-tion.Ourstati analysis orre tlyidentiedthatitwas ille-gal to performthe sequen es (start) !getInputStream,
(start)!getOutputStream,(start)!available, lose !getInputStream, lose!getOutputStream,and lose !available.
Itisworthnotingthatthe standardJavaAPI do umen-tation forthe java.net.So ketImpl lassdoes notspe ify under what onditions an ex eption would be thrown. It wasonlythroughinspe tionofthesour e odethatwewere ableto he ktheresultsofourtool.
Inspe ting the sour e ode, we found that the onne t method impli itly performs a reate operation, therefore the reate is unne essary. Furthermore,it is legal to all onne t multiple times, to all reate after onne t, and also to all lose at anytime. These transitionswerenot triggeredinourdynami tests,butourstati analysis(and subsequentinspe tion ofthesour e ode) did onrmthat theseareinfa tlegal.
5.2
Automatic static model extraction: Java
standard class library
Toevaluatetheee tivenessofourstati analysisin ex-tra tingmodelsfrombyte odedire tly,weappliedourtool tothe Javastandard lasslibrary v1.3.1. Thislibrary on-tains914 lasses,outofwhi hthetoolfound81 lassesthat hadmethodsequen esthatinvariablythrewex eptions.
Awidevarietyof lasses areamenabletothiste hnique. The tool was able to identifyseven iterator lasses in the library asfollowing the iteratormodelas dis ussed in Se -tion3. Inaddition,italsonds onforman etoourmodel inmanyother lasses:
Ve torandLinkedList: Data annotberetrievedifnone exists. Calling aretrieval method immediately after onstru tionor learingthe olle tionisillegal.
I/O and so ket lasses: Data annot be read or written before the onne tion or thelestream are setup or afterthelestreamsare losed.
Timer: Newtasks annotbes heduledifthetimerhasbeen an eledornalized.
SimpleTimeZone: Certainmethods annotbea essed fol-lowingsomeinitializations.
AlgorithmParameters, KeyStore, Se ureClassLoader, and ClassLoader lasses: Attempts to use the ob-je t before initialization or reinitialize an already-initializedobje tthrowanex eption.
ThreadGroup: AttemptingtodestroyaThreadGroup more thanon ethrowsanex eption.
Signature: Initialization,updatingandsigningmust pro- eedsequentially.
The a tual onstraints of the lasses that we have ex-tra tedareshowninFigure11. Forbrevityand larity,we have onsolidated andsimpliedthe extra ted onstraints. Ea hrowofthetabledes ribesthe onstraintsthatwefound for dierent setsof lasses; lasses thatwerefoundto have similar onstraintsare merged into asingle row inthe ta-ble. The table ontains two types of onstraints. Nega-tive onstraints,markedwitha\ "sign,saythat allinga methodinthe\from" olumnimmediatelyfollowedby all-ing amethodinthe \to" olumnwill always throwan ex- eption,andisthereforeillegal.Positive onstraints,marked
start
initSign
initVerify
sign
update
verify
Figure 10: Model extra ted stati ally from
Signature.
witha\+"sign,saythatthe allinthe\from" olumnmust pre edeanyofthe allsinthe\to" olumn,oranex eption willbethrown. Thelast olumninthetablegivesthe on-ditionontheinstan evariablebywhi hthe onstraintwas found.
The model extra ted for java.se urity.Signature is parti ularly interesting. The state of the obje t is expli -itlyen odedbyaeld alledstateinthe lass. Thisstate eldisanintegerwhosevalue anbeoneof0,2or3.
1 The stateis setto 0inthe initializer. CallinginitSign() sets the state to 2,and allinginitVerify() sets the state to 3. Ifsign()is alledwithoutbeinginstate2,anex eption isthrown. Likewise,ifverify() is alled withoutbeing in state3,anex eptionisthrown. Ifupdate()is alled with-out being instate 2 or 3, an ex eption is thrown. Using this information, our stati analysis was able to generate themodelshowninFigure10.
5.3
Evaluating Test Suites: J2EE
Our third experiment attempted to use the dynami modelextra tor to hara terize a test suite. In addition, we also wished to nd the limit of dynami instrumenta-tionby exer ising it ona large appli ation. For this pur-pose,wesele tedtheJava2EnterpriseEdition(J2EE) ver-sion1.2.1[18℄,apopular omponentar hite turefor reating multitierenterpriseappli ationsinJava. Itisalargesystem omprisingnearlyamillionlinesof odeinover5,000sour e les. Asanappli ationfor theJ2EEar hite ture,weused the Java PetStore demo[19℄, a web enterprise appli ation provided by Sun and intended to be used as a framework fordevelopingJ2EEbusinessappli ations.
We have found that instrumenting every lass of J2EE rendersthesystemtooslowtobeusable. Weidentiedthe webservi e,theorg.apa hehierar hy,tobethebottlene k, and found that justskipping the instrumentation for that pa kageis suÆ ientto reateausablesystem. Thetypi al laten yforloadingapagewas inthe5-10se ondrange.
Thegoalofthisexperimentistodetermineifwe anuse the dynami tool to evaluate a test suite. Unfortunately, wedonothavea essto anoÆ ialtestsuite fortheJ2EE platform.Forthepurposeofourexperiment,wehave devel-opedatoolthatautomati ally generatestest asesforweb appli ations witha graphi aluser interfa e. This \button pusher"simulatesausertraversingawebsitebyparsingand traversingthe webpages. Itrandomly li ksonlinks,goes ba k,abortstransfers,reloadspages,et . Italsoparsesform dataso that it an orre tlyll out webforms. This tool was adapted fromRalfWiebi ke'sLinkVerifyprogram[26℄.
1
Presumably,theyused0ratherthan1toavoidhavingto initializethestatevariable.
start
begin
suspend
resume
commit
end
rollback
Figure12: Sample model: Transa tionManager
start
end
increaseRecursionDepth
decreaseRecursionDepth
simpleWriteObject
Figure13: Samplemodel: IIOPOutputStream
Running multiple on urrent opies of our button pusher simulatesawebserverunderheavyload. We olle tedour modelsby on urrentlyrunningtwoinstan esofthe button-pusherandrunningea hinstan efortwofulliterations. A ompletetestrunoverthepetstoretookapproximately20 minutes.
To get a sense of how the button pusher ompared to manual testing, we also performed a separate experiment bymanuallya essingtheserverfromaboutadozen dier-entma hinessimultaneouslyandperformingvarious opera-tions. By omparingthemodelsderivedbythesetwotesting methods,wefoundthatthebuttonpusherwas mu hmore omplete and ee tive. In fa t, the button pusher found severalsequen esof ommandsthattriggeredun aught ex- eptionsintheJ2EEsystem.
Wefoundthat runningthe buttonpusher overthe sam-ple PetStoreappli ation invokedjustunder afth of the methodsintheJ2EElibrary,andextra tedmodelsfor657 lasses. Many ofthese, as we expe ted, have rather unin-terestingmodels. Forexample, manyofthemodelsmerely spe iedaninitializer(oftenthe onstru toritself)followed by arbitrary sequen es of dierent get and set routines. There were, however, approximately 50 very interesting models extra ted,many ofwhi hhad to dowithdatabase transa tions.
As an example, the interfa e automati ally extra ted for the lass javax.transa tion.Transa tionM anag er is showninFigure 12. Thisexampleillustratesthat the tool isee tiveinextra tingsensiblemodelsautomati allyfrom Java byte odes. This uses our nave model, be ause the lass inquestion is an abstra t interfa e, with nointernal
Abstra tList.Itr Abstra tList.ListItr
Colle tions.5 + next,previous remove,set lastRet==null,
HashMap.HashIterator
HashMap.Enumerator - add,remove remove,set index==-1
LinkedList.ListItr TreeMap.Iterator
Ve tor - init, getFirst, size==0
LinkedList removeAllElements getLast
i/o&so kets + lose,finalize anyothermethods fd==null
PipedInputStream + onne t re eive,read
PipedReader - onne t onne t onne ted==0,1
- re eiveLast, lose re eive,read
PipedOutputStream + onne t write onne ted==0,1
PipedWriter + onne t onne t
Obje tInputStream + readObje t, defaultReadObje t, urrentObje t==null
inputObje t readFields
Obje tOutputStream + writeObje t, defaultWriteObje t, urrentObje t==null
outputObje t putFields
+ putFields writeFields urrentPutFields==null
Timer - an el,finalize s hed newTasksMayBeS heduled==0
SimpleTimeZone - someinitfun tion de odeStartRule,de odeEndRule various
AlgorithmParameters
KeyStore + init \use"methods initialized==0
Se ureClassLoader - init init
ClassLoader
ThreadGroup - destroy destroy destroyed==0
Signature + start initSign,initVerify
+ initSign,sign sign,update, state==0,2,3
initSign,initVerify
+ initVerify,verify verify,update,
initSign,initVerify
Figure 11: Constraints Extra tedby Stati Analysis
elds. Aprogrammerunfamiliarwiththesystem anlearn some salient information about the ode by just perusing thesemodels. Developersmaytakethis informationasthe startingpoint andaugmentit to reate a ompletemodel. Moreimportantly,thisalso provides aninteresting hara -terization of thebehavior ofthe programor the test suite fromwhi h themodelwas generated. Inthis ase, we see thatrollba kmethodsare alwaysthelast methodsinvoked ontheobje t. Thissuggeststhateithertheappli ation sim-plydis ardsall transa tionson etheywererolledba k,or thetestsuitedoesnotexer iseany odethatperformsother operationsaftertherollba k.
Shown in Figure 13 is another example that provides useful information about the test suite. The gure is the automati ally extra tedmodelfor the J2EEinternal lass om.sun. orba.ee.internal.io .IIO POut putSt ream . It is learfromthemodelthatthere ursiondepthofthestream atanysimpleWriteObje t allisneveranythingotherthan 1. Thus,weseethatourtestsuitedoesnotadequatelytest thes enarioswherere ursionisen ountered.
Our experiments with J2EE show that the automati modelextra toris apableofextra tingusefulobje t behav-iorfromalargepie eofsoftware andalsoprovidinguseful informationthat hara terizesatestsuite.
5.4
Protocol checking: joeq
Ourfourth experimentistoexploreifourtools anhelp adevelopergaininsightintohisorherprogramandtond errors in the program. For this purpose, we applied our tool to the joeqsystem, whi h is a ompleteJava virtual ma hineand aJust-In-Time ompiler developed byone of the authors ofthis paper[25℄. Joeq an also ompile Java . lasslesintoIntelobje t ode. Thesour esofjoeq, on-sisting of approximately 65,000 linesof ode, are available viahttp://joeq.sour eforge.net.
Injoeq,thereisajqMethodobje tforeveryJavamethod in the system. As a method is loaded, prepared, and ompiledinjoeq, the orrespondingjqMethod obje tgoes throughasequen eof states: load!prepare ! ompile !exe ute. Be auseofthe omplexityduetodynami load-ing andlinking,alargenumberofdefe ts havebeenfound injoeq that ould be attributedto erroneousassumptions on the state that a jqMethod is in. Infa t, as anaid to debugging,anexpli itstatevariablehasalreadybeen intro-du edtothejq Method lasssoastodete tstateviolations atruntime.
Inthisexperiment,thedeveloperofthesoftware already has a reasonable model of the system. Our rst step of the experiment is to use our dynami model extra tor to extra t a modelof the jq Method lass automati ally. To our surprise, the model in luded several instan es of a load! ompile transition, whi h is a violation of the
in-joeq, using as an input the intended API. This tool pin-pointedtheinvo ationofthe ompilemethodthat reated a violation. Further investigation revealedthat a method inasub lassofjqMethod,jqInstan eMethod.setOffset, a tually dupli ated the ode in jqMethod.prepare. The assertions in the joeq sour e did not at h the la k of a prepare state be ause the jqInstan eMethod.setOffset method also updates the state variable, indi ating that preparehadbeen alled. Therewas noexternalindi ation thatsetOffset a tuallyperformed theprepareoperation. Theprogramhappenstowork orre tly,butitisvery frag-ile: inadvertently alling setOffset may trigger an in or-re t prepare transition. Furthermore, if the fun tionality ofprepareis hangedinthefuture, itis highlylikelythat setOffsetwouldnotbeupdateda ordingly.
We also used the stati analysis to analyze the behav-iorof thejq Method lass. It orre tlyidentiedthe pres-en e of aninteger state variable, state, in the jqMethod lass, anddedu edthe orre tsequen e,load !(prepare jsetOffset) ! ompile ! exe ute. The stati analysis was alsoable to identifythe\in orre t"setOffset transi-tion. Submodelsonothereldsalsoidentieddata depen-den iesbetweenmethods;forexample,theexe utemethod dereferen esthe ompiled ode eld, whi his only set in the ompilemethod.Thesubmodelforthe ompiled ode eld orre tlyidentiesthisdependen y.
Thisexampleillustratestheusefulnessofanindependent toolthat he ksthebehavioroftheprogram|itfounda dis- repan ybetween the intendedandthe implemented API, whi hmay be a potential sour e of errorsas the software evolves. In this ase, had the programmer provided the spe i ation, it would have beenin orre t. This example alsoillustratestheusefulnessofadynami model he ker.
6.
RELATED WORK
Our work onthe dynami model extra torand modeler wasinspiredinpartbyErnst'sDaikonsystem[9℄,whi h ex-tra tsinvariantsfromprogramsdynami ally. WhileDaikon tries to dis over invariantssu h as relations between vari-ables in a program, our system uses dynami te hniques todis overthe omponentinterfa es. Whereas Daikonhas onlybeenappliedtorelatively smallprograms,our instru-mented odehasmu hloweroverheadandisappli able to largeappli ations. Re ently,Daikonandthestati he ker ESChave beenintegrated into asystem su hthat Daikon andis overinvariantsthatESCtriestoverify[20℄.
Other dynami te hniquesin ludethe DIDUCEsystem, whi hinstrumentsdatalo ationsandtra ks hangesinthe invariantsastimegoesby.Thissystemhasprovensu essful atndingthesour esoferrorsindiÆ ult orner ases[13℄.
Using nite-statema hinemodelsto modelprogram be-havior is quite ommon. TheMetal system[5, 8℄ and the SLAMtoolkit[2℄havebeenverysu essfulinapplyingFSM modelsstati ally to operating system ode. Programming languages su h as Vault[7℄, NIL, and Hermes[23℄ en ode thesema hinesdire tlyintothesour e ode. Systemssu h as PREx also ontainmodels that anberepresented as FSMs[3℄.
The Metal system uses a simple global FSM model to tra k hanges in state[5, 8℄. It uses a metalevel ompila-tionstepto stati ally identifylo ations inthe ode where the model may be violated. Our models dier from the
ratherthanlow-levelsystemresour es,andthereforeweuse per-instan e FSM modelsratherthan asingleglobalFSM model.
The SLAM toolkit he ks temporal safety properties of general C programs[2℄. These properties are spe ied in a language alled SLIC, whi huses asafety automaton to modelexe utionbehaviorattheleveloffun tion alls. The authorsfoundthatitworkswellforprogramswhose behav-ioris governedbyanunderlyingnite-stateproto ol. Like us,theyalsofoundthatsplittingmodelsbasedondatawas ee tiveinisolatingbehaviors.
Ammons et al. have a system for spe i ation mining, whi hattemptsto extra tFSMmodelsof program behav-ior from program tra es[1℄. Their system runs pro essed tra esofCprogram odethroughano-the-shelf probabilis-ti NFAlearner. Oursystem,in ontrast,tradesgenerality inordertoleverageobje t-orientedprogramdesignsto pro-du emorefo ussedspe i ationsof omponentbehavior.
Our models of omponent behavior are very similar to the notionof typestates[22,23℄. Intypestates, the type of the obje t hangesas thevaluesstoredinitselds hange oras theprograminvokesoperationsontheobje t. Strom and Yemini developed a system alled Typestate and two languages,NILandHermes,thatuseformaltypestaterules to enfor e stati invariants on the state of a program at dierent programpoints. They donotsupporttraditional pointers. Programs are he ked for onformity by using a ow-sensitive data- ow analysis[23℄ or a demand-driven ba kward analysis[22℄. They used typestatesto stati ally verify the initialization properties of values[23℄. Xu uses typestatesto he kthesafetyofma hine ode[27℄.
Vaultis a type-safe variant of C[7℄. Vaultrequires that theprogrammerannotateorrewritetheir odetomat hthe stri ttypemodel. Vaulthasnonotion of path-sensitivity, so the state of obje ts mustbe onsistent at every point inthe program. It also has manyrestri tions on aliasing. This allows Vault to make many strong assertions about programbehavior,butmanyprogramsarewrittensu hthat theirrolesareguardedbypredi ates. Vault'ssystemforbids manyvalidprogramswrittenusingthisstyle.
SystemslikeVault,NIL,andHermesare omplementary to our system. They are able to take advantage of strong modelsbe ausetheyrequiretheprogrammertowritetheir odetoobeyastri tmodel. Ourte hniqueisabletoobtain weakermodelsoverarbitrary,largepie esof ode.
Our te hnique is similar to other notions of type. Gi-rard introdu esthe notion oflinear logi , whi hsays that resour es anbeusedonlyon e[11℄. Wadleruseslinearlogi toproposeanewtypesystemforfun tionallanguagesthat models hanges of state[24℄. Giord and Lu assen intro-du ed atype andee t dis iplinethat usestypeinferen e te hniquestotra ka essestoresour es[17℄.
Our ideas about models of obje ts map losely to the obje t-oriented design notion of roles. A role is the part ofanobje tthatthatfulllsitsresponsibilitiestoother ob-je ts. Therehasbeenworkonspe ifyingrolemodels[12℄and me hanismsforrole-basedprogramming[14℄. Kun aketal. des ribe a system for the programmerto spe ifythe roles ofobje tsbytheiraliasingrelationshipswithotherobje ts, along witha me hanism to stati ally verify thosealiasing onstraints[15℄.
7.
CONCLUSIONS
This paper proposes a new model for omponent inter-fa es. Weproposeusinganitestatema hineforea held ofa lass, withonestatefor ea hmethodthat writesthat eld. Wethenaddrestri tionsonfromwhi hstatesmethods thatreadtheeldmaybeinvoked. Wehaveproposedand implementedtwote hniquesthatautomati allyextra tsu h models. Therst is a dynami instrumentation te hnique thatre ordslegalmethodsequen esfromworkingprograms. These ondisastati analysisthatinferspairsofmethods that annotbe alled onse utively. Wehavealsodeveloped adynami modelenfor er toensurethat a givenmodelis obeyed.
Wehaveappliedourtoolstofourlargesoftware systems andfoundthatit anautomati allyextra tavarietyof use-fuldata,assistinginvariousstagesofsoftwaredevelopment. Aprogrammermayrunthistooltolearnaboutthe ommon usesofa omponent. One anbuildamodelofa omponent by extra ting the modelsimplied by atest suite, or those resultingmodelsmaybeexaminedtoevaluatethe omplete-nessofthetestsuiteitself. A omponentdevelopermayalso usethesetoolstodetermineiftheimplementedAPImat hes theintendedone.
8.
REFERENCES
[1℄ G.Ammons,R.Bodik,andJ.Larus.Mining spe i ations.InPro eedingsofthe29thACM SymposiumonPrin iples ofProgrammingLanguages, pages4{16,2002.
[2℄ T.BallandS.Rajamani.Automati allyvalidating temporalsafetypropertiesofinterfa es.InPro eedings oftheSPIN2001WorkshoponModelChe king of Software,pages103{122,2001.
[3℄ W.R.Bush,J.D.Pin us,andD.J.Siela.Astati analyzerforndingdynami programmingerrors. Software-Pra ti e andExperien e(SPE), 30:775{802,2000.
[4℄ A.N.HabermannR.H.Campbell. Thespe i ation ofpro esssyn hronizationbypathexpressions. Le tureNotesonComputerS ien e,16,1974. [5℄ A.Chou,B.Chelf,D.Engler,andM.Heinri h.Using
meta-level ompilationto he kFLASHproto ol ode. InPro eedingsoftheNinthInternationalConferen e onAr hite turalSupport forProgrammingLanguages andOperatingSystems, pages59{70,2000.
[6℄ M.Dahm.Byte odeengineeringlibrary. http://b el.sour eforge.net,2000.
[7℄ R.DeLineandM.Fahndri h.Enfor inghigh-level proto olsinlow-levelsoftware.InPro eedingsofthe ACMSIGPLAN'01Conferen eonProgramming LanguageDesignandImplementation,pages59{69, 2001.
[8℄ D.Engler,B.Chelf,A.Chou,andS.Hallem. Che kingsystemrulesusingsystem-spe i , programmer-written ompilerextensions.In
Pro eedings oftheSymposiumonOperatingSystems DesignandImplementation,pages1{16,2000. [9℄ M.Ernst,J.Co krell,W.Griswold,andDavid
Notkin.Dynami allydis overinglikelyprogram invariantstosupportprogramevolution.In
Pro eedings ofthe1999InternationalConferen eon SoftwareEngineering (ICSE'99),pages213{224,1999.
[10℄ J.Ferrante,K.Ottenstein,andJ.Warren.The programdependen egraphanditsusesin optimization.ACMTransa tionsonProgramming LanguagesandSystems, 9(3):319{349,1987.
[11℄ J.Girard.Linearlogi .Theoreti alComputerS ien e, 50(1):1{101, 1987.
[12℄ G.Gottlob,M.S hre ,andB.Ro k.Extending obje t-orientedsystemswithroles. ACMTransa tions onInformationSystems, 14(3):268{296,1996.
[13℄ S.HangalandM.S.Lam.Tra kingdownsoftware bugsusingautomati anomalydete tion.In Pro eedingsoftheInternationalConferen e on Software Engineering,2002.
[14℄ M. VanHilstandD.Notkin.UsingC++templatesto implementrole-baseddesigns.Te hni alReportTR 95-07-02, DepartmentofComputerS ien eand Engineering, UniversityofWashington,1996.
[15℄ V.Kun ak,P.Lam,andM.Rinard.Roleanalysis.In Pro eedingsofthe29thAnnual ACM Symposiumon thePrin iples ofProgrammingLanguages, pages 17{32,2002.
[16℄ D. Lie,A.Chou,D.Engler,andD.Dill.Asimple methodforextra tingmodelsfromproto ol ode.In Pro eedingsoftheInternationalSymposiumon Computer Ar hite ture,pages192{203,2001. [17℄ J.Lu assenandD.Giord.Polymorphi ee t
systems.InPro eedings oftheFifteenthAnnual ACM SymposiumonPrin iplesofProgramming Languages, pages47{57,1988.
[18℄ SunMi rosystems.Java2platform,enterpriseedition. http://java.sun. om/j2ee/,2001.
[19℄ SunMi rosystems.Petstoreappli ationforthejava2 platform,enterpriseedition.
http://java.sun. om/features/2001/05/petstore.html, 2001.
[20℄ J.NimmerandM.Ernst.Stati veri ationof dynami allydete tedprograminvariants: Integrating DaikonandESC/Java.InEle troni Notesin Theoreti alComputerS ien e,volume55, 2001. [21℄ S.P.ReissandM.Renieris.En odingprogram
exe utions. InPro eedings oftheInternational Conferen e onSoftwareEngineering,pages221{230, 2001.
[22℄ R.StromandD.Yellin.Extendingtypestate he king using onditionallivenessanalysis.Software
Engineering,19(5):478{485,1993.
[23℄ R.StromandS.Yemini.Typestate: Aprogramming language on eptforenhan ingsoftwarereliability. Software Engineering,12(1):157{171,1986. [24℄ P.Wadler.Lineartypes an hangetheworld.In
M. BroyandC.B.Jones,editors,IFIPTC2Working Conferen e onProgrammingCon epts andMethods, pages561{581,1990.
[25℄ J.Whaley.Thejoeqvirtualma hine. http://sour eforge.net/proje ts/joeq, 2001. [26℄ R.Wiebi ke.Linkverify.
http://rw7.de/ralf/htmltools/index.en.html,1998. [27℄ Z.Xu,T.Reps,andB.Miller.Typestate he kingof
ma hine ode.InPro eedingsof theEuropean SymposiumonProgramming,pages335{351,2001.