• No results found

Web Server. Repository (static information) Presentation Content Application Data and

N/A
N/A
Protected

Academic year: 2021

Share "Web Server. Repository (static information) Presentation Content Application Data and"

Copied!
16
0
0

Loading.... (view fulltext now)

Full text

(1)

Filipp oRiccaandPaoloTonella

ITC-irst

Centrop er laRicercaScienticaeTecnologica

38050Povo(Trento),Italy

fricca,[email protected]

tel.+39.0461.314592,fax+39.0461.314591

Abstract. Webapplicationsareb ecomingincreasingl ycomplexand

im-p ortant forcompanies.Theirdesign, development, analysis and testing

needthereforetob eapproachedbymeansofsupp ortto ols andmetho

d-ologies.Inthispap erweconsidertheproblemsrelatedtobuildingto ols

forthe analysis and testing ofWebapplicati ons and we tryto provide

someindicationsonp ossible solutions,basedup onourexp erience inthe

development oftheto ols ReWebandTestWeb.

Thedenition ofaprop er reference mo delwill b ediscussed, as well as

theimpact of dynamic pages duringWeb sitedownloading and

subse-quentmo delconstruction.Visualizati ontechniquesaddressingthelarge

amountofextracteddatawillb epresented,whileinfeasibili ty problems

willb econsidered withreferencetothetesting phase.

1 Introduction

Inthelastyears,Webapplicationshaveb ecomeimp ortantassetsforseveral

com-panies,b eingaconvenientandinexp ensive waytoprovidepro ductinformation,

e-commerceandserviceson-line.SinceasoftwarebuginaWebapplicationcould

interruptanentirebusinessandcostmillionsofdollars,thereisastrongdemand

formetho dologies,to olsandmo delsthat canimprove thewebsite qualityand

reliability [7,8].For example, to ols can supp ort develop ers understanding the

abstractstructure of aWebapplicationbymeansofviewsandanalyses,

ensur-ingthattherequirementsp ecicationsaresatisedbytheapplication,andthey

canhelpinthetestingphase.Developingato olthatextracts amo delofaWeb

application,implementssomestaticanalysesandsupp ortsthedevelop ersinthe

testing phase is noteasy.Main problemsare relatedto:mo delingthe abstract

structure ofWeb applications,adaptingknown analysisand testing techniques

tothecharacteristicsofWebbased systems,andvisualizinglargegraphs[6,10].

Onlyfewworks haveinsofarconsideredtheproblemsrelatedto Websitestatic

analysis,maintenance,testingand to buildingtheasso ciatedto ols. One ofthe

rst systematic studies on Webmaintenance is [12],where theauthors

recog-nize the similarity b etween software systems and web based systems and the

imp ortance of the maintenance phase. They have built a to ol called SiteSeer

(2)

that downloadsweb sites and computes some metricson them.The pap er [9]

describ es SPHINX,aJavato olkitandinteractive developmentenvironmentfor

Webspiders.SPHINX consistsoftwoparts:theSpiderworkb ench,a

customiz-ablespider that supp ortsagraphical user interface and visualizesthewebsite

recoveredas agraph,andtheWebSPHINXclasslibrary,thatprovidessupp ort

forwritingWebspidersinJava.TheCAPBAK/Webto ol,explainedin[8],is a

webtesting to ol that supp orts functionaltesting and regression testing. In [7]

anapproachto dataowtestingofWebapplicationsispresented.

Inthispap erweconsidertheproblemsrelatedtobuildingto olsforthe

anal-ysisandtesting ofWebapplicationsandwetry to providesomeindicationson

p ossible solutions,based up on our exp erience in the developmentof the to ols

ReWeband TestWeb.ReWebdownloadsand analyzes the pages of aWeb

applicationwithatwofoldpurp ose:buildingamo deloftheapplicationand

sup-plyingsomeviewsandanalysestothedevelop er.TestWeb,astructuraltesting

to ol, generates and executes a set of test cases for a Web application whose

mo delwascomputedbyReWeb.

Theremainderofthispap erisorganizedasfollows:thenextsectiondescrib es

agenericWebapplicationinfrastructure,Section3intro ducesthegeneral

archi-tecture of our to ols andpresents the adopted analysis mo delfor Web

applica-tions,Sections 4and 5explain problemsencountered and solutionsadopted in

thedevelopmentoftheto olsReWebandTestWeb.Finally,Section6concludes

thepap er.

2 Web Applications

AtypicalgenericWebApplicationinfrastructureisshowninFigure1(asimilar

schemaisprop osed in[13]).

Web

Browser

Web

Server

Back-end

Services

HTTP Response

HTTP Request

Request

HTML Output

Content Response

Content Request

Request

Response

Presentation

Content

Application

Data and

Layer

Layer

Layer

Request

Response

Service Layer

Databases

Repository

(dynamic contents)

(static information)

Application

Server

(3)

The browser sends the requests via HTTP to the server for an interactive

view of Web pages. Web pages can b e static or dynamic. While the content

of astatic page is xed and stored in a rep ository, the content of a dynamic

pageiscomputedat run-timebytheapplicationserver andmaydep end onthe

informationprovided by the user through input elds (a similardistinction is

prop osedin[4]and[5]).Theprogramsthatgeneratedynamicpagesatrun-time,

as forexampleCGI scriptsand servlets,run onthe applicationserver andcan

useinformationstoredindatabasesandotherresources.TheWebserverandthe

applicationservercanb elo catedonthesamemachineorondierentmachines.

Similarlyto [5],we classifyWebapplications 1

according toataxonomy,

or-deredbygrowingcomplexity,whichischaracterizedbydynamism,page

decom-p ositionanddataow.

{ Level0:staticpageswithoutframes.

{ Level1:staticpageswithframes.

{ Level2:dynamic pageswithoutdatatransfer fromclient.

{ Level3:dynamic pageswithdatatransferfromclient.

The diÆculties and problems in the construction of a to ol, that supp orts

develop ersinthephases ofanalysisandtestingofWebapplications,growwith

increasinglevelsinthetaxonomy.Applicationsatlevels2and3typicallyexploit

informationstoredinsideadatabasetobuildthecontentofthedynamicpages.

Allfourlevelsareinthescop eoftheprop osed techniques.

3 Tool Architecture

Output

pages

Coverage

Pass/fail

values

Expected

UML

model

TestWeb

ReWeb

Internet

URL

Input forms

Views

Analyses

Uses,

Dynamic pages,

Cond. edges

Input values

Test criterion

Fig.2. RolesofRewebandTestWeb.

Thetwoto olsReWebandTestWebhave b eendevelop edtosupp ort

anal-ysis and testing of Web applications. Their relative roles are schematized in

1

Although some authors distinguish b etween Websites and applicatio ns, using the

(4)

Figure2.ReWebdownloadsandanalyzes thepagesofaWebapplicationwith

the purp oseof buildingamo delof it andpro ducing someanalyses andviews.

TestWebgeneratesandexecutesasetoftestcasesforaWebapplicationwhose

mo delwascomputedbyReWeb.Thewholepro cessissemi-automatic,andthe

interventionsoftheuserareindicatedwithindiamondsinFigure2.Explanations

onthemanualinterventionswillb egiveninthefollowingsections.

Both to ols p erformtheir op erations onan abstraction of theWeb

applica-tions, indicated inFigure 2as UML mo del. UML,the Unied Mo deling

Lan-guage[3],wasexploitedtoexpresssuchamo del.Letusconsiderthekey

require-mentsonthemo del.Weareinterestedinamo delthatcanb edirectlyabstracted

fromtheimplementation.Someimp ortantcharacteristicsthatitshouldhavecan

b esummarizedasfollows:

{ Thefo cusshouldb eonthenavigational featuresofthesite;

{ Itshouldb ecompletei.e.themostimp ortantentitiesas,forexamplelinks,frames,

formsanddynamicpagesmustb eexplicitl y representedin themo del;

{ Itshouldb ep ossible toprovide (partial)automatic supp ortforitsextraction;

{ It should b e p ossible to apply to it some staticanalyses and testing techniques

derivedfromthoseusedwithtraditionalsoftwaresystems.

{ Itshouldb ep ossible toderive someviews fromitthatrepresentthe Websitein

anintuitive mo de;

WebPage

Form

input: Set<Var>

Frame

StaticPage

DynamicPage

use: Set<Var>

link

LoadPageIntoFrame

f: Frame

ConditionalEdge

{optional}

c: Condition<Var>

{optional}

include

into

initial page

submit

split

split into

0..*

0..*

0..*

0..*

1

{only from DynamicPage}

0..1

Fig.3.Meta mo delofagenericWebapplicationstructure.

Figure3showsourmetamo delused todescrib e theelementsinthemo del

(5)

b e displayed to theuser, and the navigation linkstoward other pages. It also

includes organizationandinteractionfacilities(e.g.,framesandforms).

Thetwosub classesof WebPagemo delthestaticanddynamicpages.When

thecontentofadynamicpagedep ends onthevalueof aset ofinputvariables,

theattributeuseofclassDynamicPage containsthem.

Aframeisarectangulararea inthecurrent pagewherenavigationcantake

placeindep endently.Moreover thedierentframesintowhichapageis

decom-p osedcaninteractwitheachother,sincealinkinapageloadedintoaframecan

forcetheloadingofanotherpageintoadierentframe.Thiscanb eachievedby

addingatargettothehyp erlink.Organizationintoframesisrepresentedbythe

asso ciationsplitinto,whosetargetisasetofFrameentities.Framesub division

mayb erecursive(auto-asso ciationsplitintowithinclassFrame),andeachframe

hasaunaryasso ciationwiththeWebpageinitiallyloadedintotheframe(absent

incase ofrecursive sub divisionintoframes).WhenalinkinaWebpageforces

theloadingofanotherpageintoadierentframe,thetargetframeb ecomesthe

datamemb erof the(optional)asso ciationclassLoadPageIntoFrame.

InHTML user inputcanb e gathered byexploitingforms. AWebpagecan

include any numb er of forms(asso ciation include). Each form is characterized

by theinput variables that areprovided by the user through it (data memb er

input).ValuescollectedbyformsaresubmittedtotheWebserverviathesp ecial

linksubmit,whosetargetisalwaysadynamicpage.Sincelinks,framesandforms

are partofthecontent ofaWebpage,and fordynamicpagesthecontentmay

dep end on the input variables, even the organization of a page is, ingeneral,

notxedanddep ends ontheinput. Thisisthereason fortheasso ciationclass

ConditionalEdge, which optionally adds a b o olean condition, function of the

input variables, representing the existence condition of the asso ciation (which

caninturnb ealink,anincludeorasplitinto).Thetarget,page,formorframe,

isreferenced bythesource dynamicpageonlywhentheinputvaluessatisfythe

conditionintheConditionalEdge.

4 ReWeb

TheReWebto olconsistsofthreemo dules:aSpider,anAnalyzerandaViewer.

TheSpiderdownloadsallpagesofatargetwebsite,startingfromagivenURL

and providingthe input required by dynamic pages, and it builds a mo del of

the downloaded site. Each page found withinthe site host is downloadedand

markedwith thedate ofdownloading.TheHTML do cumentsoutside the web

site host arenotconsidered. The user hasto sp ecify the set ofinputs foreach

pagethatcontainsForms.TheAnalyzerusestheUMLmo delofthewebsiteand

the downloaded pages to p erform several analyses, presented in the following,

some of which are exploited during static verication. The Viewer provides a

(6)

navigationand query facilities. Web Spider and Analyzer are written in Java,

whiletheViewer isbasedonDotty 2 . 4.1 Spider

SPIDER(target url)

1 UML Model

;

2 Error urls

;

3 Pages already visited

;

4 Urls found

;

5 S

f

target url

g

6

while

(S

6

=

;

)

7

choosen url

chooseElement(S)

8

S

S

nf

choosen url

g

9

ifnot

(choosen url

2

Pages already visited)

then

10

if

(choosen url is OK)

then

11

Pages already visited

Pages already visited

[f

choosen url

g

12

if

(choosen url is a HTML page)

then

13

Download(choosen url)

14

Urls found

scanPage(choosen url)

15

S

S

[

Urls found

16

AddElementsToModel(UML Model, choosen url, Urls found)

17

endif

18

else

19

Error urls

Error urls

[f

choosen url

g

20

endif

21

endif

22

endwhile

Fig.4.Pseudo-co de oftheSpider.

Web pages are not actually written using a single language. They can b e

ratherregardedasmultilanguagedo cuments,whereco defragmentsinlanguages

dierentfromHTMLcanb eloaded(e.g.Applets)orinterpreted(e.g.Javascript).

LibrariesfortheconstructionofSpiderprogramsareavailableforprogramming

languagessuchasPerl,C/C++,orJava.AnexampleistheWebSPHINXclass

library [9].We decided to implement ourWebSpider just exploiting the Java

language and its standard library, starting from scratch, in order to have

to-talcontrolonthemultilingualasp ects ofthedownloadedpages.Wedevelop eda

parserwhichrecognizesb othHTMLandJavascriptco defragments,andextracts

theneeded information(links,forms,frames,etc.)fromthem.

Figure4showsthepseudo-co deofourSpider.Thepro cedureSPIDER takes

agivenURLininput and buildsthe asso ciatedUML mo del. Theb o dy ofthe

commandwhile(containedinlines6-22) isexecuted untilthereareelementsin

theset S.ThefunctionchooseElement (line7) cho oses anelementinthesetS

whiletheconditionchosen urlisOK(line10)istrueifchosen urliswell-formed

andthecorresp onding pageexists intheWebsite.The conditionchosen urlis

2

(7)

an HTMLpage (line12)is true ifthecontenttyp eof thedo cumentconnected

tochosen urlisHTML.Inline13thepro cedureDownloadiscalledtostorethe

retrievedpageinthele-system.ThefunctionscanPage(line14)scansthepage

andreturnsthesetofURLsfoundwithinitandcontainedinthesite host.The

pro cedure AddElementsToModel (line 16) adds no des and edges to the mo del

inaccordance withourmetamo del.Iftheset Sisimplementedas astack,the

algorithmvisitstheWebsiteindepth-rstway,whileusingaqueue pro duces a

breadth-rst visit.

Problems encountered inthe construction of the Spider are due to

irregu-larities and ambiguities present in HTML co de, also noted in[12], and to the

currentstateoftheWebtechnology,oeringalargesp ectrum ofalternatives to

implementawebsite.Oursolutionwastoimprovetherobustnessoftheparser,

so that it could accept a sup erset of HTML including the main irregularities

commonlyrecognizedandprop erlyinterpreted byavailablebrowsers.

Dynamicpagesp ose additional problemto theactivityof theSpider.Since

the content of these pages is decided at run time, it may in general dep end

on the input previously provided by the user. In particular, the structure of

a dynamicpage may change when it isencountered in adierent interaction.

Since the mo del of aWeb site encompasses all p ossibilities,the Spider has to

recover all variants of a dynamic page, and has to merge them into a single

representative object.Thiscanb eachievedbysp ecifyingtheinputvaluestob e

providedb eforedownloadingthedynamicpageof interest. Moreover,thesame

dynamicpagehas to b edownloadedseveral times,with dierentinputs, when

thedierentconditionsgenerateadierentpagestructure.Allsequencesofinput

valuestob e providedb eforeeachpagedownloadaresp eciedinalewhich is

readbytheSpider.Alldynamicpagessp eciedinthelearedownloadedafter

providingtheWebserver withthegiveninputs.Finally,allversionsofthesame

page are merged.Although the numb erof inputs to b e provided mayexplo de

combinatorially, out exp erience suggests that in practice few alternatives are

suÆcienttocoverallvariants.

Anadditionalinputthatmayaectthecontentdisplayedinadynamicpage

is the cookie that the browser provides to the Web server. A cookie is a user

identier thatis stored bythe browser inthelo calle systemand is provided

totheWebservertoallowuseridenticationeachtimeanewconnectionwitha

givenserverisestablished.Afterrecognizingtheuser,theWebservercanprovide

acustomizedversionofthedynamicpagesinthesite.Sincetheirstructuremay

dep end ontheco okie,theSpiderneeds theabilitytosendaco okietotheWeb

server, in order to obtain also the pages that are generated when the user is

identied.

4.2 Analyzer

The UML mo del of a Web site can b e interpreted as a graph by asso ciating

objects withno des andasso ciations withedges. Somesimpleanalysesmay

(8)

FLOWANALYSIS(graphG=(N,E))

1 foreach(n2N)

2 initializeINnandOUTn

3 endfor 4 change true 5 while(change) 6 change false 7 foreach(n2N) 8 INn L p2pred(n) OUTp 9 OLDOUT n OUT n 10 OUT n GEN n S (IN n ,KILL n ) 11 ifOUT n 6=OLDOUT n then 12 change true 13 endif 14 endfor 15 endwhile

Fig.5.Pseudo-co deoftheowanalysisalgorithm.

page. They are obtained as the dierence b etween the pages available in the

Webserver le systemand thosedownloadedbytheSpider.Ghostpages are

asso ciatedwithp endinglinks,whichreference anonexistingpage.

More advanced analyses[11] can b e derived fromthegeneral frameworkof

owanalysis[1],describ ed bythealgorithminFigure5.Thealgorithm

propa-gatesowinformationinside agraphuntilthex-p ointisreached.Thekindof

owinformationtob epropagateddep endsonthepurp oseoftheanalysisb eing

p erformed.Some examplesaregivenb elow.Moreover,the conuence op erator

exploitedatline8tocollectoutgoinginformationfrompredecessorno desisalso

dep endentontheanalysis,andistypicallyeither theintersectionortheunion.

Afterinitializingtheinputandoutputsetsofeachno de (IN

n

andO UT

n ,lines

1-3)withtheinitialowinformation,propagationisachievedinsideax-p oint

lo op(line5)bysubtractingthedestroyedinformation(KILL

n

set)andadding

thegeneratedinformation(GEN

n

set)totheincominginformation(line10)for

each no deninthegraph.

An example of analysis which sp ecializes the algorithm in Figure 5 is the

computation of the reaching frames, which determines the set of frames in

which each page can app ear.Whena pageis loaded into aframe as its initial

pageorisreachablethroughanedgedecoratedwithaLoadPageIntoFrame

asso-ciationclassinstance,itgeneratesthenameoftheframeasowinformation.By

propagatingsuchinformationalongthesitegraphuntilthex-p ointisreached,

thereaching frames ofeach pageare determined.The outcomeofthereaching

framesanalysis isusefulto understand theassignmentofpagesto frames.The

presence of undesirable reaching frames is thus made clear. Examples are the

p ossibilityto loada page at the toplevel, while it was designed to always b e

loaded into agivenframe,or thep ossibilityto loadapage intoaframe where

itshould notb e.

Flowanalysescanb eemployedinamoretraditionalfashiontodeterminethe

(9)

graph.Ifadenitionofavariablereachesano dewherethesamevariableisused

(useattributeofadynamicpage),thereisadatadep endence b etweendening

no de and user no de. Datadep endences areuseful to representthe information

owsintheapplication.Theymayrevealthepresenceofundesirablep ossibilities,

such as using a variable not yet dened or using an incorrect denition of a

variable.Datadep endencesarealsoextremelyimp ortantfordynamicvalidation,

whendataowtestingtechniquesare adopted.

Whenthepagesofasitearetraversed, itisimp ossibletoreach ado cument

withouttraversing aset of other pages,called itsdominators.Sites inwhich

traversing a given page is considered mandatory, e.g., b ecause it contains

im-p ortantinformation,willhaveitinthedominatorsetofevery no de.Dominator

analysis,alsoderivedfromthealgorithminFigure 5,automatesthecheck.

Theevolutionofwebsites [10,12] isanother interesting object of

investiga-tion. Such ananalysis requires theabilitytocompare successive versions ofits

pages and to graphically display the dierences. Given two versions of a web

site,downloadedatdierentdates,theircomparisonaimsatdeterminingwhich

pageswereadded,mo died,deletedorleftunchanged.Itcanb ecombinedwith

thestaticanalysesdescrib ed ab ove,since theirre-computationover timeallows

controllingtheevolutionoftheapplicationquality.

4.3 Viewer

ThegraphviewofaWebapplicationis agraph,whoseno descorresp ondtothe

objects inthe mo deland whose edges corresp ond to the asso ciations b etween

objects. Lab eled edges are used for the links having a LoadPageIntoFrame or

ConditionalEdge relationsp ecier. Inthegraphview,to intuitivelysuggest

de-comp osition into frames, we adopt the convention of joining horizontally the

no des of typ e frame contained in the same page, and collapsing the edges of

typ e \split into" into asingle edge. An example of decomp osition into frames

is shown inFigure6. Page madmaxpub/index.html(the mainpage)is divided

intotwoframeswithidentiersaandb,andframeaisusedasamenutoforce

theloadingofpagesintotheotherframe.

Thegraphviewofawebapplicationcanb eenrichedwithinformationab out

itshistory[10],bycoloringtheno desandasso ciatingdierentcolorstodierent

timep oints(see Figure6).Inparticular,ascaleofcolorsrangingfromtheblue,

goingthroughthegreenandreachingtheredcanb eemployedtorepresentno des

added/mo diedinthefarpast,inthemediumpastormorerecently.

TheViewer is based on Dotty,and uses the algorithmexplained in [6] for

drawingdirectedgraphs.Theaestheticprinciplesfollowedbythealgorithmare:

to exp ose hierarchical structure (if any) in the graph, to avoid edge crossings

(ifp ossible) andsharpb ends, to keep edgesshort, tofavorsymmetryand

bal-ance.Thelayoutalgorithmof thegraphviewofaweb siteisveryimp ortantto

understanditsstructure,esp ecially whenthesite isverycomplex.

Another problemconnected with thevisualization of awebsite is thefact

(10)

diÆ-b

b

b

b

Fig.6.Coloredgraph viewofthesitewww.ubicum.it/madmaxpubatdate

3-2-2000.

abstract, to simplify,to extract ap ortionof,or to seeonlyapart of thegraph

view.Anotherp ossibilitycanb etoaddotherviewsofawebsiteasforexample

thebirdeye andoverviewdiagramsor todisplayonlythedepth-rst tree

with-outreturn edges(solutionadopted in[9]). Theviews and facilitieswe prop ose

follow. The system view represents the organization of pages into directories;

thedataow viewdisplaystheread/writeaccessesofpagestovariables,resp

ec-tivelythrough incoming/outgoingedgeslinking pagesto variables; thehistory

representation with percentage bars describ es, incompactway, thep ercentages

ofno deswiththesamecolor.Amongtheprovidedfacilities,theviewersupp orts

zo om, search, deletion of incomingor outgoingedges, and fo cus. Thefacilities

forfo cusing onandsearching ano deare usefulwhenthevisualizedgraphs are

very large. By exploiting the fo cusing facility it is p ossible to display only a

limitedneighb orho o dofaselected no de.Anotherp ossibilitytoaccessthegraph

view,not yet implemented,is the identication and extraction of ap ortion of

a web site by means of pattern matching techniques. Recurrent patterns are

exp ected tob e usedinthedesignof websites (forexampletree,hierarchy,full

connectivity,indexed-sequence).

5 TestWeb

Web sites can involve a complex interaction among Web browser, op erating

systems,plug-inapplications,communicatingproto cols,Webservers,databases,

server programs (for example CGI programs) and rewalls. Such complexity

makes the test of Web Sites a great challenge [8].Ideally all comp onents and

(11)

the extreme timepressure under which Websystems are develop ed. Available

testingtechniques [8]dieronthefeatures oftheWebsitewewanttotest.For

example it is p ossible to execute link testing, HTML validation, p erformance

testing, andsecurity testing.We areinterestedindynamic validationusingour

UML mo delas abase for this typ e of testing. In general, dynamic validation

metho dsaimatexercising thesystembysupplyingavector ofinputdata(test

case)andcomparingtheexp ectedoutputswiththeactualonesafterexecution.

Inparticular,weconsidered whitebox testingofWebApplications:theinternal

structure ofaWebapplicationisaccessed tomeasure thecoveragethat agiven

testsuite(collectionoftestcases) reaches,withresp ect toagiventestcriterion

(statingthefeaturestob etested).Somewhiteb oxtestingcriteria,derivedfrom

thoseavailablefortraditionalsoftware[2],are:Pagetesting, Hyp erlinktesting,

Denition-use testing,All-uses testing,All-pathstesting. AtestcaseforaWeb

application is a triple: URL, input (a sequence of variable-value assignments

separated by the character '&'), typ e of parameter passing (GET or POST).

Execution consistsof requestingtheWebserver fortheURLinthetriplewith

theasso ciatedinputandstoringtheoutputpages.Satisfactionofanyofthewhite

b oxtesting criteria involves selecting aset ofpaths intheWebsite graph and

providinginput values. Since path selection is indep endent (conditional edges

excluded) frominputvalues,it canb eautomated.

Output

pages

Coverage

Pass/fail

model

UML

Test

cases

Executor

Test

values

Expected

Test Criterion

Generator

Test

dynamic pages

cond. edges

uses

Input values

Fig.7.Architecture oftheto olTestWeb.

TestWeb(see Figure 7) contains a test case generation engine (Test

gen-erator),ableto generate testcases fromtheUMLmo delof aWebapplication.

The user has to add some informationto the mo delpro duced by ReWeb to

complete it fortesting purp oses andfurthermore theuser has tocho ose atest

criterion. Theuser sp ecies the pagetyp e when thedistinction b etween static

anddynamicpagescannotb eobtainedautomatically(e.g.,dynamicpageswith

no input). The user also provides the set of used variables, use, for each

(12)

Additionalmanualinterventions,relatedtostate unrolling,willb edescrib ed in

thefollowingonanexample.GeneratedtestcasesaresequencesofURLswhich,

onceexecuted, grantthecoverageoftheselected criterion.Inputvaluesineach

URLsequence areleft emptybytheTest generator,and theuser hasto ll in

them,p ossiblyexploitingthetechniques traditionallyusedinblackb oxtesting

(b oundary values, etc.). TestWeb's Test executor can now provide the URL

request sequence of each test case to theWeb server, attaching prop er inputs

to each form. Theoutput pages pro duced bythe server are stored for further

examination.Afterexecution,thetestengineerintervenestoassessthepass/fail

result of each testcase. Asecond, numeric outputoftest case execution isthe

levelofcoverage reached bythecurrenttestsuite.

Regressiontestinghighlyb enetsfromtheautomationintestcaseexecution,

sinceeach testcasecanb ere-executed unattendedonanewversionoftheWeb

application, and its output pages can b e automatically compared with those

obtainedfromarun ofthepreviousversion.

5.1 TestGenerator

Given the graph representation of a Web application,a reduced graph can b e

computedforthepurp osesofwhiteb oxtesting:each staticpagewithoutforms

isremovedfromthegraphbyaCross-Termstepdescrib ed in[2](see Figure8).

A

D

B

C

E

A

A

B

B

B

B

D

D

C

E

C

E

A

Fig.8.StepoftheCross-Termalgorithmatano deselected forremoval.

Inthe resulting graph,a ctitiousentry no de is added, connected with all

no deswithnopredecessor,andactitiousexitno deisdirectlyreachablefromall

outputno des,i.e.,dynamicno deswithnonemptyuseattribute.Infact,theend

ofacomputationisreached,inaWebapplication,whensomeresultisdisplayed

totheuser,butnointrinsicnotionofterminationforanavigationsessionexists.

Dierentlyfromtheow-graphofastructuredprogram,thegraphview ofa

Webapplicationcancontainhorrible-lo ops[2],i.e.,theremayb eno desjumping

into or out of a lo op and/or there may b e more than one iterating no de for

the samelo op. In presence of horrible-lo opstheusual strategies used to cover

nested-lo opsandconcatenated-lo opsdonotwork.Wehavechosenageneral

so-lution:a testcase generation technique based on the computationof thepath

expression [2] ofthereduced Website graph.Apath expressionis analgebraic

representation of allpaths inagraph.Variablesina pathexpression are edge

(13)

ec-REDUCTION(graph)

1 Combine all serial links bymultiplying their expressions

2 Combine all parallel links by adding their path expressions

3 Remove all self-loops by replacing them this a link of the form

X

4

while

(number of nodes in the graph

>

2)

5

n

choose a node of the graph dierent from initial or nal node

6

Apply Cross-Term elimination to n

7

Combine any remaining serial links as in step 1

8

Combine all parallel links as in step 2

9

Remove all self-loops as in step 3

10

endwhile

Fig.9.Reductionalgorithm.

Computationofthepathexpressionforasitecanb ep erformedbymeansofthe

Reductionalgorithmdescrib ed in[2]anddepicted inFigure9.

Thelines 1-3of the Reductionalgorithminitializethepro cess and put the

graph in normal form. The b o dy of the commandwhile is executed until the

numb er of no des in the graph is greater than 2. Line 5 assigns ano de of the

graphdierentfromtheinitialornalno detovariablen,whileline6executes

aCross-Term step onno de n.This step eliminatesthe no den and transforms

the graphaccording to thediagramshown inFigure8. Lines 7-10combine all

serial andparallellinksandremoveself-lo ops. Attheend oftheexecution,the

path-expressionoftheinputgraphisobtained.

PATH GENERATION(path expression)

1

while

criterion not satised

2

for each

alternative from inner to outer nesting

3

choose

one never considered before, if any

4

or randomly choose

one

5

endfor

6

if

computed path increases coverage

then

7

add

it to the resulting paths

8

endif

9

endwhile

Fig.10.Theheuristictechniqueadopted toobtainthepathssatisfyinga

crite-rion.

Since thepath expression directly represents allpaths inthe graph, itcan

b e employed to generate sequences of no des (test cases) which satisfy any of

thecoverage criteria.Determiningtheminimumnumb erofpaths, fromapath

expression,satisfyingagivencriterionisingeneralahardtask.However,

heuris-ticscanb edened tocomputeanapproximationoftheminimum.Theheuristic

techniqueadopted forthisworkis basedontheschemeofFigure10(the

alter-native at line 2 for alo op is whether to re-iterate or not). Denition-use and

(14)

such as denition-use and all-pathstesting, for which thecoverage of p ossibly

innitepathsshould b eachieved,couldrequirethat onlyindep endentpathsb e

consideredor thatlo opsb ek -limited.

:Form

input = {va}

:Form

input = {va}

:Form

input = {va}

:Form

:Form

:ConditionalEdge

:Form

input = {va}

:Form

input = {jump}

thesaurus.htm

thesaurus

:Form

input = {va}

dictionary.htm

dictionary

:Form

input = {va}

:Form

a

b

c

e

d

input = {jump}

c = ’#entries(va) > 1 and

not (jump selected)’

Fig.11.Mo delofap ortionofwww.m-w.com,includingtwoconditionaledges.

Figure11showsthep ortionoftheWebsitewww.m-w.comwhichprovideson

line access to the Merriam-WebsterEnglishdictionary and thesaurus. A word

can b e entered inthe initialpage (eitherdictionary.htmor thesaurus.htm).

A dynamicpagedictionary isthen comp osedinresp onse to theinput word,

storedinvariableva.Thecontentoftheresultingpagedep ends onthenumb er

of entries foundinthe dictionary.If thereare morethan one entry,aselection

listisdisplayedtotheuser,togetherwithanexplanationofthemainentry.The

usercancho oseamongthealternatives{theselectionisstoredinvariablejump

{andmovetoapage,stillnameddictionary,withanexplanationofsuchan

entry. The mo del of the site represents the conditionalexistence of the list of

alternativeswithaConditionalEdgeobject asso ciated withtheedgelab elledb.

Thesiteoersthep ossibilitytoenteranewwordfromb oththedynamicpages

dictionaryand thesaurus,andallowsswitching fromdictionarytothesaurus

andviceversafromtheinitialandresult pages.

Inordertodetermineasetofpathstob eexercisedduringwhiteb oxtesting,

thepathexpressioniscomputed.Letusconsiderthep ortionofthesitedevoted

to the extraction of entries fromthe dictionary (symmetricconsiderations can

b e made for the thesaurus). The path expression asso ciated with the lab elled

edges of Figure 11 is a(bc+de)

. Some of thepaths generated fromit can b e

traversedonlyifprop erinputsareprovided,whilesomeotherpathsareinfeasible

for every input. For example, the path abc can b e traversed only if the input

wordhas morethan one entry in the dictionary. This condition is quite easy

(15)

selected fromthelistdisplayedtotheuser,thenext dynamicpagedictionary

that is obtainedwillnotinclude the listof alternative entriesanylonger. This

conditionisrepresentedas 'not (jump selected)'inFigure11:ifaselection

was p erformed by the user, edge b do es not exist. As a consequence the path

expressioncannotb eeasily exploitedforpathgeneration.

dictionary2

dictionary1

:Form

input = {va}

:Form

input = {jump}

:ConditionalEdge

:ConditionalEdge

:Form

input = {va}

a

b

c = ’#entries(va) > 1’

d

c

e

h

g

f

c = ’#entries(va) = 0 or 1’

Fig.12.Pagedictionarywasunrolled intodictionary1anddictionary2.

The problemhighlighted ab ove derives fromthe p ossibility to use asame

dynamic page for dierent purp oses. Actually, some Web sites consist of just

one dynamic page which displays dierent information according to an

inter-nallyrecorded state oftheinteraction.Inotherwords,whileforstaticsitesthe

Web page is coincident with the state of the interaction, this is not

necessar-ily true with dynamic sites. A p ossible solution to this problemis to p erform

an op eration of state unrollingon thedynamicpages that are used to display

dierentcontents underdierent conditions.IntheexampleofFigure 11, page

dictionary is used for two purp oses: to prop ose a list of alternative

dictio-naryentries and toprovidethenal result ofthesearch,once asingleentry is

identied. Such two purp oses mayb e represented explicitly by the two pages

dictionary1and dictionary2intowhichtheinitialpageisunrolled (see

Fig-ure 12). Theconditions intheConditionalEdge objects arenowsimpliedand

verify only the numb er of dictionary entries. The path expression of such site

p ortionb ecomes(ac+bf+bhg c)(dc+ef+ehg c)

andallthepathsthatcanb e

generatedfromitarefeasible,providedthataninputwordwiththeappropriate

numb erofdictionaryentriesis selected.

6 Conclusions and Future Work

Weprop osedsomeanalysisandtestingtechniquesworkingonWebapplications.

Thestartingp ointfortheirdenitionisamo delofWebsites,designedtoinclude

all characteristics that are relevant from an architectural p oint of view. Page

downloadingand mo del construction were achieved by providinginput values

(16)

Facilitiesforthedisplayoftheresultingmo delareprovided bytheanalysis

to olReWeb, whiletest generation and execution isautomated byTestWeb.

Ourexp erience withReWebandTestWebsuggeststhat thechoiceofago o d

mo del is fundamental for b oth analysis and testing. We showed some views

and facilitiesof ReWeb on a real example (www.ubicum.it).Path testing in

presence of aninternalrepresentation of theinteractionstate can b esimplied

bymeansofastateunrollingop eration,whichwasalsodescrib ed withreference

toarealworldexample(www.m-w.com).

Ourfuture work will b e devoted to extending theset of analyses available

(to include,for example, pattern matching),adding abstraction techniques to

supp ortahighlevelviewofthesite,(partially)automatinginputselectionduring

testing in presence of conditional edges and providing b etter supp ort to state

unrolling.

References

1. A. V. Aho,R. Sethi, and J.D. Ullman. Compilers.Principles,Techniques,and

Tools. Addison-Wesley Publishin g Company,Reading,MA,1985.

2. B.Beizer.SoftwareTestingTechniques,2ndedition.InternationalThomson

Com-puterPress,1990.

3. G.Bo o ch,J.Rumbaugh,andI.Jacobson. TheUniedModelingLanguage{User

Guide. Addison-Wesley Publishin gCompany,Reading,MA,1998.

4. J. Conallen. BuildingWeb Applicationswith UML. Addison-Wesley Publishing

Company,Reading,MA,2000.

5. D.Eichmann.Evolvinganengineeredweb.InProc.oftheInternationalWorkshop

onWebSiteEvolution,Atlanta,GA,USA,Octob er1999.

6. E.R.Gasner,E.Koutsoos,S.North,andKiem-PhongVo.Atechniquefordrawing

directedgraphs. InIEEE-TSE1993,March1993.

7. Chien-HungLiu,David C.Kung,PeiHsia,andChih-TungHsu. Structuraltesting

ofwebapplications . InProc.ofISSRE2000,Internationalsymposiumonsoftware

reliabilityengineering,SanJose,California,pages84{96,Octob er2000.

8. EdwardMiller. Thewebsitequality challenge.-companionpap er:"website

test-ing".InProc.ofQW1998conference,901MinesotastreetSanFrancisco,CA94107

USA,1998.

9. Rob ertC.MillerandKrishnaBharat.Sphinx:Aframeworkforcreatingp ersonal,

site-sp ecic web-crawlers. InProc.ofWWW7,BrisbaneAustralia,April1998.

10. F. Ricca and P. Tonella. Visualizati on of website history. In Proc. of the

In-ternationalWorkshop on Web SiteEvolution, pages 30{33, Zurich,Switzerland,

2000.

11. F.RiccaandP.Tonella.Websiteanalysis:Structureandevolution. InProceedings

oftheInternationalConferenceonSoftwareMaintenance,pages76{86, SanJose,

California,USA,2000.

12. P.Warren,C.Boldyre,andM.Munro.Theevolution ofwebsites. InProc.ofthe

InternationalWorkshopon Program Comprehension,pages 178{185,Pittsburgh,

PA, USA,May1999.

13. Y. Zou and K. Kontogiannis. Enabling technologies for web-based legacy

sys-tem integration. InProc.ofthe InternationalWorkshop onWebSite Evolution,

Figure

Fig. 2. Roles of Reweb and TestWeb.
Figure 2. ReW eb downloads and analyzes the pages of a Web application with
Fig. 4. Pseudo-co de of the Spider.
Fig. 5. Pseudo-co de of the 
ow analysis algorithm.
+7

References

Related documents

Madeleine’s “belief” that she is Carlotta Valdez, her death and rebirth as Judy (and then Madeleine again) and Scottie’s mental rebirth after his breakdown.. All of these

The projected gains over the years 2000 to 2040 in life and active life expectancies, and expected years of dependency at age 65for males and females, for alternatives I, II, and

Marie Laure Suites (Self Catering) Self Catering 14 Mr. Richard Naya Mahe Belombre 2516591 [email protected] 61 Metcalfe Villas Self Catering 6 Ms Loulou Metcalfe

Application Procedure, The Academic Year, Structure of Programmes, Course Types, How to Look for Courses, ECTS, Thesis and Project Work, Language Proficiency, Contacts.. PLAN YOUR

On Tenerife, a spectral camera with an objective grating is mounted under an angle of 14 ◦ such that the first-order spectrum of meteors recorded by the zero-order camera is

Fédéli, “High efficiency diffractive gra- ting couplers for interfacing a single mode optical fiber with a nano- photonic silicon-on-insulator waveguide circuit,” Appl. Zhou,

In view of the importance of aryloxy tetrazoles and aryloxy imidoylazides [38-43], we want to report a facile, effective and less hazard method for synthesis the 5-aryloxy

The ethno botanical efficacy of various parts like leaf, fruit, stem, flower and root of ethanol and ethyl acetate extracts against various clinically