Automatic derivation of software performance
models from CASE documents
夽V. Cortellessa
∗, A. D’Ambrogio, G. Iazeolla
Department of Computer Science, S&P, University of Roma “Tor Vergata”, 110 Via di Tor Vergata, I-00133 Rome, Italy
Abstract
Lifecycle validation of software performance (or prediction of the product ability to satisfy the user performance-requirements) is based on the automatic derivation of software performance models from CASE documents or rapid pro-totypes.
This paper deals with the CASE document alternative. After a brief overview of existing automatic derivation methods, it introduces a method that unifies existing techniques that use CASE documents.
The method is step-wise clear, can be used from the early phases of the software lifecycle, is distributed-software oriented, and can be easily incorporated into modern (e.g., UML-based) CASE tools.
The method enables the software designer with no specific knowledge of performance theory to predict at design time the performance of various final product alternatives. The designer does only need to feed the CASE documents into the performance model generator.
The paper carries on an application case study that deals with the development of distributed software, where the method is used to predict the performance of different distributed architectures the designer could select at preliminary design time to obtain the best performing final product.
The method can be easily incorporated into modern object-oriented software development environments to encourage software designers to introduce lifecycle performance validation into their development best practices. © 2001 Elsevier Science B.V. All rights reserved.
Keywords:Software performance; Software validation; CASE tools
1. Introduction
Software quality is not an add-on feature of software products. In other words, it cannot be introduced by retrofit actions (when the product has already been developed), but has to be built-in across the software development process [11].
夽Work partially supported by funds from the MURST project on Software Architectures and Languages to Coordinate
Dis-tributed Mobile Components from the University of Roma “Tor Vergata” research on the Performance Validation of Advanced Systems and from the University of Roma “Tor Vergata” CERTIA Research Center.
∗Corresponding author.
E-mail addresses:[email protected] (V. Cortellessa), [email protected] (A. D’Ambrogio), [email protected] (G. Iazeolla).
0166-5316/01/$ – see front matter © 2001 Elsevier Science B.V. All rights reserved. PII: S 0 1 6 6 - 5 3 1 6 ( 0 1 ) 0 0 0 3 6 - 0
Fig. 1. Strategy scheme for lifecycle validation of software performance [11].
Software performance is one of the main attributes of software quality. The approach of introducing performance by retrofit actions has been termedfix-it-laterapproach in [3], where the many problems this approach creates have been documented.
Lifecycle validationof software performance is the process of predicting (at early phases of the lifecycle) and evaluating (at the end) the ability of the software product to satisfy the user performance goals. In order to implement lifecycle validation of software performance, the strategy scheme in Fig. 1 has to be applied to each phase of the development lifecycle [11]. It applies to life-cycle artifacts. The term artifact is conventionally used to mean either the final software program, or an intermediate version of it. The requirement document, the analysis document, the design documents, etc. are examples of artifacts.
When applied to theith phase, the scheme assumes that the previously validated(i−1)th phase artifact is received in input. On its basis, a tentative phase(i) artifact is produced, and validated in the PV block by comparing the predicted performance of the product with the user performance goals. In case of unacceptability, if cost effective, a new tentative artifact is produced for better performance. Else (i.e., if it is not cost-effective to produce a new tentative), a feedback loop to the user goals takes place, for goals revision.
1. performance model production; 2. performance model evaluation;
3. comparison of evaluation results with user performance goals.
The first activity deals with the derivation of a performance model from existing development docu-ments (e.g., analysis docudocu-ments or design docudocu-ments or source code). A considerable effort, however, is necessary and large expertise is required to manually translate standard software documents into a performance model, and this makes model development usually unattractive for software developers.
Only a few methods have been developed for the automatic generation of performance models [4,8–10,12,14].
In some cases, as for example in [4,10,12], they do not give the detailed step-wise algorithms for model generation.
In some other cases, as for example in [9], they are more specific to be used at the detailed design stage of the software lifecycle (while in many circumstances performance predictions at earlier development phases, such as analysis and preliminary design phases, are important to system developers).
Still in other cases, as for example in [4,7,9,14], they do not explicitly model distributed software (e.g., do not model the performance of software servers).
Some other times, as for example in [8], they are based on a different approach, that makes use of executable rapid prototypes of the software product.
This paper gives an unified view of performance model generation methods based on non-executable software documents and includes a method that overcomes some difficulties of existing methods. In particular, it introduces a model generation method which:
• is step-wise clear from the algorithmic point of view;
• can be used since the early phases of the software lifecycle, such as analysis and preliminary design;
• does not require executable prototypes;
• includes features to also deal with distributed software;
• can be easily incorporated into modern (e.g., UML-based) CASE tools, among which System Architect from Popkin, Rose from Rational and COOL:Jex from Sterling.
This paper is organized as follows. Section 2 describes the general method for generating a performance model from software development documents. Section 3 gives details of the method application by use of a case study. The generated performance model can be of two types: the extended queuing network (EQN) type [1] or the layered queuing network (LQN) type [6]. In this paper, the LQN type is illustrated. Finally, in Section 4 the issues relevant to the method application are discussed, with details of the method complexity and size and an example of model use to obtain performance metric values.
2. Method outline
The assumption is made that the software is developed according to an object-oriented approach, by use of UML notation [2,5].
The described method deals with generation of a performance model from object-oriented analysis documents (OOA), produced by use of an UML-based CASE tool. The method also assumes some documents are available at object-oriented design time (OOD), when performance aspects of distributed software are also to be considered. The method is illustrated in Fig. 2. The top area refers to input data
Fig. 2. Method outline for model generation.
to the performance model generator. The top-left part relates to input documents that are available at analysis time (class diagram CD, set of sequence diagrams SD1 through SDn, platform data PD, and operational profile OP that includes user workload data). In case of distributed software, the top-right part is also used, which relates to input documents that are available at preliminary-design time (software modules architecture (SW), client/server structure (C/S), and modules-platform mapping (MP)).
The remaining blocks in Fig. 2 relate to various steps of the performance model generation, that can be synthesized as follows:
Step 1 generation of the set of precedence graphs (PG1 through PGn), each graph relevant to a specific scenario (object interactions). Such graphs are also calledscenario PGs;
Step 2 generation of the global precedence graph (global PG), that merges together scenario PGs and represents the behavior of the whole system;
Step 3 generation of the extended flow graph (EFG). This generation is based on the global PG, the workload data (from OP documents), the SW modules architecture and the C/S structure. The EFG is a standard flow graph with extensions to represent client/server structures;
Step 4 generation of the so-called extended execution graph (EEG) on the basis of the platform configuration (PC) from PD documents (see Fig. 2), the EFG and on the basis of experience data and MP, as better discussed in Section 3.4;
Step 5 generation of LQN performance model on the basis of the EEG and resource capacity data (RC) from the PD document (see Fig. 2), as better seen in Section 3.5.
The so obtained LQN model is then evaluated by use of an LQN evaluation tool [15]. Example results of such an evaluation are given in Section 4.2.
3. Method description
A constructive description of the method is given here by the use of an application case that deals with the development of a distributed software calledmail system. The system gives the users the capability to download bulk of mail messages coming from external sources. A database stores log records of each user operation, and is updated by the system when the user closes the session [16].
There are two categories of users:administrators, who are allowed to download both the log records and the mail messages, andstandard-users, allowed to download only mail messages. Twoscenariosare thus defined for this system, each one referring to a particular category of users.
3.1. Inputs from the analysis phase artifacts
According to the OP document the administrators are 10% of total users and they download one mail message for session, while standard-users are 90% of total users and download three mail messages for session.
Table 1 PC devices list
Device 1 T Generic peripheral terminal of the user workstation
Device 2 UWS CPU of the user workstation
Device 3 UWS-DB DB unit of the user workstation
Device 4 HWS-A CPU of the HW server A
Device 5 HWSA-DB DB unit of the HW server A
Device 6 HSW-B CPU of the HW server B
Device 7 HWSB-DB DB unit of the HW server B
Device 8 LAN Local area network
Device 9 WAN Wide area network
A mail system process is run for each user logged on the user workstation. Number (N) of users are simultaneously logged and their think time is 10 s on the average. Each user must be authenticated in order to get access to the mail system.
It is assumed that, according to the PD document the mail system PC is the one in Fig. 3, and consists of oneUser Workstation, with a number of peripheral terminals operated by users, and two servers,HW server AandHW server B. The user workstation is connected to the HW server A by the LAN and to the HW server B by the WAN. The WAN also directs the external mail messages to HW server A or B according to design-time decisions (see Section 3.4). The user workstation and the hardware servers are each endowed with their own DB device, as better described in Table 1, which gives the complete set of PC devices in Fig. 3.
Additional artifacts from the object-oriented analysis phase are the CD and the set of SDs, on whose basis the PGs of Step 1 can be generated as described below.
3.2. Generation of precedence graphs
Precedence graphs (PGs) are used to identify the execution flow and the interaction among system actors. This section describes the generation of a PG from data provided by the OOA, more specifically from the CD and the set of SDs. This step of the method consists of two sub-steps:
• the generation of ascenario PGfor each scenario (Section 3.2.1);
• the generation of aglobal PGfor the overall system, obtained by merging all scenario PGs (Section 3.2.2).
3.2.1. Generation of the scenario PGs
The OOA provides the diagrams used for the generation of scenario PGs. Such diagrams are the CD and the set of SDs. The CD depicts the static structure of the system in terms of classes, with attributes and methods, and relationships between classes. Class methods are the main elements used for the scenario PG generation. Fig. 4 describes the CD of the considered mail system, consisting of four classes:
• theUserclass, which refers to the users of the mail system;
• theSystemclass, with themail sessionandverify usermethods;
• Mail Mgrclass, that manages the Mail DB, with theread mailmethod to download a single mail message;
Fig. 4. Class diagram of the considered mail system.
• the History Mgr class, that manages the History DB, with the view log and update log methods.
While the CD refers to the overall system, SDs are diagrams that refer to a particular scenario. A SD shows the interaction in time sequence among objects, or instances of each given class, in the scenario: the vertical dashed lines represent thelifelinesof the objects (time progress is top-down) and the horizontal arrows represent control transfer (or objectinteractions) from the sender object to the receiving object. Each UML-type activation box [5] between an incoming arrow and the subsequent outgoing arrow of the SD graphically represents the code executed by eachclass method.
Fig. 6. SD for the administrator scenario.
Only two SDs are illustrated here, given in Figs. 5 and 6 for the standard-user scenario and the admin-istrator scenario, respectively.
In the administrator scenario for example (see Fig. 6) the User object initially calls themail session method of the System object. The System executes the method by first verifying the user rights and then by calling theview logmethod of the History Mgr object and theread mailmethod of the Mail Mgr object. Finally, the System returns control to the User object and, at the same time, performs a call to theupdate logmethod of the History Mgr object. Such a call is of asynchronous type, so there is no return arc to the System object.
It is easy to be convinced that from each SD a precedence graph (PG), also called “scenario PG”, can be derived (see Figs. 7 and 8). This is done by introducing a PG node for each method call or method execution in the SD.
PG nodes are vertically grouped into subsets, each for an actor corresponding to a specific SD object. Nodes are labeled by U, S, M and H to denote they belong to the User actor, to the System actor, to the Mail Mgr actor and to the History Mgr actor, respectively.
PG arcs give the transfer of control between PG nodes. Arcs with thecalllabel represent synchronous calls, which expect areturn arc. Arcs with thegotolabel represent asynchronous calls, which have no return arcs.
Each PG node can be a standard node(representing method CALL or method EXECUTION) or a special node(SPLIT, LOOP, NOP, etc.).
The correspondence between method calls in Figs. 5–8 is illustrated in Table 2, where the LOOP nodeS1represents the three times repetition of visits to nodesS2andM0, corresponding to the sequence of three calls toread mail method of Mail Mgr object in Fig. 5, the SPLIT nodeS4represents the simultaneous transfer of control to nodesU1andH1, and the NOP nodeU1is used to represent the user think time. The EXECUTION nodes (S0,M0,H0,H1) and the CALL nodes (U0,S2,S3,S5) are of immediate evidence.
Fig. 7. PG for the standard-user scenario of Fig. 5.
Table 2
SD–PG correspondence
U0 CALL of System:mail session
U1 NOP
S0 EXECUTION of System:verify user
S1 LOOP:3
S2 CALL of History Mgr:view log
S3 CALL of Mail Mgr:read mail
S4 SPLIT
S5 CALL of History Mgr:update log
M0 EXECUTION of Mail Mgr:read mail
H0 EXECUTION of History Mgr:view log
H1 EXECUTION of History Mgr:update log
Fig. 9. Merge operation.
3.2.2. Generation of the global PG
The global PG represents the behavior of the overall system and is obtained combining together the specific scenario PGs. Combination is performed by merging together the specific PGs according to the following operation.
Let PG(h) be the result of merging PG(i) and PG(j), in symbols: PG(h):=PG(i)⊕PG(j)
where⊕denotes the merge operation. Then PG(h) is the precedence graph consisting of four sub-graphs (g1,g2,g3,g4) connected by a CASE node (see Fig. 9), withg1being the entrance sub-graph to the CASE, g2andg3the alternative sub-graphs andg4the continue sub-graph of the CASE node.
Sub-graphsg1throughg4are defined as follows:
• g1is the sub-graph (if any) common to the initial parts of PG(i) and PG(j) (for example the (U0,S0) sub-graph in Figs. 7 and 8);
• g2andg3are the immediately following sub-graphs in PG(i) and PG(j) which are structurally different1 (for exampleg2=(S1, S3, M0)in Fig. 7 andg3=(S2, H0, S3, M0)in Fig. 8);
• g4is the final sub-graph (if any) common to PG(i) and PG(j) (for example the (S3,S4,U1//H1) sub-graph in Figs. 7 and 8).
Based on the above defined merge operation, the global PG, denotedgPG, is obtained from the specific scenario PGs (PG(1), PG(2),. . ., PG(n)) according to the following algorithm:
Global PG generation algorithm Begin
gPG :=PG(1)
Fori =2, . . . , n⇒gPG :=gPG⊕PG(i) End
Fig. 10 shows the global PG that results from the merge of the specific scenario PGs of Figs. 7 and 8.
Table 3 Workload data Number of Users N Think time 10 s % Administrators 10% % Standard-users 90%
3.3. Generation of the EFG
According to Fig. 2, the EFG is the graph generated at Step 3 of the method, which prescribes the following inputs to the EFG generation:
1. the global PG obtained at Step 2 of the method; 2. the workload data (from OP document); 3. the SW modules architecture;
4. the C/S structure.
Input 1 has been described in Section 3.2.2. Input 2 describes the workload imposed by the user, consisting of:
2.1. number of users, 2.2. users’ think time,
2.3. percentage of different user categories, as illustrated in Table 3 for the example case study.
Input 3 is the SW modules architecture given by the structure chart generated at preliminary design time, as illustrated in Fig. 11 for the example case study. By comparing Figs. 4 and 11 it is seen that, in the example case study, it is assumed a one-to-one mapping between classes from the CD and software modules in Fig. 11 (for more complex systems, various alternative mappings can be found), except for the User class that is assumed not to denote a specific software component. Nevertheless the global PG nodeU0will be translated into an EFG block to be considered as the user think time block.
Input 4 is a description of which modules play the role of clients and which the role of servers. Ac-cording to Table 4, in the example case study, assumption is the System module plays the role of client, while the Mail Mgr and History Mgr modules play the role of servers (see below for the meaning of the multiplicity data).
Table 4 C/S structure
Module Role Multiplicity
System Client –
Mail Mgr Server 2
History Mgr Server 1
An EFG is a graph semantically equivalent to the global PG but notationally different. It is an enrichment of the standard flow graph (FG) introduced in [3] for SPE. According to [3], an FG is a graph whose nodes are calledblocks, which are of various types:basicblocks,expandedblocks,repetitionblocks,case blocks,splitblocks,fork-joinblocks,lock-freeblocks andshareblocks. The semantics of such nodes is of immediate evidence. The reader is sent to [3] for details.
The EFG we introduce enriches the mentioned FG blocks by a so-calledsoftware serverblock, whose semantics is illustrated in Fig. 12, where the number inside the left diamond denotes the degree of multiplicity, in other words the number of simultaneous users admitted to the block. Each software server block may contain nested software server blocks.
The translation of a global PG into an EFG is performed by applying the following set of rules:
• each SPLIT node of the global PG is converted into an EFGsplitblock;
• each LOOP node of the global PG is converted into an EFGrepetitionblock;
• each CASE node of the global PG is converted into an EFGcaseblock;
• each node of the global PG belonging to an actor (object) that, according to the C/S structure of Table 4, plays the role of client is converted into an EFGbasicblock (i.e., a plain rectangular block). Examples of such nodes are nodesU0,S0,S2,S3,S5of the global PG;
• each node of the global PG belonging to an actor (object) that, according to the C/S structure of Table 4, plays the role of server is converted into an EFGsoftware serverblock (i.e., a rectangular block with diamond sides). Examples of such nodes are nodesM0,H0,H1of the global PG.
Fig. 13 gives a view of the EFG obtained from the global PG in Fig. 10. It is easy to recognize that the alternative branches outcoming from the CASE block have associated branching probabilities of values
Fig. 13. Extended flow graph (EFG).
0.1 and 0.9 for administrators and standard-users, respectively, as obtained from the user workload in Table 3.
It is also easily seen that the blocks on the 0.1 branch correspond to nodesS2,H0,S3 andM0 of the global PG in sequence, while the blocks on the 0.9 branch correspond to global PG nodesS3 andM0 connected by the repetition blockS1.
Finally, the outcoming blocks from the CASE block correspond to nodesS4,S5andH1of the global PG, whereS4is the SPLIT node converted into a split block.
The Mail Server multiplicity 2 gives the server the capability to accommodate two simultaneous exe-cutions (e.g., two standard-users or one standard-user and the administrator).
According to the nesting rule of the software server block, the outer thin-line software server block in Fig. 13 denotes the execution of the Systemmail sessionmethod and illustrates the fact that all included blocks play the role of server for theUserclient blockU0. Since a System is run for each user
Table 5
Modules-platform mapping
Platform resource SW Modules
MP1 alternative MP2 alternative
User workstationi System System
HW server A History Mgr Mail Mgr
HW server B Mail Mgr History Mgr
(according to the OP document described in Section 3.1), the degree of multiplicity of such a software server block isN, i.e., the number of users from Table 3.
3.4. Generation of the EEG
An EEG is the direct derivation of the EFG obtained above. It is a weighted graph in which to each EFG block aresource demand vector:
d=(d1, d2, . . . , dn)
is associated, wherediis the number of elementary operations the block demands to deviceiof the PC. Exampledis are the number of program statements executed by the CPU, the number of accesses to the DB and the number of accesses to LAN and/or WAN.
As already outlined in Section 2 (see Fig. 2), the EEG is obtained at Step 4 of the method, on the basis of the PC (see Fig. 3), of the EFG (see Fig. 13), and on the basis of MP data and experience data.
Modules-platform mappingdata (MP) are decisions on which software module to allocate on which PC device. Example mapping decisions are the two alternatives MP1 and MP2 described in Table 5. In MP1, the incoming mail messages are collected by the HWSB, while they are collected by the HWSA in MP2. In the following, alternative MP1 will be assumed.
Experience dataare estimations of the values taken by components di basing on data from similar systems.
Table 6 describes the results of such an estimation fordiof the various blocks illustrated in Fig. 13. Each row in Table 6 gives the resource demandsd1throughd9submitted by each block (U0throughH1) of the EFG. Eachdiis expressed in terms of number of accesses (acc) or number of statements (stat). RowS3, e.g., says that the demand from the call Mail Mgr.read mail block is for 10 UWS statements, one LAN access, one WAN access and zero elsewhere. The resulting EEG is the weighted graph illustrated in Fig. 14, where each block has the associated demand vector obtained from Table 6.
3.5. Derivation of the LQN model
According to [6], an LQN model is an high level abstraction of queuing network models introduced for modeling the contention for software servers and hardware devices.
The basic LQN components are calledentities. An LQN includes an entity for each software task (client or server) and an entity for each processor or hardware device (see Fig. 15).
Table 6
Resource demands
EFG block Resource demand vectors
T (acc) UWS (stat) UWS-DB (acc) HWSA (stat) HWSA-DB (acc) HWSB (stat) HWSB-DB (acc) LAN (acc) WAN (acc)
(U0)Call System.mail session 1 5 0 0 0 0 0 0 0
(S0)System.verify user 0 100 1 0 0 0 0 0 0
(S2)Call History Mgr.view log 0 10 0 0 0 0 0 1 0
(S3)Call Mail Mgr.read mail 0 10 0 0 0 0 0 1 1
(S5)Call History Mgr.update log 0 10 0 0 0 0 0 1 0
(M0)Mail Mgr.read mail 0 0 0 0 0 150 1 0 0
(H0)History Mgr.view log 0 0 0 100 1 0 0 0 0
(H1)History Mgr.update log 0 0 0 200 2 0 0 0 0
Software tasks in Fig. 15 are connected by directed edges that representsmethod calls, going from client tasks to server tasks. Such edges are labeled by numbers denoting the average number of calls. In a similar way, hardware devices are connected to software tasks by boldface directed edges calleddevice callsand labeled by numbers denoting the numberdiof elementary operations the task demands to device i. Tasks may be labeled by numbers denoting thedegree of multiplicity of the task. When omitted, the degree of multiplicity is assumed to be 1. Number values associated to directed edges are calledLQN
Fig. 15. LQN components.
parameters. To each software tasks entriesare associated denoting the part of the task that provides a certain service (analogous to object methods). For more details the reader is sent to [6].
This section illustrates the generation of the LQN performance model of Step 5 of the method outlined in Section 2. The algorithm to generate an LQN model from an EEG consists of two steps:
• generation of the LQN,
• parameterization of the LQN,
described in Sections 3.5.1 and 3.5.2, respectively. 3.5.1. Generation of the LQN
In this step the generation of an LQN without parameters is illustrated for the example case study. An LQN model is generated from an EEG by applying the following set of rules:
• onesoftware task is generated for each type of software server blocks (i.e., System, Mail Mgr, His-tory Mgr). One software task is also generated for each type of client blocks (i.e., User);
• thedegree of multiplicityof each server task is obtained from the degree of multiplicity of the corre-sponding software server block. For client tasks, the degree of multiplicity is obtained from workload data (Table 3, see later);
• entriesfor each task are obtained by visiting the EEG and, for each pair of client/server blocks (i.e., a calling basic block followed by a software server block) the name of the called method and the name of the calling method become the entry name in the server task and the entry name in the client task, respectively;
• method callsare obtained by drawing a directed edge from the calling entry (of the client task) to the called entry (of the server task);
• devicesare derived from the platform configuration devices (Table 1);
• device callsare obtained by drawing boldface directed edges from each entry to each device for which a non-zero value is specified in the entry resource demand vector (Table 6).
Fig. 16 describes the resulting LQN model for the example case study of Fig. 14. According to the above described set of rules, the following tasks, and entries for each task, have been identified and introduced in the LQN model:
Fig. 16. LQN model for alternative MP1.
• aUserclient task, with a dummy entryuser entry, corresponding to theU0block of the EEG, and with a degree of multiplicity ofN, corresponding to the number of users in Table 3;
• aSystemserver task, with amail sessionentry, corresponding to the set ofS0,S2,S3andS5blocks of the EEG, and with a degree of multiplicity of 10, as from Fig. 14;
• aHistory Mgrserver task, withview logandupdate logentries, corresponding to theH0andH1blocks of the EEG, respectively, and with a degree of multiplicity of 1, as from Fig. 14;
• aMail Mgrserver task, with aread mailentry, corresponding to theM0block of the EEG, and with a degree of multiplicity of 2, as from Fig. 14.
Methods calls, devices and device calls are of immediate evidence. 3.5.2. Parameterization of the LQN
In this step the parameterization of the LQN is performed by associating values to directed edges representing method calls and device calls.
The value associated to a method call is calledMCP(method call parameter), while the value associated to a device call is calledDCP(device call parameter). The MCP gives the expected number of calls that the client task entry makes to the server task entry. The DCP gives the device demand of the task entry, expressed in different units (statements, bits, transactions) depending on the device type considered.
DCPs for each entry are first evaluated as follows:
1. letNibe the number of EEG blocks associated to each entryiof the LQN model;
2. letdij be the resource demand vector of the EEG blockj(see Table 6) associated to LQN entryi, wherej =1, . . . , Ni;
Table 7
vijanddijfor the example case study LQN entry (i) EEG
block (j) vij dij T (acc) UWS (stat) UWS-DB (acc) HWSA (stat) HWSA-DB (acc) HWSB (stat) HWSB-DB (acc) LAN (acc) WAN (acc) User entry U0 1 1 5 0 0 0 0 0 0 0 Mail session S0 1 0 100 1 0 0 0 0 0 0 S2 0.1 0 10 0 0 0 0 0 1 0 S3 2.8 0 10 0 0 0 0 0 1 1 S5 1 0 10 0 0 0 0 0 1 0 Read mail M0 2.8 0 0 0 0 0 150 1 0 0 View log H0 0.1 0 0 0 100 1 0 0 0 0 Update log H1 1 0 0 0 200 2 0 0 0 0
3. letvijbe the mean number of visits to the EEG blockjassociated to LQN entryi, wherej =1, . . . , Ni. Such number represents the average number of times that an EEG block is passed through during one execution of the EEG, and it is easily obtained by visiting the EEG and meanwhile counting the number of visits to each block, weighted by the branching probabilities. Refer to [3] for details about methods to derive mean number of visits to blocks of an execution graph;
4. the resource demand vector associated to each LQN task entryi, denoted byDi, is then obtained as:
Di = Ni j=1
vijdij
5. the elements of vector Di are finally used to obtain DCP values for entryi. These are obtained by expressing such elements in proper units, as from Table 11.2 Thus, number of statements in Di remains number of statements in the corresponding DCP value, while number of accesses to DB and LAN/WAN devices becomes number of transactions and bits, respectively. Number of accesses to the peripheral terminal device inDi remains unchanged in the corresponding DCP value, since such a device is only used to represent the users’ think time in LQN models and it does not appear as a device in Table 11.
Table 7 gives the values ofvij anddij for the example case study, while Table 8 gives the resulting
Di, that are finally mapped to DCP values in Table 9. DCPs for each entry are obtained by assuming one transaction for each access to DB type devices and a mean number of bytes transferred for each access to LAN/WAN devices, in particular:
• 2500 bytes for mail messages;
• 4000 bytes for view log records;
• 250 bytes for update log records.
2It is understood that CPU statements are high level language statements. In other words it is assumed that the capacity takes into consideration the overhead introduced by the middleware and the operating system. In conformity to other authors it is assumed a CPU speed of 2–5 MIPS at machine level, with a ratio of high level to machine instructions of 1/20 [3,13]. A similar assumption holds for DB requests and capacity. In conformity to other authors it is assumed an access time of 20–50 ms per request for I/O devices, with a ratio of DB request to I/O request of 1/1 [3,13].
Table 8
Di for the example case study LQN entry (i) Di T (acc) UWS (stat) UWS-DB (acc) HWSA (stat) HWSA-DB (acc) HWSB (stat) HWSB-DB (acc) LAN (acc) WAN (acc) User entry 1 5 0 0 0 0 0 0 0 Mail session 0 139 1 0 0 0 0 3.9 2.8 Read mail 0 0 0 0 0 420 2.8 0 0 View log 0 0 0 10 0.1 0 0 0 0 Update log 0 0 0 200 2 0 0 0 0 Table 9
DCP values for the example case study LQN entry (i) DCP values
T (acc) UWS (stat) UWS-DB (trans) HWSA (stat) HWSA-DB (trans) HWSB (stat) HWSB-DB (trans) LAN (bits) WAN (bits) User entry 1 5 0 0 0 0 0 0 0 Mail session 0 139 1 0 0 0 0 61200 56000 Read mail 0 0 0 0 0 420 2.8 0 0 View log 0 0 0 10 0.1 0 0 0 0 Update log 0 0 0 200 2 0 0 0 0
The resulting DCPs values for the example case study are the device call labels (5, 139, 1, etc.) in Fig. 17.
MCPs are obtained, for each LQN entry, by use of the mean number of visitsvijto each EEGcalling
blockjassociated to the LQN entryi. An EEG calling block is a basic block that represents a method call to a software server block. The mean number of visits to such a block corresponds to the number of calls from the LQN entry associated to the EEG calling block to the LQN entry associated to the EEG software server block.
The resulting MCPs values for the example case study are represented by method call labels (1, 0.1, 2.8, 1) in Fig. 17. Their values are obtained from thevijvalues in Table 7 for blocksU0,S2,S3andS5, respectively.
4. Application issues
This section gives a few details on the method application and the degree of its size and complexity. This section concludes with an example of model use.
4.1. Method size and complexity
It is of interest to the software engineer to have some details about the amount of work to be done to translate the software project into a performance model.
We shall assume the designer has knowledge of its project size expressed in:
• number of scenarios, each represented by a sequence diagram (SD) (see Section 3.2.1);
• a detailed view of each SD, including number of interactions (horizontal arrows) and number of class methods (UML-type activation boxes) (see Figs. 5 and 6);
• total number of devices in the PC (see Section 3.1).
Letkbe the number of scenarios, letmj be the number of methods andij the number of interactions in scenarioj (j =1, . . . , k). Then M= k j=1 mj and I = k j=1 ij
will denote the total number of methods and interactions in the project, respectively. Besides, letrbe the total number of devices.
It is easy to be convinced that the global PG consists of O(M) nodes and O(I) arcs (see Sections 3.2.1 and 3.2.2) and thus its space complexity is O(M), sinceIandMare of the same order. On the other hand, O(M) also gives the time complexity for the generation of the global PG, since a number ofMmethods in the SDs are to be visited to generate a corresponding number of PG nodes.
In summary, O(M) is the time and space complexity of Steps 1 and 2 of the method (see Section 2). Step 3 is itself O(M) again, since according to Section 3.3 it is simply performed by a one-to-one translation of each global PG node into an EFG node. Step 4 requires a bit larger work, since it requires the production of a r-sized vector for each EFG node. And thus it is again O(M) being r largely lower than M, in general.
Step 5 of the method finally requires again O(M) both in space and time. Indeed, lete denote the number of entries of the resulting LQN (see Section 3.5), such a number is nothing else of the total number of methods in the class diagram, which is obviously not larger thanM. The space complexity of the LQN is by definition the space to accommodate numbere+r +c items, wherer is the above defined number of devices andcthe number of LQN method plus device calls. It is easy to be convinced thatc is not larger than M +er, and thus the total LQN space complexity is O(M), as stated above. On the other hand, the time to perform the LQN generation (see Section 3.5.1) is the time to visit the O(M) nodes of the EEG and to generate, for each node, at most numberrdevice calls. Thus, the time complexity is again O(M). The further LQN parameterization of Section 3.5.2 can be performed in the course of the LQN generation itself, and thus it does not increase the above given time complexity for Step 5.
In conclusion, the amount of work the software engineer can expect to be performed by the model generation algorithm is of the same order of magnitudeMof the total number of methods he can recognize in his project scenarios and this is also the amount of space required to allocate the performance model. 4.2. Model use
Model evaluation is not the focus of this work. Nevertheless, in order to show the usefulness of the approach, in this section a few performance results are obtained that give a prediction of the performance of the mail system software project under development.
As stated in Section 3.4, the generated LQN performance model in Fig. 17 is related to alternative MP1 (see Table 5). By introducing slight changes in the resource demands of columns LAN and WAN in Table 6, the procedure illustrated in Section 3.5 yields the similar LQN model for alternative MP2 (not illustrated here).
The MP1 and MP2 models can be used to predict the MP alternative (MP1 or MP2) that will yield the shortest response time, or the time to execute the System mail session illustrated in Fig. 14.
The two models have been evaluated by using the LQNS tool [15] for several values of the number (N) of users (see Table 3), ranging from 2 to 10 users.
For each given number of users, the average response time has been obtained (see Table 10).
Table 10
Average response time
No. users Response time (s)
MP1 alternative MP2 alternative 2 32.7 22.0 3 49.4 33.1 4 66.4 44.2 5 83.8 55.4 6 101.4 66.6 7 119.0 77.8 8 136.3 89.1 9 154.4 100.4 10 172.8 111.9
Table 11
Resource capacity
Platform resource Capacity
User workstationi 105statements/s
HW server A 105statements/s
HW server B 105statements/s
LAN 106bits/s
WAN 104bits/s
User workstation DB 50 transactions/s
Mail DB 50 transactions/s
History DB 50 transactions/s
As seen from Table 10 alternative MP2 performs 1.5 times better than alternative MP1. Such a result gives to the software designer a quantitative evaluation of the relative importance of the History Mgr and Mail Mgr software modules from the performance point of view. In alternative MP2, the History Mgr is allocated to HWSB which is reached by the WAN (see Fig. 3), while the Mail Mgr is reached by the LAN. And vice versa for alternative MP1. Regardless of the fact that the Mail Mgr module is called more frequently than the History Mgr module (see Fig. 14), its relative importance from the performance point of view is lower. Indeed, in order to obtain a better performance (i.e., lower response time) it is necessary to make the History Mgr reachable by the LAN instead of by the WAN.
The tool also gives quantitative insights on the relative effects of the LAN and WAN. Indeed, regardless of the fact that the WAN capacity is 100 times lower than the LAN capacity (see Table 11) the impact on the response time is negligible (only 1.5 times).
In conclusion, the software designer has been enabled to predict the impact of his design time choices without being required of any specific knowledge of queuing model theory. The analyst is only required to input his CASE documents into the model generator illustrated in Section 3, and then feed the obtained LQN model into the LQNS solver to obtain quantitative predictions.
5. Conclusions
The use of performance modeling requires large expertise and thus is usually unattractive for software developers. On the other hand, the prediction of software performance during the early phases of the lifecycle is of paramount importance to software validation activities. In other words, the ability of the final product to meet the user performance requirements.
In order to enable software designers to introduce performance modeling in their current practice, meth-ods are to be introduced for the automatic derivation of the performance model from CASE documents. This paper has given a method that takes in input standard CASE software documents from analysis and preliminary design phases and yields in output an LQN performance model ready for evaluation.
The method is easy to use and can be effectively incorporated into modern object-oriented software development environments (UML-based, etc.) to encourage software designers to introduce lifecycle performance validation into their development best practices. It is presently available in a prototype form and is being developed in a web version accessible by interested practitioners.
References
[1] S.S. Lavenberg, Computer Performance Modeling Handbook, Academic Press, New York, 1983. [2] S.R. Schach, Classical and Object-oriented Software Engineering, WCB/McGraw-Hill, New York, 1999. [3] C.U. Smith, Performance Engineering of Software Systems, Addison-Wesley, Reading, MA, 1992.
[4] C.U. Smith, L.G. Williams, Performance engineering evaluation of object-oriented systems with SPEED, in: R. Marie, et al. (Eds.), Computer Performance Evaluation Modelling Techniques and Tools, Lecture Notes in Computer Science, Vol. 1245, Springer, Berlin, 1997.
[5] G. Booch, J. Rumbaugh, I. Jacobson, Unified Modeling Language User Guide, Addison-Wesley, Reading, MA, 1997. [6] J.A. Rolia, K.C. Sevcik, The method of layers, IEEE Trans. Softw. Eng. 21 (8) (1995) 689–700.
[7] C. Hrischuk, C.M. Woodside, J. Rolia, R. Iversen, Trace-based load characterization for generating software performance models, IEEE Trans. Softw. Eng. 25 (1) (1999) 122–135.
[8] G. Franks, A. Hubbard, S. Majumdar, D. Petriu, J. Rolia, C.M. Woodside, A toolset for performance engineering and software design of client–server systems, Perform. Eval. (special issue) 24 (1–2) (1996) 117–135.
[9] C.M. Woodside, A three-view model for performance engineering of concurrent software, IEEE Trans. Softw. Eng. 21 (9) (1995) 754–767.
[10] U. Herzog, A concept for graph-based process algebras, generally distributed activity times and hierarchical modelling, in: Proceedings of the Fourth Workshop on Process Algebra and Performance Modelling, Torino, Italy, July 1996.
[11] G. Iazeolla, A. D’Ambrogio, R. Mirandola, Software performance validation strategies, in: Performance’99 Conference, CRC Press, Boca Raton, FL, 1999.
[12] V. Cortellessa, R. Mirandola, Deriving a queuing network based performance model from UML diagrams, in: Proceedings of the Second International Workshop on Software Performance (WOSP2000), Ottawa, Canada, September 2000, pp. 58–70. [13] C. Hrischuk, J. Rolia, C.M. Woodside, Automatic generation of a software performance model using an object-oriented prototype, in: Proceedings of the International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS’95), 1995, pp. 399–409.
[14] C.U. Smith, B. Wong, SPE evaluation of a client/server application, in: Proceedings of the Computer Measurement Group, Orlando, FL, December 1994.
[15] C.M. Woodside, S. Majumdar, J.E. Neilson, D.C. Petriu, J.A. Rolia, A. Hubbard, R.B. Franks, A guide to performance modelling of distributed client–server software systems with layered queuing networks, Department of Systems and Computer Engineering, Carleton University, Ottawa, Canada, November 1995.
[16] A. D’Ambrogio, G. Iazeolla, M. Versari, Software performance model generation from non-executable design, Report RI.99.03, Laboratory for Computer Science, University of Roma, Tor Vergata, Roma, Italy, July 1999.
V. Cortellessareceived his Laurea degree in computer science from the University of Salerno (Italy) in 1991 and his Ph.D. degree in computer engineering from the University of Roma at Tor Vergata (Italy) in 1995. Currently, he is a Research Assistant Professor at CEMR, West Virginia University (WV, USA), and he holds a graduate research fellowship at DISP, University of Roma at Tor Vergata (Italy). His research interests include performance modeling of software/hardware systems, software engineering, parallel simulation. He is a member of ACM.
A. D’Ambrogiois researcher at the Department of Computer Science, Systems and Industrial Engineer-ing, University of Roma at Tor Vergata (Italy). His research is in the fields of distributed object computEngineer-ing, web-based modeling and simulation, computer supported cooperative work and software quality engi-neering. He is a member of ACM and IEEE.
G. Iazeollais a Full Professor of Computer Science, Software Engineering Chair, Faculty of Engineering, University of Roma at Tor Vergata (Italy). His research is in the areas of software engineering and infor-mation system engineering in relation to system performance and dependability modeling and validation.