Parameter Dependencies for Component Reliability Specifications

(1)

Parameter Dependencies

for Component Reliability Speciﬁcations

Heiko Koziolek

1

ABB Corporate Research 68526 Ladenburg, Germany

Franz Brosch

2,3

Forschungszentrum Informatik (FZI) Karlsruhe 76137 Karlsruhe, Germany

Abstract

Predicting the reliability of a software system at an architectural level during early design stages can help to make systems more dependable and avoid costs for fixing the implementation. Existing reliability prediction methods for component-based systems use Markov models and assume that the software architect can provide the transition probabilities between individual components. This is however not possible if the components are black boxes, only at the design stage, or not available for testing. We propose a new modelling formalism that includes parameter dependencies into software component reliability specifications. It allows the software architect to only model a system-level usage profile (i.e., parameter values and call frequencies), which a tool then propagates to individual components to determine the transition probabilities of the Markov model. We demonstrate the applicability of our approach by modelling the reliability of a retail management system and conduct reliability predictions.

Keywords: CBSE, Reliability, Prediction, Usage Proﬁle, Parameter Dependencies

1 Introduction

Formal techniques for analysing the properties of software design and software sys-tems are useful not only for functional properties (e.g., correctness), but also for extra-functional properties (e.g., performance, reliability, security, safety, etc.). Pre-dicting extra-functional properties of a system based on design models can help to avoid implementing software architectures that do not fulﬁl user requirements for timeliness and dependability. With this method, developers can save substantial

1 _Email: _{[email protected]} 2 _Email: _{[email protected]}

3 _{This work is supported by the Seventh Framework Programme (FP7) of the European Comission, Grant}

Agreement ICT FP7-216556.

www.elsevier.com/locate/entcs

doi:10.1016/j.entcs.2009.09.026

(2)

costs for ﬁxing an implementation based on a poor component-based software ar-chitecture.

For reliability predictions (i.e., probability of failure on demand (POFOD)) of a component-based system, software architects combine component specifications with failure probabilities given by (possibly third-party) component developers. However, component reliability – and particularly component interaction – depends on the usage profile assumed by the software architect. For example, unreliable but seldom used component services have little effect on the overall system reliability. The propagation of calls from one component to another may depend on the in-put parameter a certain service is called with. Therefore, component developers need to make the dependency between input parameters and the propagation of calls explicit in their component reliability specification, as they should not make assumptions about the users and the deployment environment of their components to keep them widely reusable.

Existing solutions for component-based reliability prediction (e.g.,

[3,6,5,19,15,2]) model the control ﬂow in a component-based software

archi-tecture using Markov chains. They assume that the software architect constructing such a model can provide the transition probabilities in the Markov chains, which model the control flow propagation between individual components. This is not possible, if the software architect builds the prediction based only on component interface specifications or on implemented components viewed as a black box alone. From the component specification, the software architect does not know which required service a provided service calls upon invocation and how many calls are made.

We present a novel approach where component developers supply component reliability speciﬁcations based on so-called stochastic regular expressions (SRE), which extend regular expressions with probabilistic attributes. An SRE describes the call propagation through a component service in dependency to input

parame-ter values. Software architects can compose these parametrised speciﬁcations using

tools and add their application specific system-level usage profile. A transformation tool then solves the parameter dependencies inside the component reliability speci-fications and thereby deduces the component transition probabilities based on the system-level usage profile. With our approach, the software architect does not need to estimate transition probabilities inside the architecture model making reliability predictions more accurate.

We demonstrate the applicability of our approach by modelling the reliabil-ity of the component services of a retail management system supported by tools

for the Palladio Component Model (PCM) [1]. They allow software architects to

compose individual component specifications by different developers. We make pre-dictions with varying system-level usage profiles and show the sensitivity of the overall system reliability to individual component failure probabilities and usage profile parameters.

The contribution of this paper is a modelling formalism to specify parameter de-pendencies in software component reliability speciﬁcations. Our approach of

(3)

mod-elling parameter dependencies can potentially be added to any existing component-based reliability prediction approach relying on the same assumptions thereby mak-ing it more accurate and ﬂexible. Furthermore, our approach can be used for other compositional quality attributes (e.g., performance [9]).

The paper is organised as follows. Section 2 surveys related work. Section 3 de-scribes the steps of our method and explains the involved developer roles. Section 4 deﬁnes our formalism on stochastic regular expressions and shows the transforma-tion algorithm to derive Markov model from the formalism. Sectransforma-tion 5 demonstrates our approach in a case study, followed by a discussion of assumptions in Section 6. Section 7 concludes the paper.

2 Related Work

The ﬁeld of reliability prediction for component-based software architectures has been surveyed in [5,4,7]. The approach presented in this paper is in the class of state-based methods, which assume that the control ﬂow between the components

of a software architecture can be modelled using a Markov chain. Goseva et al. [5]

state that most approaches rely on estimations of the transition probabilities be-tween components. In our approach, no estimation is necessary, as each component speciﬁes its call propagation in dependency to its own usage proﬁle.

One of the ﬁrst approaches for architecture-based reliability modelling was

pre-sented by Cheung in 1980 [3]. He emphasised the fact that a system is more reliable

if the unreliable parts are rarely used. He introduced a Markov model to describe the control ﬂow through a component-based software architecture. Cheung computes system reliability by analysing the Markov chain and incorporating the transition probabilities between diﬀerent components, so that seldom used components only marginally contribute to system reliability.

Recently, several models based on Cheung’s work have been introduced (e.g., [19,15]). Wang et al. [19] add special constructs to express architectural styles,

while Sharma et al. [15] broaden the scope of architectural analysis and also analyse

performance and security with a Markov model. None of the approaches provides special methods to determine the transition probabilities between individual com-ponents. Instead, they rely on testing data, which is not available for early design, or the software architect’s intuition.

Hamlet et al. [6] specify reliabilities and call propagations for each individual

component. Thus, the resulting models for each component shall be reusable across different architectures and system-level usage profiles. However, the approach re-quires the software architect to execute each component against the desired usage profile, which essentially reduces the approach to testing. The dependency of tran-sition probabilities and input parameters is not made explicit in this approach.

Reussner et al. [14] took over the concept of parametrised contracts for software

components to architecture-based reliability modelling. Each component speciﬁes probabilities of calling required services. Therefore, an architecture model can be constructed by combining these individual speciﬁcations. The approach assumes

(4)

ﬁxed transition probabilities and the models therefore have the usage proﬁle implic-itly encoded.

Cheung et al. [2] incorporate failure states into state-based models for

individ-ual components. They use information from requirements documents, simulation, and domain knowledge to train a hidden Markov model (HMM), which eventually delivers transition probabilities to failure states. However, the approach does not include the probability of control ﬂow propagation to other components in relation to the usage proﬁle.

3 Component Speciﬁcation and Reliability Prediction

This section sketches the process model underlying our approach and describes the responsibilities for the involved developer roles.

A software component is a unit of composition with contractually speciﬁed

in-terfaces and explicit context dependencies only [18]. A component speciﬁcation

consists of the provided and required interfaces of the component. This informa-tion is suﬃcient to assemble components and check their interoperability. However, additional information about each component is required to enable reasoning on extra-functional properties of a component-based architecture, such as performance, reliability, or availability.

To create a speciﬁcation language that captures the additionally required infor-mation, it is necessary to consider the involved developer roles. In component-based

software engineering (CBSE), there is a strict separation betweencomponent

devel-opers and software architects. Component developers implement components and provide functional and extra-functional speciﬁcations (i.e., models) for them. Soft-ware architects use these speciﬁcations to reason about planned architectures and later assemble the actual component implementations.

The component specifications for extra-functional properties provided by com-ponent developers need to describe how provided services of a comcom-ponent call re-quired services in terms of probabilities, frequencies, and parameter values. This enables software architects to create a model of the control and data flow through the whole architecture by simply composing these specifications and not referring to

component internals. In Section 4, we introduce a speciﬁcation language based on

stochastic regular expressions that lets component developers specify how provided services of a component call its required services via so-called external calls.

The propagation of calls from provided to required interfaces may depend on pa-rameter values used as input to provided interface calls. For example, a component service could call a required service as many times as the length of a list supplied to it as an input parameter value. If, as in many existing component reliability specifications, the component specification does not reflect such dependencies, pre-dictions based on the model are always fixed to a specific usage profile (i.e., a set of parameter values) assumed by the component developer. This is not desirable as software components should be reusable under varying usage profiles. The

(5)

for the propagation of calls to required services. Therefore the system model can be adapted by different software architects to different usage profiles.

Fig. 1depicts the process model underlying our approach. Each component

de-veloper provides a component speciﬁcation, which includes extra-functional proper-ties (i.e. in our case failure probabiliproper-ties, etc.), call propagations to required services, and parameter dependencies of each provided service of the component. Methods to determine failure rates (e.g., [2]) are out of scope for this paper. If the compo-nent is already implemented, the compocompo-nent developers may use static code analysis techniques (e.g., in [8]) or dynamic monitoring (e.g., in [11]) to derive call propaga-tions and parameter dependencies from their source code. Component speciﬁcapropaga-tions reside in public repositories, where software architects can obtain them.

Fig. 1. Model-based Reliability Prediction for Component-Based Architectures

Each software architect assembles the component specifications in the same man-ner as later the component implementations. Additionally, each software architect provides a usage profile for the complete system (i.e., having direct interaction with the user or other systems). Based on this information, a tool traverses the archi-tectural model and resolves the parameter dependencies in the component specifi-cations.

For example, if the usage proﬁle states that the length of a list supplied as an input parameter is always 10, and the component speciﬁcation states that the called component calls 2 required services for each list item, the transformation tool determines that this component makes 20 calls to required services. As depicted in

Fig.1, another software architect could parametrise the model diﬀerently (e.g., list

length 20) to get a diﬀerent architecture model (i.e., 40 calls).

This method works recursively, as each component specification includes the (parametrised) usage profile for calling its required services. Therefore, with our specification language, it is possible to propagate requests at the system level to individual components and describe the control and data flow throughout the whole architecture. This is not possible with other approaches for reliability prediction

(6)

(e.g., [3,14]), which rely on the software architect to model the control ﬂow based on provided and required interfaces only.

With the parameter dependencies resolved, the resulting model can be trans-formed into a discrete-time Markov chain (DTMC). Using existing techniques (e.g.,

[14]), the software architect can determine the probability of reaching the failure

state of the Markov chain to get a reliability prediction (i.e., POFOD) for each com-ponent service at the system boundaries based on the failure probabilities of the components inside the architecture. This enables the software architect to assess the feasibility of reliability requirements.

If the predictions show that given reliability goals cannot be met, there are several options. The software architect can revise the architecture model for example by introducing redundancy or trying different architectural configurations. The software architect can also check the effect of using other functional equivalent components with different reliability properties. If possible, the modelled usage profile may need to be adapted or the reliability goals have to be changed. This method of model-based reasoning is supposed to be significantly more cost-efficient than building prototypes or relying on personal experience with similar systems only.

4 Parameter Dependencies in Component Reliability

Speciﬁcations

This section describes our model for component reliability speciﬁcation

(Sec-tion 4.1), and system modelling (Section4.2). Afterwards it explains the necessary

transformation steps to deduce a Markov model from the speciﬁcation (Section4.3),

and shows how to solve the Markov model to get a reliability prediction (Section4.4).

4.1 Stochastic Regular Expressions for Component Speciﬁcation

Our approach requires component developers to describe the reliability and control and data ﬂow propagation of each provided service of a component using a stochastic regular expression (SRE). SREs consist of internal actions, external calls, a return value, and control ﬂow statements as explained in the following. Additionally they use a language for arithmetic and boolean expressions. An example will be provided further below.

• Internal Actions: Let I = {1, ..., n}, n ∈ N be an index set and IA = {ia₁, ..., ia_n} be a set of terminal symbols. ia_i refers to an internal action of

the provided service modelled by the SRE. Let f p : IA → [0,1] be a function,

which assigns a failure probability to each internal action. Failure probabilities

can be derived using statistical testing [12,2].

• Expression Language: LetLbe alanguagefor boolean and arithmetic

expres-sions deﬁned by the grammar in AppendixA. LetS⊂Lbe the set of all strings

in L, B ⊂Lbe the set of all boolean expressions in L, andA⊂L the set of all

(7)

arithmetic expression. Notice thatLallows the speciﬁcation of numbers as well as probability mass functions (PMF) to model values in a stochastic manner. • External Calls: Let J = {1, ..., m}, m ∈ N be an index set and EC =

{E₁, ..., E_m}be a set of non-terminal symbols. E_j refers to anexternal call to a

required service. E_j is a non-terminal symbol and can only be substituted by the

SRE of the referenced required service after its composition to other components is ﬁxed. We assume a call and return semantic (i.e., synchronous communication, the caller blocks until receiving an answer).

Let ei : EC → (S × A) and eo : EC → (S× A) be two functions that

assign parametricinput andoutput expressions to an external call. For example

ei(E1) = (a, x+ 1) means that the SRE calls the required service where the value

x+ 1 is assigned to the variable named a.

• Return Value: Let rv ∈ N denote a return value, which models the output produced by the modelled service, which is send back to the caller.

Let P be a non-terminal symbol. Then, the syntax of SREs in our approach is

as follows:

P :=ia_i |E_j |P ·P |P+_πP |Pυ

The semantic of the above syntax is:

• P·P denotes a sequence of two symbols. The dot can be omitted.

• P +_πP denotes a branch of two symbols. π represents a branch condition

(e.g. π :=y <0, y∈N,y denoting a variable) deﬁned as a boolean expression of

the languageL.

• Pυdenotes aloop. υrepresents aparametric loop count(e.g.,υ:= 3 +z, z∈

N,z denoting a variable). Inﬁnite loops are not allowed by SREs, the loop count

is always bound.

Example: The following example shows a complete stochastic regular expression:

P :=ia₁·(E₁+_πE₂), f p(ia₁) = 0.001, ei(E₁) = (x,2−x), eo(E₁) =∅

ei(E2) = (y,27∗x+ 3), eo(E2) = (z, x+ 1), π:=y <0, rv:= 0

It speciﬁes that the service ﬁrst executes some internal code (ia₁) with the failure

probability 0.001 and then either executes an external callsE₁ or E₂ depending on

the value of the input parametery.

Rationale: Notice that an SRE is an abstraction of a component service’s be-haviour. It includes control flow constructs only if they influence external calls. Internal control flow of the service not changing the behaviour visible from outside are abstracted within internal actions. A single internal action may represent thou-sands of lines of code with a single failure probability. This abstraction focuses on necessary properties for a component-based reliability prediction (i.e., failure prob-abilities and call propagations) and makes the analysis of even complex components mathematically tractable.

(8)

The included parameter dependencies for branch conditions and loop counts allow software architects to use varying usage proﬁles (i.e., the value assignments for the included variables). We restrict the values of parameters to integer values in this paper for clarity. This is not a general restriction. The formalism can be

easily extended to other data types as shown in [9]. However, the value domain

is always ﬁnite and discrete in any case. For large or unlimited value domains, a partitioning of the value domain into a manageable set of equivalence classes with similar reliability properties have to be found (also see [6]).

Using SREs instead of Markov models allows for structured control ﬂow with clearly deﬁned loop entry and exit points (also see [10]).

4.2 System Model

Usually, a whole use case spanning multiple SREs is the subject of a reliability prediction. Thus a complete system model consists of a set of SREs, with an initial usage proﬁle.

Let K ={1, ..., r}, r ∈N be an index set and Services= {P₁,· · ·, P_r}be a set of connected SREs realising the functionality of one use case. The included SREs

reference each other via their external calls. Let P₁ be the service called at the

system boundaries.

LetT ={1, ..., t}, t∈N be an index set, Let V ={v₁,· · ·, v_t} ⊂S be the set of strings denoting the variable names speciﬁed inP₁, and letN ={n₁,· · ·, n_t} ⊂Lbe

a set numbers or probability mass functions fromLdeﬁning the initial values for the

variables at the system boundaries. Then U sageP rof ile= {(v₁, n₁),· · ·,(v_t, n_t)} denotes the system-level usage proﬁle for the modelled use case.

Then, the system model is deﬁned as: SystemM odel =

{U sageP rof ile, Services}.

4.3 Transformation

Once a software architect has assembled a set of SREs from a repository to realise

a use case, and speciﬁed the usage proﬁle to form a SystemM odel, an automatic

transformation (implemented as an Eclipse plugin) substitutes the variables in the ﬁrst SRE with the values from the usage proﬁle and propagates them through all

connected SREs. The transformation algorithm in Listing 1traverses the abstract

syntax trees of the SREs of all participating component services. Listing 1: Transformation Algorithm

I n p u t : P (SRE t o t r a n s f o r m )

U ( u s a g e p r o f i l e , i . e . , a s e t o f v a r i a b l e s w i t h v a l u e s a s s i g n e d ) Z ( p o i n t e r t o t h e c u r r e n t s t a t e i n t h e Markov model )

Output : Z ( p o i n t e r t o t h e c u r r e n t s t a t e i n t h e Markov model ) S i d e E f f e c t I n p u t : MM ( i n i t i a l empty markov model )

SS ∈MM ( s t a r t s t a t e i n t h e markov model ) FS _∈MM ( f a i l u r e s t a t e i n t h e markov model

S i d e E f f e c t Output : MM ( m o d i f i e d markov model c o n s t r u c t e d i n p a r a l l e l ) p o i n t e r t r a n s f o r m (P , U, Z ) {

(9)

switch( c u r r e n t N o d e ){ case s e q u e n c e w i t h l e f t S R E , rightSRE : Z _<₋ t r a n s f o r m ( l e f t S R E , U, Z ) ; Z _<₋ t r a n s f o r m ( rightSRE , U, Z ) ; break; case b r a n c h w i t h l e f t S R E , rightSRE , _π a s b r a n c h c o n d i t i o n

U1 <− e v a l u a t e S t o c h a s t i c a l l y (U,_π) ; //new u s a g e p r o f i l e e v a l u a t e d under t h e c o n d i t i o n t h a t _π i s t r u e U2 <− e v a l u a t e S t o c h a s t i c a l l y (U, n o t π) ; //new u s a g e p r o f i l e e v a l u a t e d under t h e c o n d i t i o n t h a t _π i s f a l s e π <− s e a r c h A n d R e p l a c e (π, U) ; // r e p l a c e v a r i a b l e s i n π w i t h c u r r e n t v a l u e s from u s a g e p r o f i l e bp <− s o l v e E x p r e s s i o n (π) ; // compute t h e b r a n c h p r o b a b i l i t y S1 <− n e w S t a t e ( ) ; // i n i t i a l s t a t e f o r t h e l e f t b r a n c h S2 <− n e w S t a t e ( ) ; // i n i t i a l s t a t e f o r t h e r i g h t b r a n c h S3 <− n e w S t a t e ( ) ; // s t a t e t o j o i n t h e b r a n c h e s a g a i n T1 <− n e w T r a n s i t i o n ( Z ,S₁,bp) ; T2 <− n e w T r a n s i t i o n ( Z ,S2, 1−bp) ; Z _<₋ t r a n s f o r m ( l e f t S R E , _U₁, _S₁) ; // t r a n s f o r m l e f t b r a n c h T3 <− n e w T r a n s i t i o n ( Z ,S₃, 1 . 0 ) ; // j o i n b r a n c h Z <− t r a n s f o r m ( rightSRE , U2, S2) ; // t r a n s f o r m r i g h t b r a n c h T4 <− n e w T r a n s i t i o n ( Z ,S3, 1 . 0 ) ; // j o i n b r a n c h MM_<₋MM_∪_S₁_∪_S₂_∪_S₃_∪_T₁_∪_T₂_∪_T₃_∪_T₄; Z _<₋ _S₃; break;

case l o o p w i t h body bodySRE , _υ a s p a r a m e t r i c l o o p c o u n t

υ <− s e a r c h A n d R e p l a c e (_υ, U) ; // r e p l a c e v a r i a b l e s i n _υ w i t h c u r r e n t v a l u e s from u s a g e p r o f i l e n l _<₋ s o l v e E x p r e s s i o n (_υ) ; // compute number o f i t e r a t i o n s for i =1 t o max ( n l ) do Z <− t r a n s f o r m ( bodySRE , U, Z ) ; // u n r o l l l o o p s break; case ia w i t h f a i l u r e p r o b a b i l i t y _fp(_ia) : S1 <− n e w S t a t e ( ) ; T1 <− n e w T r a n s i t i o n ( Z ,S₁, 1−fp(ia) ) ; T2 <− n e w T r a n s i t i o n ( Z , FS ,fp(ia) ) ; // c o n n e c t t o f a i l u r e s t a t e MM<−MM∪S1∪T1∪T2; Z <− S₁; break;

case E w i t h SRE a s t h e c a l l e d s e r v i c e and rv a s t h e return v a l u e o f SRE : i n p u t <− s e a r c h A n d R e p l a c e (ei(E) , U) ; U1 <− s o l v e E x p r e s s i o n ( i n p u t ) ; // add e x t e r n a l c a l l i n p u t s t o new u s a g e p r o f i l e Z _<₋ t r a n s f o r m (SRE , _U₁, Z ) ; o u t p u t _<₋ s e a r c h A n d R e p l a c e (_eo(_E) , U _∪ _rv) ; U<− U ∪ s o l v e E x p r e s s i o n ( o u t p u t ) ; // add o u t p u t s t o u s a g e p r o f i l e break; } return Z ; }

The algorithm creates a Markov model as a side eﬀect while traversing the

abstract syntax trees of the SREs. This Markov model contains a failure state, which is taken if an internal action fails. Therefore, the transition probability from a state representing an internal actionia_ito the failure state is its failure probability

f p(ia_i).

The Markov model contains a sequence of states for subsequent internal actions and transitions for branches. Additionally, the algorithm unrolls the loops within the SREs in the Markov model (i.e., it creates the states for the symbols inside a loop body for each loop count).

(10)

After substituting a variable name with its current value from the usage proﬁle, the algorithm solves the resulting expression. For example, for a boolean expression

x < 0 as a branch condition and a usage proﬁle containing P(x ≥ 0) = 0.7, the

algorithm resolves a branch probability bp = 1−P(x ≥ 0) = 0.3 and uses it in

the Markov model as the transition probability. Analogously, for an arithmetic

expression 3 +xas a parametric loop count and a usage proﬁlex= 5, the algorithm

resolves a loop countlc= 8 and uses it to unroll the states in the Markov model.

The functionsolveExpression supports computing all arithmetic and boolean

operations speciﬁable by the expression languageL. They can be arbitrarily

com-plex and involve constants as well as probability mass functions.

Upon traversing an external call E_j, the algorithm creates a new usage proﬁle

for the called SRE. It ﬁrst resolves the parameter dependencies on the inputs of the

call deﬁned by the functionei(E_j). Then it uses these values as new usage proﬁle.

After traversing the called SRE, the algorithm resolves parameter dependencies on

the outputs of the call deﬁned by the functioneo(E_j). The parametric dependencies

of these output values may refer to the return valuerv of the called SRE.

For the ﬁrst invocation, the SREP₁of aSystemM odelis given as input

param-eter as well as the U sageP rof ileof theSystemM odel.

Fig. 2. Transformation Example: SRE Traversal (left), Resulting Markov Model (right)

Example: As an example, we use the SRE P from Section 4.1 and connect its

external calls E₁ and E₂ to the following two SREs P₂ and P₃ to form a set of

Services.

P2:= (ia1)υ, f p(ia1) = 0.001, υ=x+ 2, rv= 0 P3:=ia1, f p(ia1) = 0.002 rv= 0

Together with the U sageP rof ile = {(x,1),(P(y = −2) = 0.3),(P(y = 4) = 0.7)}, we form the SystemM odel = {U sageP rof ile,{P1, P2, P3}}. Running the

(11)

transformation algorithm in Listing1 initiates the traversal of the abstract syntax

trees of the SREs depicted on the left hand side of Fig.2. By solving the parameter

dependencies for the branch conditiony <0 and the parameteric loop countx+ 2

with the given U sageP rof ile the algorithm calculates a branch probability of 0.3

to callE1 and a loop count of 3.

After termination, the algorithm has created the Markov model on the right hand side of Fig.2. It includes a start state, a failure state, and a success state.

4.4 Reliability Prediction

With the Markov model created by the transformation, we can now compute the reliability of the overall system model (i.e., for the modelled use case). In our

approach we deﬁne the reliability as R := 1− P OF OD, where POFOD is the

probability of failure on demand. It can be computed from the Markov model by calculating the probability of reaching the success state from the start state [17].

Example: The transition matrix for the Markov chain generated in the former example is P= 0 B B B B B B B B B B B B B B B B B B B B B @ 0 0.999 0 0 0 0 0 0 0 0.001 0 0 0.3 0.7 0 0 0 0 0 0 0 0 0 0 0.9999 0 0 0 0 0.0001 0 0 0 0 0 0 0 0.9998 0 0.0002 0 0 0 0 0 0.9999 0 0 0 0.0001 0 0 0 0 0 0 0.9999 0 0 0.0001 0 0 0 0 0 0 0 0 1.0 0 0 0 0 0 0 0 0 0 1.0 0 0 0 0 0 0 0 0 0 1.0 0 0 0 0 0 0 0 0 0 0 1.0 1 C C C C C C C C C C C C C C C C C C C C C A

where the second to last column refers to transition probabilities to the success state and the last column refers to transition probabilities to the failure state. As

this is an absorbing Markov chain, we can compute the probability b_i,j that the

chain will be absorbed in the absorbing state s_j if it starts in the transient state

s_i (cf. [17]). Let N = (I −Q)−1 be the fundamental matrix computed from the

identity matrixI and the upper left transition matrixQ. LetRbe the upper right

transition matrix from the canonical form ofP. Then, the absorption probabilities

are determined by the matrixB with the entries b_i,j, which is given by:

B=N R= (I−Q)−1R= 0 B B B B B B B B B B B B B B B @ 1 0.999 0.2997 0.6993 0.29967 0.29964 0.29961 0.69916 0 1 0.3 0.7 0.29997 0.29994 0.29991 0.69986 0 0 1 0 0.9999 0.9998 0.9997 0 0 0 0 1 0 0 0 0.9998 0 0 0 0 1 0.9999 0.9998 0 0 0 0 0 0 1 0.9999 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 C C C C C C C C C C C C C C C A 0 B B B B B B B B B B B B B B B @ 0 0.001 0 0 0 0.0001 0 0.0002 0 0.0001 0 0.0001 1.0 0 1.0 0 1 C C C C C C C C C C C C C C C A = 0 B B B B B B B B B B B B B B B @ 0.99877 0.001229761 0.99977 0.000229991 0.9997 0.00029997 0.9998 0.0002 0.9998 0.0002 0.999 0.0001 1.0 0 1.0 0 1 C C C C C C C C C C C C C C C A

Thus, the reliability (i.e., the probability of reaching the success state) of the ex-ample under the given usage proﬁle isR=b0,0= 0.99877.

(12)

5 Case Study

To illustrate our approach, we take a retail management system as an example, which has been designed in a service-oriented way. It is taken from the SLA@SOI

research project [16] and based on the Common Component Modelling Example [13].

The system enables a retail chain to handle its sales and manage its inventory data.

Fig. 3shows parts of the system architecture.

Fig. 3. System Architecture of the Service-oriented Retail Management System (excerpt)

The retail chain takes the role of the service customer. It operates several stores, each of which contains multiple cash desk lines. At each cash desk line, a dedicated retail solution client handles the individual sales processes. The service provider oﬀers a range of services to the retail solution clients, such as a payment service to handle payments by cash or by credit card, and an inventory service to update stock information in the inventory. The service provider in turn uses services of external

providers. Fig. 3shows a bank service used by the payment service as an example.

We focus on the service operation bookSale() that belongs to the inventory

service. The retail solution clients invoke this operation at the end of each sales process in order to update the stock information. The stochastic regular expression

which corresponds to the bookSale()execution looks as follows:

P1 =PbookSale:=E2(ia1+γ(E3+πE4))υE5, υ:=x, γ:= (y= 0), π:= (z = 0) P2 :=ia2, P3:=ia3, P4:=ia4, P5:=ia5

The operation takes a list of all sold items as an input parameter, and contains

a loop to iterate through this list. The iteration number υ of the loop equals the

number x of items in the list. Within the loop, stock information is updated by

(13)

Additionally, two external callsE2 andE5 to service operationsP2 andP5 (to set

up and to commit the database transaction) surround the loop over all sold items. All inventory component service operations are modelled through single internal actionsia2, ia3, ia4 andia5.

The control ﬂow of thebookSale() operation distinguishes several cases. First,

an item may be rated as small goods that do not have to be declared in the inventory. A branch located in the loop body reﬂects this fact. If an item is small goods (which

corresponds to y having value 0), there is no need for stock information update.

Rather, some internal calculations ia₁ are done. The probability P(y= 0) is part

of the usage proﬁle. Second, an item can be identiﬁed by a number or by a name,

which results in two diﬀerent types of database queriesP₃andP₄. Again, the usage

proﬁle determines the probabilities of both alternatives throughP(z = 0).

Our method allows assigning failure probabilities to internal actions. For exam-ple, a call to an inventory component service operation might fail - the underlying database might be temporarily unavailable, overloaded, or the request sent to the database might be lost. Our approach includes the propagation of the usage pro-ﬁle from the inventory service component to the inventory component: calling the

bookSale()operation with a list ofxitems results in an expected call sequence of

P(y= 0)∗P(z = 0)∗xandP(y= 0)∗P(z = 0)∗x calls to the two stock update operations, surrounded by one call for transaction setup and one call for transaction commit. ● ● ● ● ● ● 0 20 40 60 80 100 0.99985 0.99990 0.99995 1.00000 number of items reliability ● fp = 0.0 fp = 1.0E−6 fp = 2.0E−6 ● ● ● ● ● ● ● _● ● ● 0 20 40 60 80 100 2e−06 4e−06 6e−06 8e−06 1e−05 number of items failure probability ● reliability = 0.9997 reliability = 0.9998 reliability = 0.9999

Fig. 4. Reliability of thebookSale()operation

In our case study, we are interested in analysing the relation between the usage

proﬁle of thebookSale()operation and its reliability. We assume the failure

proba-bilities of the internal actions involved to be 10−6, except the failure probabilities of stock information update (f p(ia3), f p(ia4)), which we vary between 0 and 2∗10−6.

We set small goods to 5% and items identiﬁed by name to 20%.

For our reliability calculation, we have modelled the retail management system

using tools of the Palladio Component Model [1]. We have implemented an

ad-ditional algorithm for Markov Chain generation and calculation of the resulting reliability.

The left hand side of Fig.4shows the results. Evidently, the reliability decreases with a growing number of items. A longer list of items results in a higher number

(14)

of loop iterations, leading to more stock update operations. Only in the theoretical case that the stock update cannot fail at all, the reliability is nearly independent of the number of items. In that case, it is determined only by the potential failure of

transaction setup and commit, as well as the internal calculationia1.

If we set a goal of 99.99% for the bookSale() reliability, the ﬁgure shows that

this goal is reached with a stock update failure probability of 10−6 _{and at most 98}

items. If the failure probability rises to 2∗10−6_{, the goal is only reached with a}

number of items smaller than 50.

The right hand side of Fig.4shows the relation between the number of items and

the failure probability of database queries for varying reliability goals. Generally, the acceptable failure probability decreases for higher reliability goals and for a higher number of sold items. If, for instance, 60 items are sold, a failure probability

of 2∗10−6 would be acceptable to reach 99.98% reliability, but it would violate a

reliability goal of 99.99%.

6 Assumptions and Limitations

In this section, we discuss assumptions and limitations of our approach. Presumably, the most critical assumption lies in the determination of failure probabilities for internal actions. The predicted reliability can only be as close to reality as the failure

probabilities given as an input to the method. Cheung et al. [2] propose estimating

these probabilities through domain experts. Lyu et al. [12] suggest using statistical testing. If probabilities of failure are only roughly estimated, the approach can still be valuable in comparing architectural alternatives, or determining acceptable ranges for failure probability.

Besides individual failure probabilities, the parametric dependencies of the con-trol ﬂow also need to be given as an input to reliability prediction. The speciﬁcation of the internal behaviour of components given by the component developer or

deter-mined through a reverse engineering process can serve as a source of information [8].

If no speciﬁcation exists and the source code is not available, it might be still possible to reconstruct the parametric dependencies by monitoring the inputs and outputs

of the component by running it as a black box in a test-bed [11].

Our method abstracts from the concrete faults that may cause a failure. A ser-vice execution may alter the state of a component, such as the values of variables owned by the component, in an invalid way. The faulty component state might not immediately result in a failure, but at a later point in time. Such coherences cannot be explicitly expressed by our approach. The same is true for the probability of failure due to usage of certain resources or message loss over a certain communica-tion link. Also, failures caused by concurrency are not explicitly captured by the SREs. Our approach aggregates the distinct causes of failure into one probability value, which tends to decrease accuracy of prediction.

If an internal action fails, we assume that the corresponding service call is also rated as being failed. We do not consider the possibility that the system handles the failure and might successfully ﬁnish service execution even though the failure

(15)

happened. Furthermore, we assume the failure probabilities of internal actions to be stochastically independent. Currently the failure probability values are ﬁxed constants. They cannot be adapted to take factors like system or component state at run-time, previous service calls or the control ﬂow of the current call into account. Such considerations are left as a topic for future work.

7 Conclusions

We presented a novel formalism to model software component reliability specifica-tions. The formalisms allows to specify the control flow propagation of a software component from provided to required services in dependency to input parameter values. Software architects compose these specifications from individual compo-nent developers, use a tool to transform the specification into a Markov model parametrised for their specific usage profile, and can then conduct reliability pre-dictions for their design.

Our approach helps component developers by providing a language to specify component reliability. It helps software architects, who can quickly assemble com-ponent reliability specification, and parametrise them for different usage profiles without knowing the internals of the components. Similar existing component re-liability prediction approaches can adopt our method of modelling to make their prediction more accurate and more flexible. By assessing the reliability of a sys-tem design during early development stages, potentially high costs for late life-cycle fixings of a system can be avoided.

Our approach is still in an initial stage. We plan to add information about the underlying hardware resources to our model to allow to make more reﬁned pre-dictions. Furthermore, we will add reliability models for network connections and connectors between components. Another important direction is the determination of valid failure probabilities. We plan to incorporate existing approaches for deter-mining failure probabilities into our own method and validate it on a large real-life case study.

References

[1] Becker, S., H. Koziolek and R. Reussner, The Palladio Component Model for Model-Driven Performance Prediction, Journal of Systems and Software82(2009), pp. 3–22.

[2] Cheung, L., R. Roshandel, N. Medvidovic and L. Golubchik,Early prediction of software component reliability, in: ICSE ’08: Proceedings of the 30th international conference on Software engineering

(2008), pp. 111–120.

[3] Cheung, R. C.,A user-oriented software reliability model, IEEE Trans. Softw. Eng.6(1980), pp. 118– 125.

[4] Gokhale, S. S.,Architecture-based software reliability analysis: Overview and limitations, IEEE Trans. on Dependable and Secure Computing4(2007), pp. 32–40.

[5] Goseva-Popstojanova, K. and K. S. Trivedi, Architecture-based approaches to software reliability prediction, Computers & Mathematics with Applications46(2003), pp. 1023–1036.

[6] Hamlet, D., D. Mason and D. Woit,Theory of software reliability based on components, in:Proc. 23rd Int. Conf. on Software Engineering (ICSE ’01)(2001), pp. 361–370.

(16)

[7] Immonen, A. and E. Niemel,Survey of reliability and availability prediction methods from the viewpoint of software architecture, Journal on Softw. Syst. Model.7(2008), pp. 49–65.

[8] Kappler, T., H. Koziolek, K. Krogmann and R. Reussner,Towards Automatic Construction of Reusable Prediction Models for Component-Based Performance Engineering, in: Proc. Software Engineering 2008 (SE’08), LNI121(2008), pp. 140–154.

[9] Koziolek, H., “Parameter Dependencies for Reusable Performance Speciﬁcations of Software Components,” Ph.D. thesis, Department of Computing Science, University of Oldenburg, Germany (2008).

[10] Koziolek, H. and V. Firus, Parametric Performance Contracts: Non-Markovian Loop Modelling and an Experimental Evaluation, in: J. Kuester-Filipe, I. H. Poernomo and R. Reussner, editors,Proc. 3rd International Workshop on Formal Foundations of Embedded Software and Component-Based Software Architectures (FESCA’06), ENTCS176(2006), pp. 69–87.

[11] Kuperberg, M., K. Krogmann and R. Reussner,Performance Prediction for Black-Box Components using Reengineered Parametric Behaviour Models, in:Proceedings of the 11th International Symposium on Component Based Software Engineering (CBSE 2008), Karlsruhe, Germany, 14th-17th October 2008, LNCS5282(2008), pp. 48–63.

[12] Lyu, M. R., Software reliability engineering: A roadmap, in: FOSE ’07: 2007 Future of Software Engineering(2007), pp. 153–170.

[13] Rausch, A., R. Reussner, R. Mirandola and F. Pl´aˇsil, editors, “The Common Component Modeling Example: Comparing Software Component Models,” LNCS5153, Springer, 2008.

[14] Reussner, R. H., H. W. Schmidt and I. H. Poernomo,Reliability prediction for component-based software architectures, J. Syst. Softw.66(2003), pp. 241–252.

[15] Sharma, V. S. and K. S. Trivedi, Quantifying software performance, reliability and security: An architecture-based approach, Journal of Systems and Software80(2007), pp. 493–509.

[16] SLA@SOI, Empowering the Service Economy with SLA-aware Infrastructures, ICT: FP7-216556 (2008), http://www.sla-at-soi.eu/.

[17] Stewart, W., “Introduction to the Numerical Solution of Markov Chains,” Princeton University Press, 1994.

[18] Szyperski, C., “Component Software: Beyond Object-Oriented Programming,” Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 2002.

[19] Wang, W.-L., D. Pan and M.-H. Chen, Architecture-based software reliability modeling, Journal of Systems and Software79(2006), pp. 132–146.

A

Stochastic Expression Language

The following depicts the EBNF of the stochastic expression language: < expression >::=< compare−expr >

< compare−expr >::=< sum−expr >|< sum−expr >< compare−op >< sum−expr > < compare−op >::= ”<”|”>”|”≤”|”≥”|” = ”|”= ”

< sum−expr >::=< prod−expr >|< prod−expr >< sum−op >< prod−expr > < sum−op >::= ” + ”|”−”

< prod−expr >::=< atom >|< atom >< prod−op >< atom > < prod−op >::= ”∗”|”/”

< atom >::=< number >|< string >|< id >|< pmf >|”(”< compare−expr >”)”

< pmf >::= ”P M F[”< samples >”]”

< samples >::=< sample >|< sample >< samples > < sample >::= ”(”< number >”; ”< number >”)”

< number >::=< digit >|< digit >< number > < digit >::= ”0”|...|”9”

< string >::=< char >|< char >< string > < char >::= ”a”|...|”z”

< id >::=< ucchar >|< ucchar >< id > < ucchar >::= ”A”|...|”Z”