AVACS Automatic Verification and Analysis of Complex Systems REPORTS. of SFB/TR 14 AVACS. Editors: Board of SFB/TR 14 AVACS

(1)

AVACS – Automatic Verification and Analysis of Complex

Systems

REPORTS

of SFB/TR 14 AVACS

Editors: Board of SFB/TR 14 AVACS

On Probabilistic Automata in Continuous Time

by

Christian Eisentraut

Holger Hermanns

Lijun Zhang

AVACS Technical Report No. 62 August 2010

(2)

Publisher: Sonderforschungsbereich/Transregio 14 AVACS

(Automatic Verification and Analysis of Complex Systems)

Editors: Bernd Becker, Werner Damm, Martin Fr¨anzle, Ernst-R¨udiger Olderog, Andreas Podelski, Reinhard Wilhelm

ATRs (AVACS Technical Reports) are freely downloadable fromwww.avacs.org

Copyright c August 2010 by the author(s)

(3)

On Probabilistic Automata in Continuous Time

Christian Eisentraut1, Holger Hermanns1,2, and Lijun Zhang3

1

Saarland University, Computer Science, Saarbr¨ucken, Germany 2

INRIA Grenoble – Rhˆone-Alpes, France 3

DTU Informatics, Technical University of Denmark, Kgs. Lyngby, Denmark

Abstract. We develop a compositional behavioural model that integrates a varia-tion of probabilistic automata into a conservative extension of interactive Markov chains. The model is rich enough to embody the semantics of generalised stochas-tic Petri nets. We define strong and weak bisimulations and discuss their compo-sitionality properties. Weak bisimulation is partly oblivious to the probabilistic branching structure, in order to reflect some natural equalities in this spectrum of models. As a result, the standard way to associate a stochastic process to a gener-alised stochastic Petri net can be proven sound with respect to weak bisimulation. Keywords: Markov processes; nondeterminism; discrete time; continuous time; process algebra; weak bisimulation semantics

1 Introduction

Process calculi provide compositional theories for complex systems, especially those involving communicating, concurrently executing components [1]. The consideration of stochastic phenomena has led to the development of a plethora of stochastic and prob-abilistic process calculi, seeded in [2,3]. Two of these calculi stand out, in the sense that they extend classical concurrency models in simple yet conservative fashions: Proba-bilistic automata [4,5,6], and interactive Markov chains [7]. Though different in flavour, both are equipped with compositional theories for strong and weak bisimulations and corresponding equational theories.

In probabilistic automata (PA), there is no global notion of time, and concurrent processes may perform random experiments inside a transition. This is represented by transitions of the formsa µ, where s is a state, a is an action label, and µ is a

probabil-ity distribution on states. Labelled transition systems are instances of this model family, obtained by restricting to Dirac distributions (assigning full probability to single states). Thus, foundational concepts and results of standard concurrency theory are retained in their full beauty, and extend smoothly to the model of probabilistic automata.

Since the model is akin to Markov decision processes, its fundamental beauty can be paired with powerful model checking techniques, as implemented for instance in the PRISMtool [8].

Interactive Markov chains (IMC) in turn arise from classical concurrency models by adding a second type of transitionss λ s′_{, that can embody random delays governed by}

a negative exponential distribution with some parameterλ. This twists the model to one

that is running on a continuous time line, and where executions of actions take no time and happens immediately – unless an action can be blocked by the environment. This

(4)

s ≈ E s′ F u E F λ τ λ λ λ 2λ v′ v ≈ E F 3λ τ 1 3 2 3 v′ _v _≈ E F u E F α τ 1 3 2 3 α 1 3 2 3 s′ s E F ≈ s′′ s′′′ τ a 1 3 2 3 τ τ

Fig. 1. Yardsticks for weak bisimulation on Markov automata.

is linked to the process algebraic notion of maximal progress for internal actions. By dropping the second type of transitions, again, standard concurrency theory is regained in its entirety, and extends smoothly to the fullIMC model. The availability of tool support [9] has led to several academic and industrial applications [10,11,12,13,14,15]. For long, the analysis trajectory was restricted to models, where the weak bisimulation quotient is free of non-determinism (then corresponding to a continuous time Markov chain), but with the work of [16] this restriction is obsolete.

The seemingly simple question addressed in this paper is: What happens to the the-ory of these two concurrency models if we integrate them? This is not only an academic question, since industrial engineers are desperately asking for formalisms that support both, probabilistic branching and exponentially distributed delays. Therefore, we are going to look into a model classMA (Markov automata), that supports two types of transitions, namelysa µ and s′ λ _s′′_.

In the context of Petri nets, this move has been done 25 years ago. After Mol-loy introduced Stochastic Petri nets (which correspond to continuous time Markov chains) [17], it was a matter of two years until also probabilistic branching was sup-ported in the form of weighted immediate transitions, leading to the model of gener-alised stochastic Petri nets (GSPNs) [18]. However, the inventors of GSPNs initially overlooked the issue of non-determinism arising from concurrently enabled probabilis-tic branching. To date, the analysis trajectory for GSPNs is a partial one. It is restricted to confusion-free nets, a class of nets, where non-determinism is absent. Still, the anal-ysis trajectory developed for this class gives us important inspiration when formulating our theory. While a direct combination of thePA and the IMCtheories is an almost easy exercise, it turns out to be very demanding if reflecting on the different time scales we now work in. As in plainIMCs, internal probabilistic transitions cannot be blocked and take no time to execute. Consequently, we aim at fusing sequences of them. This implies that we need to partially ignore the branching structure of our probabilistic au-tomata induced substructures when defining equalities, especially weak bisimulation,

(5)

on them. In particular, we aim at a weak bisimulation that satisfies the equalities de-picted in Figure 1.PAtransitions are drawn with a small dot (omitted if the distribution is Dirac) after which the distribution follows. E and F indicate distinct equivalence

classes. The rightmost equality is induced by PAweak bisimulation, the leftmost by

IMCweak bisimulation, which is an implication of lumpability. The ones in the mid-dle are peculiar. The left one motivates what we want to achieve in the interplay of probabilistic and continuous time Markov behaviour. It reflects that the minimum of two exponentially distributed delays is exponentially distributed with the sum of the rates, and that the probability of the left one finishing first is determined by its relative weight in the accumulated sum of the rates. In the middle to the right we find what this implies on the level ofPA, namely that our weak bisimulation becomes ignorant to the branching structure. This equality is invalid inPAweak bisimulation, since in

u there is no successor state equivalent to v′_{. It is however valid with respect to the}

PAforward simulation kernel, the coarsest congruence induced by probabilistic trace distribution [19,4].

The main contribution of this paper is a definition of weak bisimulation on theMA

model that is (i) (indeed) an equivalence satisfying the above equalities, (ii) a congru-ence with respect to parallel composition, (iii) a conservative extension ofIMCweak bisimulation, (iv) coarser thanPAweak bisimulation, and (v) can serve as a correctness criterion when associating a continuous time Markov chain to a confusion free GSPN.

For the sake of space, we establish the congruence result only for a parallel com-position operator in the style ofPAandIMC, which have common roots in CSP. Other operators can be considered at will (hiding, restriction,. . . ), with the usual root condi-tion being needed to arrive at a congruence for non-deterministic choice.

Organisation. Section 2 sets the ground for the paper. In section 3 we introduce Markov automata. The main contribution of this paper is presented in section 4. In sec-tion 5, we prove that our nosec-tion of weak bisimulasec-tion covers the nosec-tion of equivalence for GSPNs as a special case. Section 6 discusses related works and concludes the paper.

2 Preliminaries

(Sub-)distributions A subdistributionµ over a set S is a function µ : S 7→ [0, 1] such

thatP_s∈Sµ(s) ≤ 1. We denote by Supp(µ) = {s ∈ S | µ(s) > 0} the support of µ

and define the probability ofS′ _{⊆ S with respect to µ as µ(S}′_{) :=} P

s∈S′µ(s). The

subdistributionµ with Supp(µ) = ∅ is denoted by µ∅. Let |µ| := µ(S) denote the

size of the subdistributionµ. We say µ is a full distribution, or distribution, if |µ| = 1. Let Dist (S) and Subdist (S) be the set of distributions and subdistributions over S, respectively. For s ∈ S, we let ∆s ∈ Dist(S) denote the Dirac distribution, i.e.,

∆s(s) = 1. Let µ and µ′be two subdistributions. We define the subdistributionµ′′:=

µ ⊕ µ′_by:_µ′′_{(s) = µ(s) + µ}′_{(s), if |µ}′′_{| ≤ 1. Conversely, we say that µ}′′_{can be split}

intoµ and µ′. Or that(µ, µ′) is a splitting of µ′′. Moreover, ifx · |µ| ≤ 1, we let xµ

denote the subdistribution defined by:(xµ)(s) = x · µ(s).

(Sub-)distributions can also be considered as sets overS × (0, 1], where (s1, r1),

(s2, r2) ∈ S × (0, 1] ∧ s1= s2impliesr1= r2, and where the second components of

(6)

will be widely used throughout the paper. For example to denote the distribution µ

with µ(s1) = 0.75 and µ(s2) = 0.25, we may write µ = J(s1, 0.75), (s2, 0.25)K.

Let for an elements ∈ S and a subdistribution µ over S the expression µ−s denote

the subdistribution that is obtained fromµ by removing the pair (s, µ(s)) from µ, if it

exists. To make clear when we talk about sets representing subdistributions and when about in general sets, we use_{J and K for subdistributions, { and } for sets. Since ⊕ is} associative and commutative, we may use the notationLfor arbitrary finite sums. Labelled trees Forσ, σ′ _{∈ (N}

>0)∗we writeσ ≤ σ′if there exists a (possibly empty)φ

such thatσφ = σ′_.

A partial functionT : (N>0)∗→ L, which satisfies

– if forσ, σ′ _{∈ (N}

>0)∗:σ ≤ σ′andσ′∈dom(T ) then σ ∈dom(T )

– ifσi ∈dom(T ) for i > 1, then also σ(i − 1) ∈dom(T )

– ε ∈dom(T )

is called an (infinite)L-labelled tree. Let σ ∈ dom(T ): σ is called a leaf of T if there

is no σ′ ∈ dom(T ) such that σ < σ′.ε is called the root of T . We denote the set

of all leaves ofT byLeafT and the set of all inner nodes byInnerT. If the tree has

only one node, the root node, then this node is contained in bothInnerT andLeafT.

In any other case the two sets are disjoint. For a nodeσ of a tree T letChildren(σ) = {σi | σi ∈dom(T )}. In this paper, we consider L-labelled trees with finite branching,

i.e.,|Children(σ)| < ∞ for all node σ.

3 Markov automata

We integrate probabilistic automata and interactive Markov chains into one model, de-fined by means of a twofold transition relation and :

Definition 1. A Markov automatonMAis a quintuple(S, Act , , , so), where

– S is a nonempty finite set of states,

– Act is a set of actions containing the internal actionτ ,

– ⊂ S × Act × Dist(S) is a set of probabilistic transitions, and

– ⊂ S × R≥0× S is a set of Markov timed transitions, and

– so∈ S is the initial state.

We lets, u, v, t, E, F, G and their variants with indices range over S. For Markov

timed transitions,λ, µ ∈ R≥0denote rates of exponential distributions. For probabilistic

transitions,a ranges over Act, and µ ranges over Dist (S). A probabilistic transition (E, a, µ) ∈ is also denoted by Ea µ, similarly we define E λ F . We say an

actiona ∈ Act is enabled in E, if there exists a probabilistic transition Ea µ. A

stateE ∈ S is called stable if the internal action τ is not enabled in E. If E is stable,

we use the shorthand notationE↓. We employ the maximal progress assumption. This

means that if a state is not stable, time is not allowed to progress, making Markov timed transitions out of this state irrelevant [20]. As inIMC, this assumption is not evident in the model, but part of the equivalences defined on it.

Now we define a (nonnegative) real-valued functionrateMA: S × S 7→ R≥0, that

calculates the rate to reach a states′from a states by

rateMA(s, s′) =

X

(7)

Moreover, we definerateMA(s) :=Ps′rate_MA(s, s′) as the exit rate of s. The subscript

is omitted if clear from the context.

The delay associated with a state that enables Markov timed transitions is ex-ponentially distributed with the exit rate: for state u in the upper part of Figure 1

the probability that the transition is executed in the next z ∈ R≥0 time units is

1 − e−rate(u)z _{= 1 − e}−3λz_{. In general, the probability to move from}_{u to the}

suc-cessor state E equals the probability that (one of) the Markov timed transitions that

lead fromu to E wins the race. Therefore, the discrete branching probability to move

toE is given by

P(u, E) := rate(u, E)

rate(u) ,

which is 1₃in our example. Foru ∈ S, we use P(u, ·) to denote this discrete branching

distributions.

The subset ofPAarises if = ∅, and theIMCsubset is the one where the distri-butions occurring as third components of are all Dirac.

Parallel composition

Assuming we are given two MAs MA1 = (S1, Act1, 1, 1, s1o) andMA2 =

(S2_{, Act}2_, 2_, 2_{, s}2

o), we consider a family of parallel operators ||A indexed by

some set A ⊆ (Act1 ∪ Act2_{) − {τ }. For a clear presentation we use these}

opera-tors as syntactical means to denote some states1 ||A s2, which arises by the parallel

composition ofs1 ands2. As syntactic sugar, we lift them to subdistributions as

fol-lows: for subdistributionsµ1 ∈ Subdist(S1) and µ2 ∈ Subdist(S2), µ1 ||A µ2

de-notes the subdistribution in Dist(S1× S2) by distributing ||A. As an example, we have

(µ1||Aµ2)(s1||As2) = µ1(s1) · µ2(s2).

Definition 2. LetMA1,MA2andA as discussed above. The parallel operator can be

applied to the twoMAs to form the parallel compositionMA1||AMA2= (S, Act1∪

Act2, , , so) of processes where

– S = {s1||As2 | (s1, s2) ∈ S1× S2},

– (s1||As2, α, µ1||Aµ2) ∈ iff either

• α ∈ A and for each i ∈ {1, 2}, (si, α, µi) ∈ i, or

• α 6∈ A and (s1, α, µ1) ∈ 1∧ µ2= ∆s2or(s2, α, µ2) ∈

2_{∧ µ} 1= ∆s1

– (s1||As2, λ, s′1||As′2) ∈ iff either

• if for each i ∈ {1, 2}, rateMAi(si, s

′ i) > 0 ∧ si = s′i then λ = rateMA1(s1, s ′ 1) +rateMA2(s2, s ′ 2), otherwise • λ =rateMA1(s1, s ′ 1) and s′2= s2, orλ =rateMA2(s2, s ′ 2) and s′1= s1,

– so= s1o||A s2o∈ S1× S2is the initial state.

This composition agrees with the one for IMC and forPA (neglecting minor differ-ences in synchronisation set constructions) on the respective subsets [7,21]. The style of defining this operator can be made more elegant though [22].

(8)

4 Bisimulations

We introduce first some notations that make our further discussion more compact, at the price of mildly reduced readability. It enables a uniform treatment of probabilistic and Markov timed transitions. In doing so, we introduce the pseudo-actionχ(r) to denote

the exit rater of a state. Moreover, we let χ := {χ(r) | r ∈ R≥0} and Actχ:= Act ∪χ,

andα, β, ... range over this set.

Definition 3. LetMA= (S, Act, , , so) be anMA. LetE ∈ S and α ∈ Actχ.

We writeE−−→ µ ifα

– Eα µ ∧ α ∈ Act or

– E↓ ∧ r =rate(E) ∧ α = χ(r) ∧ µ = P(E, ·).

In order to to compare twoMAs, we compare their initial states in the direct sum of the twoMAs. For this purpose we introduce the direct sum of twoMAs:

Definition 4.

LetMA1 = (S1, Act1, 1, 1, s1o) andMA2= (S2, Act2, 2, 2, s2o) be two

MAs with S1_{∩ S}2 _{= ∅. Their direct sum is defined as the} _MA_: _(S1_·∪S2_{, Act}1 _∪

Act2, 1_·∪ 2_, 1_·∪ 2_{, s}1 o).

Since the initial state of the direct sum automaton does not play a role for bisimulation, the choice ofs1

oas initial state is arbitrary.

4.1 Strong Bisimilarity

For strong bisimulation, the obvious combination of strong bisimulation forIMCs and strong bisimulation forPAs can be phrased as follows:

Definition 5. LetMA = (S, Act , , , so) be an MA. LetR be an equivalence

relation onS. Then, R is a strong bisimulation iff E R F implies for all α ∈ Actχ:

E−−→ µ implies Fα −−→ µα ′_with_{µ(C) = µ}′_{(C) for all C ∈ S/R.}

Two states E and F are strongly bisimilar, written E ∼ F , if (E, F ) is contained in

some strong bisimulation. Two MAs are strongly bisimilar if their initial states are strongly bisimilar in the direct sum of theMAs.

In case thatα ∈ Act, the condition µ(C) = µ′_{(C) is the same as for}_PA_{s [4]. For}

α = χ(r), it is an equivalent reformulation of the following: E↓ impliesrate(E) =

rate(F ) and rate(E, C) = rate(F, C) for all C ∈ S/ R. The latter is exactly the

one used forIMCs [7], and as such implements maximal progress. Since our definition conservatively extends strong bisimulation onIMCs andPAs, some nice properties, such as strong bisimulation∼ being an equivalence relation and a congruence relation with

respect to parallel composition, can be established by combining standard techniques from [7] forIMCs and [4,23] forPAs. In the following, we focus on weak bisimulation, which is the central contribution of this paper. We start by introducing weak transitions.

(9)

4.2 Weak Transitions

Weak transitions for probabilistic systems have been defined in the literature via prob-abilistic execution in [4], trees [24], or infinite sum [25]. We adopt the tree notations, and decorate nodes with labels to our convenience of developing the theory.

We consider in the followingS × R≥0× (Actχ·∪ {⊥})-labelled trees. Briefly, a

node in such trees is labelled by the corresponding state, probability of reaching this node, and the chosen action (including the pseudo-action for Markov timed transitions) to proceed. For a nodeσ we writeStat(σ) for the first component of t(σ),Probt(σ) for

the second component oft(σ) andActt(σ) for the third component of t(σ).

Definition 6. LetMA = (S, Act, , , so) be anMA. A transition tree T is a

S × R≥0× (Actχ·∪ {⊥})-labelled tree that satisfies the following condition:

1. ProbT(ε) = 1,

2. ∀σ ∈LeafT :ActT(σ) = ⊥,

3. ∀σ ∈InnerT \LeafT : ∃µ :StaT(σ)

ActT(σ) −−−−−→ µ and ProbT(σ) · µ =q(StaT(σ ′ ), ProbT(σ ′ )) | σ′ ∈ ChildrenT(σ)y .

Note that the definition implies thatP_σ∈Leaf

T ProbT(σ) =ProbT(ε) = 1. Restricting

Actχ to Act, a transition treeT corresponds to a probabilistic execution fragment: it

starts fromStaT(ε), and resolves the non-deterministic choice by executing a

proba-bilistic transition with the actionActT(σ) at the inner node σ. The second label of σ

is then the probability of reachingStaT(σ), starting fromStaT(ε) and following the

selected transitions. If in a nodeσ the Markov timed transition is chosen, the third label

ActT(σ) ∈ R≥0represents the exit rate ofStaT(σ). In this case, a child σ′is reached

withProbT(σ) times the discrete branching probability P(StaT(σ),StaT(σ′)).

For two transition treesT and T′_{, we say}_{T is a prefix of T}′_,_{T ≤ T}′_if_dom_{(T ) ⊆}

dom(T′_{) and ∀σ ∈} _Inner

T′ \ Leaf_T′ : T (σ) = T′(σ) and ∀σ ∈ Leaf_T′ : either

T (σ) = T′_{(σ) or σ 6∈}_Leaf

T andStaT(σ) =StaT′(σ) andProb_T(σ) =Prob_T′(σ).

An internal transition treeT is a transition tree where eachActT(σ) is either τ

or⊥. Let T be a transition tree. Then the distribution associated with T , denoted by µT, is defined as

µT =def

M

σ∈LeafT

J(StaT(σ),ProbT(σ))K .

We say the distributionµT is induced byT . With the above definitions we are now able

to define weak transitions:

Definition 7. ForE ∈ S and µ a full distribution we write

– E ==⇒ µ if µ is induced by some internal transition tree T withStaT(ε) = E.

– E ==⇒ µ if µ is induced by some transition tree T withα StaT(ε) = E, where on

every maximal path from the root at least one nodeσ is labelledActT(σ) = α. In

case thatα 6= τ , then there must be exactly one such node on every maximal path.

And all other inner nodes must be labelled byτ .

(10)

For all three transition relations we say that the transition tree that inducesµ also

induces the transition toµ.

Note thatE ==⇒ ∆E andE ==⇒ ∆τˆ E hold independently of the actual transitions

E can perform, whereas E ==⇒ ∆τ Eonly holds ifEτ ∆E. For allα 6= τ , E ==⇒ µαˆ

is identical toE==⇒ µ. Below we define the notion of combined transitions [4], whichα

arise as convex combination of a set of transitions with the same label, including the label representing Markov timed transitions.

Definition 8. We write E ==⇒α C µ, if α ∈ Actχ and there is a finite indexed

set{(ci, µi)}_i∈I of pairs of positive real valued weights and distributions such that

E==⇒ µα ifor eachi ∈ I andPi∈Ici= 1 and µ =Li∈Iciµi.

We say thatE==⇒α C µ is justified by the set {(ci, µi)}i∈I.

Transitions relations from states to distributions can be generalised to take (sub)distributionsµ to (sub)distributions, by weighting the result distribution of the

transition of each elementE ∈ Supp(µ) by µ(E).

Definition 9. Let ∈ n==⇒, =ˆ =⇒, , , −−→o. Then, we write µ γ if γ = L

si∈Supp(µ)µ(si)µi, wheresi µiholds for allsi∈ Supp(µ).

We say thatµ γ is justified by the transitions si µi. We also generalise stability

to subdistributions:µ↓ if ∀E ∈ Supp(µ) : E↓.

Basic Properties Weak Transitions

Throughout the paper, we will usually assume thatS is the set of states of someMA= (S, Act, , , so) in the lemmas and proofs without further reference.

Let for two subdistributions µ and γ the distance ∆(µ, γ) be defined as P

A⊆S|µ(A) − γ(A)|.

Lemma 1. LetE ∈ S and α ∈ Act. Then, for any sequence of distributions {µi}i∈I

such thatE ==⇒α C µi, there existsE α

=

=⇒C µ and a subsequence {µi}i∈J withJ ⊆ I

such that∀ǫ : ∃k : ∀k′_{∈ J ∧ k}′_{≥ k :}

∆(µk′ − µ) < ǫ

The lemma means that the limiting distribution is also attained by some weak combined transition. Desharnais [24] has proven this lemma for the labelled Markov processes. Since our weak transition involves only probabilistic states, their proof can be applied direct to prove the above lemma.

Lemma 2. LetE ∈ S and α ∈ Act. Then, let {µi}i∈Ibe a sequence of distributions,

such thatE==⇒α Cµiand theµihave limit distributionµ, i.e.,

∀ǫ : ∃k : ∀k′∈ I ∧ k′≥ k : ∆(µk′− µ) < ǫ

(11)

Lemma 3. LetT be an internal transition tree. Then every sequence of finite subtrees

ofT , ordered by <, contains a subsequence of finite transition trees, whose induced

distributions have limit distributionµT.

Lemma 4. Letα ∈ Act.

– Ifµ ==⇒Cµ′andµ′=⇒= C µ′′then alsoµ ==⇒C µ′′.

– Ifµ==⇒α Cµ′andµ′=⇒= C µ′′then alsoµ==⇒α C µ′′.

– Ifµ ==⇒Cµ′andµ′==⇒α C µ′′then alsoµ==⇒α C µ′′.

Lemma 5. Letµ and γ be subdistributions, such that µ ⊕ γ is again a subdistribution.

And letµ==⇒ˆa Cµ′andγ ˆ a = =⇒Cγ′. Thenµ ⊕ γ ˆ a = =⇒Cµ′⊕ γ′.

Proof. We partition Supp(µ ⊕ γ) into disjoint sets:

(i) {Ej| µ(Ej) > 0 ∧ γ(Ej) = 0}_j∈J,

(ii) {Fk | µ(Fk) = 0 ∧ γ(Fk) > 0}_k∈Kand

(iii) {Ci | µ(Ci) > 0 ∧ γ(Ci) > 0}j∈I

Letµ ==⇒αˆ C µ′ be justified by the transitionsTEj = (Ej

ˆ α

=

=⇒C µEj) for j ∈ J and

Ci ==⇒ˆa C µCi fori ∈ I and let γ

ˆ α

=

=⇒C γ′ be justified by the transitions TFk =

(Fk ˆ a = =⇒C µFk) for k ∈ K and Ci ˆ a =

=⇒C γCifori ∈ I. Combining the two justifying

transitions ofCi, we let Ci==⇒ˆa CξCi def = µ(Ci) µ(Ci) + γ(Ci)µCi ⊕ γ(Ci) µ(Ci) + γ(Ci)γCi.

Call this transitionTCifor eachCi.

First, we rewrite the distributionµ′as follows:

µ′ = M G∈{Ej}j∈J∪{Ci}i∈I µ (G) µG = M E∈{Ej}j∈J µ (E) µE | {z } =: y1 ⊕ M C∈{Ci}i∈I µ (C) µC | {z } =: x1 (1) Similarly, we have: γ′ = M H∈{Fk}k∈K∪{Ci}i∈I γ (H) γH = M F ∈{Fk}k∈K γ (F ) γF | {z } =: y2 ⊕ M C∈{Ci}i∈I γ (C) γC | {z } =: x2 (2)

In order to obtainµ′_{⊕ γ}′_{, we first add}_x

1andx2:

xdef

(12)

= M C∈{Ci}i∈I µ (C) µC⊕ M C∈{Ci}i∈I γ (C) γC = M C∈{Ci}i∈I µ (C) µC⊕ γ (C) γC = M C∈{Ci}i∈I (µ (C) + γ (C)) ξC = M C∈{Ci}i∈I (µ ⊕ γ) (C) ξC (4)

Considering the way the partitioning was chosen,y1andy2can be further rewritten:

y1=  M j∈J (µ ⊕ γ) (Ej) µEj   y2= M k∈K (µ ⊕ γ) (Fk) µFk !

Then, adding togethery1,y2andx, we get:

µ′⊕ γ′=  M j∈J (µ ⊕ γ) (Ej) µEj   ⊕ M k∈K (µ ⊕ γ) (Fk) µEk ! ⊕ M i∈I (µ ⊕ γ) (Ci) ξCi ! .

Hence the transitionsTEj andTCiandTFktogether justifyµ ⊕ γ

ˆ a

=

=⇒Cµ′⊕ γ′. 2

Lemma 6.

Letµ be a distribution. Then for each 0 < k < 1:

1. For allµ1, µ2: If µ = µ1⊕ µ2then alsoµ = kµ1⊕ ((1 − k)µ1⊕ µ2).

2. For allµ′_{: If}_{µ µ}′_then_{kµ kµ}′_{, for every}_∈n₌_{=⇒, =}ˆ _=⇒, _, _{, −}_−→o_.

4.3 Weak Bisimilarity

A notion of weak bisimulation satisfying our yardsticks in the introduction is not straightforward to find. As a first try, we combine the weak bisimulation relations for

PAandIMCin the obvious way, analogously to our approach for strong bisimulation:

Definition 10 (Naive Weak Bisimulation). Let R be an equivalence relation on S.

Then,R is a weak bisimulation iff P R Q implies for all α ∈ Actχ:P −−→ µ impliesα Q==⇒α Cµ′withµ(C) = µ′(C) for all C ∈ S/R.

(13)

This naive weak bisimulation however is too fine. It fails to equate the states u, v

on the left part of our motivating example in Figure 1. To see this, consider the Markov timed transition v −−−−→ ∆χ(3λ) v′ =: µ₁. Since E, F can be assumed stable,

u −−−−→χ(3λ) q(E,1 3), (F,

2 3)

y

=: µ2is the only option to match this transition.

Assum-ing, thatE and F lie in two different equivalence classes (for instance each of them

can perform a distinct probabilistic transition with probability1), it is easy to see, that [v′_{] 6= [E] and [v}′_{] 6= [F ]. It is thus impossible to fully satisfy the bisimulation}

condi-tion in Definicondi-tion 10, sinceµ1([v′]) = 1, but µ2[v′] = 0. Thus u and v are not weak

bisimilar here.

To overcome this problem, we take some inspiration from Andova and Baeten [26], and consider a state to be a representation of the distribution that is obtained by fusing distributions reachable by internal transition, as long as there is no non-determinism. In the example above, this would remove our problems, since we would then considerv′

to represent the distributionq(E,1₃), (F,2₃)y, which is perfectly matched byµ2.

It hence appears natural to define weak bisimulation as a relation over subdistri-butions overS. As for standard bisimulation, our notion of weak bisimulation is not

defined over equivalence relations, but uses two symmetric bisimulation conditions in-stead.

Definition 11 (Weak Bisimulation). A relation R over subdistributions over S is

called a weak bisimulation if wheneverµ1R µ2then for allα ∈ Actχ:

|µ1| = |µ2| (5)

and

A.) ∀E ∈ Supp(µ1): ∃µ2g, µ2s:µ2==⇒Cµ2g⊕ µ2sand

(i) _{J(E, µ}1(E))K R µ2gand(µ1−E) R µ2s

(ii) wheneverE−−→ µα ′

1for someµ′1thenµ2g ˆ α = =⇒C µ′′andµ1(E) · µ′1R µ′′ and B.) ∀F ∈ Supp(µ2): ∃µ1g, µ1s:µ1==⇒C µ1g⊕ µ1sand (i) µ1gR J(F, µ2(F ))K and µ1sR (µ2−F ) (ii) wheneverF −−→ µα ′

2for someµ′2thenµ1g==⇒αˆ Cµ′′andµ′′R µ2(F ) · µ′2

Two subdistributionsµ and γ are weakly bisimilar, written µ ≈ γ, if the pair (µ, γ) is

contained in some weak bisimulation. We say that two statesE, F are weak bisimilar,

denoted byE ≈∆F , iff ∆E ≈ ∆F. TwoMAs are weak bisimilar if their initial states

are weak bisimilar in the direct sum of theMAs.

This definition requires a careful explanation. First of all, two subdistributions can only be bisimilar, if they have the same size (Condition (5)). Then exactly as bisimulations are usually defined after Milner, our definition consists of two almost symmetric condi-tions, Condition A.) and B.). Let us have a closer look at Condition A.): The behaviour of a subdistribution is determined by the behaviour of the states in its support plus the respective probability it assigns to the individual states. Condition A.) now asks that for each state E ∈ Supp(µ1), µ2 can reach a subdistribution via internal transitions

(14)

such thatµ2g behaves bisimilar toµ1(E) · ∆E, andµ2sbehaves bisimilar toµ1−E,

the remainder of distributionE, if we neglect E (Condition A.(i)). Noteworthy, the

subdistributions of the splittings must also match in their size. Condition A.(ii) is sim-ply the usual bisimulation condition as also found forPAs andIMCs. Condition B.) is completely analogous.

The fact that in Condition A the distributionµ2(andµ1in Condition B, respectively)

is split only after a sequence of internal transitions, and not necessarily immediately, is what adds the power of fusing distributions in this definition. It might be confusing that at this point we allow a combined transition, since this seems to allow for unexpected interference with non-determinism. However, as we will see in Theorem 2, the relation does not change if we restrict to weak (non-combined) transitions here. By repeated application of the bisimulation condition onµ1−E and µ2s, it is not hard to see that in

factµ2can reach a distribution, that can be split into several subdistributions, such that

each subdistributions is bisimilar to exactly one state in the support ofµ1. Repeating

this argument with the roles ofµ1andµ2interchanged, this suggests thatµ1andµ2can

fuse internal transition until they both reach distributions, whose supports are state-wise bisimilar. Theorem 2 shall make this more explicit.

≈ s E s′ F G 4 4 1 2 1 2 v E F G 4 2 2

Fig. 2. Fusing distributions after Markov timed transitions

Example 1. Consider theMAdepicted in Figure 2, and assumeE, F, G indicate

dis-tinct equivalence classes. For better readability, we have omitted the labels of internal transitions. Following similar discussions in the introduction, we want statess and v,

or the corresponding Dirac distributions∆s and∆v, to be weak bisimilar. We prove

this by giving a bisimulation R containing (∆s, ∆v). Let µ1 :=

q (E,1 2), (s′, 1 2) y

andµ2 := q(E,1₂), (F,1₄), (G,1₄)y. ThenR:= {(∆s, ∆v), (µ1, µ2), (µ2, µ2)} ∪ ID

where ID denotes the identity relation over Dirac distributions over S. All

condi-tions of Definition 11 are easy to check for(∆s, ∆v). Obviously |∆s| = |∆v|. Since

Supp(∆s) = {s}, Condition A.(i) is easy to satisfy ∆v, taking µ2g = ∆v and

µ2s = µ∅. For Condition A.(ii) it suffices to mention that µ1 R µ2. Condition B.)

follows completely analogously.

Now consider the pair(µ1, µ2). For s′ ∈ Supp(µ1), we need to split µ2:µ2g =

q

(F,1₄), (G,1₄)yandµ2s =q(E,1₂)y. It is now routine to check the remaining

condi-tions of Definition 11.

The fact that these twoMAs are indeed weakly bisimilar is the nucleus for the fact that the relation will be a congruence with respect to (a choice operator and in particular) parallel composition of Markov timed delays.

(15)

Example 2. In Figure 3, all transitions are internal,τ -labels are omitted. Here, it is not

obvious how to fuse distributions along internal transitions for states, because it can

non-deterministically choose between two internal transitions. However, we see that any resolution of non-determinism ins induces the same distribution on the classes E, F and G, namely the one on the right. Therefore, they are weak bisimilar. Note however,

if we change a distribution for one of the internal transitions, thens 6≈ t.

Example 3. In Figure 4, boths and u have the same probabilities to reach equivalent

states. For example, the probability to reach a state where the only possible transition performs actione is 0.5, and the probability to reach a state that can perform exactly

transitions with actionse and g is 0.25 for both states. Note that it does not play a role

thats has an action e immediately enabled, since all internally reachable states of both s and u can internally reach a state where e is enabled with probability 1. So neither s

noru can refuse e. On the other hand, replacing in s the action e by some other action f 6= e, s and u will not be equivalent anymore.

As a side remark, our new notion of weak bisimulation does not require thatR is

an equivalence relation, which is the case for bothPAs [4] andIMCs [7]. The way it is defined, it conservatively extends the non-probabilistic case. A nice consequence is that our weak bisimulation enjoys some useful properties, that are rare in probabilistic and stochastic calculi. For instance it is easy to show that the unionR1∪ R2is a weak

bisimulation, provided thatR1andR2are weak bisimulations. This property is not true

for bothPAs andIMCs.

Lemma 7. Ifµ1≈ µ2then for each0 < k ≤ 1 also k µ1≈ k µ2.

The rest of this paper is devoted to the key properties of weak bisimulation. We start with a novel proof technique, which is crucial for almost all proofs. If we want to prove two distributions bisimilar, it suffices to show that they are contained in some bisimulation-up-to-splitting. The definition of bisimulation-up-to-splitting is identical to that of weak bisimulation, except that it weakens every condition of the formµ R γ

into a splitting conditionµ R⊕ _{γ, which is defined as exist k ∈ N and subdistributions}

µi_and_γi_for_{i = 1, . . . , k such that}

– µ =L_i=1,...,kµi_and_{γ =}L

i=1,...,kγiand

– ∀i = 1, . . . , k : µi_{R γ}i

Definition 12. R is a weak bisimulation-up-to-splitting, if whenever µ1R µ2then for

allα ∈ Actχ:|µ1| = |µ2| and

≈ s s1 s2 E F G E s3 F G t E F G 1 2 1 4 1 4 1 2 1 2 1 2 1 2 1 2 1 4 1 4

(16)

≈ E G E H s3 s4 s1 E s2 E s E e g e _h 1 2 1 2 e ₁ 2 1 2 e e E G E H E u1 u2 u3 u e g e h e 1 4 1 4 1 2

Fig. 4. Fusing distributions in the presence of observable actions

1. ∀E ∈ Supp(µ1) : ∃µ2g, µ2s: µ2==⇒C µ2g⊕ µ2sand

(a) _{J(E, µ}1(E))K R⊕µ2gand(µ1−E) R⊕µ2sand

(b) wheneverE−−→ µα ′ 1thenµ2g ˆ α = =⇒Cµ′′and(µ1(E) · µ′1) R⊕µ′′

2. in addition, an according condition forF ∈ Supp(µ2) must hold.

Lemma 8. WheneverR is a bisimulation-up-to-splitting then R ⊆ ≈.

Proof. Let the relationS consist of all pairs of subdistributions, that can be split into

smaller subdistribution in such a way that the components of the splitting are related by

R. Formally, S is the set consisting of all pairs (L_i=1,...,kµi 1,

L

j=1,...,kµi2) satisfying

the conditionµi

1 R µi2fori ∈ {1, . . . , k} and k ∈ N. We will shown that S is a weak

bisimulation. The lemma then follows fromR ⊆ S. We skip completely symmetric

cases without further mentioning.

Consider an arbitrary pair (µ1, µ2) ∈ S, with µ1 = Li=1,...,kµi1 and µ2 =

L

j=1,...,kµi2. The first condition is easy to check, sinceµi1R µi2implies|µi1| = |µi2|,

and hence|µ1| = |µ2| follows. We proceed by induction over the number of

compo-nents of the splittingk. Consider an arbitrary pair (L_i=1,...kµi 1,

L

i=1,...kµi2) ∈ S and

assumeE ∈ Supp(L_i=1,...kµi

1). If k = 1 the proof follows from µ1 R µ2 and our

choice ofS.

Now assumek > 1 and that the claim holds for k − 1. We will use this by splitting µ1andµ2into two parts as follows:

µ1= ( M i=1,...k−1 µi 1) | {z } =: γ1 ⊕µk 1andµ2= ( M i=1,...k−1 µi 2) | {z } =: γ2 ⊕µk 2.

To establish Condition A of weak bisimulation, consider someE ∈ Supp(µ1).

Then µ1(E) = γ1(E) + µk1(E). Let l := γ1(E) and m := µk1(E) > 0. Since

(L_i=1,...k−1µi 1,

L

i=1,...k−1µi2) ∈ S, we have that γ2 ==⇒C γ2′ = γ2g ⊕ γ2sfor

two distributionsγ2g andγ2s, which satisfyJ(E, γ1(E))K S γ2g and(γ1−E) S γ2s.

Analogously, since (µk

1, µk2) ∈ S, by induction hypothesis, we have that µk2 ==⇒C

µk 2 ′

= µk

2g ⊕ µk2s for two distributions µk2g and µk2s with

q (E, µk 1(E)) y S µk 2g andµk

1−E S µk2s. We conclude by Lemma 5 that γ2⊕ µ2k ==⇒C (γ2g ⊕ γ2s) ⊕

(µk

2g ⊕ µk2s) = (µk2g ⊕ γ2g) ⊕ (µk2s⊕ γ2s). To finish the proof of Condition Ai

of weak bisimulation, we need to show that (i) q(E, (γ1⊕ µk1)(E))

y S µk

(17)

and, (ii)(µk

1−E) ⊕ (γ1−E) S µk2s⊕ γ2s For(i) note that

q

(E, (γ1⊕ µk1)(E))

y = J(E, m + l)K = J(E, m)K ⊕ J(E, l)K. Now, since J(E, m)K S µk

2gandJ(E, l)K S γ2g:

J(E, m)K and µk

2gas well asJ(E, l)K and γ2gcan be split in such a way that the

com-ponents of the splitting of_{J(E, m)K and µ}k₂gare pairwise related byR and as well the

components of the splitting of_{J(E, l)K and γ}2gare related byR. This is enough in order

to concludeq(E, (γ1⊕ µk1)(E))

y S µk

2g⊕ γ2g.

For(ii) we remark that (µk

1⊕γ1)−E = (µk1−E)⊕(γ1−E). To establish (µk1−E)⊕

(γ1−E) S µk2s⊕ γ2swe can proceed in the same way as for(i) above.

For Condition Aii of weak bisimulation supposeEα ξ with α ∈ Actχand hence by induction hypothesis,γ2g ˆ α = =⇒C γ′′andlξ Rγ γ′′. Similarly,µk2g ˆ α = =⇒C µ′′and mξ Rµ µ′′. Hence by Lemma 5,µk2g⊕ γ2g ˆ α = =⇒C µ′′⊕ γ′′. We observe that((µk1⊕

γ1)(E))ξ = mξ ⊕ lξ. Similarly as above, mξ ⊕ lξ S µ′′⊕ γ′′follows.

2

This technique is helpful whenever bisimilar distributions are composed of bisimilar subdistributions. We hence immediately obtain that≈ is compositional with respect to

subdistribution composition. Formally,

µ ≈ µ′∧ γ ≈ γ′_implies_{µ ⊕ γ ≈ µ}′_{⊕ γ}′_.

Lemma 9. Ifµ1≈ µ2andγ1≈ γ2then(µ1⊕ γ1) ≈ (µ2⊕ γ2).

Proof. LetRµ andRγ be bisimulations that satisfyµ1Rµ µ2andγ1 Rγ γ2,

respec-tively. We can show that the relationS ⊆ Subdist × Subdist with S= {(µ, µ′_{) | µ = ν ⊕ ξ ∧ µ}′_{= ν}′_{⊕ ξ}′

∧ν(Rµ ∪ Rγ)ν′∧ ξ(Rµ∪ Rγ)ξ′}

is a bisimulation-up-to-splitting. Obviously(µ1⊕ γ1, µ2⊕ γ2) ∈ S. The proof mainly

follows using Lemma 5. ₂

Theorem 1. ≈ is an equivalence relation.

Proof. Reflexivity is obvious. For every bisimulationR also R−1_{is a bisimulation and}

henceR ∪ R−1 _{is so. This proves symmetry. Transitivity is established by showing}

that≈ ◦ ≈ is a bisimulation. This follows easily from the alternative characterisation

of weak bisimulation presented in Lemma 10. ₂

Lemma 10. Wheneverµ1≈ µ2then for allα ∈ Actχ:|µ1| = |µ2| and

1. wheneverµ1==⇒Cµ1g⊕ µ1sthen

(a) µ2==⇒Cµ2g⊕ µ2sandµ1g≈ µ2gandµ1s≈ µ2s

(b) and wheneverµ1g==⇒αˆ Cµ′1then alsoµ2g==⇒αˆ C µ′′andµ′1≈ µ′′

(18)

Proof. We distinguish two cases, skipping completely symmetric cases. We start with the special case where µ1 immediately is split and does not perform initial internal

transitions. Formally,µ1 = µ1g ⊕ µ1s. Since|S| is finite, we may represent µ1g as

L

i∈1,...,nJ(Ei, ki· µ1(Ei))K for n ∈ N, where ∀i ∈ 1, . . . , n : ki · µ1(Ei) > 0.

We proceed by induction overn. The base case with µ1g = µ∅ is trivial. For the

in-duction step, we proceed by redefining the splitting to a splitting µ′

1g ⊕ µ′1s, where

Supp(µ′1g) only contains En. Formally, µ′1g = J(En, kn· µ1(En))K and µ′1s =

L

i∈1,...,n−1J(Ei, ki· µ1(Ei))K ⊕ µ1s. We then prove that for this splittingµ′2g and

µ′2s exist with µ′2g ≈ µ′1g andµ′2s ≈ µ′1s. We call this(⋆). It follows almost

im-mediately fromµ1 ≈ µ2. The only reason why we cannot use Condition Ai of weak

bisimulation directly, is that there always implicitlykn= 1. It is, however, an easy

ex-ercise to adjust the splitting obtained from this condition. Say we have obtainedρg_{⊕ ρ}s

as a splitting. Then ρg _{≈ µ}

1(E1)∆E1 andρ

s _{≈ µ}

1−E1. We adjust this by setting

µ′

2g= knρgandµ2′s= (1 − kn)ρg⊕ ρs. It is easy to check that those subdistributions

satisfy the necessary conditions.

The hard part, however, is to show thatµ′

2gcan simulateEn ==⇒α C µ′1, rather than

onlyEn −−→ µα ′1 (Condition 1(b)). This proof needs a structural induction over the

transition tree justifyingEn α

=

=⇒C µ′1, if the tree is finite. For infinite trees, we can

reduce the problem to finite trees with Lemma 3. We explain the details later. For a shorter notation we letE = En in the following. Assume thatE

ˆ a

= =⇒C µ′1:

Let this transition be justified by the indexed set{(ci, µi)}_i∈I and let the respective

inducing transition trees be{Ti}_i∈I.

We will first show how to reach for each of theµi a bisimilar distribution via a

combined transition, and how we can combine these transitions in order to reach a distribution that is bisimilar toµ′

1itself.

Consider an arbitraryµifrom the indexed set justifyingE==⇒αˆ Cµ′1and letT = Ti

be the transition tree inducing this transition.

The simple case isE ==⇒ ∆E. Sinceµ2g==⇒C µ2gand directlyµ2g≈ k · µ1(E) ·

∆Eeverything follows.

Now assume thatE ==⇒ µα ifor arbitraryµi. Then letTprebe the maximal subtree

ofT , which is an internal subtree. This means intuitively that if we let all leaves of Tpretake on more step, as given byT , then these steps are α-transitions. Let µprethe

distribution induced byTpre. We call the distribution resulting from performing all

α-transitionsµpost. Formally,µpostis the distribution induced by the smallest subtree of

T that represents a transition of the form E==⇒.α

We then show thatµ2g ==⇒α C µ′′isuch thatµi ≈ µ′′iin three steps: (i) We show

thatµ2g ==⇒C µ2igprewithµpre≈ µ2igpreand (ii)µ2igpre α

=

=⇒C µ2igpost withµpost ≈

µ2igpostand finally (iii)µ2igpost =⇒= C µ′′iwithµi ≈ µ′′iBy Lemma 4, we may then

conclude thatµ2g α

=

=⇒Cµ′′i.

We now combine these transitions transition in order to simulate distributionµ′ 1:

Combine the µ′′i _{with the weights} _c

i. More formally, we let µ2g ==⇒C µ′′ :=

L

i∈Iciµ′′iSinceµ1(E)µi≈ µ′′i, alsoµ1(E)ciµi≈ ciµ′′i. Hence by repeated

(19)

We now establish steps (i) to (iii): As mentioned before, we proceed by structural

induction over each treesTi inducing a transitionE ==⇒ µi, which in turn are then

combined to the combined transition of E we actually consider. If Ti is infinite for

somei, by Lemma 3, it suffice to show that we can simulate the induced distribution µs of each finite subtreeS of Ti. Thus, it suffices to consider the finite case in the

following.

Let us hence prove by structural induction over internal transition trees, that when-ever∆E ≈ γ and E ==⇒C µ, then γ ==⇒C γ′ andµ ≈ γ′. LetE ==⇒ µ with a finite

inducing transition treeT . Let us first note that γ ==⇒Cγg⊕ γsandγg≈ E and hence

γs_{= µ}

∅. FurthermoreAii holds true for E and γg.

Assume we have already established the hypothesis for all strict subtrees of the internal transition tree T . Since T is finite, there is a subtree S such that T and S

are node-wise identical on all nodes of S, except for one leaf node σ of S, where S stops, but T continues by one more transition. Formally, for this node σ we know E′ def

=StaT(σ) = StaS(σ) andProbT(σ) = ProbS(σ), butActT(σ) = τ , whereas

ActS(σ) = ε. Furthermore, the nodes of T , that are not nodes of S, are exactly the

children ofσ in T . They define a distribution µE′ withµ_E′(G) = ProbT(σG)

ProbT(σ) , ifσG

exists, whereσG is the node inChildrenT(σ), withStaT(σG) = G and µE′(G) = 0

otherwise. Furthermore,E′ τ _µ E′.

LetµS be the distribution induced byS. Let µSsbe the subdistribution we obtain

fromµS by removing exactly the fraction that is caused byσ. Then (ProbT(σ))∆E′⊕

µSsis a splitting ofµS. And henceµ = µSs⊕ (ProbT(σ))µE′.

Now by induction,γg ₌_=⇒

C γ′ for someγ′ such thatµS ≈ γ′. Using(⋆), there

also existγ′

E′ andγ_S′ such thatγ′ ==⇒C γ′E′ ⊕ γ_S′ withγ_E′′ ≈ (Prob_T(σ))∆E′ and

γ_S′ ≈ µSs.

Since E′ τ _µ

E′, also γ′

E′ ==⇒_C γ′′ ⊕ µ_∅ and γ′′ ==⇒_C γ′′′ with γ′′′ ≈

(ProbT(σ))µE′.

Hence by Lemma 5 and Lemma 4, γE′′ ⊕ γ′_S ==⇒C γ′′′ ⊕ γ_S′. Since γ′′′ ≈

(ProbT(σ))µE′ and γ′

S ≈ µSs and µ = µ2s ⊕ (ProbT(σ))µE′ by Lemma 9,

γ′′′_{⊕ γ}′

S ≈ µ. And hence finally again by Lemma 4, since γ′ ==⇒C γE′ ′ ⊕ γ_S′ and

γg₌_=⇒

Cγ′andγ ==⇒Cγg, we getγ ==⇒Cγ′′′⊕ γ_S′, which completes the induction.

We next consider step(ii). We want to show that there exists µ2igpost such that

µ2igpre α

=

=⇒C µ2igpostandµpost ≈ µ2igpost.

By step(i) we know that µ2igpre≈ µpre. TreeTpostcan be obtained from treeTpre

by expanding all leaves ofTpreby oneα-transition, completely analogously to what we

did in the last paragraph with one leaf.

Using similar ideas as above, we first note that

µpre=

M

σ∈Leaf(Tpre)

q

(StaTpre(σ),ProbTpre(σ))

y .

Then, using as above(⋆) for each leaf, there exist for each leaf σ a distribution γσ, such

thatµ2igpre==⇒C Lσ∈Leaf(Tpre)γσand∀σ : γσ ≈

q

(StaTpre(σ),ProbTpre(σ))

y

. Then we can continue as above.

(20)

The general case, whereµ1does perform initial internal transitions to some

distri-butionµ′

1, formally,µ1 τ

=

=⇒Cµ′1= µ1g⊕ µ1swithµ′16= µ1, mainly reuses the special

case above, setting thereµ1gtoµ1andµ1stoµ∅. From there, we obtain in a first step

µ2==⇒Cµ2g⊕ µ2sandµ1≈ µ2gandµ∅≈ µ2s.

We now will show thatµ2 ==⇒C µ′′ andµ′1 ≈ µ′′. We also obtain this from the

special case as follows: wheneverµ1 ˆ α

=

=⇒Cµ′1for someµ′1then alsoµ2g ˆ α = =⇒Cµ′′and µ′ 1≈ µ′′. Hence by transitivity,µ2==⇒Cµ2g⊕ µ2s= µ2g==⇒C µ′′. Asµ′

1is directly split now without any initial internal transitions, we can directly

apply the special case forµ′

1andµ′′, which we have obtained in the last paragraph. This

finishes our proof since then 1. µ′′₌_=⇒ Cµ′′g⊕ µ′′sandµ1g≈ µ′′gandµ′′s ≈ µ2s. 2. and wheneverµ1g ˆ α =

=⇒Cµ′1for someµ′1, then alsoµ′′g ˆ α

=

=⇒Cµ′′′andµ′1≈ µ′′′,

2

Lemma 11. Let(µ1, . . . , µk) be a splitting of µ. If µ ≈ γ then there exist γ1, . . . , γk

such thatγ ==⇒Cγ1⊕ · · · ⊕ γkandµi ≈ γifori = 1, . . . , k.

Proof. We proceed by induction overk. The case k = 1 is follows immediately from

the definition of weak bisimulation. Consider the splitting(µ1, . . . , µk−1, µk) for k >

1. By induction hypothesis, the lemma holds for the splitting (µ1, . . . , µk−1 ⊕ µk).

Hence there existγ1, . . . , γk−1 such thatγ ==⇒C γ1⊕ · · · ⊕ γk−1andµi ≈ γi for

i = 1, . . . , k − 2 and (µk−1⊕ µk) ≈ γk−1. Since(µk−1⊕ µk) ≈ γk−1, we can use

Lemma 10 and taking setting thereµ1g = µk−1andµ1s= µk, we get thatγk−1==⇒C

γk−1g⊕ γk−1sandµk−1≈ γk−1gas well asµk ≈ γk−1s.

Sinceγk−1==⇒Cγk−1g⊕ γk−1s, alsoγ1⊕ · · · ⊕ γk−2⊕ γk−1==⇒Cγ1⊕ · · · ⊕

γk−2⊕ γk−1g⊕ γk−1sby Lemma 4. And hence, again by Lemma 4, also γ ==⇒C

γ1⊕ · · · ⊕ γk−2⊕ γk−1g⊕ γk−1s, which completes the proof. 2

It is important to note again that for weak bisimulation it is vital that in general

µ ≈ γ does not imply ∀C ∈ S/≈∆: µ(C) = γ(C). In Example 1 we used 0.5∆s′ R

J(F, 0.25), (G, 0.25)K, in order to prove s R v. However, neither s′ _≈

∆ F nor s′ ≈∆

G holds. Although retaining exactly this condition is impossible, ≈ and ≈∆satisfy a

weaker form, which can be phrased as an alternative characterisation of≈ and ≈∆.

Theorem 2. Let MA = (S, Act, , , so) be an MA,E, F ∈ S, and µ, γ ∈

Dist(S). Then

1. E ≈∆F if and only if: whenever E −−→ µ then for some γ: Fα ==⇒α C γ and µ ≈ γ,

2. µ ≈ γ if and only if: ∃µ′ : µ ==⇒ µ′and∃γ′ _{: γ =}_{=⇒ γ}′_and_{µ ≈ µ}′_and_γ′ _{≈ γ}

and∀C ∈ S/≈∆: µ′(C) = γ′(C).

The above characterisation implies directly that ≈∆ is a generalisation of the weak

bisimulation we have defined in Definition 10:≈∆demands that wheneverE α

− −→ ξ

thenF ==⇒α C ν for some ν, and ξ ≈ ν. In general, the latter will not imply ∀C ∈

S/ ≈∆: µ(C) = γ(C) directly. But instead, by the second clause of the theorem,

these two distributions can fuse distributions to reach distributionsµ′_and_γ′_{that satisfy}

(21)

Furthermore, the distributionsγ′ _and_µ′ _{from the second clause are as refined as}

possible with respect to containment of different equivalence classes with respect to

≈∆. Phrased differently, for every state s in the support of µ′ orγ′, every internal

transition ofs either reaches a distribution, where every state in the support is equivalent

tos itself and can thus not refine the distribution any further, or further refinement by

this internal transition of the distributions µ′ _or_γ′_{, respectively, would immediately}

yield a distribution that is not in the same equivalence class. This observation is also the key observation needed to prove Theorem 2, and is stated formally in Lemma 16.

For the proof of this lemma we need Lemma 14 and 15, which is the dual to the congruence property of ≈ with respect to subdistribution composition. It states that

two distributions that are identical, except for a certain fraction, must be different with respect to≈, as soon as this fraction is different. This holds independently of how small

the fraction actually is. Additionally, Lemma 12 and 13 deal with convergent infinite sequences of distributions.

Let us use infinite subsets of the natural numbersI ⊆ N and J ⊆ N as index sets for

sequences of (sub)distributions and real numbers. We denote a sequence of elementsai

fori ∈ I with haiii∈I. If sequencehaiii∈I has a limitr, we write haiii∈I −→ r. We

omit the index, if it is clear from the context. Note that (sub)distributions over states can be considered elements of R|S|, since we assumedS to be finite. Hence the usual

results for the convergence of infinite sequences can be applied. Lemma 12. LetI be an index set and i ∈ I.

Ifh∆(µi, γi)i −→ 0 and µi α = =⇒C µig⊕ µisandµig α = =⇒C µ′i, then∃γig, γis, γi′for

everyi ∈ I, such that γi ==⇒C γig⊕ γisandγig α

=

=⇒Cγi′andh∆(µig, γig)i −→ 0,

h∆(µis, γis)i −→ 0 and h∆(µ′i, γi′)i −→ 0.

Proof. For every i ∈ I, the transition γi ==⇒C γi′′ = γig ⊕ γis is constructed by

reusing for everyE ∈ Supp(µi) ∩ Supp(γi) the transition of E, that was used to justify

µi ==⇒C µig ⊕ µis, and by using for everyF ∈ Supp(γi) \ Supp(µi) the trivial

transitionF ==⇒C∆F.

Then for the splitting ofγ′′

i intoγig andγis, we chooseγig such thatγig(E) =

min {γ′′

i(E), µig(E)} for every E ∈ S. This ensures that each state in Supp(γig) can

in fact perform transitions of the form==⇒α C. The transitionγig α

=

=⇒Cis then chosen as

above.

The claim then follows from the fact that wheni −→ ∞ then for each E ∈ Supp(µi) ∩ Supp(γi), |µi(E) − γi(E)| −→ 0. 2

In the following lemma we show that two distributions are weakly bisimilar, if they only differ in an infinitesimal small fraction.

Lemma 13.

R:= {(γ, µ) | ∃I : ∃hγiii∈I, hµiii∈I:

hγiii∈I −→ γ ∧ hµiii∈I−→ µ ∧ ∀i ∈ I : µi≈ γi}.

(22)

Proof. Consider (γ, µ) ∈ R and assume γ ==⇒C γg ⊕ γs andγg α

=

=⇒C γ′. Since

(γ, µ) ∈ R there are infinite sequences hγiii∈I andhµiii∈I as in the definition ofR.

We first show that the behaviour ofhγii converges against the behaviour of γ. Then,

sinceγi ≈ µi for everyi ∈ I, µi can simulate this behaviour, the behaviour ofµ is

the limit of this behaviour. As we will see, this suffices to establish the condition of bisimulation. As usual, we skip symmetric cases.

Considering the trivial infinite sequenceγ, γ, . . . . By Lemma 12, for each i ∈ I we

can findγig, γisandγi′such thatγi==⇒Cγig⊕ γisandγig ˆ α

=

=⇒Cγi′andhγigii∈I−→

γg_and_hγ

isii∈I −→ γsand alsohγi′ii∈I−→ γ′.

Sinceµi≈ γifor eachi ∈ I, µi =⇒= C µig⊕ µisandµig==⇒αˆ Cµ′iandµig≈ γig

andµis≈ γisandµ′i≈ γi′.

By the theorem of Bolzano-Weierstraß, there exists an infinite subsequenceJ1of the

µigsuch thathµjgij∈J1 −→ µ

∗g_{for some}_µ∗g_{. Repeating the argument for}_hµ jsij∈J1,

there must exist again an infinite subsequenceJ2such thathµjsij∈J2 −→ µ

∗s_{for some}

µ∗s_{. And repeating it once more, for}_hµ′

jij∈J2,there must exist an infinite subsequence

J3such thathµ′jij∈J3 −→ µ

∗′_{for some}_µ∗′_.

Now forµ, with the trivial infinite sequence µ, µ, . . . by Lemma 12, for each i ∈ J3,

we see that there existsµig_{, µ}is_and_µi′_{, such that}_{µ =}_=⇒

Cµig⊕ µisandµig==⇒αˆ C µi′

and∆(µig_{, µ}

ig) −→ 0, ∆(µis, µis) −→ 0 and ∆(µi′, µ′i) −→ 0. But this means that

alsohµig_i i∈J3 −→ µ ∗g_and_hµis_i i∈J3 −→ µ ∗s _and_hµi′_i i∈J3 −→ µ ∗′_{and hence by}

Lemma 2,µ ==⇒C µ∗g⊕ µ∗s. Sinceh∆(µ∗g, µig)ii∈J3 −→ 0 and µ

ig ₌_=⇒αˆ

C µi′and

hµi′_{i −→ µ}∗′_{there exists a sequence of}_µ∗

ig such thatµ∗g ==⇒C µ∗ig by Lemma 12

andhµ∗ igi −→ µ∗′. Hence alsoµ∗g ˆ α = =⇒Cµ∗′by Lemma 2.

Now we check if also(γg_{, µ}∗g_{) ∈ R. This holds since hγ}

igii∈J3 −→ γ

g _and

hµigii∈J3−→ µ

∗g_{and finally,}_γ

ig≈ µigfor eachi ∈ J3. We can check the remaining

cases completely analogously. ₂

Lemma 14. LetE=⇒ µ and ∆τ E 6≈ µ. Then for every 0 ≤ k < 1 : E 6≈ k∆E⊕ (1 −

k)µ.

Proof. Assume that on the contrary for a fixedk in fact E ≈ k∆E⊕ (1 − k)µ holds.

SinceE 6≈ µ, but E=⇒ µ, there either must be a transition Eτ α γ such that whenever µ ==⇒αˆ C γ′, thenγ′ 6≈ γ or if α 6= τ , then possibly directly µ 6

ˆ α

=

=⇒C. We will refer to

this fact by(⋆). The second case immediately implies k∆E⊕ (1 − k)µ 6 ˆ α

=

=⇒C, which

is a contradiction to our assumption. In the first case, since wheneverµ ==⇒αˆ C γ′then

γ′_{6≈ γ, but still, by our assumption, E ≈ kE ⊕ (1 − k)µ, there must exists some ξ and}

someγ′_{such that}_E ₌_=⇒αˆ

C ξ and µ ˆ α

=

=⇒C γ′andkξ ⊕ (1 − k)γ′ ≈ γ holds. We will

refer to this fact by(⋆1).

Again, it cannot be the case thatµ can simulate the transition E ==⇒α C ξ. To see

this, assume the contrary. Then, for someξ′ _{: µ} ₌_=⇒αˆ

C ξ′ andξ ≈ ξ′. But then also

µ ==⇒C k(ξ′) ⊕ (1 − k)γ′ ≈ kξ ⊕ (1 − k)γ′ ≈ γ. A contradiction to (⋆)!

Nevertheless, by our assumption thatE ≈ k∆E ⊕ (1 − k)µ, we must be able to

(23)

aν such that E ==⇒αˆ C ν and kν ⊕ (1 − k)π ≈ ξ. We will refer to this fact by (⋆2).

Using this transition, we will show that this transition ofE can also be simulated by k2_{E ⊕ (1 − k}2_{)µ. Then by repeating our argumentation so far for k}2_{, we see that our}

assumption must also hold for(k2₎2_{. As we made no assumption on}_{α or the transition,}

and sinceE ==⇒ µ by the premise of the lemma, this implies indeed E ≈ k2_{E ⊕ (1 −}

k2_{)µ. It is well-known that the countable infinite sequence hk}

iii∈Nwithki = k2

i

has limit0. But his implies immediately that hkiE ⊕ (1 − ki)µii∈N−→ µ. By Lemma 13,

this implies thatE ≈ µ. A contradiction!

We will now show that indeedk2_∆

E⊕ (1 − k2)µ can simulate the transition E α

= =⇒ γ. With straightforward transformations we obtain k2_∆

E⊕ (1 − k2)µ = k2∆E⊕ (1 −

k)(k + 1)µ = k2_∆

E⊕ k(1 − k)µ ⊕ (1 − k)µ = k(k∆E⊕ (1 − k)µ) ⊕ (1 − k)µ. From

(⋆2), we know that this distribution can perform a weak transition to k(kν ⊕(1−k)π)⊕

(1 − k)µ, which is equivalent to k(ξ) ⊕ (1 − k)µ with respect to ≈. Now we obtain the

distribution from(⋆1) by the transition µ α

=

=⇒ γ′, which together yieldskξ ⊕ (1 − k)γ′. As we have argued, this is equivalent toγ with respect to ≈. From here it is then clear

thatk2_∆ E⊕ (1 − k2)µ α = =⇒ k2_∆ E⊕ k(1 − k)π ⊕ (1 − k)γ′≈ γ. 2

Corollary 1. Letµ1≈ µ2≈ ∆Eandµ 6≈ ∆E, butEτ µ. Then ∀0 ≤ k < 1 : µ16≈

kµ2⊕ (1 − k)µ.

Proof. Assume the contrary. Then for some0 ≤ k < 1 : ∆E≈ µ1≈ kµ2⊕(1−k)µ ≈

k∆E⊕ (1 − k)µ. A contradiction to Lemma 14 2

Lemma 15. IfE τ µ and ∆E6≈ µ, then also for every subdistribution γ and 0 < k ≤

1:

k∆E⊕ (1 − k)γ 6≈ kµ ⊕ (1 − k)γ.

Proof. For a concise notation, letξ = k∆Eandγ′= (1 − k)γ and µ′= µ for some k.

To arrive at a contradiction, assumeξ ⊕ γ′_{≈ µ}′_{⊕ γ}′_{. Then}_µ′_{⊕ γ}′₌_=⇒

Cξ ⊕ γ′must

hold. Sayµ′₌_=⇒

Cµ′′andγ′==⇒Cγ′′is the respective contribution ofµ′andγ′to this

distribution. Since by the premise of our lemma,µ′₌_=⇒

Cξ is impossible, it cannot be

the case thatµ′′ _{= ξ. Hence both µ}′′_and_γ′′_{need to contribute to}_{ξ (possibly with µ}′′

contributing onlyµ∅). Expressed differently, there must exist a splittingµs1⊕ µc1= µ′′

and a splittingγs

1⊕ γ1c = γ′′ such that ξ ≈ µs1⊕ γ1c (possibly with µs1 = µ∅). As

a consequence,γ′ ≈ µc

1⊕ γ1smust hold. Now we start anew with our considerations

replacingγ′byµc

1⊕ γ1s, which Lemma 9 allows. Thusµc1⊕ γ1smust be able to simulate

the transition ofγ′_{that we have considered before. Formally, there must exist}_γs 1′⊕ γ1c′

such thatµc

1⊕γs1==⇒Cγ1s′⊕γ1c′andξ ≈ µs1⊕γ1c′andγ′≈ µc1⊕γ1s′. This allows us now

to see how muchµc

1could have contributed toξ, by contributing to γ1c′. Analogously to

what we have seen before, for someµs

2, µc2, γ2s, γ2cit must hold thatµc1 ==⇒C µc2⊕ µs2

andγs

1′ ==⇒C γ2c⊕ γ2s, withγ1c′ = µs2⊕ γ2candγ1s′ = µc2⊕ γ2s. Using Lemma 9, we

getξ ≈ µs

1⊕ µs2⊕ γ2c. It cannot be the case that actuallyγ2c = µ∅, which means that

γs

1′ has not contributed any further toξ, since then µs2 ≈ γ1c′, which impliesµ′ ==⇒C

µs

1⊕ µc1==⇒Cµs1⊕ µs2andξ ≈ µs1⊕ µs2. A contradiction to the premise of the lemma

(⋆).

Sinceγs

1′ = µc2⊕ γ2swe can repeat our arguments forγ1s′, as we have done forγ′

before, and see how muchµc

(24)

this results in some subdistributionsµs

3andγ3c, which yield a refined representation of

ξ, namely ξ ≈ µs

1⊕ µs2⊕ µs3⊕ γ3c, completely analogous to the above.

Repeating this argument, we obtain a sequence of distributionsµs

i and distributions

γc

i, such thatξ ≈ µs1⊕ · · · ⊕ µsi ⊕ γic. As we have argued before(⋆), it can never by

the case thatγc

i = µ∅. Note thath|µs1⊕ . . . µsi|ii∈Nis monotonously increasing. Since

|ξ| is bounded, there must exist λ ∈ R such that h|γc

i|i −→ λ.

We distinguish two cases. Assumeλ = 0. Then the limit ξ′ ofh|µs

1⊕ . . . µsi|ii∈N

is equivalent toξ. This fact follows since (ξ, ξ′) is contained in the bisimulation R of

Lemma 13. This holds sinceh|µs

1⊕ . . . µsi ⊕ γici −→ ξ′andξ ≈ µs1⊕ . . . µsi ⊕ γicfor

alli ∈ N. Furthermore, µ′ ₌_=⇒

C ξ′by Lemma 2. In total, this is a contradiction to the

premise of the lemma.

Now assumeλ > 0. This means that from a certain point on |µc

j| ≥ λ. Recall that for

everyi ∈ N: γs

i ≈ µci+1⊕ γi+1s . Combining those facts, this means that|γis| − |γi+1| ≥

λ. Thus, after a finite number of steps, for some k ∈ N this implies γs

k < µck+1. But this

contradicts the invariantµc

k+1< γk+1s . 2

Lemma 16. Letµ ≈ γ, then exists µ′ _and_γ′ _{such that}_{µ =}_{=⇒ µ}′ _and_{γ =}_{=⇒ γ}′_and

µ ≈ µ′_and_{γ ≈ γ}′_and

for allC ∈ S/ ≈∆: µ′(C) = γ′(C).

Proof. We first define a set of possibly infinite internal transition trees that will be used to justifyµ ==⇒ µ′andµ ==⇒ µ′, respectively. The construction is identical for each elements in Supp(µ) ∪ Supp(γ).

Readers familiar with the notion of schedulers, can regard the following construc-tion as a variant of deterministic schedulers.

LetT be an internal transition tree withStaT(ε) = s that is stepwise constructed

as follows. Consider a leafσ. LetStaT(σ) = E. If E τ µ and µ ≈ ∆E and∃G ∈

Supp(µ) : G 6≈∆ E, then continue T with this transition. If E allows several different

internal transitions, we choose for every node of T , that is labelled by E, the same

internal transition. In all other cases, we do not continue the tree here.

If a state is continued, following the construction above, we call it a continuous state. Otherwise, a stopping state.

For simplicity, we will now speak of stateE in the tree T , when formally we mean a

nodeσ withStaT(σ) = E. Since we continue every such node σ with the same internal

transition, this is well justified.

We will now prove, that this internal transition tree indeed induces a full distribution. Letµ be the subdistribution induced by T , and assume, for a contradiction, that µ is not

a full distribution.

Let us consider an arbitrary countable sequenceT1, T2, . . . of finite subtrees of T ,

which haveT as limit. Let γibe there respective induced distributions. Thenµ is a full

distribution if and only if the sequencehγi({s ∈ S | s is continuous)}i −→ 0. Since

we assumed thatµ is not a full subdistribution, hγi({s ∈ S | s is continuous)}i −→ L

for someL ∈ N must hold.

Then the set of states occurring inT can be partitioned into two finite sets of states

(remember that we always assume that the state space is finite), namely(i) continuous