Data structures for the analysis of large structured Markov models

(1)

Dissertations, Theses, and Masters Projects Theses, Dissertations, & Master Projects 2000

Data structures for the analysis of large structured Markov

models

andrew S. Miner

College of William & Mary - Arts & Sciences

Follow this and additional works at: https://scholarworks.wm.edu/etd

Part of the Computer Sciences Commons Recommended Citation

Recommended Citation

Miner, andrew S., "Data structures for the analysis of large structured Markov models" (2000). Dissertations, Theses, and Masters Projects. Paper 1539623985.

https://dx.doi.org/doi:10.21220/s2-sjja-aj08

This Dissertation is brought to you for free and open access by the Theses, Dissertations, & Master Projects at W&M ScholarWorks. It has been accepted for inclusion in Dissertations, Theses, and Masters Projects by an authorized administrator of W&M ScholarWorks. For more information, please contact [email protected].

(2)

INFORMATION TO USERS

This manuscript has been reproduced from the microfilm master. UMI films the

text directly from the original or copy submitted. Thus, some thesis and

dissertation copies are in typewriter face, while others may be horn any type of

computer printer.

The quality of this reproduction is dependent upon the quality of the copy

submitted. Broken or indistinct print colored or poor quality illustrations and

photographs, print bleedthrough, substandard margins, and improper alignment

can adversely affect reproduction.

In the unlikely event that the author did not send UMI a complete manuscript and

there are missing pages, these wilt be noted. Also, if unauthorized copyright

material had to be removed, a note wilt indicate the deletion.

Oversee materials (e.g., maps, drawings, charts) are reproduced by sectioning

the original, beginning at the upper left-hand comer and continuing from left to

right in equal sections with small overlaps.

Photographs included in the original manuscript have been reproduced

xerographicaily in this copy. Higher quality 6" x 9* black and white photographic

prints are available for any photographs or illustrations appearing in this copy for

an additional charge. Contact UMI directly to order.

Bell & Howell Information and Learning

300 North Zeeb Road, Ann Arbor, Ml 48106-1346 USA

_________ <H>

U1VLI

800-521-0600

(3)

(4)

DATA STRUCTURES FOR THE ANALYSIS OF LARGE

STRUCTURED MARKOV MODELS

A Dissertation

Presented to

The Facility o f the Department o f Computer Science

The College o f W illiam & M ary in V irginia

In P a rtia l Fulfillm ent

O f the Requirements for the Degree o f

D octor o f Philosophy

by

Andrew Stephen M iner

2000

(5)

UMI'

UMi Microform9989345

unauthorized copying underTitie 17, United States Code.

Belt & Howell information and Learning Company 300 North Zeeb Road

P.O. Box 1346

Ann Arbor, Mi 48106-1346

(6)

APPROVAL SHEET

This dissertation is subm itted in p artial fulfillm ent of

the requirements for the degree o f

D octor o f Philosophy Andrew S. Miner Approved, June 2000 Gianfranco Ciardo Thesis-Adwsor

"S-ktK

\hk

Steve Park Virginia Torczon A lex Pothen Old. Dominion University

n

(7)

Hi

(8)

Table o f C ontents

A cknow ledgm ents x i

L ist o f T ab les x iii

L ist o f F ig u res x v iii

L is t o f A lg o rith m s x x L is t o f S ym bols x x ii A b strac t x x iii 1 In tro d u c tio n 2 1.1 C o ntribution s... 4 1.2 O rg a n iza tio n ... 5

2 Background

7

2.1 N otation and basic d efin itio n s... 7

2.2 Sparse m atrix s to ra g e ... 9

2.3 Solving linear system s... - ... 12

tv

(9)

3 M arkov Chains

19

3.1 Random, variables and im portant d is trib u tio n s ... 19

3.2 Stochastic processes and Markov c h a in s ... 24

3.3 Discrete-tim e Markov chains... 25

3.3.1 Transient a n a ly s is ... 26

3.3.2 Stationary analysis . ... 29

3.3.3 Mean tim e to absorption... 31

3.4 Continuous-time Markov chains ... 33

3.4.1 Transient a n a ly s is ... 34

3.4.2 Stationary analysis... 37

3.4.3 Mean tim e to absorption... 40

3.5 Phase-type d istrib u tio n s... 41

3.5.1 Discrete phase-types... 42

3.5.2 Continuous phase-types... 44

3.5.3 Phase-types and general d is trib u tio n s ... 45

4 H ig h -le v e l form alism s 47 4.1 M odel p a ra d ig m ... 47 4.2 P etri n e ts ... 50 4.3 Logical analysis... 55 4 4 M arkov analysis... 57 4.5 Structured m o d e ls ... 62 v

(10)

5 E x p lic it S ta te S pace G en eratio n 69

5.1 Generation algorithm ... 70

5.2 Traditional d ata s tru c tu re s ... 72

5.2.1 Storage o f s ta te s ... 72

5.2.2 Storing a set o f s ta te s ... 74

5.2.3 Storage o f unexplored states... 78

5.2.4 C om pression... 81

5.3 M ulti-level trees for structured models . ... 83

5.3.1 Representing unexplored s ta te s ... 88

5.3.2 E xploiting lo c a lity ... 92

5.3.3 C om pression... 93

5.4 Generating local states f ir s t ... 94

5.4.1 T rad itio n al structure... 95

5.4.2 B it v e c to rs ... 96

5.4.3 M u lti-level arrays... 97

5.5 Experim ental re s u lts ... 99

5.6 C o n clu sio n ... 108

6 Sym bolic S ta te Space G en eratio n 110 6.1 Decision d ia g ra m s ... I l l 6.1.1 M anipulating M D D s ... 113

6.2 Generating S w ith B D D s ... 118

6.3 Generating S w ith MDDs ... 121

vi

(11)

6.3.2 Occurrence o f synchronizing events... 127

6.3.3 Complete generation a lg o rith m ... 130

6.4 Logical queries on the state space... 131

6.5 Experim ental results ... 134

6.5.1 Dining philosophers m odel... 134

6.5.2 Slotted ring m o d e l... 139

6.5.3 FMS m o d e l... 141

6.5.4 Kanban m o d e l... 144

6.6 Conclusion ... . 147

7 T ra n s itio n R a te M a trix S to rag e 148 7.1 Kronecker a lg e b ra ... 150

7.2 Sparse Kronecker representations... 154

7.3 Representing Q w ith Kronecker a lg e b ra ... 155

7.4 Kronecker overheads... 158

7.4.1 Overheads from using the potential states . ... 159

7.4.2 Overheads from using the actual states ... 161

7.5 Decision diagrams to store the state space... 163

7.5.1 State searches... 164

7.5.2 Computing state in d ic es... 165

7.5.3 Determ ining the next reachable s ta te ... 166

7.6 M atrix diagrams to store the transition rate m a trix ... 167

v ii

(12)

7.6.1 Kronecker products with, m atrix diagram s... 169

7.6.2 A ddition o f m atrix d ia g ra m s ... 171

7.6.3 Subm atrix selection w ith m atrix diagram s... 173

7.6.4 Representing R with, m atrix d iag ram s... 176

7.6.5 M a trix diagram column ac c e s s... 178

7.7 Experim ental re s u lts ... 184

7.8 Conclusion ... 189

8 A S ta tio n a ry A p p ro x im a tio n 191 8.1 Exact ag g re g atio n ... 193

8.2 Approximate aggregation using decision diagram s... 194

8.2.1 O ur decision diagram s tru c tu re ... 195

8.2.2 A partition based on decision diagrams ... 197

8.2.3 A simple case o f exact aggregation... 198

8.2.4 O ur approxim ation... 200

8.2.5 An exam ple... 203

8.2.6 Exploiting event lo c a lity ... 207

8.3 Algorithm ic d e ta ils ... 212

8.3.1 The fixed-point c o m p u ta tio n ... 213

8.3.2 D ata structures...215

8.3.3 Computing measures...215

8.4 Product-form m odels... 217

8.5 Experim ental re s u lts ... 219

(13)

8.5.2 Load-dependent service m o d e l... 222

8.5.3 Kanban m o d e l... 227

8.5.4 Flexible manufacturing system (FM S) m o d e l... 228

8.6 C o n c lu s io n ... 231

9 A p p lic a tio n s 233 9.1 D istributed algorithm verificatio n ... 233

9.2 Web server performance evalu atio n ... 242

10 C onclusion 250 10.1 Future research... . 252

10.1.1 E x te n s io n s... 252

10.1.2 Related w o rk ... 253

10.1.3 New d ire c tio n s ... 254

A S M A R T 256 A.1 SM AR T Language... 257

A .1.1 Function d e c la ra tio n s ... 258

A .1.2 A rra y s ... 259

A .1.3 Fixed-point ite ra tio n s ... 259

A.2 Random variables... 260

A.3 M odel form alism s... 261

A.4 D istributed version (under developm ent)... 262

rx

(14)

A .4.1 D istributed algorithm s... 262

A .4.2 Concurrent solutions... 262

B B enchm arks 264 B .l Kanban m o d e l... 264

B.2 Flexible manufacturing system (FMS) m o d e l... 265

B.3 Dining philosophers m odel... 269

B.4 Slotted ring network protocol m o d e l... 271

C A n alysis o f Case 273

D D in in g philosophers states 275

B ib lio g ra p h y 280

x

(15)

This work would have been impossible w ithout the help o f many people. First and foremost, I would like to thank m y advisor, Gianfranco Ciardo, who tolerated my unusual sleeping schedules. His office door was always open and we had countless fru itfu l discussions about SM A R T, research ideas, and papers. We also attended some excellent conferences and shared an extrem ely dangerous daily commute v ia rickety bicycles in Torino.

I would also like to thank Steve Park, who somehow managed to find tim e in his busy schedule as Departm ent Chair (and now as Dean o f A rts and Sciences) to help me w ith grant proposals, cover letters, and other delicate matters o f diplomacy and word-smithing. I must also profusely thank Evgenia S m irai. Although I could not possibly enumerate the entire list o f items fo r which I owe her thanks (which grows by the day), I w ill at least say that I would alm ost certainly be unemployed without her help. Many thanks also to Andreas Stathopoulos, V irg in ia Torczon, and Alex Pothen (and to the rest o f the committee members already named) for reading this rather large body o f work and providing many valuable constructive comments. Thanks also to Susanna D onatelli for arranging my productive research visit to the Universita di Torino, Ita ly . I hope it was the first o f many such visits. I would also like to thank the V irginia Space Grant Consortium, the NASA Graduate Student Researchers Program , and the Departm ent of Computer Science at W illiam and M ary for financial support during my graduate studies.

Finally, on a personal note, I thank my fam ily and friends for th eir support, especially Shannon. I w ill always remember Wednesday pool, “celebrity" Cheese Shop lunches, disc golf, frisbee a t a ll hours, M ystery Science Theater 3000, and lam trips to Dennys.

This document was prepared using the document preparation system [62]. The

figures were draw n with. Tgif.

x i

(16)

List o f Tables

2.1 Memory required to store a m a trix ... 11

3.1 Computing 7r(6.0) using u n ifo rm iza tio n ... 38

4.1 Comparison of memory required for vs. ... . 60

4.2 Event rates for the structured model o f Figure 4 . 5 ... 68

5.1 State space sizes for benchmark m o d e ls ... 100

5.2 Mem ory usage for traditional techniques... 101

5.3 Mem ory usage for structured techniques... 103

6.1 Results for Dining Philosophers, two philosophers per subm odel... 137

6.2 Results for Slotted Ring, one node per subm odel... 141

6.3 FM S decom positions... 143

6.4 Results for FM S, 10 submodel decom position... 143

6.5 Results for Kanban, 4 submodel decom position... 146

7.1 Next functions for the structured model o f Figure 4 .5 ... 157

7.2 C T M C sizes and memory requirements fo r Kanban and F M S ... 185

(17)

8.1 CTM C sizes fo r the fbrk-join m odel... 221

8.2 Iterations for the fork-join model, N = 40, X = 1 ... 221

8.3 C TM C sizes for the load-dependent service m odel... 224

8.4 Iterations for the load-dependent service model, N = 40, A = 1 ... 224

8.5 C TM C sizes for the Kanban m o d e l... 228

8.6 Iterations for the Kanban model, iV = 66 ... 229

8.7 CTM C sizes for the FMS m odel. ... 230

8.8 Iterations for the FMS model, N = 33 ... 230

9.1 Results for verification o f Algorithm 9 . 1 ... 240

9.2 Results for verification of modified Algorithm 9 .1 ... 241

9.3 Rates for the model o f Figure 9.4 w ith 5 servers... 244

9.4 Rates for the model o f Figure 9.4 w ith 10 s e rv e rs ... 247

B.1 Transition rates for the Kanban m o d el... 265

B.2 Transition rates for the Flexible M anufacturing System m odel... 268

(18)

List o f Figures

2.1 Storage o f a sparse m a trix ... 10

3.1 Exam ple D T M C s ... 27

3.2 Exam ple C T M C s ... 35

3.3 Discrete phase d is trib u tio n s... 42

3.4 Continuous phase d is trib u tio n s ... 44

4.1 P etri net of an open queueing n etw o rk... 52

4.2 Producer / Consumer P e tri n e t... 53

4.3 A sim ple model and its state space... 57

4.4 Underlying CTM C based on S ... 60

4.5 Exam ple o f a structured m o d e l ... 65

4.6 Structured models and logical product fo rm ... 67

5.1 Some P-semiflows and invariants o f a P etri n e t ... 73

5.2 A hash table w ith c h a in in g ... 75

5.3 Storing unexplored states using an extra pointer per n o d e ... 79

5.4 Storing unexplored states using a linked lis t ... 80

xiv

(19)

5.6 Compression using a sorted array o f states... 82

5.7 Compression using an unordered array o f states and an ordering array . . . 82

5.8 Chiola’s m ulti-level technique to store the reachability set o f a S P N ... 83

5.9 M athem atical representation o f a m ulti-level s tru c tu re ... 85

5.10 A m ulti-level tree representing the structure in Figure 5 .9 ... 86

5.11 Representing U w ith a second m ulti-level tr e e ... 89

5.12 Representing U and Tt in one m ulti-level t r e e ... 91

5.13 The compressed version o f the m ulti-level tree o f Figure 5 .1 0 ... 93

5.14 A m ulti-level array representing the structure in Figure 5 . 9 ... 98

5.15 Traditional generation times for K a n b a n ... 104

5.16 Traditional generation times for F M S ... 105

5.17 Traditional generation times for D ining Philosophers... 106

5.18 Traditional generation times for Slotted R in g ... 107

6.1 M D D representations o f m in ({r.y t2} ) ... 113

6.2 C o m p u tin g / =Case(min({art y })Tz < 0 . z < l.z < 2 ) ... 117

6.3 M DDs for the Case computation o f Figure 6 .2 ... 118

6.4 A BDD encoding reachable markings for a simple P etri n e t ... 119

6.5 The M DD equivalent o f Figure 5.14 ... 122

6.6 Application o f Equation 6.3 for an event local to submodel 2 ... 125

6.7 M D D example fo r a synchronizing event e . . . _____ . . . 128

6.8 Perform in g a query on S ... 132

xv

(20)

6.9 Symbolic generation times and memory usage for Dining Philosophers . . . 135

6.10 Symbolic generation times and memory usage for Slotted R in g ... 139

6.11 Symbolic generation times and memory usage for F M S ... 142

6.12 Symbolic generation times and memory usage for K a n b a n ... 145

7.1 Exam ple Kronecker product ... 152

7.2 Example Kronecker s u m ... 153

7.3 Local matrices for the structured model o f Figure 4 .5 ... 158

7.4 Kronecker matrices for the structured model o f Figure 4 .5 ... 159

7.5 State space data s tru c tu re ... 164

7.6 A n example m atrix diagram and the m atrix it represents ... 168

7.7 M atrix diagram o f the Kronecker product o f Figure 7 .1 ... 171

7.8 Example o f m atrix diagram a d d itio n ... 174

7.9 Exam ple o f a m atrix diagram representing a subm atrix... 174

7.10 M a trix diagram for the structured model of Figure 4 .5 ... 177

7.11 Interesting portion o f the m atrix diagram in Figure 7 .1 0 ... 180

7.12 A trace o f Algorithm 7 .4 ... 181

7.13 Column m ultiplication tim es. Kanban m o d e l... 187

7.14 Column m ultiplication times, FM S m o d e l... 188

8.1 A simple aggregation example ... 194

8.2 Adding redundant nodes to an R O M D D ... 195

8.3 Exam ple o f sets A{p) a n d B (p )... 196

8.4 Decision diagram with, node la b e ls ... 204

x v i

(21)

8.6 Computing A. matrices at level 3 ... 205

8.7 Level 3 C T M C ... 206

8.8 Level 2 C T M C ... 207

8.9 Level 1 C T M C ... 208

8.10 MDD for a product-form network with. 4 queues and 4 custom ers...219

8.11 A Fork-join m o d e l... 220

8.12 Error for the fork-join model against N ... 222

8.13 Error for the fork-join model against.X ... 223

8.14 A load-dependent service m o d el... 223

8.15 Error for the load-dependent service model against N ... 225

8.16 Error for the load-dependent service model against X ... 226

8.17 Relative error for the Kanban m o d e l... 227

8.18 Relative error for the FMS m o d el... 229

9.1 Message passing subnet for machine i. Algorithm 9 .1 ... 236

9.2 Probe subnet for machine i, Algorithm 9 .1 ... 238

9.3 Special subnet for machine 0, Algorithm 9 .1 ... 239

9.4 Model o f a group o f K Web servers... 242

9.5 Probability o f a fu ll system. K = 5 servers and J = 5 jo b s ... 245

9.6 Average number o f requests in the system, K = 5 servers and J = 5 jobs . . 246

9.7 CPU tim es fo r K — 5 servers and J = 5 jo b s ...247

9.8 Probability o f a fa ll system, K — 10 servers and J = 10 jo b s... 248

xvii

(22)

9.9 Average number o f requests in the system, K = 10 servers and J = 10 jobs 249

9.10 CPU tim es for K = 10 servers and J — 10 jo b s ... 249

B.1 P etri net o f the Kanban m o d e l... . 265

B.2 P etri net o f the Flexible Manufacturing System m o d e l...267

B.3 The 1th philosopher subnet... 270

B.4 The P e tri net for ten dining philosophers ... 271

B.5 The i th network node for slotted rin g ... 272

D .l M DD encoding for 5 , 5 dining philosophers... 276

x v iii

(23)

2.1 Computing a Jacobi iteration by ro w s ... 16

2.2 Computing a Jacobi iteration by co lu m n s... 16

2.3 Computing a Gauss-Seidel iteration by row s... 17

5.1 Traditional state space generation ... 70

5.2 Inserting a state into a m ulti-level tre e ... 87

5.3 Choosing and removing a state from a m ulti-level t r e e ... 90

6.1 The Case operator on MDDs ... 115

6.2 Generating S using B D D s ... 120

6.3 Adding states due to local even ts... 126

6.4 Determ ining states due to synchronizing eve n ts... 129

6.5 Generating S using M D D s ... 130

7.1 Building a m atrix diagram o f a Kronecker p ro d u c t... 170

7.2 Adding two m atrix d ia g ra m s ... 173

7.3 Com puting a subm atrix using m atrix d iag ram s... 175

7.4 O btaining a m atrix diagram c o lu m n ... 179

8.1 Com puting the A m a tric e s ... 212

x ix

(24)

8.2 Computing the B matrices ... 213

8.3 O ur fixed-point ite ra tio n ... 214

9.1 M odified term ination detection ... . 235

B.1 SM ART code for the Kanban m odeL ...266

xx

(25)

IV Set of natural n u m b e rs ... 7

R Set of real n u m b e rs ... 7

tj(M ) Th e number of non-zero elements o f m atrix M ... 8

RowSum(M) A vector of row sums o f m atrix M ... 8

Diag (x ) T he m atrix w ith vector x along the d ia g o n a l... 9

P The probability m atrix o f a D T M C ... 25

p T he stationary probability vector o f a D T M C ... 29

R The transition rate m atrix o f a C T M C ... 34

Q The infinitesim al generator m atrix o f a C T M C ... 34

ir Th e stationary probability vector o f a C T M C ... 39

£ Th e set o f events o f a m o d e l... 48

S T he set o f potential (possible) states... 48

S T h e set o f actual (reachable) s ta te s ... 55

s s ' n-step reachability... 55

R Transition m atrix based on potential s ta te s ... 58

R 5 Transitions due to event e ... 58

x x i

(26)

K T he number o f submodels in a structured m o d e l... 62

first(e) The first submodel affected by event e ... 63

last(e) T he last submodel affected by event e ... 63

fx=i Cofactors o f a fu n c tio n ... I l l

Xs Characteristic function o f set S ... 118

0 Kronecker product . ... 150

© Kronecker su m ... 153

B{p) Substates encoded below node p ... 196

A(p) Substates above node p ... 196

xxil

(27)

High-level modeling formalisms are increasingly popular tools for studying complex: sys tems. Given a high-level model, we can autom atically verify certain system properties or compute perform ance measures about the system. In the general case, measures must be computed using discrete-event sim ulations. In certain cases, exact num erical analysis is possible by constructing and analyzing the underlying stochastic process o f the system, which is a continuous-time Markov chain (C TM C ) in our case. Unfortunately, the number o f states in the underlying C TM C can be extrem ely large, even if the high-level model is “small” . In this thesis, we develop data structures and techniques th at can tolerate these large numbers o f states.

First, we present a m ulti-level data structure for storing the set o f reachable states of a model. We then introduce the concept o f event “locality” , which considers the components of the model th at an event may affect. We show how a state generation algorithm using our m ulti-level structure can exploit event locality to reduce CPU requirements.

Then, we present a symbolic generation technique based on our m ulti-level structure and our concept o f event locality, in which operations are applied to sets o f states. The extremely compact data structure and efficient m anipulation routines we present allow for the examination o f much larger systems than was previously possible.

The transition rate m atrix o f the underlying C T M C can be represented w ith Kronecker algebra under certain conditions. However, the use o f Kronecker algebra introduces several sources o f C PU overhead dining num erical solution. W e present data structures, including

our new data structure called matrix diagrams, that can reduce this CPU overhead. Using our techniques, we can compute measures for targe systems in a fraction o f the tim e required by current state-of-the-art techniques.

Finally, we present a technique for approxim ating stationary measures using aggrega tions of the underlying C TM C . O ur technique utilizes exact knowledge o f the underlying CTM C using our compact data structure for the reachable states and a Kronecker repre sentation for the transition rates. W e prove th at the approximation is exact for models possessing a product-form solution.

xxrn

(28)

DATA STRUCTURES FOR THE ANALYSIS OF LARGE

STRUCTURED MARKOV MODELS

(29)

Introduction

Advancements in technology demand increasingly complex systems. The widespread growth

o f the internet and wireless communications, for instance, have fueled research in techniques

for analyzing large systems. The design of such a system almost certainly requires the use

o f computer models and simulations to assist engineers in making im portant design deci

sions. As a result, high-level modeling formalisms, such as stochastic Petri nets, are gaining

acceptance as tools to study such systems. These high-level models allow for autom atic

verification and performance evaluation o f systems whose analysis would otherwise be im

possible.

Generally, a model to be analyzed has certain properties that need to be verified, such

as “the system never reaches a deadlocked state". In some cases, there are also performance

or reliab ility measures o f interest to be determined, such as “what is the probability that

the system is down”. The form er type o f analysis can only be performed in general by a

systematic examination o f the states o f the high-level model [19, 27, 49, 57[. For certain

types o f models, efficient symbolic techniques can be used, which do not require explicit

exam ination o f every state [17,32, 6 8 ,7 8 ,7 9 , 8 0 ,81j. These techniques are quite promising,

as the states described by a high-level model can easily num ber in the m illions o r billions.

2

(30)

CHAPTER L IN TR O D U C TIO N 3

Performance evaluation, o f a high-level model can be performed either by discrete-event

simulation, by exact analysis, or by approxim ation. W hile discrete-event simulation is

applicable to an extrem ely general class o f problems, accurate solutions may require long

sim ulation runs, especially if the analysis involves the study o f rare events. Exact analysis,

on the other hand, is applicable to certain types o f stochastic models only. In our work,

we consider a fairly general class of formalisms in which the underlying stochastic process

is a Markov chain. In this case, analysis o f the model requires generation and analysis of

the underlying Markov chain. As mentioned above, it is not uncommon fo r these Markov

rhatns to contain m illio n s or billions of states. This leads to difficulties, as exact analysis

requires us to represent the reachable states o f the model, the transition rate m atrix of the

Markov chain, and a solution vector corresponding to the computed probability for each

state. These three structures pose obvious storage difficulties when the number o f states

o f the Markov chain becomes large. Much attention has been given to the transition rate

m atrix, as it is the largest o f the three structures. Techniques based on Kronecker algebra

have received much attention [12, 14,16,18 , 28, 29,41, 42, 44, 56, 84, 85, 86, 91], although

some alternatives have also been investigated [38, 39, 52].

Approximation techniques often involve decomposing the model into submodels, which

are then analyzed in isolation. The results obtained for the submodels are then combined.

Fixed-point iterations can be used to resolve the dependencies between submodels. The

overall storage and C P U requirements for the analyses o f the submodels are usually a

small fraction o f those for an exact analysis o f the entire model- As a result, model-based

decompositions have been successfully used [20,30, 50, 5 1 ,5 3 ,6 9 ,9 3 ,9 8 ,1 0 0 ,1 0 5 ,1 0 7 } to

accurately approximate performance measures when an exact solution is mfeasihTp.

(31)

1.1

C on trib u tion s

In this work, we address each of the three m ajor structures required for exact stationary

analysis: the set o f reachable states S , the transition rate m atrix R o f the Markov chain,

and the stationary probability vector ir. We consider oniy a very small subset o f model ver

ification problems; namely, that o f generating and exam ining the set o f reachable states S.

First, we develop a m ulti-level data structure for storing S that can be used w ith struc

tured models. We then show how, when an event occurs, we can update a portion o f the

data structure only, by exploiting structural properties o f the model. This new concept of

the “locality” o f an event can substantially reduce generation times, and is used throughout

the work.

Second, we develop a technique for symbolically generating S that can be applied to

a general class o f structured models. We present an encoding scheme that combines ideas

from our m ulti-level structure and from decision diagrams. We then develop specialized

m anipulation routines for our encoding which allow us to generate S extrem ely efficiently.

It has been shown that a Kronecker representation for the transition rate m atrix R

can reduce the storage requirements for R by orders o f magnitude. However, Kronecker

techniques suffer from significant sources o f CPU overhead. O ur th ird contribution consists

o f new data structures and techniques th at elim inate or reduce these CPU overheads.

W ith efficient representations fo r S and R , the only remaining bottleneck for exact

analysis is the solution vector 7r. O ur fourth m ajor contribution is a technique for approx

imating 7r. U nlike other approximations, ours makes use o f exact knowledge o f S and R

by using our previous contributions. This enables our technique to correctly assign a zero

(32)

CHAPTER I . IN TR O D U C TIO N 5

probability to unreachable states.

Finally, the software tool SM A R T [26], which, is discussed in Appendix A , represents a

considerable contribution to the academic and m odeling community.

1.2

O rganization

The rem ainder o f the thesis is organized as follows. The next three chapters are background

chapters. Chapter 2 introduces our notation and gives some im portant background infor

mation about storing matrices and solving linear systems. Chapter 3 presents an overview

of random variables and stochastic processes, w ith particular emphasis on Markov chains.

Chapter 4 describes how a high-level formalism can be used to generate a Markov chain.

Various classes o f structured models are defined.

O ur m ain contributions are presented in four chapters. Chapter 5 describes our m ulti

level data structure for explicit storage of the states o f the model. We compare our data

structure w ith several other exp licit storage schemes. Chapter 6 presents our symbolic

technique for generating and storing states o f a structured model w ith certain properties.

Our new approach is compared w ith existing symbolic approaches. Chapter 7 discusses ap

proaches in which the transition rate m atrix o f the underlying Markov chain for a structured

model is represented algebraically using Kronecker products and sums. We present data

structures and techniques for reducing or elim inating overheads inherent w ith Kronecker

approaches. Chapter 8 describes a novel technique fo r approximating stationary measures

o f a structured model, based on exact knowledge o f the underlying M arkov chain.

Exam ple applications o f our techniques are presented in Chapter 9. Concluding remarks

(33)

and directions for future work are given in Chapter 10. There are four appendices. Ap

pendix A discusses SM ART, a software package that incorporates the techniques described

in this thesis. Appendix 5 describes the models we use as benchmarks throughout the work.

Appendix C presents a detailed analysis o f one o f the algorithm s described in Chapter 6.

Finally, Appendix D derives an expression for the number o f reachable states for one o f our

benchmark models.

(34)

Chapter 2

Background

This chapter covers basic concepts th at are used throughout our work. Section 2.1 intro

duces our notation and presents a few basic definitions used in the rem ainder o f this thesis.

Section 2.2 gives an overview o f data structures used to store sparse matrices. Finally.

Section 2.3 briefly describes the iterative techniques we use to solve the linear systems that

arise in our work. For in-depth treatm ent o f these topics, we refer the reader to [58, 83, 96}

on the subject o f sparse m atrix storage, and to [6, 47, 96, 103] on the subject o f solving

linear systems.

2.1 N o ta tio n and basic d efin ition s

Sets are denoted in upper-case calligraphic letters, such as S. The fundamental sets are

exceptions: the set o f naturals is denoted iV , the set o f reals is denoted IR, and the sets o f

positive and non-negative reals are denoted 1R+ and 2ft*, respectively. The m inim al and

maximal elements o f a set S o f reals are denoted by min(«S) and max(<S), respectively.

Matrices are w ritten in upper-case bold letters, such as M . W e say a real m atrix

M e 2ftmxn has m rows and n columns. The identity m atrix o f size n x n i s denoted as

7

(35)

I n, although, if the size is clear from the context it w ill be w ritten as simply I . The m atrix

l l7ixn (o mxn) is the m atrix o f a ll ones (zeroes) w ith m rows and n columns, although it

w ill be w ritte n as simply 1 (0) if the size is clear from the contact. The m atrix element at

row i and column j is denoted as for i € ( 0 ,... ,m — 1} and j € { 0 ,... ,n — 1 }. A.

set is used to indicate more than one row or column. For exam ple. M [X , J \ refers to the

submatrix o f M w ith rows I and columns J . Row i (column j ) o f m atrix M is denoted

M [i. •} (M [* ,jJ ). The number of non-zero elements in m atrix M is denoted t7(M ). The

transpose o f m atrix M is denoted M T . The inverse o f a square m atrix M is denoted M “ L.

A m atrix w ith a single column (row) is called a column (row) vector. Vectors w ill be

denoted w ith lower-case bold letters, such as x . Elements o f a vector are denoted as x [t].

If x is a colum n vector, then x [i] = x [i.0 l otherwise x [i] = x [0 ,i]. The same notation is

used for both row and column vectors, as usually it is clear from the context if a vector is

a row or column vector. A probability vector is a vector whose elements are non-negative

and stun to one:

n—I

x 6 -R? A ^ 2 x [*] — 1-t=0

The dot product o f two vectors x .y 6 R n is defined as

n—L

x - y = ^ x [ ily [ ij. i=0

RowSum ( M ) is the vector whose elements are the row sums o f m atrix M . T h a t is, if

M & HV71*-71 then RowSum(NL) = M - 1 " *1 G JRT1. Drag (x ) is the square m atrix w ith x

(36)

CHAPTER 2. BACKGROUND 9

along the diagonal and zeroes elsewhere:

D iag {x)[itj \ =

j

if t = j

i f *

2.2

Sparse m a trix storage

A m atrix M 6 jRmxn can be stored using fu ll storage, in which every element o f M is

exp licitly stored. This is typically done using a tw o-dim ensional array o f size m x n , or a

one-dimensional array o f size m n. In either case, fu ll storage requires exactly m -n -b f bits o f

memory, where bf is the number o f bits to store a floating-point number o f desired precision.

A m atrix is called sparse if it contains relatively few non-zero entries: r?(M) <SC m - n.

Memory can be conserved by storing only the non-zero elements o f sparse matrices [83, 96].

To do so, we must also store some indexing inform ation. Thus, sparse-storage structures

may be inefficient when applied to dense matrices due to the overhead o f the indexing

inform ation. The amount o f “sparseness” required for a sparse-storage structure to be more

memory-efficient than fu ll storage depends on the structure, the number o f b its required for

floating-point representation, and other factors.

One way to represent a sparse m atrix is to use a linked list for each row, which stores

only the non-zero elements o f that row. Each node m the lis t stores a column index and

the associated value for th at column. We say such a m atrix is stored in sparse row-wise

form at. W hile it is relatively easy to access a row o f a m atrix stored in th is form at, it

is not so easy to access a column o f the m atrix. I f we require column access, we ran use

sparse column-wise form at. This is essentially the same structure, except each lis t stores

the non-zero elements o f a column. I f we require both row and column arress, we ran store

(37)

0.0 0.0 3.1 0.0 0.0 4.1 0.0 0.0 0.0 0.0 5.9 0.0 0.0 2.6 5.3 0.0 0.0 0.0 5.8 0.0 9.7 0.0 0.0 0.0 0.0 0.0 9.3 0.0 Full Storage I 2HH 3 2 • » 3.1 3 • » 53 0 • m 53 5 93 5 4.1 6 2.6 4 5.S 6 M l

Sparse row-wise using linked lists

0 1 2 3 4 5 6 2 53 I M l 2 53 4.10 t 2.6 3 9 3 2 9.7

Sparse column-wise using linked lists

(03) 3.1 F7zo>

tL M ,

(03) 4.1 0 3 ) (1.6) 5.9 2.6 1(2.4) 5.8 (2.6) 9.7 \\ (33) 9 3

Sparse w ith row and column access

N 2 3.1 5 4.1 3 5.9 6 2.6 0 53 4 5.8 6 9.7 5 93

Sparse row-wise using' arrays

0 1 2 3 4 5 6 I 0 | 1 | 1 | 2 [ 3 | 4 j 6 | 8 1

!

l / X X X / /

2 0 I 2 0 3 I 2 5 3 3.1 53uS 00 4.1 93 2.6 9.7

Sparse column-wise using arrays

F ig u re 2 .1 : Storage o f a sparse m atrix

linked lists for both the rows and the columns [58]. Alternatively, we can convert from

row-wise format to column-wise form at in 0 (t/(M )) operations.

I f we have prior knowledge o f the rows or columns ran be represented using

arrays instead o f linked lists. For efficiency, instead o f using a separate array fo r each row

or column, we use a single array. In place o f the pointers to linked lists, we m ain tain array

indices to the first non-zero element o f each, row or column. To mark the last non-zero

element o f the m atrix, an extra index is added.

(38)

CHAPTER 2. BACKGROUND II

Storage o f m atrix M 6 iRmxn, where:

6 / = # bits for a floating-point number o f desired precision bp — # b its for a pointer

bi = # bits for an integer o f appropriate size

Technique Full storage

By rows w ith linked lists By columns with. linked lists

By rows and columns with, linked lists By rows w ith arrays

By columns w ith arrays

M e m o ry m n b f m bp + T ](M .)(b i + b / + bp) nbp4 -T /(M )(6i 4 - 6 / 4 - bp) mbp 4- nbp 4- r?(M )(26i 4 - 6 / 4 - 2bp) (m 4- I)6 i 4 -rj(M )(6 i 4 -6 /) (n 4- I)6 i 4- i/(M )(6 i 4- 6 /)

T a b le 2 .1 : Memory required to store a m atrix

An example illustrating the data structures used for sparse storage o f a m atrix is given

in Figure 2.1. Each structure represents the same 4 x 7 m atrix. For clarity, null pointers

are not drawn. The storage requirement for each structure is shown in Table 2.1. Note

th at integers o f size one or two bytes can be used if the number o f rows, columns, and the

number of non-zero elements is sufficiently sm all. Further memory savings can be achieved

by using the m inim al number o f bits to store each integer. For instance, i f sparse, column

wise storage is used w ith arrays, the row indices can be stored in bits, and the

“pointers” to the elements can be stored in fIog2(»?(M) 4 -1)] bits. Also, note th at the

sparse structures described do not work w ell for “ultra-sparse” m atrices, in which many

rows and columns are empty. Row-wise, sparse-storage structures can be modified to store

only the non-empty rows, and these modified structures w ill conserve memory if most o f

the rows are empty. S im ilar modifications can be made for column-wise storage.

Another im portant benefit to using sparse-storage structures is the savings in compu

tational complexity. A frequently used m atrix operation is th at o f vector-m atrix m u ltip

(39)

cation. Given a m atrix M g R m* n and vectors x G 2 R "\y G StnT a ll stored using fo il

storage, the cost o f computing x M or M y is m n floating-point m ultiplications. However,

m ultiplication algorithm s for sparse matrices require only 7j(M ) floating-point m ultiplica

tions, assuming the m atrix is stored using a sparse structure. For large, sparse matrices

this difference is substantial.

2.3 S o lvin g linear sy stem s

Many computations o f interest w ill require us to solve a linear system o f equations o f the

form x M = y for an unknown vector x . This form can be rearranged by

x M = y

(xM )t = y T

Mt x t = y T

to obtain the preferred form A x —

b.

It is im portant to note that the techniques we discuss apply to A x =

b;

thus if our solution technique requires row access o f A , then this translates to column access o f M .

Solution o f the linear system A x =

b

is a thoroughly discussed problem [6, 47, 96, 103] and several techniques are available. For our applications, A is typically a very large,

extremely sparse, square m a trix Usually, we do not use techniques th at compute the

inverse o f A , fo r two reasons.

1. Tim e requirements: Computing the inverse o f an. ti x n m atrix requires 0 ( n 3) floating

point operations.

(40)

CHAPTER 2. BACKGROUND 13

2. Memory requirements: Since A is sparse, it can be stored using 0 (q (A )) memory.

However, the inverse o f a sparse m atrix is not necessarily sparse (and usually is not

sparse), so storage o f A - t w ill require 0 ( n 2) memory.

Instead, we prefer “indirect” techniques th at perform, a series o f m airix-vector m ultipli

cations, each requiring 0{rj(A)} floating-point operations. Since the m atrix A is never

modified, relatively low-precision floating-point representation can be used for A . This

combined w ith a sparse-storage structure results in significant memory savings.

We consider iterative techniques that compute a sequence x R o f approximations to x .

Given an in itia l guess xo, the remainder o f the sequence is computed from an equation o f

the form

Xn+t = B x„ + k (2.1)

where the m atrix B and the vector k are specified by our iterative technique. The sequence

is guaranteed to converge for any in itia l guess xo, provided

Iim B n = 0.

n - w o

This occurs when p (B ), the largest eigenvalue o f B , is strictly less than one. The asymptotic

rate o f convergence depends on p (B ): the sm aller the value o f p (B ), the faster the sequence

computed by Equation 2.1 is likely to converge.

O f course, we cannot compute x ^ ; instead we must compute x,v fo r some large value o f

N r and hope that xjy is an accurate approxim ation to x . The number o f iterations required

(41)

to satisfy a tolerance e can be obtained by the approxim ate relationship [96]

p (B )* = e,

which is not used in practice, since p (B ) is usually not known. Instead, the technique

used most often in practice is to somehow compare successive vectors. One technique

frequently used is to compare some norm o f the difference o f successive vectors w ith a desired

tolerance e. Th e iterations then continue until an absolute precision has been achieved:

||X tf - X tf-rlf < e.

Alternatively, we can use a relative measure. The technique we use is to continue iterations

until the maximum relative difference between elements o f x,v and x*v—i is w ith in the

desired precision: max t XA r[t]-XA T-t [t] xatW < E.

Relative precision is safer to use when entries in the vector x differ in size by orders o f

magnitude. T h is is not uncommon, especially in computing probability vectors for Markov

chains.

Another frequently used technique is that o f residual testing. Since we are solving the

system A x = fa, the idea is th at A x ^ w ill be “close” to b if x,v is “close” to x . O f course,

if A is a large m atrix, the cost o f computing the residual tan be high. Residual testing may

not work w ell fo r ill-conditioned systems.

(42)

CHAPTER 2. BACKGROUND 15

2 .3 .1 J a c o b i a n d G au ss-S e id e l

Conceptually, we split the m atrix A into matrices L , D , and U such th at A = D — L — U ,

where L and U are strictly lower- and upper-triangular matrices, respectively, and D is a

diagonal m atrix. Thus we have the system D x — L x — U x — b .

For the Jacobi technique, we use the following iteration:

D x „_ t - L x„ - U x „ = b.

In this case we can compute x n+t using

Xn-i-i = D - t (L U )x n -t- D - t b

where D - t is triv ia l to compute since D is a diagonal m atrix. A single Jacobi iteration

can be computed either using Algorithm 2.1, which accesses elements o f A by rows, or

using Algorithm 2.2, which accesses elements o f A by columns. In the algorithms, we store

the diagonal elements of A separately in a vector d . Thus, m atrix A is represented by

the two structures A ' and d , where A ' = A — Diag (d ). Another common practice is to

store d~l , where D _I = Diag (d - t ); in that case, the divisions by d [ij in Algorithm 2.1 and

Algorithm 2 .2 are replaced w ith m ultiplications by d~l [t|.

Jacobi does not use the newest approxim ation o f x during the computation. T h at is,

once we have determined x„+ t[i], we do not use it u n til we compute x„+2- Thus Jacobi

is insensitive to the ordering o f the rows and columns o f A . However, it makes sense to

use x re+t[i] if it is known when computing the rem aining entries o f X n +i. This is the idea

(43)

RowJacobi(x0/d, x nem, A ', d , b )

• Inputs: vector the current probability vector x n; m atrix A ' = A —D ; vector d , where D = Diag (d ); and vector b .

• Output: vector x nem, the next probability vector x

n+i-1: fo r e a c h row r^ d o

2: x neJt,[r] -jp r (b [rj - A '[r, •] - x ^ ) • Dot product of A '[r, •] and x ^ j d[r]

3: e n d fo r

A lg o rith m 2.1: Com puting a Jacobi iteration by rows

ColJacobi(x0/d. x nem, A r, d , b )

• Inputs: vector x^u, the current probability vector x „; m atrix A ' = A — D ; vector d, where D = Diag (d ); and vector b .

• Output: vector x nem, the next probability vector x n + l.

1: X nettr < b

2: fo r e a c h c o lu m n c d o

3: X neta Xnem ~ X oW[c] A '[*, c]

4: e n d fo r 5: f o r e a c h c o lu m n c d o Xnettffc] Vector equation 6: X neuf[c] 7: e n d fo r d[c]

A lg o rith m 2 .2 : Com puting a Jacobi iteration by columns

behind the Gauss-Seidel iteration. Form ally, we have

D x n+! - L x n+i - U x« = b

which can also be w ritten

Xn+i = (D — L )_ IU x „ + (D - L ) - rb

although in practice we do not compute the inverse o f D — L . Since xa+1[i -f- l j must be

(44)

CHAPTER 2. BACKGROUND 17

RowGaussSeidel(x, A ', d , b )

• Inputs: vector x , the current probability vector x„; matrix A ' = A — D ; vector d, where D = Diag (d ); and vector b.

• Output: vector x (overwritten), the next probability vector x „ + i-1: for each row r do

2: x[r] <— (b [r] — A '[rT«] - x ) • Dot product o f A '[r, •] and x

3: e n d fo r

A lg o rith m 2 .3 : Computing a Gauss-Seidel iteration, by rows

computed after we have computed x„+ i[i], Gauss-Seidel is usually implemented using row

access o f A , as in A lgorithm 2.3. One benefit of row Gauss-Seidel is th at we only need to

store a single vector x . in which xn+i[t] overwrites x ^ t). This is possible because x„[i] is

no longer used once x n+t[i] has been computed. An algorithm for Gauss-Seidel that uses

column access o f A was recently developed in [39]: this algorithm requires an auxiliary

vector w in addition to the single vector x .

Both the Jacobi and Gauss-Seidel iterations fa ll under the type o f Equation 2.1: for

Jacobi we have B /oc = D - l (L -f-U ), and for Gauss-Seidel we have B cs = (D — L )- l U . The

Stein-Rosenberg theorem [103] states that for non-negative Jacobi matrices Bj acr exactly

one o f the following statements holds.

1- p (B jo c ) = p (B < ;s ) = 0 .

2. 0 < p(B cs) < p (B jac) < 1.

3- pCB/oc) — p (B c s ) =

1-4. 1 < p(B/ac) < p (B c s ).

(45)

Thus if the Jacobi m atrix D - I (L -F IT ) is non-negative, then Jacobi and Gauss-Seidel w ill

either both converge or both diverge. Furtherm ore, if both techniques converge, then

Gauss-Seidel has a faster asym ptotic rate o f convergence (meaning that Gauss-Gauss-Seidel is expected to

converge fester). Since updated values are used immediately w ith Gauss-Seidel, the variable

ordering may affect the rate of convergence [96J. In contrast, the convergence rate o f Jacobi

is independent o f the variable ordering.

Both the Jacobi and Gauss-Seidel techniques can make use o f relaxation [96. 103[. The

idea is that for each iteration, we are changing our approximation o f x by

Xn+l = xn +

6

(Xn)

where 6{xn) is determ ined by our iterative technique. We can alter the speed o f convergence

by instead computing

x n+i = x n -Fa/d(x„)

where w is called the relaxation param eter. I f we use 1 < or < 2 (0 < co < 1), then the

technique is called over-relaxation (under-relaxation). D eterm in in g an optim al value for cj

is not a triv ia l task.

Another popular group o f techniques to solve a linear system A x = b are projection

techniques, in which an exact solution is approximated from a sequence o f approximations

taken from an m-dimension subspace [96}. W hile these techniques are quite sophisticated,

they require storage o f m vectors in addition to the solution vector x . As we w ill see in

later chapters, the size o f the solution vector can become quite large; thus the storage o f m

additional vectors is often not possible due to excessive memory requirements.

(46)

Chapter 3

M arkov Chains

This chapter presents some m athem atical background on the concept o f “randomness”,

which is fundam ental to much o f our work. Section 3.1 discusses random variables and

describes the im portant distribution functions for our work. Section 3.2 continues the dis

cussion w ith an overview of stochastic processes (fam ilies o f random variables) and Markov

chains (a special type of stochastic process). For more d etail on this m aterial, we refer the

reader to [31, 89]. Section 3.3 and Section 3.4 give thorough discussions on discrete-tim e

and continuous-time Markov chains, which are critical to our work. For more on Markov

chains, the reader is referred to [55], an excellent treatm ent on discrete-tim e Markov chains,

and to [89, 96]. Finally, Section 3.5 gives a brief overview o f phase-type random variables,

which use M arkov chains in their definition. This topic is covered particularly w ell in [75].

3.1 R an d om variables and im portant d istrib u tio n s

Suppose we conduct some experim ent. The set o f a ll possible outcomes o f the experiment

is called the sample space. Given some sample space W , a random variable [31, 89] is a

function X : W — <S. I f the set S is countable, X is a discrete random variable, otherwise

19

(47)

X is a continuous random variable. Typically, if X is discrete then S C IV , and if X is

continuous then S Q M . Random variables are w ritten in upper-case.

A discrete random variable can be completely described by its probability distribution

or probability mass function [89{, which specifies the probability o f each possible value o f X .

An important point is that two random variables X and Y may have the same probability

distribution, but this does not im ply that X and Y are equal. Arguably the simplest

distribution is when the random variable is not random a t all: random variable X is said

to be a constant ar, w ritten X ~ Const(ar), if

= X

otherwise

A constant random variable is then just what its name implies: a random variable that

is only allowed to take on a single value. A more interesting distribution is the Bernoulli

distribution: random variable X is said to be a Bernoulli random variable w ith success

parameter p, w ritten X ~ B em ouIli(p), if

P r { X = n} = <

1 —p if n = 0 p if n = 1

0 otherwise

where 0 < p < 1. Thus, a Bem oulIi(p) random variable can take on values 0 and 1, except

for the lim itin g cases BemouIli(O) = Const(O) and B em o u lli(l) = C o n st(l).

Consider an in fin ite sequence o f independent Bernoulli random variables X i , X2, . . . a ll

w ith success param eter p. Let J be the position o f the first occurrence o f the value 1. T h at

is, X j = I and fo r a ll 0 < i < J, X f = 0. Then J is said to be a Geometric random variable

(48)

CHAPTER 3. M ARKOV CHAINS 21

w ith success parameter p, w ritten. J ~ G eom (p), and

P r{J = n> = {

q(1 — p)n lp i f n > 0

otherwise

Let K be the number o f 0Ts before the firs t 1. Note that K is one less than J . Then

K is said to be a M odified Geometric random, variable w ith success parameter pr w ritten.

K ~ ModGeom(p), and

Let Y be the stun o f the first n variables, Y — X i. Then Y is said to be a Binom ial

random variable w ith parameters n and p, w ritten Y ~ B inom ial(n,p), and

P r { F = t } = | ( i ) p *(l — p)n * i f 0 < i < n otherwise

Consider the lim iting case o f a Binom ial random variable where n —»■ oo and p ->• 0 such that

the product np remains a constant A. This is the Poisson distribution w ith parameter A,

w ritten Poisson(A). I f Z ~ Poisson(A), then we have

Pr{Z = i} =

Urn

f M p ‘ ( I - p r a —footp—>0,np=A \ 1 / fon 4 f P *(l ~ p ) ~ * ( l- P ) n n —*oo,p—+ 0 ,T ip = \ t l ^Tl — t ) l

=

lim

i n p C n - D p - C n - i + D p d - p r f 1- ^

a-K3a,p-»0jip=A tL \ T lJ I _—t _—X = ^ ( i r e A1 = T e \ il

W ith continuous random variables, the distribution cannot be specified by the proba

bilities for each possible value o f X . Since X can take on an uncountably infinite number o f

(49)

values, the probability o f X taking on a single value is always zero. Instead, a continuous

random variable is described by either its probability density function or by its cumulative

distribution function (C D F) [89], which specifies P r { X < x} for every value o f x € JR. A

continuous random variable X is said to be an Exponential random variable w ith rate A.

w ritten X ~ Expo(A), if

p \ J 1 ~ e -Ax i f x > 0

P r { X < x > = | 0

-In our work we are prim arily interested in the Exponential distribution. The Exponential

distribution is often used to build other distributions. For instance, a continuous random

variable Y is said to be an Erlang random variable w ith n stages and param eter p , w ritten

Y ~ E rlan g (n ,/i), if

Y = Xt H--- F X nr

where X i , . . . ,X n are independent and identically distributed random variables w ith dis

tribution Expo (fi).

We are especially interested in distributions th at satisfy the memoryless property

P r { X > s -F f|X > t} = P r { X > s} (3.1)

for any non-negative values o f s and t . I f X ~ Geom (p), then we have

P r { X > n } = P r {The first n Bernoulli random variables are zero}

= ( I- P ) n,

(50)

CHAPTER 3. M A RK O V CHAINS 23 which gives us _ r „ Pr{JY > s + t AX > t \ P r { X > S + t \ X > t } = ^ - p r { x > 1 J --- * P r { Z > s + i } Pr { X > t} (1 ~ p ) 3+t (1 - p ) t = ( 1 - P ) a = P r { X > s } t

thus satisfying Equation 3-1. I f X ~ Expo(A)r then we have

r» t v xi v - l P r { X > s + t}

g —A (s+ t) g —At

= e_Aj

= P r { X > s}

and Equation 3.1 is satisfied- Thus the Geometric and Exponential distributions are

mem-oryless, and it can. be shown that they are the only memoryless distributions.

The Exponential distribution is closely related to both the Geometric and Poisson dis

tributions. Suppose we have a sequence o f independent random variables X i , X i, . . . that

are all exponentially distributed w ith rate A. Let N be a Geom etric random variable w ith

success param eter p . Then the random variable Y given by

AT

i=l

(51)

is an Exponential random, variable w ith rate Ap. I f J is the integer such that

J J + i

l < J > i (3.2)

£=t i = l

then J is a Poisson random variable w ith param eter A.

3.2

S to ch a stic p rocesses and M arkov ch ain s

A stochastic process [31t 89] is a collection o f random variables (X (£ ) : t £ T } . The set S

o f possible values for X (t) is called the state space. The param eter t is often considered to

be tim e. I f T is countable, the stochastic process is a discrete-tim e process; otherwise it is

a continuous-time process. We say the process is in state s 6 S a t tim e t 6 T if X ( t ) = s.

M arkov processes are special cases o f stochastic processes th at obey the memoryless or

Markovian property: only the current state o f the process determines the probability or

rate o f switching to another state [96]. This is expressed form ally as

Vn € ]N, V io ,-.. Tin+t 6 <StV£0, . . . r W i € T,£o < " ' < fn-n»

P r { X ( £ n+r) = i„ -u l |X (£n) = t n A - A l ( t 0) = i Q} = P r { X ( £ n + I) = i Jl+i|X (fTl) = i „ } .

A M arkov process w ith a discrete state space S is called a Markov chain. A Markov chain

whose transition probabilities do not depend on tim e is called homogeneous. We w ill lim it

our discussion to homogeneous Markov chains with, finite state spaces.

The states o f a Markov chain can be classified based on th eir a b ility to reach other states

[55]. In a Markov chain, we say state j is reachable from state i i f

P r { X ( £ ) = j |X ( 0 ) = i} > 0

(52)

CHAPTER 3. M ARKO V CHAINS 25

for some tim e t > 0. A state t is called transient if there exists a state j such that j is

reachable from i but i is not reachable from j . Conversely, a state i is called recurrent if for

every state j reachable from i, i is reachable from j . A recurrent state is called absorbing if

no other state is reachable from it. Two states t and j are said to be mutually reachable if i

is reachable from j and j is reachable from i. The equivalence relation “m utually reachable”

creates equivalence classes over the set o f states, where a ll states in a given class are either

transient or recurrent. A set o f states X C S is called a recurrent class if a ll pairs o f states

in X are m utually reachable and no state outside o f X is reachable from a state in AT. A

Markov chain is called irreducible if <

Data structures for the analysis of large structured Markov models

Data structures for the analysis of large structured Markov

INFORMATION TO USERS

The quality of this reproduction is dependent upon the quality of the copy

there are missing pages, these wilt be noted. Also, if unauthorized copyright

xerographicaily in this copy. Higher quality 6" x 9* black and white photographic

STRUCTURED MARKOV MODELS

A Dissertation

UMi Microform9989345

Ann Arbor, Mi 48106-1346

APPROVAL SHEET

This dissertation is subm itted in p artial fulfillm ent of

Table o f C ontents

A cknow ledgm ents x i

2 Background

2.1 N otation and basic d efin itio n s... 7

3 M arkov Chains

3.1 Random, variables and im portant d is trib u tio n s ... 19

3.5.2 Continuous phase-types... 44

5.3.1 Representing unexplored s ta te s ... 88

6.3.3 Complete generation a lg o rith m ... 130

7.4.2 Overheads from using the actual states ... 161

7.6.4 Representing R with, m atrix d iag ram s... 176

8.3 Algorithm ic d e ta ils ... 212

10 C onclusion 250 10.1 Future research... . 252

B.4 Slotted ring network protocol m o d e l... 271

List o f Tables

2.1 Memory required to store a m a trix ... 11

6.1 Results for Dining Philosophers, two philosophers per subm odel... 137

8.1 CTM C sizes fo r the fbrk-join m odel... 221

8.8 Iterations for the FMS model, N = 33 ... 230

List o f Figures

2.1 Storage o f a sparse m a trix ... 10

4.6 Structured models and logical product fo rm ... 67

5.8 Chiola’s m ulti-level technique to store the reachability set o f a S P N ... 83

5.15 Traditional generation times for K a n b a n ... 104

6.5 The M DD equivalent o f Figure 5.14 ... 122

6.12 Symbolic generation times and memory usage for K a n b a n ... 145

7.8 Example o f m atrix diagram a d d itio n ... 174

8.2 Adding redundant nodes to an R O M D D ... 195

8.13 Error for the fork-join model against.X ... 223

9.2 Probe subnet for machine i, Algorithm 9 .1 ... 238

9.10 CPU tim es for K = 10 servers and J — 10 jo b s ... 249

2.1 Computing a Jacobi iteration by ro w s ... 16

6.3 Adding states due to local even ts... 126

8.3 O ur fixed-point ite ra tio n ... 214

p T he stationary probability vector o f a D T M C ... 29

last(e) T he last submodel affected by event e ... 63

DATA STRUCTURES FOR THE ANALYSIS OF LARGE

Advancements in technology demand increasingly complex systems. The widespread growth

acceptance as tools to study such systems. These high-level models allow for autom atic

or reliab ility measures o f interest to be determined, such as “what is the probability that

as the states described by a high-level model can easily num ber in the m illions o r billions.

CHAPTER L IN TR O D U C TIO N 3

Performance evaluation, o f a high-level model can be performed either by discrete-event

we consider a fairly general class of formalisms in which the underlying stochastic process

Markov chain, and a solution vector corresponding to the computed probability for each

some alternatives have also been investigated [38, 39, 52].

small fraction o f those for an exact analysis o f the entire model- As a result, model-based

C on trib u tion s

In this work, we address each of the three m ajor structures required for exact stationary

tured models. We then show how, when an event occurs, we can update a portion o f the

a general class o f structured models. We present an encoding scheme that combines ideas

techniques suffer from significant sources o f CPU overhead. O ur th ird contribution consists

by using our previous contributions. This enables our technique to correctly assign a zero

CHAPTER I . IN TR O D U C TIO N 5

probability to unreachable states.

O rganization

The rem ainder o f the thesis is organized as follows. The next three chapters are background

Various classes o f structured models are defined.

Our new approach is compared w ith existing symbolic approaches. Chapter 7 discusses ap­

o f a structured model, based on exact knowledge o f the underlying M arkov chain.

Appendix C presents a detailed analysis o f one o f the algorithm s described in Chapter 6.

Chapter 2

This chapter covers basic concepts th at are used throughout our work. Section 2.1 intro­

on the subject o f sparse m atrix storage, and to [6, 47, 96, 103] on the subject o f solving

2.1 N o ta tio n and basic d efin ition s

Sets are denoted in upper-case calligraphic letters, such as S. The fundamental sets are

M e 2ftmxn has m rows and n columns. The identity m atrix o f size n x n i s denoted as

set is used to indicate more than one row or column. For exam ple. M [X , J \ refers to the

Our new approach is compared w ith existing symbolic approaches. Chapter 7 discusses ap

This chapter covers basic concepts th at are used throughout our work. Section 2.1 intro