Comparison with Related Work - Implementation of symbolic model checking for probabilistic syst

As we pointed out in Chapter 2, the basic idea behind our hybrid technique for probabilistic model checking is similar to that of Kronecker-based approaches for CTMC analysis. They both use a compact, structured representation of the model and explicit storage for vectors to perform numerical computation using iterative methods. Of particular interest is the implementation of Ciardo and Miner [CM99] which uses decision diagram data structures. Now, having now presented our technique in detail, we give a more in-depth comparison of the two approaches.

To recap, the idea of Kronecker-based techniques is that the transition matrix of a CTMC is defined as a Kronecker (tensor) algebraic expression of smaller matrices, corresponding to components of the overall model. It is only necessary to store these small matrices and the structure of the Kronecker expression; iterative solution methods can be applied directly to this representation. Extracting or accessing a single element of the matrix requires several multiplications and a summation. Hence, as with our approach, ingenious techniques must be developed to minimise the resulting time overhead during numerical solution. Typically, these disadvantages are outweighed by the substantial savings in memory and corresponding increase in size of solvable model.

It turns out that the two approaches share a number of issues. A good example is the need to differentiate between reachable and unreachable states. Early Kronecker approaches worked over the entire set of possible states. This increases the size of the vectors which need to be stored and means that additional work is required to detect entries in the transition matrix corresponding to unreachable states. Improvements to the Kronecker approach, such as Kemper’s use of binary search over ordered sets [Kem96] and Ciardo and Miner’s use of multi-level data structures [CM97, CM99] to store the state space, relieved these problems to a certain extent. Our approach is comparable to that of [CM99] in that both use decision diagrams to compute and store the set of reachable states, and use offsets to determine state indices. The difference is that [CM99] uses multi-valued decision diagrams (MDDs) and we use binary decision diagrams (BDDs).

Another common issue is the selection of an iterative method for numerical solution. Early Kronecker approaches used the Power or Jacobi methods. More recent work such as [BCDK97, CM99] has given algorithms for the Gauss-Seidel method, a more attractive

option since it usually requires considerably fewer iterations to converge and needs less memory. The drawback is that, to implement the method, each row or column of the matrix must be extracted individually, a process not ideally suited to techniques relying on structured storage of the matrix.

An alternative, and the one which is most directly related to our approach, is the ‘interleaving’ idea of [BCDK97]. Here, all matrix entries are accessed in a single pass of the data structure, comparable with the depth-first traversal of MTBDDs we adopt. The advantage is that many of the multiplication operations performed to compute the matrix entries can be reused when carried out in this order. The sacrifice made to obtain this saving is that the technique is restricted to the Power and Jacobi methods.

The Kronecker implementation of Ciardo and Miner [CM99, Min00] has further similarities with our work. They have developed a data structure called a matrix diagram which stores the Kronecker representation for a CTMC in a tree-like data structure. Like an MTBDD, the tree is kept in reduced form to minimise storage requirements. Fur- thermore, Ciardo and Miner use a combination of MDDs for reachability and state space storage, and matrix diagrams for matrix storage and numerical solution. This is analogous with our use of BDDs and MTBDDs, respectively.

Despite the numerous similarities, there remain fundamental differences between our hybrid approach and the various Kronecker-based techniques. Firstly, the amount of work required to extract a single matrix entry differs significantly. For an offset-labelled MTBDD, this constitutes tracing a path from its root node to a terminal, reading and following one pointer at each level. The Kronecker representation is also a multi-level system, but extracting an entry is slower, requiring a floating point multiplication to be performed at each level. In some cases, several such products must be computed and then summed to determine the final value.

Secondly, our MTBDDs are based on binary decisions whereas data structures for storing Kronecker representations, such as matrix diagrams, are based on multi-valued decisions. From an alternative viewpoint, the former encodes state spaces using Boolean variables and the latter does so using finite, integer-valued variables. The relative merits of each are hard to judge. Multi-valued variables might been as a more intuitive way to structure a given model’s state space. On the other hand, we have found the flexibility of Boolean variables useful for developing efficient storage schemes for MDPs. Kronecker methods have only been used to store models which can be represented as a square, two-dimensional matrix, i.e. CTMCs and DTMCs.

With all of the above factors in mind, we conclude this section by presenting a comparison of the space and time efficiency of the two approaches. We begin by considering the amount of memory required for matrix storage. It seems that the Kronecker-based

representations are more compact than MTBDDs in this respect. As an example, we use the Kanban system case study of Ciardo and Tilgner [CT96], used both in this thesis and in numerous other related sources in the literature. According to the statistics in [CM99], for example, we find that our offset-labelled MTBDD representation requires several times more memory than matrix diagrams for this model. Fortunately, this comparison is largely irrelevant since the space requirements of both approaches are dominated almost entirely by storage for vectors, not matrices.

From a time perspective, an exact comparison is more problematic. Even given the facility to run contrasting implementations on the same hardware, it is hard to make a fair evaluation without a detailed knowledge of their workings. Generally, matrix diagrams and sophisticated Kronecker implementations both claim, like us, to be comparable to sparse matrices in terms of speed. We are not aware, though, of an instance where Kronecker-based methods actually outperformed explicit techniques, as our hybrid approach did on the polling system case study. Given that an iteration of numerical solution essentially reduces to extracting matrix entries from the structured matrix representation, our observations above would suggest that offset-labelled MTBDDs should be faster than Kronecker-based data structures in this respect.

In fairness, though, Kronecker-based implementations, such as matrix diagrams, are often tailored towards performing the Gauss-Seidel method. While this usually entails more work per iteration, the total number of iterations can often be significantly reduced. The number of vectors which must be stored is also reduced by one.

We, on the other hand, have opted to focus on developing fast implementations of iterative methods which can be implemented using matrix-vector multiplication such as the Jacobi method. We have taken steps to address some of the limitations of Jacobi by investigating Pseudo Gauss Seidel, which exhibits numerous advantages of conventional Gauss-Seidel, only to a lesser extent.

More importantly, in this thesis we have concentrated on a wider range of analysis methods for probabilistic models. While steady-state probability computation requires the solution of a linear equation system, amenable to Gauss-Seidel, many other problems reduce to alternative iterative methods. Such problems include computing transient prob- abilities for CTMCs, model checking CSL time-bounded until properties for CTMCs and model checking PCTL properties for MDPs. In these cases, our approach is at a distinct advantage.

Conclusions

7.1 Summary and Evaluation

The aim of this work was to develop an efficient probabilistic model checker. We set out to investigate whether BDD-based, symbolic model checking techniques, so successful in the non-probabilistic setting, could be extended for this purpose. The approach we have taken is to use MTBDDs, a natural extension of BDDs.

In terms of efficiency, we are concerned with minimising both the time and space requirements of model checking. When working with BDD-based data structures, complexity analysis is generally unhelpful; despite often having exponential worst-case complexity, it is well known that on realistic examples exhibiting structure, symbolic techniques can dramatically outperform other alternatives. For this reason, we have opted to rely on empirical results to gauge the efficiency of our techniques. By applying our work to a wide range of case studies, we aim to make such comparisons as fair as possible.

One of the main motivations for the work in this thesis was the lack of existing implementations of probabilistic model checking. Consequently, there is limited scope for making comparisons of our work with other tools. Instead we have chosen to implement an alternative version of our model checker based on more conventional, explicit data structures and use this to judge the efficiency of our techniques. This allows for fair comparisons, ensuring that tests can be carried out on identical examples and solution methods, and under the same conditions. Since probabilistic model checking requires computations on large matrices with relatively few non-zero entries, the obvious candi- date for an explicit representation is sparse matrices. Fortunately, it is relatively simple to produce an efficient implementation of this data structure.

As demonstrated in the preceding chapters, we have successfully applied MTBDDs to the process of probabilistic model checking for two temporal logics, PCTL and CSL,

and for three types of model, DTMCs, CTMCs and MDPs. We found that there was a significant amount of commonality between the various cases.

In Chapter 4, we showed that, by applying heuristics for encoding and variable or- dering, MTBDDs could be used to quickly construct a very compact representation of extremely large probabilistic models from their high-level description in the PRISM lan- guage. The heuristics for MDPs are the first to be presented in the literature. Fur- thermore, because of the close relationship between MTBDDs and BDDs, we were able to perform reachability and model checking of qualitative temporal logic properties efficiently

on models with more than 1013 _states.

The real focus of the thesis, however, has been on model checking of quantitative properties, for which numerical computation is required. In Chapter 5, we demonstrated that, for some case studies, this could be performed very efficiently with MTBDDs. On a desktop workstation of relatively modest specification, we were able to analyse models with as many as 7.5 billion states. The best results proved to be for MDP models, typically of randomised, distributed algorithms. Clearly, models of this size could not be handled explicitly under the same conditions.

We also found, in concurrence with existing work on symbolic analysis of probabilistic models, that MTBDDs were often inefficient for numerical computation because they provide a poor representation of solution vectors, which exhibit no structure and contain many distinct values. In Chapter 6, we presented a novel hybrid approach to combat this, combining our symbolic, MTBDD-based approach and the explicit version. Initially, we implemented this hybrid approach for the model checking of DTMCs and CTMCs, relying on the Jacobi and JOR methods for solving linear equation systems. Thanks to memory savings from the compactness of our MTBDD representation, we were able to analyse models approximately an order of magnitude larger than with explicit techniques. Typically, we also maintained a comparable solution speed.

We then showed how this hybrid approach could be extended for model checking of MDPs. Although the results were not as impressive as for DTMCs in terms of solution speed, we still required less memory than explicit approaches. As a second extension, we modified our hybrid approach to allow numerical solution of linear equation systems using a modified version of Gauss-Seidel called Pseudo Gauss-Seidel. We succeeded both in reducing the number of iterations for convergence and in producing a significant improve- ment in terms of the amount of memory required for vector storage. We also managed to reduce the average iteration time, in one case actually outperforming sparse matrices.

We concluded Chapter 6 by presenting a comparison of our hybrid approach and Kronecker-based techniques for CTMC analysis, in particular the matrix diagram data structure of Ciardo and Miner. Although the origins of these areas of work are different,

the two approaches have a lot in common. Generally, the performance of the two is quite similar; both can handle state spaces an order of magnitude larger than explicit approaches, while maintaining comparable solution speed. One of the main differences is that Kronecker techniques are amenable to the more efficient Gauss-Seidel method, whereas ours are not. We have gone some way towards redressing the balance with the implementation of Pseudo Gauss-Seidel. More importantly, though, our approach applies to a wider range of probabilistic models and solution methods, several of which, such as transient analysis of CTMCs and model checking of MDPs, make no use of Gauss-Seidel. The potential value of our implementation is illustrated by the response we have received from the release of our model checker PRISM. To date, the tool has been down- loaded by more than 350 people and we have received a pleasing amount of positive feedback from those who have managed to use PRISM to analyse interesting case studies. These include applications of the tool to probabilistic anonymity protocols [Shm02], and

power management strategies [NPK+_{02]. A promising collaboration [DKN02] with the}

KRONOS model checker [DOTY96] has also been established.

In document Implementation of symbolic model checking for probabilistic systems (Page 164-169)