• No results found

In order to reduce the complexity of the fault-tolerant software and promote software reuse, several object-oriented patterns and frameworks have been proposed. These software structures generally translate fault tolerance concepts as variants and decision algorithms (also called abjudicators) into abstract classes that define interfaces for the implementation of fault tolerance techniques. A common approach is to separate the fault tolerance functionality from the application software, making it reusable. Additionally, the applications program becomes a user of the fault tolerance software, reducing system complexity.

Xu, Randel, Rubira-Calsavara and Stroud [119] proposed an object-oriented structure for dealing with software fault tolerance. They suggested the application of idealized components with diverse design using classes to implement the control algorithm, the software variants and the abjudicator, as shown in the example of Figure 2.7. Controller - pa: Abjudicator * - pv1: Variant * - pv2: Variant * - pv_n: Variant *

+ recoveryBlocks(Abjudicator *, Variant **, ...) : status + nVersionProgramming(...) : status

Abjudicator Variant

Voter AT Variant1 Variant2 Variant_n

1* 1

1

Figure 2.7: Xu,Randell, Rubira-Calsavara and Stroud´s framework example.

Each fault tolerance technique is implemented by a method of the Controller class, using one Abjudicator and several Variant objects passed as arguments by the application program. In this architecture, the inclusion of a new fault tolerance strategy demands the addition of a new method to the Controller class. There is no

definition on how input data is passed for the variants and how the results are returned, but a general solution must be adopted, otherwise the Controller class would not be reusable. Specialized abjudicators can be defined by deriving the Voter and AT classes.

Variant classes can achieve design diversity by using diverse algorithms and internal data structures. This is termed class-level design redundancy. However, some mechanism must be provided for maintaining state consistency among Variant objects if they maintain their state between activations. Less general solutions to variant diversity include object-level design redundancy, in which variant objects belong to the same class but are initialized with slightly different data, and operation-level design redundancy, in which variant classes have diverse implementation algorithms but no class data.

Tso, Shroki, Tai and Dziegiel [115] developed and implemented a framework of software tolerance components. Figure 2.8 shows the class diagram for their implementation of the Recovery Blocks technique.

RBscheme Executive TryBlock Check PointMechanism AcceptanceTest SingleProcess Concurrent SRB DRB PTC Conversation Primary Alternate

CheckPoint RecoveryCache AuditTrail

Timing Reasonable

Figure 2.8: Tso, Shokri, Tai and Dziegiel´s class diagram for the RB technique.

The RBscheme class is responsible for implementing the Recovery Blocks technique. It delegates the control algorithm to an Executive object, which is specialized by inheritance to cover several execution schemes, using single and concurrent processes. Primary and alternate variants are implemented as classes derived from the TryBlock class. Acceptance tests algorithms are defined by classes that inherit from the AcceptanceTest class. Checkpointing mechanisms, as recovery

caches and audit trails, are implemented by classes derived from the

CheckPointMechanism class.

The main drawback of this framework, comparing to Xu et al. framework, is the definition of a different class structure for each fault tolerance scheme. For instance, voter classes are added for NVP and data re-expression classes are added for data diversity techniques, such as Retry Block and N-Copy Programming [11].

Daniels, Kim and Vouk [35] proposed the Reliable Hybrid pattern, which targets the design of fault tolerance applications. The focus of this pattern is on the decision mechanism, which can combine acceptance tests and voters in hybrid strategies, such as Concensus Recovery Blocks [102] and Acceptance Voting [18]. Figure 2.9 presents the Reliable Hybrid pattern structure.

Master + request() Version + request() Abjudicator + getResult() Version1 + request() Version2 + request() Version_n + request() Voter + getResult() - vote() AT + getResult() - accTest() Hybrid + getResult() VoterImplem1 + getResult() - vote() VoterImplem2 + getResult() - vote() ATImplem1 + getResult() - accTest() ATImplem2 + getResult() - accTest()

Figure 2.9: Reliable Hybrid pattern class diagram.

The Reliable Hybrid pattern has a class diagram that is similar to Xu et al. framework. The improvement is related to the abjudicator, which includes the Hybrid class and implements the Composite pattern [47]. The Master class has a single association with one Abjudicator object, which may be a Voter, an AT or a Hybrid object. The Hybrid class possesses a list of Abjudicator objects (Voters, AT objects and other Hybrid objects) and its getResult method calls each Abjudicator object sequentially until a successful result is obtained.

In this pattern, the fault tolerance strategy is performed by the Master class, which calls the several Version objects and sends their results to the Abjudicator object. However, no specific mechanism is devised to change the control algorithm.

Xu and Randell improved their previous framework and published it as the Generic Software Fault Tolerance (GSFT) pattern [121]. This pattern class diagram is shown in Figure 2.10. ExternalInterface + request() FTObject + request() GenericFTinterface + request() FTController NVP RB Other Variant + request() Variant1 + request() Variant2 + request() Variant_n + request() Abjudicator + getResult() Voter + getResult() AT + getResult() Combined + getResult()

Figure 2.10: The Generic Software Fault Tolerance pattern class diagram.

A fault-tolerant class (FTObject) must implement ExternalInterface to conform to the interface characteristics of an idealized component. The FTObject class passes the user requests to the GenericFTInterface class, which actually executes the fault- tolerant processing, using FTController subclasses to implement the control algorithm. The abjudicator is implemented similarly to the Reliable Hybrid Pattern, including a Combined class that behaves as the Hybrid class in that pattern. The main difference of this pattern to the original framework proposed by the authors is the inclusion of the FTController hierarchy that implements the control algorithm by applying the Strategy pattern [47], similarly to the Tso et al. framework (Figure 2.8).

The GSFT pattern is, to our knowledge, the most comprehensive framework for fault tolerance ever presented. However, it leaves undefined many issues. One is regarding data passing between variants, abjudicators and the user application.

Another issue is how to implement this pattern using processes or threads as units of fault tolerance. In Chapter 5 we propose a fault tolerance framework that addresses these issues.