Software Technology
Specifying and Verifying Multicast Communication Protocols
Jong Hyo Jin and Chris George
May 2007
UNU-IIST (United Nations University International Institute for Software Technology) is a Research and Training Centre of the United Nations University (UNU). It is based in Macau, and was founded in 1991. It started operations in July 1992. UNU-IIST is jointly funded by the Governor of Macau and the governments of the People’s Republic of China and Portugal through a contribution to the UNU Endownment Fund. As well as providing two-thirds of the endownment fund, the Macau authorities also supply UNU-IIST with its office premises and furniture and subsidise fellow accommodation.
The mission of UNU-IIST is to assist developing countries in the application and development of software technology.
UNU-IIST contributes through its programmatic activities:
1. Advanced development projects, in which software techniques supported by tools are applied, 2. Research projects, in which new techniques for software development are investigated,
3. Curriculum development projects, in which courses of software technology for universities in devel- oping countries are developed,
4. University development projects, which complement the curriculum development projects by aiming to strengthen all aspects of computer science teaching in universities in developing countries, 5. Schools and Courses, which typically teach advanced software development techniques, 6. Events, in which conferences and workshops are organised or supported by UNU-IIST, and
7. Dissemination, in which UNU-IIST regularly distributes to developing countries information on international progress of software technology.
Fellows, who are young scientists and engineers from developing countries, are invited to actively partic- ipate in all these projects. By doing the projects they are trained.
At present, the technical focus of UNU-IIST is on formal methods for software development. UNU-IIST is an internationally recognised center in the area of formal methods. However, no software technique is universally applicable. We are prepared to choose complementary techniques for our projects, if necessary.
UNU-IIST produces a report series. Reports are either Research R , Technical T , Compendia C or Administrative A . They are records of UNU-IIST activities and research and development achievements.
Many of the reports are also published in conference proceedings and journals.
Please write to UNU-IIST at P.O. Box 3058, Macau or visit UNU-IIST’s home page: http://www.iist.unu.edu, if you would like to know more about UNU-IIST and its report series.
G. M. Reed, Director
Software Technology
P.O. Box 3058 Macau
Specifying and Verifying Multicast Communication Protocols
Jong Hyo Jin and Chris George
Abstract
In this work, we study about formal design and verification of protocols using CSP(Communicating Sequential Processes) formalism.
The protocols concern multicast communications; we develop formal models of multicast commu- nication protocols with CSP and then verify the models using FDR which is the model checking tool of CSP.
MSc. in computer science from the university of natural science. His current research interests include designing and verifying the systems such as protocols using formal methods, especially CSP formalism.
Chris George joined UNU/IIST in September 1994 as a Senior Research Fellow and is cur- rently Associate Director. He is one of the main contributors to RAISE, particularly the RAISE method, and that remains his main research interest. Before coming to UNU/IIST he worked for companies in the UK and Denmark.
Copyright c° 2007 by UNU-IIST, Jong Hyo Jin and Chris George
Contents
1 Introduction 1
2 Why do we need multicast 2
2.1 Unicast communication . . . 2
2.2 Broadcast communication . . . 3
2.3 IP Multicast . . . 4
2.4 Application level multicast . . . 5
3 Strategy for designing and verification 7 3.1 Requirement capturing . . . 8
3.2 Informal designing . . . 8
3.3 Formal specification . . . 9
3.4 Formal verification . . . 9
4 Protocol of simple joining and leaving 14 4.1 Requirements of the system . . . 14
4.2 Designing the system . . . 15
4.2.1 Designing the nodes . . . 15
4.2.2 Designing the SM . . . 16
4.2.3 Designing the OUTSIDE . . . 16
4.3 Formal specification of the system . . . 17
4.4 Verification of the system . . . 19
4.4.1 Safety checking . . . 19
4.4.2 Simple Liveness checking . . . 24
4.5 Mutation test . . . 27
5 Protocol considering the simple packet loss 29 5.1 Requirements of the system . . . 29
5.2 Designing the system . . . 30
5.2.1 Designing the nodes . . . 30
5.2.2 Designing the Loser . . . 30
5.2.3 Designing the SM . . . 30
5.3 Formal specification of the system . . . 30
5.4 Verification of the system . . . 32
5.4.1 Safety checking . . . 32
5.4.2 Simple liveness checking . . . 36
6 Protocol considering the hierarchical subgroup structure 38 6.1 Requirements of the system . . . 38
6.2 Designing the system . . . 38
6.2.1 designing the nodes . . . 38
6.2.2 designing the GMs . . . 39
6.2.3 Designing the SM . . . 39
6.3 Formal specification of the system . . . 39
6.4 Verification of the system . . . 41
6.4.1 Safety checking . . . 41
6.4.2 Simple liveness checking . . . 45
7 Protocol considering the dynamics of channels 48 7.1 Requirements of the system . . . 49
7.2 Designing the system . . . 49
7.2.1 Designing the channels . . . 50
7.2.2 Designing the nodes . . . 50
7.3 Formal specification of the system . . . 51
7.4 Verification of the system . . . 58
7.4.1 Safety checking . . . 58
7.4.2 Simple liveness checking . . . 65
8 Conclusion 74
A Protocol of simple joining and leaving in CSPM 78
B Protocol considering the simple packet loss in CSPM 84
C Protocol considering the hierarchical subgroup structure in CSPM 87 D Protocol considering the dynamics of channels in CSPM 92
E Constructing of N-place buffer 109
1 Introduction
In this work, we are going to present a case study for formal design and verification of protocols which are components of multicasting communications.
Multicasting is a key communicating technique for group communicating systems such as video conference, distance learning, internet games, video on demand, newsgroup and so on. [4, 5]
It has been proved that multicast is very predominate over unicast or broadcast in the case of such group commnicating systems, but contrary to unicast and broadcast, multicasting commu- nication is more complex and more difficult to implement due to several reasons.
The main reason is the lack of the multicast capable routers in the world internet, but also the group security, group management, etc are more complex and more difficult than unicast and broadcast.
So far many multicasting communication techniques have been developed([6] - [13]), but most are lent not being able to be implemented in the real internet for the reason that the techniques are depending on the IP level, that is, on the multicast capable router.
Now most of the routers in the internet are not capable to support multicasting.
To overcome this trouble, recently the research focus is being concentrated on the application level multicating ([14] - [20]) which depends only on the application level routing rather than on the IP level routing, that is, multicast-capable routers of the internet.
That is, in application level multicasting, the data from sender is propagated to all group mem- bers by intelligent relaying of the members themselves rather than by multicast capable routers.
In this work, we formally design and verify simple protocols considering several multicasting situations which will be the bases of the future development of the application level multicasting protocol.
We design the protocols formally using the CSP formalism and then verify the protocols in FDR which is the model checking tool of CSP.
In section two, we simply overview about the internet communication techniques, especially about the multicasting techniques including IP multicast and application level multicast.
In section three, we explain our own strategy for designing and verifying of a system which we are going to maintain consistently during the entire development process.
In section four, we design a simple multicast protocol pertaining to simple joining and leaving of one group system and simple data communication in the group, and then verify the protocol in FDR.
In section five, we design a simple multicast protocol to reflect the situation of data loss during communication but the loss model is unchangeable, that is, non-dynamic, and then verify the protocol in FDR.
In section six, we cover a simple protocol to deal with the situation that one multicasting group has two level hierarchical subgroup structures which can be easily extended to multi-level hierarchical subgroup structures, and then verify the protocol in FDR.
In section seven, we consider designing and verifying of one protocol which reflects the situation that the communication channels are dynamic internally and each group member is capable of intelligent data relaying for other members to support reliability of multicating.
In the end, we conclude the work analyzing some problems during designing and verifying of the above simple protocols and aiming the future work towards a completed multicasting protocol and the study to solve the problems.
2 Why do we need multicast
Multicast is the key technique for the group communication such as video conference, distance learning, internet games, video on demand, newsgroup and so on. [4, 5]
Internet communication techniques can be broadly divided into three aspects; Unicast, Broad- cast, Multicast.
So far, the most conventional way of internet communication has been unicast and broadcast, but now multicast techniques have been appearing to develop the more efficient group commu- nication systems.
2.1 Unicast communication
Unicast is one-to-one communication technique, in other words, the communication between single sender and single receiver.
In this communication mode, the packet from sender contains the destination address for only one receiver and can be arrived at only the destination node.
In IPv4, class A(0.0.0.0-127.255.255.255), class B(128.0.0.0-191.255.255.255), class C(192.0.0.0- 223.255.255.255) addresses are used for unicast communication except for the addresses whose host parts are all 0s(subnetwork address) or 1s(broadcast address) in the binary number mode.
In unicast, if we want to send one packet to several receivers using unicast, then we have to make the same copies of the packet as much as the number of the receivers and then send each copy to each receiver respectively.(Fig1)
In the figure, internet backbone means the set of the routers of the world internet.
Each copy of the packet contains the destination address for each receiver.
For one packet to be sent, the channel between the sender and its router must be used several times as much as the number of receivers, so it is clear that the bandwidth wasting on this channel is increased in proportion to the number of the receivers and data communicating delays occur due to this problem.
Apart from the subnetwork, inside the internet, there may be channels used repeatedly for one packet, so above problems can also occur.
This bandwidth wasting may exert serious bad influence upon other communicating systems working on the network and furthermore cause the performance loss of the entire network.
Especially, the bottleneck problem on the channel between the sender and its router cannot be avoided, so reluctantly the sending delay is very increased and there is a strong limit of the number of the receivers to avoid several problems.
Snd R
Rcv2
Rcv3 Rcv1
R
R
R
23.46.235
105.23.46.235
179.253 105
204.143.25
168 38.75 R : Router
: Internet backbone : Data stream
Internet
copy2 179.253.38.75
copy3
204.143.25.168
copy1
Figure 1: The topology of multiple unicast
2.2 Broadcast communication
Broadcast is one-to-global communication technique, in other words, the communication be- tween single sender and global receivers.
In this communication mode, the packet from sender contains the destination address whose host parts are all “1”s in the binary number mode and can arrive at the entire subnetwork receivers.
In IPv4, the addresses which are the class A, B, C addresses and their host parts are all “1”s in the binary number mode are used for broadcast.
For example(Fig2), in IPv4, if the destination address is 203.187.255.255, then this is a class B address and its host parts, that is, the last two bytes are all “1”s in the binary number mode.
This address is a broadcast address and the packet which contains this address for the desti- nation address can arrive at all receivers whose internet addresses are 203.187 despite that the packet is sent by sender only once rather than making several copies of the packet.
At a cursory glance, it seems to be enough for group communication, but here are also several problems which cannot be solved in broadcast for group communication.
If we want to send data to several receivers as well as the subnetwork members, then the broad- cast technique has no way to help us to do this because the broadcast address is confined only for the subnetwork.
Also in the case that we want to send data to several special members of the subnetwork 203.187 rather than all the subnetwork members, the broadcast technique depending on the broadcast address cannot do it because the broadcast packet is shown to all the subnetwork members.
Multicast technique is the only solution for these problems.
R
R
R
R
Internet
packet 203.187
203.187.255.255
23.46.235
105
Snd
204.143.25
168 Rcv1
Rcv2
Rcv3
Rcv4 26.123
36.134 : Internet backbone
: Data stream
R : Router
Figure 2: The topology of broadcast
2.3 IP Multicast
Multicast is one-to-special communication technique, in other words, the communication for sending data to special group of receivers.(Fig3)
In this communication mode, the packet from sender contains one group address as the desti- nation address for the special group of receivers and can arrive at all the group members along the data transfer tree made by multicast routing protocols running at the group members and the internet equipments such as routers.([4] - [13])
This group members have already joined the group in advance and have the same group IP address.
Once a multicast data packet arrives at a router along its path, then the router makes copies of the packet and sends a copy to each downstream router or downstream group member according to the pre-configured data transfer tree.
It is clear that no data packet traverses the same link twice, therefore the network bandwidth wasting becomes minimum and especially the bottleneck problem never occurs which causes undesirable sending delay and limiting of the number of the receivers in the unicast based group communication system.
Also, from sender’s point of view, because sender sends data packet only once rather than send- ing several times as much as the number of the receivers, sender’s burden which is also great obstacle of scalability of group communication systems in the unicast group communication mode becomes minimum.
In IPv4, class D addresses(224.0.0.0-239.255.255.255) are reserved for multicast communication to designate special multicast group.
This group address is neither real IP address of any group members nor real IP address of any devices such as router, switch and bridge, rather we can regard the address as the so-called ab-
Snd R
Rcv1
R
R
R
Rcv2
Rcv3
Rcv4
Internet
R
R packet
copy1 copy2
copy1_1
copy1_2
copy1_2
copy1_2
copy2 231.183.218.17
23.46.235
26.123
231.183.218.17
231.183.218.17 38.123
231.183.218.17 168
204.143.25 231.183.218.17
203.187 (class D address)
105 : Internet bacbone
: Data stream
R : Router
Figure 3: The topology of IP multicast
stract address, so it is possible that the packet which contains this group address as destination address can be controlled only when all the routers through which the packet will be propagated are capable of supporting of multicast routing.
In this naive multicast mode, the multicast functionality such as membership management and packet routing thoroughly depends on the network equipment such as routers running multicast routing protocols at the IP level.
That is, the membership state is managed by the routers and packet replication and routing of one packet is performed according to the packet’s group IP address only depending on the IP routing table pre-configured inside the routers by multicasting protocols.
That is why in general we regard this naive multicast technique as “IP multicast”.
So far many valuable IP multicast techniques for many different purposes have been developed([6]
- [13]) and theoretically the techniques are supreme over unicast and broadcast especially in terms of internet bandwidth saving, delay decreasing, sender’s burden decreasing, etc., but most are not able to be deployed in the real commercial internet due to several reasons.[14,15,17]
Among all the reasons, the most important reason is that most of the routing equipment of the real word internet is not capable of supporting the multicast communication.
2.4 Application level multicast
To overcome these troubles, recently the research focus is being concentrated on the application level multicating which depends only on the application level routing rather than on the IP level routing, that is, the multicast-capable router. ([14] - [20])
In other words, the data from sender is propagated to all group members by intelligent relaying
of the members themselves rather than by multicast capable routers.(Fig 4)
Snd R
Rcv1
R
R
R
Rcv2
Rcv3
Rcv4
Internet
R
R
23.46.235
26.123
168 204.143.25
105 : Data stream
: Internet backbone R : Router
copy1_1_1
203.187.26.123
105.23.46.235
38.125
copy1_2 copy1_1
copy1
203.187.38.125 203.187
204.143.25.168
Figure 4: The topology of application level multicast
Figure 4 shows the application level multicasting which depends only group members themselves rather than multicast capable routers.
Figure 5 shows logical topology of the data transfer tree derived from the real physical commu- nication topology of application level multicast of figure 4.
Rcv1 Rcv2
Rcv3
203.187.26.123
203.187.38.125
Snd
Rcv4 204.143.25.168
copy1
copy1_2
copy1_1_1 copy1_1
105.23.46.235
Figure 5: The data transfer tree of application multicast
Multicast routing tree are formed between group members in the application level rather than between group members and internet routers in the IP level like IP multicast and the application multicast tree are used for data propagation to all group members.
That is, multicasting functions such as membership management, packet replication and packet routing, etc are performed by group members themselves rather than multicast capable routers in IP multicast.
Due to the merit that does not depend on the multicast capable internet routing equipment, the application level multicast can be easily deployed into the real world internet.
Dependence only on the group members themselves, that is, dependence only on the applications running on the members without depending on the IP level routing equipments of the internet, that is why we regard this multicast technique as application level multicast.
Apart from the benefit that is easily deployable into the internet, the other great benefit of application level multicasting is that it is possible to manage actively the different efficient data transfer trees according to the different targets of many different group applications.
The disadvantage of application level multicast comparison with IP multicast is that the ap- plication multicast consumes rather more bandwidth than the IP multicast because it involves rather repeated usages of same channels for one packet.
And the delay at the entire group are rather longer than IP multicast because the packet process time at group receivers(end hosts) is rather longer than the dedicated routing equipments of the internet and the multicast data transfer tree between the group members is also less optimal than the multicast tree between multicast routers and the group members made by IP multicast protocol.
However, due to the reason that IP multicast is impossible to be deployed into the real world in- ternet but application level multicast is possible, now the only way to adopt multicast technique to the internet group communication system is application level multicast, so now the research focuses are being concentrated on the application level multicast technique development.
Multicast communication systems involve a lot of group members(from tens to thousands) and the interactions between them, so protocols of such systems may be very complex and very heavy in the sense of possible state spaces of the system from the operational point of view, therefore it may become impossible to design and verify thoroughly such systems manually.
So, it is clear that when we design and verify such systems, we must depend on a formalism which is the fundamental basis for verifying the system fully automatically.
Bearing all such backgrounds in our mind, in this work, we are trying to formally design some protocols considering several possible situations which will be the bases of the future develop- ment of the complete application level multicast protocol using CSP formalism and verify the protocols automatically and almost thoroughly in FDR which is a model checking tool for CSP.
3 Strategy for designing and verification
Before embarking on the protocol development stage, in this section, we’d like to establish our own simple strategy we are going to maintain consistently.
3.1 Requirement capturing
When we design system, first of all, it is very important that we have a thorough grasp about the system’s entire characteristics including properties, methods, events of the system in the OOD(Object Oriented Design) point of view .
This is an important precondition for us to extract all the requirements thoroughly which the system must satisfy.
We are trying to find all the requirements of the system as thoroughly as possible.
Of course, in this work, we do not focus about whether all the found requirements are enough to thoroughly reflect the system which we want to develop or not, that is, about the domain engineering aspects.
3.2 Informal designing
After assuming that all the requirements are perfect for the system in the sense that all the requirements are enough to thoroughly reflect the system, then what we have to do is to design informally the system according to the requirements in OOD mode.
In general, we can think that system has two kinds of components; physical components and logical components.
In the multicasting communication system, communication channels, group members including group managers can be regarded as physical components and data transfer trees and control meshes consisting of logical relations between the physical components can be regarded as logi- cal components.
Each component has its own dynamic properties, methods and events and can be a candidate of one subprocess of the entire system to be specified formally.
We are trying to find all possible components making the system which we want to develop and design the properties, methods, events of each component.
Good quality of informal design makes formal specification be good.
In this step, what we have to bear in our mind is that we have to look ahead that the verifica- tion of the system will use only its interface events which can be regarded as “server provider”
between system and environment, that is, from the CSP point of view, all events, methods and properties inside the component will be hidden, only the interface events between the system and environment will be exposed.
In other words, regarding each component as a black box and thinking that verification of system will become verifying of traces of the entire server providers of the system being synchronized between the system and environment seem to be a good way for verification and further imple- mentation, especially in the case that the verifier is an agent who does not know anything about the system except for its interface events.
3.3 Formal specification
After informal designing is finished, what we have to do is to specify the informally designed system formally using CSP which is a formalism for describing concurrent systems.
We are trying to specify each component consisting of properties, methods and events as a so- called process which interacts with environment using powerful semantics of CSP.[1, 2, 3]
CSP processes are described by some basic operators such as prefixing, recursion and choice operators deterministic, nondeterministic and conditional.
Also, CSP provides advanced features such as parallel, sequential composition of processes and hiding, renaming operators which makes it possible to design and verify complex concurrent systems which consist of several concurrent components and its relations.
In CSP, there are two simplest processes; one is STOP which can do nothing, that is, never communicates and another is SKIP which terminates immediately successfully.
In other words, SKIP; P = P for all P because all SKIP does is terminate successfully and pass control over to P, in contrast STOP; P = STOP because STOP does not terminate, merely come to an ungraceful halt.
Formal specification is a basic condition for automatic verification of system using system veri- fying tools.
Then we have to translate the CSP specification into the machine readable dialect CSPM which makes it possible to verify system using FDR(Failures/Divergences Refinement).
FDR is an automatic verification system for CSP.
3.4 Formal verification
The final step of our work is to verify the CSP specification of the system using FDR.
CSP has three levels of denotational semantics which describe a notion of process equivalance;
traces, failures and divergences.
When Sys is a CSP process representing a system, Spec is a CSP process representing a property, if Spec = Sys u Spec, then we say that Sys refines Spec and denote Sys w Spec and the system Sys satisfies the property Spec.
Trace semantics deals with safety properties, failures semantics deals with both safety and liveness properties and failures/divergence semantics deals with safety, liveness and divergence properties.
In principle, multicast communicating system must be divergence free which means that the system has no trace after which the system can enter a situation where it is infinitely performing some internal actions and does not interact with the environment.
Therefore, first of all, we are trying to verify that systems we have formally designed are livelock free.
If we succeed to verify that systems are livelock free, then our verification targets range will be reduced into two aspects; traces refinement(safety) and failures refinement(safety, liveness) rather than failures/divergence refinement.
• Safety checking
First of all, we have to verify if the system Sys satisfies each requirement Spec.
In general, we can regard the system’s requirements as safety property which means that Sys does not have to engage in ‘illegal’ activities which are those events not defined by Spec.
In CSP, a sequence of visible events that occurs in an execution of a process P is called a trace and the set of all possible traces of the process is called its traces semantics and is denoted traces(P).
We say that Sys traces refines Spec and denote Spec vT Sys if and only if traces(Sys) ⊆ traces(Spec).
That is, the traces refinement Spec vT Sys means that Sys satisfies the requirement Spec safely by not doing anything conflicting with Spec, which means that traces semantics is enough for safety property checking.
So, in safety checking, we have to formally design the process Spec representing each requirement and then our final target becomes to verify that Spec vT Sys.
Of course, FDR has the functions of such traces refinement checking, that is, in FDR ‘assert Spec vT Sys’ gives the result about refinement relation between the two processes in traces semantics.
But we do not use that functions, instead, use so-called ‘trace monitor’ for several benefits as we will see later.
However, even though the safety checking process is performed using the trace monitor strategy without using the traces refinement checking functions of FDR, the result from the trace monitor strategy is Spec vT Sys which is the final target we pursue in mind.
That is, only the processes are different, but the results will become trace refinement checking.
• Liveness checking
In CSP, the most simplest process is STOP which does nothing and has trace hi.
So, in trace model, it is clear that STOP refines P because hi ⊆ traces(P) for every process P, which means that it is possible to say that STOP satisfies our protocols because it does nothing conflicting with the requirements of the protocols.
That is, in traces model it is impossible to distinguish between a deadlocking process and a non-deadlocking process.
In other words, even though traces semantics can say that a process P will not do anything stupid, it cannot force the process P to do anything at all.
The failures model is used to solve this problem.
A failure of a process P is (s, X ) such that s ∈ traces(P) and X ∈ refusals(P/s), where P/s represents P after performing the trace s and refusals(P/s) means is the set of events that after performing s, P will refuse to accept forever.
The failures semantics failures(P) is the set of all failures of P.
We say that a process Sys failures refines a process Spec and denote Spec vF Sys if and only if failures(Sys) ⊆ failures(Spec) which also clearly implies traces(Sys) ⊆ traces(Spec).
That is, Sys can neither accept nor refuse an event unless Spec also accepts or refuses the event.
For this reason, the traces semantics is regarded as safety or partial correctness condition, while the failures semantics is regarded as liveness or total correctness condition which additionally forces a process to be able to do things.
Finding thoroughly the liveness properties of system is important as much as safety property because the total correctness of system is due to the liveness property satisfaction, while the
safety property satisfaction represents only the partial correctness of system.
In liveness checking, first we have to formally design the process Spec representing the liveness property and then our final target becomes to verify that Spec vF Sys.
FDR has the functions of such failures refinement checking, that is, in FDR ‘assert Spec vF Sys’
gives the result about refinement relation between the two processes in failures semantics.
But we do not use that functions, instead, use so-called ‘trace monitor’ for several benefits as we will see later.
However, even though the liveness checking process is performed using the trace monitor strategy without using the failures refinement checking functions of FDR, the result from the trace monitor strategy is Spec vF Sys which is the final target we pursue in mind.
That is, only the processes are different, but the results will become failures refinement checking.
• Trace monitor
In our work, we use a so-called ‘trace monitor’ to get the refinement checking result for traces and failures semantics without using the refinement checking functions of FDR.
In a word, our verification strategy is that when we want to check the satisfaction of Sys for a property Spec, we construct a monitor Moni such that the deadlock checking result of Sys in parallel with Moni reflects the refinement checking of Spec vT Sys or Spec vF Sys.
— Why do we use a trace monitor?
So, why are we trying to use a trace monitor to get the refinement checking result without using directly the refinement checking functions of FDR?
The first reason is that it seems in general very difficult to specify formally Spec representing a property which we want to verify as a CSP process separately from the system.
After making or receiving a system Sys to be verified, in refinement checking of traces, failures and failures/divergence (vFD) , what a verifier has to do first is to specify formally processes representing the safety(requirement), safety/liveness, safety/liveness/divergence properties.
Of course, in our case, because we first verify that systems which we consider are divergence free (livelock free in FDR), so we do not need to do failures/divergence refinement for our system.
Only traces and failures refinements are enough for our protocol.
If we want to verify that Spec vT Sys or Spec vF Sys, then we have to specify the Spec as a CSP process, but it seems in general very difficult.
The second reason is that it seems that a verifier has to consider too many actions of the system comparison with the small amount of information which the verifier wants to verify.
A system may consists of many processes(CSP specifications of components of the system) and in general the actions inside the subcomponents are hiddden (in CSP semantics \{...}) by de- signers and then are handed to verifiers (in principle, the most thorough verifier is a third party, especially when a system is big, then the verifiers and the designers can not be the same).
Verifiers only use the interface actions between the components of the system and the envi- ronment and it is possible that in the case of big system, the amount of the interface actions becomes very big.
In this case, a verifier which wants to verify the satisfaction of Sys for Spec has to hide all inter- face actions of Sys except for belong to Spec to use the CSP specification “assert Spec vT Sys”
or “assert Spec vF Sys”, and then use the refinement checking functions of FDR.
That is, the verifier must focus on many actions which are not the actions to be verified as well
as the actions to be verified.
This is clearly non-optimal in the case that the amount of interface actions is very big compar- ison with the amount of actions to be considered for the refinement checking at that time.
This situation is similar to the case that in reliable communication system, the ACK(Acknowledge) mode can be used when the communication channels are very unreliable, and the NACK(Negative Acknowledgment) mode can be used when the channels are very reliable to avoid receiving of too many control messages.
After all, direct use of the refinement checking function of FDR to verify “assert Spec vT Sys”
or Spec vF Sys” is rather inefficient due to that the left hand side Spec is very difficult to specify as CSP process and the right hand side Sys involves for hiding unnecessary knowledge capture for actions which are in fact not the actions of the verifier’s interest.
The third reason is that a trace monitor construction is very simple comparison with the con- struction of Spec representing the property to be verified in CSP.
— Trace monitor construction and result
The trace monitor construction makes a verifier focus on only a few actions related to a property to be verified and is easier than specifying a property to be verified separately from the system.
Of course, making each property to be verified to contain as small amount of actions as possible makes the construction of trace monitor easy.
In a word, trace monitor is a monitor to capture the interactions between system and envi- ronment, but only the interactions of our interest without concerning other interactions, which naturally produces hiding effects without using hiding function of CSP.
We make the trace monitor Moni of a system Sys such that the deadlock checking result of Sys |[ interface actions ]| Moni will reflect the satisfaction of Sys for a property Spec.
If Spec is a safety property, then the deadlock checking result of Sys |[interface actions ]| Moni re- flects the refinement checking result of Spec vT Sys, that is, the satisfaction or non-satisfaction of Sys for the safety property Spec.
Also, If Spec is a liveness property, then the deadlock checking result of Sys |[ interface actions ]|
Moni reflects the refinement checking result of Spec vF Sys, that is, the satisfaction or non- satisfaction of Sys for the liveness property.
The deadlock of that parallelism means that the specification Spec vF Sys does not hold, that is, Sys does not satisfy the liveness property Spec.
It is clear that we can get the same refinement checking results of system by using the deadlock checking functions of trace monitor instead of directly using the refinement checking functions of FDR for the above reasons, while the verifying processes will become optimal than using that functions directly.
— Drawback of trace monitors
There are some cases when the trace monitor is not appropriate, where it is necessary to use directly the refinement checking functions of FDR.
In general, trace monitors are constructed to synchronize with the system to capture the sys- tem’s actions to prove if the system satisfies a property.
So, if the property checking needs for the system to run several times synchronizing with the trace monitor, then the parallelism becomes impossible to be checked fully automatically due to the state space explosion, especially in the case of liveness checking (see “proving the fourth
liveness property” in section 7.4.2).
That is, even though we succeed to check a system SYS, it may result in failing to check the parallel combination between the parallel combinations between SYS and trace monitor such as
|||dta : Data • SYS |[ ...]| TRACE MONITOR(dta,...) directly in FDR if SYS is even a little bigger in the sense of state space due to the length of “Data”.
So, we reluctantly chose every data of “Data” manually and checked the above parallel combi- nation for the chosen data.
In this case, the only way for us to verify fully automatically the property is the direct use of the refinement checking function of FDR which, of course, involves the drawbacks of the direct use of refinement checking function of FDR as mentioned above.
That is, we have to choose one between the trace monitor method and the direct use of refine- ment checking function of FDR intelligently according to the diverse modality of verification, but it is clear that we have to get the same refinement checking result regardless of the methods.
In our work, in general the safety property is expressed as “∀ tr ∈ traces(Sys) • tr sat Spec”, so traces of the process representing Spec must contain traces(Sys), that is, constructing Spec as a CSP process may be relatively difficult, so in this case the trace monitor strategy may be better than the direst use of refinement checking function of FDR.
Contrary to that, in our work the liveness property is expressed in general as “∃ tr ∈ traces(Sys)•
tr sat Spec”, so Spec will be a special case of Sys and the traces of the process representing Spec must be contained in traces(Sys), that is, constructing Spec as a CSP process may be relatively easy, so in this case the direct use of refinement checking function of FDR may be better than use of the trace monitor strategy.
4 Protocol of simple joining and leaving
Here we are going to design and verify a simple protocol which covers the simple joining, leaving, data communicating in one group.(Fig1)
The group consists of one session manager (SM) and a number of member nodes. In the figure, i, j, k mean the nodes, SM means the session manager, OUTSIDE means the system out of the protocol.
For example, OUTSIDE may be one system which generates, receives and processes the necessary information such as multimedia data packets, control packets, etc.
The channel “chn in” means the channel between the nodes and SM, and “chn out” means the channel between SM and OUTSIDE.
SYSTEM
Node k
j i
Chn_in SM Chn_out
OUTSIDE
Figure 6: The physical topology of the protocol
4.1 Requirements of the system
This system must satisfy the following requirements.
i) There is one group communication system which consists of one session manager (SM) and a number of member nodes.
ii) Each node can join and leave the group in cooperation with the SM in the request/confirm mode.
iii) SM allows joining of one node only when a node which is not a member of the group currently requests joining, and leaving of one node only when a node which is a member of the group currently requests leaving.
iv) There is a limit for the number of the member nodes and this rule is controlled by the SM.
v) Only after joining the group, each node can do the data communicating with SM.
vi) When the session manager receives data from outside, it sends the data to all member nodes.
vii) When the session manager receives data from member nodes, it forwards the data to OUT- SIDE.
It is assumed that any message corruption or loss on the channel “chn in”, “chn out” does not occur.
4.2 Designing the system
4.2.1 Designing the nodes
(1) Joining process
• Each node who wants to join sends a message “Join request” to the SM, and then the node is in the waiting state for the reply from the SM.
• If the node receives a message “Join confirm” from the SM, then the node joins the group instantly.
• If the node receives a message “Join refusal” from the SM, then the node cannot join the group at that time, so the node will retry joining.
(2) Data communcating process
• Each member node receives the data designated to the node from the SM.
• Each member node can send data to the SM.
(3) Leaving process
• Each member node which wants to leave sends a message “Leave request” to the SM, and then the node is in the waiting state for the reply from the SM.
• If the node receives a message “Leave confirm” from the SM, then the node leaves the group instantly.
• If the node receives a message “Leave refusal“ from the SM, then the node cannot leave the group at that time, so the node will retry leaving.
4.2.2 Designing the SM
(1) Joining process
• Whenever SM receives a message “Join request” from any node, SM checks the number of the member nodes.
• If the number is smaller than the limited number N and the node is not a member node, then the SM allows the joining of the node and sends a message “Join confirm” to the node.
• Otherwise, SM does not allow the joining of the node and sends a message “Join refusal”
to the node.
(2) Data communicating process
• SM receives data from any node and sends it to outside.
• SM receives data from OUTSIDE and multicasts the data to all member nodes.
(3) Leaving process
• Whenever SM receives a message “Leave request” from any node, SM checks if the node is a member node.
• If the node is a member node which joins the group, then SM sends a message “Leave confirm”
to the node.
• Otherwise, SM sends a message “Leave refusal” to the node.
4.2.3 Designing the OUTSIDE
• OUTSIDE receives all data from SM.
• OUTSIDE sends data to SM.
The entire system SYS consists of these processes and the interaction between them.
4.3 Formal specification of the system
Formal specification of this protocol using CSP is as follows, where
NJOIN(i): group joining process at node i.
NDATA(i): data communicating process at node i.
NLEAVE(i): group leaving process at node i.
N(i): node i.
SM(Jnode): Session Manager, where Jnode is a set of the nodes which join the group.
OUTSIDE: outside which interacts with the group.
Protocol of simple joining and leaving datatype
Msg = {Join request, Leave request, Join confirm, Join refusal, Leave confirm, Leave refusal, Joined already}
Err = {err exceed, err trace, err full, err join liveness, err data only after joining, err multicast RtoL, err relay LtoR, err at least once data communi}
Direction = {LtoR, RtoL}
set
Node = {1..n} : The set of all nodes.
Data : The set of all data to be communicated Packet = Data ∪ Msg
k : Maximum number of the nodes which can join the group channel
chn in : Node.Packet.Direction chn out : Node.Data.Direction chn err : Err
process SYS =
let
NJOIN (i) =
chn in!i.Join request.LtoR → chn in.i?x : Msg!RtoL →
if (x = Join confirm ∨ x = Joined already) then N (i)
else NJOIN (i)
NDATA(i) =
chn in.i?x : Data!RtoL → N (i) 2
2x : Data • chn in!i.x .LtoR → N (i) NLEAVE (i) =
chn in!i.Leave request.LtoR → chn in.i?x : Msg!RtoL → if x = Leave confirm then NJOIN (i) else N (i)
N (i) = NLEAVE (i) 2 NDATA(i) SM (Jnode) =
chn in?i?x !LtoR → if x = Join request then
if i ∈ Jnode
then chn in!i.Joined already.RtoL → SM (Jnode) else
if #(Jnode) ≥ k
then chn in!i.Join refusal.RtoL → SM (Jnode) else chn in!i.Join confirm.RtoL → SM (Jnode ∪ {i}) else
if x = Leave request then
if i ∈ Jnode
then chn in!i.Leave confirm.RtoL → SM (Jnode \ {i}) else chn in!i.Leave refusal.RtoL → SM (Jnode) else
if x ∈ Data
then chn out!i.x .LtoR → SM (Jnode) else SM (Jnode)
2
chn out?i?x !RtoL →
(|||j : Jnode • chn in!j .x .RtoL → Skip); SM (Jnode) OUTSIDE =
chn out?i?x !LtoR → OUTSIDE 2
2x : Data • chn out!1.x .RtoL → OUTSIDE within
((|||i : Node • NJOIN (i)) |[ {|chn in|} ]| SM (∅)) |[ {|chn out|} ]| OUTSIDE
4.4 Verification of the system
We verify the above system model SYS using FDR.
In order to check the system in FDR, we translate the system with CSP into the system with CSPM which is presented in appendix A.
Firstly, we are trying to analyze and verify the safety property of the system which is defined as “in this system, nothing bad will happen”.
Secondly, we are trying to analyze and verify the simple liveness property of the system which is defined as “in this system, something good can happen”.
4.4.1 Safety checking
We check whether this system will work correctly satisfying all the requirements or not.
This system SYS is deadlock free, livelock free and deterministic.
We have to verify the correctness of this system SYS by checking that the traces of this system satisfy all the requirements. The fact that SYS satisfies the requirements i),ii) is clear.
So, we have to check whether the requirements iii)-vii) are satisfied or not.
• Correctness checking for the requirement iii).
That is, SM has to allow joining of only the node which is not a member of the group currently, and leaving of only the node which is a member of the group currently.
To satisfy this requirement, the trace model of SYS must satisfy the following specification in CSP.
∀ tr ∈ traces(SYS ), ∀ i ∈ Node • · · · (4 − 1)
tr ↓ chn in.i.Join confirm.RtoL ≥ tr ↓ chn in.i.Leave confirm.RtoL tr ↓ chn in.i.Join confirm.RtoL ≤ tr ↓ chn in.i.Leave confirm.RtoL + 1
To verify this rule of the system’s trace, we constructed the trace checking model for this rule in CSP as follows.
Trace Checking Model process
MONITOR S 1(a) = chn in?i.x .d → if x = Join confirm then
if i ∈ a
then chn err !err trace → Stop else MONITOR S 1(a ∪ {i})
else
if x = Leave confirm then
if i /∈ a
then chn err !err trace → Stop else MONITOR S 1(a \ {i}) else MONITOR S 1(a)
VERI TRACE = SYS |[ {|chn in|} ]| MONITOR S 1(∅)
If SYS does not satisfy this requirement, then the process MONITOR S 1(∅) will send the message “err trace” and then stop. Therefore, the entire checking process VERI TRACE will deadlock.
If SYS satisfies this requirement, then the process will not stop, so the VERI TRACE will not deadlock.
We translate this trace checking model into CSPM model(Appendix A) and then check the model using FDR. The checking result of the VERI TRACE is deadlock free.
So it is proved that SYS satisfies the requirement iii).
• Correctness checking for the requirement iv).
That is, The number of the nodes which join the group does not exceed the limited maximum number k. This requirement can be specified using CSP as follows.
∀ tr ∈ traces(SYS ) •
#(tr ¹ {chn in.i.Join confirm.RtoL | i ∈ Node}) ≤
#(tr ¹ {chn in.i.Leave confirm.RtoL | i ∈ Node}) + k · · · (4 − 2)
Here, k is the maximum number of the nodes which join the group. To verify this rule, we constructed the trace checking model for this rule in CSP as follows.
Trace Checking Model process
MONITOR S 2(i) = if i ≥ 0 ∧ i ≤ k then
chn in?j .x .d → if x = Join confirm
then MONITOR S 2(i + 1) else
if x = Leave confirm then MONITOR S 2(i − 1) else MONITOR S 2(i)
else chn err !err exceed → Stop
VERI MEM LIM = SYS |[ {|chn in|} ]| MONITOR S 2(0)
If SYS does not satisfy the requirement iv), then the checking process MONITOR S 2(0) will send out the message “err exceed”, and then stop.
Therefore, the entire checking process VERI MEM LIM will deadlock.
If SYS satisfies this requirement, then MONITR2 (0) will not stop, so VERI MEM LIM will not deadlock.
We translate this trace checking model into CSPM model(Appendix A) and then check the model using FDR. The checking result of the VERI MEM LIM using FDR is deadlock free.
So it is proved that SYS satisfies the requirement iv).
• Correctness checking for the requirement v).
Only after joining the group, each node can do the data communicateing with SM.
This requirement can be specified using CSP as follows.
∀ tr ∈ traces(SYS ), ∀ i ∈ Node, ∀ dt ∈ Data, ∀ dr ∈ Direction • · · · (4 − 3) ( ¯tr )0 = chn in.i.dt.dr ⇒
( ¯tr )´↓ chn in.i.Join confirm.RtoL = ( ¯tr )´↓ chn in.i.Leave confirm.RtoL + 1
In the specification (4-3), ¯tr means the reverse of tr , and ( ¯tr )0 means the head of the reverse and ( ¯tr )´means the tail of the reverse.
To verify this rule, we constructed the trace checking model for this rule in CSP as follows.
Trace Checking Model process
MONITOR S 3(a) = chn in?i.x .d → if x = Join confirm
then MONITOR S 3(a ∪ (i, Join confirm)) else
if x = Leave confirm
then MONITOR S 3(a \ (i, Join confirm)) else
if x ∈ Data then
if (i, Join confirm) ∈ a then MONITOR S 3(a)
else chn err !err data only after joining → Stop else MONITOR S 3(a)
VERI DATA ONLY AFTER JOINING = SYS |[ {|chn in|} ]| MONITOR S 3(∅)
If SYS does not satisfy the requirement v), then checking process MONITOR S 3(∅) will send out the message “err data only after joining”, and then stop. Therefore, the entire checking process VERI DATA ONLY AFTER JOINING will deadlock.
If SYS satisfies this requirement, then MONITOR S 3(∅) will not stop, so VERI DATA ONLY AFTER JOINING will not deadlock.
We translate this trace checking model into CSPM model(Appendix A) and then check the model using FDR.
The checking result of the VERI DATA ONLY AFTER JOINING using FDR is deadlock free.
So it is proved that SYS satisfies the requirement v).
• Correctness checking for the requirement vi).
Once Session Manager receives a data from OUTSIDE, it multicasts the data to all member nodes of the group.
This requirement can be specified using CSP as follows.
∀ tr ∈ traces(SYS ), ∀ d1, d2 ∈ Data, ∀ s, t in tr • · · · (4 − 4) (s a hchn out.1.d1.RtoLi a t a hchn out.1.d2.RtoLi = tr ∧
t ¹ {chn out.1.x .RtoL | x ∈ Data} = hi) ⇒ (∀ i ∈ Node •
s ↓ chn in.i.Join confirm.RtoL = s ↓ chn in.i.Leave confirm.RtoL + 1 ⇒ t ↓ chn in.i.d1.RtoL = 1
)
To verify this rule, we constructed the checking model for this rule in CSP as follows.
Trace Checking Model process
MONITOR S 4(Jnode, Mnode, y, flg is first) = chn in?i?x ?d →
if x = Join confirm
then MONITOR S 4(Jnode ∪ {i}, Mnode ∪ {i}, y, flg is first) else
if x = Leave confirm
then MONITOR S 4(Jnode \ {i}, Mnode \ {i}, y, flg is first) else
if x ∈ Data ∧ d = RtoL ∧ x = y
then MONITOR S 4(Jnode, Mnode ∪ {i}, y, false) else MONITOR S 4(Jnode, Mnode, y, flg is first)
2
chn out?i?dta?d → if d = RtoL
then
if Jnode = Mnode ∨ flg is first = true then MONITOR S 4(Jnode, ∅, dta, false) else chn err !err multicast RtoL → Stop else
MONITOR S 4(Jnode, Mnode, y, flg is first) VERI DATA MULTICAST =
SYS |[ {|chn in, chn out|} ]| MONITOR S 4(∅, ∅, null, true)
If SYS does not satisfy the requirement vi), then checking process MONITOR S 4(∅,∅,null,true) will send out a message “err multicast RtoL”, and then stop. Therefore, the entire checking process VERI DATA MULTICAST will deadlock.
If SYS satisfies this requirement, then MONITOR S 4(∅,∅,null,true) will not stop, so VERI DATA MULTICAST will not deadlock.
We translate this trace checking model into CSPM model(Appendix A) and then check the model using FDR. The checking result of the VERI DATA MULTICAST using FDR is deadlock free.
So it is proved that SYS satisfies the requirement vi).
• Correctness checking for the requirement vii).
Once Session Manager receives data from the nodes which join the group, it forwards the data to outside. This requirement can be specified using CSP as follows.
∀ tr ∈ traces(SYS ), ∀ i ∈ Node, ∀ d1, d2 ∈ Data, ∀ s, t in tr • · · · (4 − 5) (s a hchn in.i.d1.LtoRi a t a hchn in.i.d2.LtoRi = tr ∧
t ¹ {chn in.i.x .LtoR | x ∈ Data} = hi) ⇒ t ↓ chn out.i.d1.LtoR = 1
To verify this rule, we constructed the trace checking model for this rule in CSP as follows.
Trace Checking Model process
MONITOR S 5(i, y, flg is relayed) = chn out.i?x ?d →
if x = y ∧ d = LtoR
then MONITOR S 5(i, y, true)
else MONITOR S 5(i, y, flg is relayed) 2
chn in.i?dta?d →
if d = LtoR ∧ dta ∈ Data then
if flg is relayed = true
then MONITOR S 5(i, dta, false) else chn err !err relay LtoR → Stop else
MONITOR S 5(i, y, flg is relayed) VERI DATA RELAY =
SYS |[ {|chn in, chn out|} ]| (|||i : Node • MONITOR S 5(i, null, true))
If SYS does not satisfy the requirement vii), then checking process MONITOR S 5(i,null,true) will send out the message err relay LtoR, and then stop. Therefore, the entire checking process VERI DATA RELAY will deadlock.
If SYS satisfies this requirement, then MONITOR S 5(i,null,true) will not stop, so VERI DATA RELAY will not deadlock.
We translate this trace checking model into CSPM model(Appendix A) and then check the model using FDR. The checking result of the VERI DATA RELAY using FDR is deadlock free. So it is proved that SYS satisfies the requirement vii).
4.4.2 Simple Liveness checking
Above, we proved the safety properties of the system.
Additionally, to the liveness of the system, we are trying to prove the following three properties.
Firstly, can the situation that the number of member nodes is the maximum number k occur eventually, in other words can the group be full eventually? (We define as MEMBER FULL this possibility of liveness.)
Secondly, can each node join the group eventually? This property also means that each node which joins the group can leave the group eventually. (We define as EVENTUALLY JOIN this possibility of liveness)
Thirdly, a member can do data communication with SM at least once before leaving the group?
(We define as AT LEAST ONCE DATA this possibility of liveness)
• Proving the “MEMBER FULL” possibility liveness
This property that the group will be full eventually can be specified using CSP as follows.
∃ tr ∈ traces(SYS )•
#(tr ¹ {chn in.i.Join confirm.RtoL | i ∈ Node}) = · · · (4 − 6)
#(tr ¹ {chn in.i.Leave confirm.RtoL | i ∈ Node}) + k
Instead of proving the specification (4-6), we are going to try to prove the safety of its negation (4-7).
∀ tr ∈ traces(SYS )•
#(tr ¹ {chn in.i.Join confirm.RtoL | i ∈ Node}) < · · · (4 − 7)
#(tr ¹ {chn in.i.Leave confirm.RtoL | i ∈ Node}) + k
Because we have already proved (4-2), so in the specification (4-7) the operator “6=” turn to the operator “<”.
If we fail to prove the safety of (4-7),then we can say that (4-6) is true.
The specification (4-7) can be modeled using CSP as follows.
Trace Checking Model process
MONITOR L 1(i) = if i = k
then chn err !err full → Stop else
chn in?j .x .d → if x = Join confirm
then MONITOR L 1(i + 1) else
if x = Leave confirm then MONITOR L 1(i − 1) else MONITOR L 1(i)
VERI MEM FULL = SYS |[ {|chn in|} ]| MONITOR L 1(0)
If the specification (4-7) holds, then MONITOR L 1(0) will not stop, so the entire checking process VERI MEM FULL will not deadlock.
Contrary to that, if the specification (4-7) does not hold, then MONITOR L 1(0) will stop sending out the message “err full”, so VERI MEM FULL will deadlock.
We translate this trace checking model VERI MEM FULL into CSPM model(Appendix A) and then check the model using FDR. The result is that VERI MEM FULL deadlocks with the message “err full”.
So (4-7) is invalid which means that (4-6) is true.
We can say that the group can be full eventually.
• Proving the “EVENTUALLY JOIN” possibility liveness
The property that each node can join the group eventually can be modeled using CSP as follows.
∀ i ∈ Node, ∃ tr ∈ traces(SYS )•
tr ↓ chn in.i.Join confirm.RtoL > 0 · · · (4 − 8)
Instead of proving the specification (4-8), we are going to try to prove the safety of its negation (4-9).
∃ i ∈ Node, ∀ tr ∈ traces(SYS )•
tr ↓ chn in.i.Join confirm.RtoL = 0 · · · (4 − 9)
If we fail to prove the safety of (4-9), then we can say that (4-8) is true.
The specification (4-9) can be modeled using CSP as follows.
Trace Checking Model process
MONITOR L 2(s) = if s = Node
then chn err !err join liveness → Stop else
chn in?j .x .d → if x = Join confirm
then MONITOR L 2(s ∪ {j }) else MONITOR L 2(s)
VERI CANNOT JOIN EXIST = SYS |[ {|chn in|} ]| MONITOR L 2(∅)
If the specification (4-9) holds, then MONITOR L 2(∅) will not stop, so the entire checking process VERI CANNOT JOIN EXIST will not deadlock.
Contrary to that, if the specification (4-9) does not hold, then MONITOR L 2(∅) will stop sending out the message “err join liveness”, so VERI CANNOT JOIN EXIST will deadlock.
We translate this trace checking model VERI CANNOT JOIN EXIST into CSPM model(Appendix A) and then check the model using FDR.
The result is that VERI CANNOT JOIN EXIST deadlocks with the message “err join liveness”.
So the specification (4-9) is invalid which means that (4-8) is true.
We can say that each node can join the group eventually.
• Proving the “AT LEAST ONCE DATA” possibility liveness
The property that a member can do the data communication with SM at least once before leaving the group can be modeled using CSP as follows.
∀ i ∈ Node, ∃ tr ∈ traces(SYS ), ∀ s, t in (tr ↓ chn in.i) • · · · (4 − 10) (s a hJoin confirm.RtoLi a t a hLeave confirm.RtoLi = (tr ↓ chn in.i) ∧
t ¹ {Join confirm.RtoL, Leave confirm.RtoL} = hi) ⇒ t ¹ {x .y | x ∈ Data, y ∈ Direction} 6= hi
The specification (4-10) can be modeled using CSP as follows.
Trace Checking Model process
MONITOR L 3(communi set) = if communi set = Node
then chn err !at least once data communi → Stop else
chn in?n?x ?d → if x ∈ Data
then MONITOR L 3(communi set ∪ {n})) else MONITOR L 3(communi set)
VERI AT LEAST ONCE DATA COMMUNI = SYS |[ {|chn in|} ]| MONITOR L 3(∅)
If the specification (4-10) holds, then MONITOR L 3(∅) will stop sending out a message
“at least once data communi”, so the entire checking process VERI AT LEAST ONCE DATA COMMUNI will deadlock.
Contrary to that, if the specification (4-10) does not hold, then MONITOR L 3(∅) will not stop , so VERI AT LEAST ONCE DATA COMMUNI will be deadlock free.
We translate this trace checking model VERI AT LEAST ONCE DATA COMMUNI into CSPM model(Appendix A) and then check the model using FDR.
The result is that VERI AT LEAST ONCE DATA COMMUNI deadlocks sending out a mes- sage “at least once data communi” which means that the specification (4-10) holds.
We can say that a member can do data communication with SM at least once before leaving.
4.5 Mutation test
So far, we have checked all the safety and liveness properties of a system to verify the system.
The method that we use mainly to verify for a system Sys to satisfy a property Spec is mainly trace monitor strategy where the deadlock checking result of Sys in parallel with Monitor is our target to verify if Sys satisfy Spec, that is Sys w Spec.
Of course, in some cases that the trace monitor strategy is inadequate, we will use directly the refinement checking functions of FDR instead of trace monitor, where our target is to model Spec as a CSP process and then directly refinement check Sys w Spec in FDR.
Anyway, in both cases it is a task of vital importance to make someone confident that the trace monitor or the model reflecting Spec are correct, that is, the test cases are correct.
In the case of trace monitor, we make a simple check to see if our monitors could in fact detect errors in the specification: if they always returned the expected result they would not be much use!
For example, we tried a small change to SYS: in SM we changed the condition
#Jnode ≥ k to #Jnode > k
to check if the trace monitor, which is designed for the safety property that the number of the nodes which join the group does not exceed the limited number k, acts as our expectation that in this case VERI MEM LIM will deadlock implying that Sys does not satisfy this safety property.
The result is that the trace monitor discover this specification error, that is, the trace moni- tor discover that in this case Sys does not satisfy the safety property by making the process VERI MEM LIM deadlock.
This is an example of the principle of mutation testing being applied to formal verification.
In mutation testing small changes in a program are introduced in order to see if the test cases discover them by reporting an error.
If there is no error reported, either the change makes no difference to the correct functionality of the programme or, since the functionality is now wrong, the test cases are inadequate.
Thus it is a procedure for checking the effectiveness of a set of test cases.
Here we are checking the effectiveness of a set of correctness conditions and associated monitors.
Similar and more extensive mutation checks could be done for all the specifications.
5 Protocol considering the simple packet loss
Here we are going to design and verify a simple protocol which models the situation that there exists one loser which sometimes discards data packets being sent through the loser in the group.
This simple protocol is to model the reliable multicast communicating system in the unreliable network situation.(Fig2)
5.1 Requirements of the system
i
j
k
SM Loser
Chn_node Chn_sm
Node
SYSTEM
Figure 7: The physical topology of the protocol.
This protocol must satisfy the following requirements.
i) There is one group communication system which consists of one session manager (SM) and Loser and several member nodes.
ii) SM sends dta1 to all nodes.
iii) Only after each node receives dta1, the node sends Acks to SM.
iv) When SM receives all Acks from all nodes, then the system is OK.
v) Loser relays the messages between the nodes and SM, but discards(does not relay) every third message among all messages which it receives.