Protocol Anomaly Detection and Verification
4.3 SDL Modeling For Prototyping Packet Verifier
The purposes of Packet Verifier are validating compliance to standards, and detecting protocol anomalies. Packet Verifier checks the protocol header of packets, verifies packet size, checks TCP/UDP header length, verifies TCP flags and all packet parameters, does TCP protocol type verification, and analyses TCP Protocol header and TCP protocol flags. The goal of using the specification and description language (SDL) [CCITT(1992)]
is not to define a formal description of the TCP verification model, but rather to provide some assurance that the TCP verification model under development are complete and perform the functions that were intended. This SDL allowed us to locate errors in requirements of Packet Verifier.
The specification was made by hand (Figure 4.10) first, and then using SDL is to ac-complish requirement of the TCP verification model. SDL is an International Telecom-munication Union (ITU) standard, based on the concept of a system of Communicating Extended Finite State Machine (CEFSM) Model [Hopcroft and Ullman(1979)]. To un-derstand how SDL can work based on the CEFSM, it is necessary to address the dynamic semantics of the finite state machine, SDL’s underlying model, and generating the TCP verification model using SDL. This rapid development of a model for testing and validat-ing of the contained behaviour of the development verification model was useful. This process uncovered various ambiguities, unspecified transitions, and a deadlock within the draft verification model. Thus helping to ensure that at least those errors found were fixed and applied to development.
4.3.1 Dynamic Semantics Of Finite State Machines
SDL is based on the concept of CEFSMs, which communicate with each other and their common environment by signals in an asynchronous manner via possibly delaying communication paths. These signals are buffered on arrival at a process.
A finite state machine (FSM) is defined as a 4-tuple < S, s0, E, f >, where S is a set of states, s0 is an initial state, E is a set of events with their parameter lists, f is a state transition relation. However, the construction of an FSM is limited by the state-explosion problem. An extended finite-state machine (EFSM) solves this problem by introducing variables in addition to explicit states of the process instance. These variables become implicit states, being able to take on a number of values themselves.
Each EFSM is defined as a FSM with addition of variables to its states. EFSMs are
those defined with additional variables to states as a 5-tuple < S, s0, E, f, V >. Where S, s0, E, and f as in the case of the FSM and V is a set of local variables along with their types and initial values, if any. Each state in an EFSM is defined by a set of variables, including state names. The transition T of an EFSM becomes [< s, v1, . . . , vn > + input∗, task∗; output∗ + < s0, v01, . . . v0n >], where s and s0 are the names of states, < v1, v2, . . . , vn > and < v10, v02, . . . vn0 > are values of extended variables, n is the number of variables, “+” means coexistence, “;” means sequence of events such as tasks and outputs, and “[,]” denotes a sequenced pair. The difference between an EFSM and an FSM is that an EFSM associates each transition not only with input and output actions but also with assignment actions and conditions [Wang and Liu(1993)].
A communicating extended finite-state machines (CEFSM) includes the definitions of EFSMs and signals [Jan Ellsberger and Sarma(1997)]. There are signals, which means that channels exist. A CEFSM is defined as a 6-tuple < S, s0, E, f, V, X >. Where S, s0, E, f , and V as in the case of the EFSM, and X is a set of signals. In CEFSM, signals are responsible for communicating information from within the CEFSM to other automata, some of which may be located in the environment of a system. The signals account for the observable behaviour, which is more important than the actual model for a specification. In SDL, CEFSM processes use signals to communicate with other CEFSMs and the environment.
4.3.2 SDL’s Underlying Model
The language SDL is intended for the formal specification of complex, event-driven, real-time, and interactive applications involving many concurrent activities that com-municate using discrete signals. It is especially well suited for specification of commu-nication protocols, reactive systems such as switches, routers and distributed systems.
SDL has been designed for the specification and description of the behaviour of such systems, i.e., the internetworking of the system and its environments. SDL allows the hierarchical description of systems. The description starts from a construction called system, where functional blocks are inserted. A block is a component composed by one or more processes and/or other blocks. A block consists of processes connected by signal routes. A process contains a sequential behaviour and concurrency modelled by a set of processes. Each process is a CEFSM. These machines or processes run in parallel. They are independent of each other and communicate with discrete messages, called signals.
A process can also send signal to and receive signals from the environment of the system.
The behaviour of a state machine is characterized by a set of transitions. A transition to another state or the same state occurs whenever an input is consumed. When a process is in a state, it accepts input. This input can be a signal received by the input port or timers. When a process enters a new state, it means that a transition terminates.
CEFSM enables decisions to be made in transitions based on the value associated with
a variable so that the state which follows when a specific input is consumed is not only determined by the existing state and input.
The SDL language supports two equivalent notations: the graphical notation (SDL-GR) and the textual notation (SDL-PR). The SDL-GR is a standardized graphical represen-tation of the system. SDL elements such as system, block, process, signal etc. are drawn using standardized graphical symbols. The SDL-PR is a textual phrase representation of the SDL system, or in other words, it is a SDL source code.
4.3.2.1 Process Model
The Z.100 ITU-T standard defines that the SDL underlying model is a CEFSM (Com-municating Extended Finite State Machine), where all processes are CEFSMs. For each process, a finite number of states, inputs and outputs determine its behaviour. Non-determinism capability allows representing spontaneous transitions, which are transi-tions without any signal causing them. This is useful to describe unpredictable system characteristics. In SDL, only one input signal can be consumed/evaluated at each in-stant. This means that each input signal consumed corresponds to one state transition in an SDL description.
4.3.2.2 Communication Model
The concurrency model used in SDL allows independent and asynchronous processes operation. There is no guaranteed relative ordering of operations in distinct processes, except the ordering created by explicit synchronization among processes through the use of shared signals. Shared signal events are then the means by which a precise ordering of events in distinct process can be achieved.
The communication between processes is reliable. It is assured that the receiving process will consume every signal produced by a sender process. However, it is not guaranteed that the ordering of the signals generated by all processes is the same of their con-sumption. This model is adequate to describe events without precise ordering, like systems that can have their normal flow interrupted. Handshaking or unlimited queues in practice-bounded queues are used to implement the communication model. For both cases, each SDL state results in a set of protocol communication signals and area over-head to implement the protocol. This characteristic may cause large communication overhead, which can penalize the implementation.
4.3.3 Generating the Specification
The TCP verification model is specified with CEFSM and is presented in SDL in this section. A CEFSM is defined as a 6-tuple < S, s0, E, f, V, X >, as it is mentioned above.
• S is a set of states
• s0 is an initial state
• E is a set of events with their parameter lists
• f is a state transition relation
• V is a set of local variables along with their types and initial values, if any
• X is a set of signals
For a state, an input event, and a predicate composed of a subset of V , the state transition relation f has a next state, a set of output events and their parameters, and an action list describing how the local variables are updated.
The purpose of SDL in this project is to verify whether the simplified TCP verification model follows the standard TCP transitions. To do this, the simplified TCP verifica-tion model (Figure 4.10) was converted into a SDL specificaverifica-tion. The CEFSM of the simplified TCP verification model is as follows:
• S = {listen, syn rcvd, ack wait, established, closing, close wait 1, close wait 2, closed}
• s0 = listen
• E = {send(Vi, Xi), recv(Vi, Xi), timeout(Vi)}
• f : {f (Si, Ei, Vi) → (Si+1, Vi+1, Ei+1) }
• V = {tcp id, tcp seq, tcp id seq}
• X = { ACK, SYN, FIN, RST, SYNACK, ACKFIN}
In this SDL specification, among TCP flags, PSH and URG are not included. Timeout, and checking flags and packet sequences should be dealt with in a low-level implemen-tation part as well.
4.3.4 SDL Creation based on the TCP Verification Model
To detect packet fragmentation, the SDL specification can recall the packet sequence and proper flag, and the low-level implementation part cooperates with this SDL specifi-cation, other flag combinations, and timeout. To build the SDL specifispecifi-cation, Cinderella SDL [Cinderella(2003)] was used. Figure 4.12 shows the part of the StateTransition process built in SDL. Besides, all SDL-GR and -PR of the proposed TCP verification model can be found in Appendix A.
f is the state transition relation. It represents how to move from the current state to a new state given a certain action if any.
Note that ‘, ,’ in a set of variables V means no variable changed, ‘{ }’ in a set of events E means no specific event is required.
Figure 4.12:System of the TCP Protocol State Machine
• LISTEN State.
f(listen, recv(tcp id, SYN), tcp id seq = 0) → (syn rcvd, tcp id seq = tcp seq, send(tcp id, SYNACK))
Figure 4.13: LISTEN State of the TCP Protocol State Machine
• SYN RCVD State.
f(syn rcvd, recv(tcp id, RST), tcp id seq != 0) → (listen, tcp id seq = 0, {}), f(syn rcvd, send(tcp id, SYNACK), tcp id seq != 0) → (ack wait, , {})
Figure 4.14:SYN RCVD State of the TCP Protocol State Machine
• ACK WAIT State.
f(ack wait, recv(tcp id, ACK), tcp id seq != 0) → (established, tcp id seq = tcp seq, {}),
f(ack wait, recv(tcp id, ACKFIN), tcp id seq != 0) → (close wait 2, tcp id seq = tcp seq, {}),
f(ack wait, timeout(tcp id), tcp id seq != 0) → (closed, tcp id seq = 0, {F}), f(ack wait, recv(tcp id, FIN), tcp id seq != 0) → (closing, , send(tcp id, ACK))
Figure 4.15: ACK WAIT State of the TCP Protocol State Machine
• CLOSING State.
f(closing, recv(tcp id, ACK), tcp id seq != 0) → (close wait 1, tcp id seq = tcp id seq, {}),
f(closing, timeout(tcp id), tcp id seq != 0) → (closed, tcp id seq = 0, {F})
Figure 4.16:CLOSING State of the TCP Protocol State Machine
• CLOSE WAIT 2 State.
f(close wait 2, recv(tcp id, ACK), tcp id seq != 0) → (closed, tcp id seq = 0, {F}), f(close wait 2, timeout(tcp id), tcp id seq != 0) → (closed, tcp id seq = 0, {F})
Figure 4.17: CLOSE WAIT 2 State of the TCP Protocol State Machine
• ESTABLISHED State.
f(established, recv(tcp id, RST), tcp id seq != 0) → (closed, tcp id seq = 0, {F}), f(established, recv(tcp id, SYN), tcp id seq != 0) → (closed, tcp id seq = 0, {F}), f(established, recv(tcp id, FIN), tcp id seq != 0) → (close wait 2, , send(tcp id, ACK)),
f(established, timeout(tcp id), tcp id seq != 0) → (closed, tcp id seq = 0, {F})
Figure 4.18:ESTABLISHED State and CLOSE WAIT 1 state of the TCP Protocol State Machine
• CLOSE WAIT 1 State.
f(close wait 1, send(tcp id, ACK), tcp id seq != 0) → (closed, tcp id seq = 0, {F})