3.3.1 Reverse Engineering LTS Model From Low-Level Traces
Walkinshaw et al. [28] used dynamic analysis to generate a list of execution traces that can be served as an input for grammar inference techniques. Those low-level traces are required in an abstraction process to obtain high-level abstraction. They integrated the reverse-engineering technique represented in QSM into a testing framework. Their idea was performed in four activities as follows:
1. Dynamic analysis: This process generates a collection of system execution traces, which is considered as sequences of method calls.
2. Abstraction: This process focuses on generating a function that can use the low- level traces obtained in activity 1 as input and return equivalent sequences of func- tions at a level of abstraction as output.
3. Trace abstraction: The abstraction method in step 2 is applied to the set of traces derived in step 1. It returns a finite set of abstract function sequences, which is passed as input for the next step.
4. QSM: In this process QSM is applied to the function sequences. They [28] improved the QSM algorithm by modifying the questions generator, and adding a facility to add negative sequences to eliminate the invalid edges in the resulting machine.
Similar to the original QSM, Walkinshaw et al. [28] used the EDSM to select a pair of states to merge. In the QSM framework [28], a slight modification to the membership queries generator was implemented compared to the original QSM algorithm [36]. The improved generator generates membership queries from the merged graph, and the reason for introducing this method is that new sequences can appear as a result of the merging and determinism processes. The improved generator creates queries by concatenating the shortest prefixes to the red state with suffixes of the merged state in the graph after merging.
Example 3.5. Let us return to the example of the text editor in Figure3.18, the merging of states B and C can result in a new machine as illustrated in Figure1.3, and a new edge la- belled save is added to the red states labelled with BCG. The improved generator returns a list of question as follows: Improved Queries = {hLoad, Savei, hLoad, Edit, Close, Loadi}.
3.3.2 Reverse Engineering LTS Model Using LTL Constraints
Walkinshaw and Bogdanov [77] proposed a technique to use temporal constraints in the model inference process. The main reason for introducing LTL constraints in DFA infer- ence is to reduce the reliance upon the execution traces.
The technique that is proposed by Walkinshaw and Bogdanov [77] allows adding LTL constraints alongside the list of traces to infer a state machine. In addition, a model checker is used to ensure that the hypothesis machine does not violate any temporal rules. If the proposed machine violates defined rules, then counterexamples are generated from a model checker to feed them into the inference learner to start learning again.
Additionally, this technique [77] might be run in a passive or an active manner. In passive learning, LTL constraints are provided initially by the developer alongside traces. The inference process starts by generating APTA from the provided positive and negative traces. Iteratively, pairs of states are selected using the EDSM learner with the red-blue
framework. The pair of states with the highest score is picked for merging. Once the hypothesis machine is obtained after merging a pair of states, it is passed to the model checker to ensure that it does not violate LTL constraints [77]. If there is any violation with the provided LTL properties, the model checker returns a counterexample, and the inference process is restarted [77].
On the other hand, [77] showed that the QSM learner can benefit from the integration of LTL constraints. Similar to the case of passive inference described above, the learning starts by augmenting sequences into APTA and merges states iteratively. It calls the model checker to find any contradiction with LTL constraint. In cases where no counterexamples are retuned from the model checker, the active algorithm checks the correctness of a merger of two states by asking queries in the same manner in the QSM learner. This differs from the passive learning in that it continues to merge states if there are no counterexamples obtained from the model checker [77].
Besides, in the case of the active learning strategy, the advantage is that the QSM learner attempts to find undiscovered sequences by asking queries. Moreover, there is a possibility of adding a new LTL properties that can help to confirm or reject new scenarios that appear during the inference process [77].
Walkinshaw and Bogdanov [77] stated that LTL constraints are very helpful in reducing the amount of traces required to generate the exact machine. In addition, Walkinshaw and Bogdanov [77] stated that without such constraints a considerable number of traces are required to infer an accurate model. However, there are barriers related to identify- ing LTL constraints because it requires effort and a large numbers of traces [77]. The drawback of the inference of a state-machine model using the LTL constraints is the re- liance still upon the developer to provide reasonable LTL constraints, which requires more effort Walkinshaw and Bogdanov [77].
Walkinshaw and Bogdanov [77] showed that a number of membership queries can be reduced with the aid of LTL constraints. To sum up, if a large number of constraints are supplied with traces, a large number of queries will be avoided during the inference process [77].