Simulation Guided Weakest Precondition Computation to Dis-

CHAPTER 5 WORD LEVEL FEATURE DISCOVERY TO ENHANCE

5.5 Simulation Guided Weakest Precondition Computation to Dis-

Computation to Discover Word Level Features

5.5.1 Representing RTL as CDFGs

In this section, we introduce the data structures used in the simulation guided weakest precondition computation to discover word level features. We first use a Verilog parser to transform Verilog design into CDFG. Figure 5.3 shows the CDFG of the motivating Verilog example in Figure 5.1. There are three kinds of nodes in a CDFG: a branch node (e.g., b1) corresponds to a branch statement in RTL; an assignment node (e.g., b2) corresponds to an assignment statement in RTL; a merge node (e.g., b6) corresponds to the end of a branch.

The multiple-cycle path in RTL refers to a path that is executed across multiple cycles. The Verilog program is unrolled and the variables in each cycle are annotated with the corresponding cycle index. Each path corresponds to a set of assignment statements and conditional expressions. The Path condition for an assignment statement is a conjunction of all conditional expressions leading to the execution of that assignment statement on the path. The CDFG records multi-cycle paths during simulation. Figure 5.3 shows two concrete paths in cycle 1 and cycle 2. The concrete path in cycle 1 is b1 − b3 − b4 − b6 − b7 in the first always process and b8 − b10 − b12 − b13 − b16 − b18 in the second always process. The concrete path in cycle 2 is b1 − b3 − b5 − b6 − b7 in the first process and b8 − b10 − b12 − b13 − b15 − b17 in the second process. These paths are used to guide weakest precondition computation.

The UD chain of a variable points to all statements that assign it. The UD- chain are used to compute the weakest precondition and track the variables in the logic cone of the target. Figure 5.3 shows the UD-chain for variable id insn in b13. Statements in b2, b4 and b5 define this variable. Note that the non-blocking assignment (“<=”) in a clock triggered process means the assigned value is used in next cycle.

5.5.2 Weakest Precondition Computation in RTL

In the example shown in Figure 5.1, we assume the postcondition predicate is id insn[31 : 26] = ‘OR32 ADDI. We backward substitute the variables used

in postcondition with the definitions to these variables. There are three definitions in b2, b4 and b5. We must simultaneously consider path conditions for the variables used in postcondition predicate. The resulting weakest precondition is computed as follows:

Example(1) : static weakest precondition computation wp(T , id insn[31 : 26] = ‘OR32 ADDI)

=((rst ∨ f lushpipe)

⇒ 60h5 = ‘OR32 ADDI)

∧(¬(rst ∨ f lushpipe) ∧ (¬id f reeze) ⇒ if insn[31 : 26] = ‘OR32 ADDI) ∧(¬(rst ∨ f lushpipe) ∧ (id f reeze)

⇒ id insn[31 : 26] = ‘OR32 ADDI)

We employ the RTL weakest precondition to derive word level features from the conditional expressions in RTL. To guarantee that the resulting word level features are in terms of primary inputs, we set k as the mining window length and set all word level conditional expressions as postcondition predicates. These conditional expressions are within both the logic cone of the given target and the mining window. If postcondition P is in cycle i within the mining window, the wpi−1_{will be} computed.

5.5.3 Simulation Guided wp Computation

Statically computing the weakest precondition generates very complex and unread- able predicates. Assignments to the same variable on different paths are considered in the weakest precondition computation. The path condition for each assignment is also included in the resulting expression. In example(1), the path condition for if insn[31 : 26] = ‘OR32 ADDI is (¬(rst ∨ f lushpipe) ∧ (¬id f reeze).

In addition, the path conditions for different variables used in postcondition P may conflict. If multiple variables are used in the postcondition predicate, the path conditions for the assignments to these variables are conjunct. However, static wp computation is unaware of the satisfiability of such condition. In other words, the conjunct paths for different variables may be infeasible.

wpk _{for large k. We will transitively track the definitions to all variables used in} postcondition predicates until the primary inputs or constants are reached. The resulting weakest precondition is easy to blowup. In example (1), if we want to compute wp1_{, we should find definitions to id insn used in b5 since it is not in} terms of primary inputs. There are three definitions to it in previous cycle. As a result, 9 paths are taken into account.

We use a dynamic simulation guided weakest precondition computation to re- place the static computation. The RTL design is first simulated using either directed or random tests. All concrete paths are recorded during the simulation. We limit the backward substitution only along concrete simulation path. In this way, we can disregard the path conditions in wp computation since there is only one assignment to any variable used in postcondition P along the concrete simulation path.

In the example in Figure 5.3, the concrete simulation paths in cycles 1 and 2 are shown. Given the postcondition predicate P: id insn[32 : 26] = ‘OR32 ORI in cycle 2, we want to use simulation guided method to discover word level features. The definition to id insn on the concrete path is in statement b4. Using substitution, we can discover the word level feature: if insn[32 : 26] = ‘OR32 ORI. We can see that the discovered word level feature using simulation guided wp computation is simple and readable.

We simulate RTL design using directed or random tests to guide the wpi computation. The simulation path may span over millions of cycles, which is much larger than mining window length len. However, the concrete paths used in wpi computation should be at most len cycles. We resolve this problem by shifting the mining window during the simulation. Initially, simulation cycle 1 to cycle len is in mining window. Then cycle 2 to cycle len + 1 is the new mining window. In this way, the mining window is shifted every simulation cycle. The concrete paths in every mining window can be used to guide the wpi computation.

We set the mining window length to 2 in Figure 5.3 and the word level target is alu op = ‘ALU OR. There are several conditional expressions within both the mining window and the logic cone of alu op = ‘ALU OR. Only the id insnh2i[31 : 26] = ‘OR32 ORI (b13 and b16) is at word level. The remaining conditional expressions f lushpipeh1i(b1), f lushpipeh2i(b10), id f reezeh1 i(b3), id f reezeh2i(b10), ex f reezeh2i(b12) are selected as bit level features. Recall that wp1_{(id insnh2i[31 : 26] = ‘OR32 ORI) = (if insnh1i[31 : 26] = ‘OR32 ORI).} When the mining window shifts to simulation cycle 2, the word level predicates

cle 3 is the same as that in cycle 2. In this case, the definition in b5 is used. wp1_{(id insnh2i[31 : 26] = ‘OR32 ORI)=(id insnh1i[31 : 26] = ‘OR32 ORI).} We can see that the discovered word level features do not suffer from the blow-up problem even if we increase k in the wpk _{computation. It should be noted that the} concrete cycle numbers express the relative cycle order within the mining window. They are replaced with the X operator if assertions are expressed in LTL.

The simulation, being inexhaustive, cannot exhaust all feasible paths reaching postcondition P . However, finding a complete set of predicates as features for mining is not required in the context of assertion generation. The mining of assertions is not trying to extract the complete function of the given target. In addition, our method cannot guarantee that the extracted word level features are in terms of every primary input within the target’s logic cone. In this situation, we simply treat each bit of the input variables as a bit level feature.

In document Harmonizing data mining and static analysis to tackle hardware and system level verification (Page 108-111)