• No results found

4.4 Extracting testing knowledge

4.4.1 Extracting features from a simulation trace

Suppose we want to understand what conditions cause the branch predictor to mispredict. The first step is to extract all the information relevant to branch prediction from a simulation trace. This trace is generated by simulating an assembly program on an RTL, which in our case is a processor core. We use randomly generated assembly programs containing roughly 10,000 instructions each. RTL verification is done by co-simulating an RTL with its architectural model. The output of an RTL simulation, therefore, includes a cycle accurate log of all the architectural state changes. All state values measurements are done at the time of instruction retirement.

Fig. 4.4 shows us how a KOS simulation trace looks like. All the information regarding an instruction and the state changes caused by it are summarized in a block. The first line of the block indicates the cycle in which the instruction is re- tired. This is followed by several lines describing the various operations performed as a result of executing the instruction. To identify the nature of the operation being described, each line is assigned a key. For example, information about the instruction itself is indicated by the key Ex. As shown in the figure, this line indi- cates the core on which the instruction was executed, the instruction count in the

RTL Cycle number 13121

P000: 0000000116: Ex: 0000:F000:000000000000C1C3 (00000000FFFFC1C3) OP: MOV EAX,EDI P000: -- Mode: Rl16

P000: -- Opcode Bytes: 66 89 F8

P000: -- prefix[66] opcode1[89] modrm[f8]

P000: -- RAX = 00000000000C0000/000000008000C000 P000: -- RIP = 000000000000C1C3/000000000000C1C6 P000: -- Written: RAX RIP

RTL Cycle number 13122

P000: 0000000117: Ex:0000:F000:000000000000C1C6 (00000000FFFFC1C6) OP: ADD EAX,0x01000218 P000: -- Mode: Rl16

P000: -- Opcode Bytes: 66 05 18 02 00 01

P000: -- prefix[66] opcode1[05] imm1[18 02 00 01] P000: -- RAX = 000000008000C000/000000008100C218 P000: -- RFLAGS = 00000006/00000086

P000: -- RIP = 000000000000C1C6/000000000000C1CC P000: -- Written: RAX RFLAGS RIP

Time of retirement

Core: Instruction #: Key: TR Selector: CS Selector: Effective RIP (Linear RIP) Instruction

Register value changes

Summary

Figure 4.4: RTL simulation trace showing two instruction retirements test, Task Register (TR) selector, Code Segment (CS) selector, effective address, linear address and the instruction itself. In this case, the 116th instruction(OP: MOV EAX, EDI) is executed on Core 0. It was retired on the 13121th cycle. Table walks, code reads and memory reads are some other examples of processor operations. In Fig. 4.4, there is only one keyed line each in the two blocks. This is followed by a summary of the instruction. The first three lines indicate the user mode, opcode bytes and instruction bytes description. The rest of summary has all the register value changes and a final line that lists out all the registers which were written to.

Our goal here is to find the relationship between an event and the specific instruction/instruction sequence responsible for triggering it. Because we treat an event as a function of the state and the instruction triggering it, each data sample should include information about the preceding instructions in addition to itself. It allows us to uncover the correct sequence of instructions that can act as a trigger. Since instruction execution is out of order, microarchitectural events can also be influenced by instructions following a trigger. Even though they appear after a triggering instruction, they can be executed before the trigger. Fig. 4.5

illustrates how a dataset is created from an instruction sequence. ..

(0000000026E87E3A) PUSH ECX (0000000026E87E3B) BT EAX,0x02 (0000000026E87E3F) JB $+0x00000010 (0000000026E87E45) MOV EBX,0x15B88022 (0000000026E87E4F) MOV EDX,ECX

.. ..

(0000000026E87E84) XOR EAX,EAX (0000000026E87E86) MOV AL,[EDX + ECX] (0000000026E87E89) CMP EAX,0x00000020 (0000000026E87E8E) JL $+0x00000055 (0000000026E87EE3) SHL EBX,0x02

(0000000026E87EE6) ADD [EBP + EBX + 0x04],EAX (0000000026E87EEA) MOV EAX,0xFFFEFFFF .. Sample n Sample n + 1 Triggering instruction Triggering instruction True False ..

Figure 4.5: Use of a moving window of size 5, to generate a dataset for modeling branch mispredictions from a simulation trace. Annotation values are indicated in red.

To generate a data sample, we use a moving window that slides over the instruction sequence beginning with the first instruction. The size of the moving window is empirically set to an odd integer value w = 2p + 1 depending on the event being modeled. In the illustration, a window size of w = 5 has been used. A wider window takes the effect of more instructions into picture when describing an event. A narrow window localizes the effect. The first and the last 2p instructions in the test are ignored as they do not constitute a full window. If necessary, they can be included by using NOP instructions as padding.

In the example described by Fig. 4.5, features are extracted from a window only if the p+1th (triggering) instruction is predicted by the branch prediction unit (BPU). This includes all conditional and unconditional branches, and jumps and calls within the current code segment (near JMP and near CALL). Control transfer instructions such as far JMP, far CALL and RET instructions are ignored as they are not predicted. All other instructions that do not modify program control flow are also ignored. Thus, features are extracted from instructions specific to the target behavior. In cases where the target behavior is not instruction specific, feature extraction is performed for all instances of the moving window.

Features collected from each instance of the moving window is used to create a single data sample. For each instruction within the window, features that are rele- vant to branch prediction are extracted. We use domain knowledge to extract the

right features. For example, the type of a branch instruction is important (condi- tional or unconditional) but the user mode at the time of execution is irrelevant. Other features relevant to branch prediction are:

• Branch location

• Branch target

• Branch outcome - Taken or not taken

• Opcode of the branching instruction

Each moving window instance is also annotated with a binary value indicating whether or not the modeled event was successfully excited by the triggering in- struction, which is the (p + 1)th instruction. In our example, a label ’1’ (boolean True) indicates that the instruction was mispredicted. And a label ’0’ (boolean False) implies that an instruction was correctly predicted by the branch predic- tor. These annotation values are generated by tapping into the appropriate RTL signals. Whenever the target event is triggered, the next retired instruction is annotated with a value of ’1’. For example, if two branch instructions retire on the 200th and 205th cycles respectively, and a branch is mispredicted on the 203rd cycle, the second instruction is annotated with a value of ’1’.