• No results found

Examples for Annotations to Generate Finite Sequences

Finite Sequences

Two examples for annotations of a description in LLS to generate the acyclic finite sequences for symbolic simulation as described in section 4.1.3 are given in the following.

Microprogram-architecture example

Fig. 9.6 and 9.7 demonstrate how the user can indicate the completion of an instruction in the implementation of Example 4.2 (section 4.1.3). The same an- notations are necessary for the verification of the second example in section 7.2.2. Equivalence of a structural description of an architecture with microprogram con- trol and the corresponding behavioral specification is checked in this example. No cycle equivalence is given. Therefore, the sequences to be compared are the complete executions of one instruction.

Specification Implementation     behavioral specification instr fetched←0; mad←2;     annotated implementation instr fetched←1;     annotated implementation · · ·     if mad=2 and instr fetched

then STALL;

else structural description of implementation

· · ·     annotated implementation     annotated implementation

Fig. 9.6: Sequences to be compared for microprogram example

The execution of an instruction in the implementation of this microprogram- architecture takes depending on the instruction 8 to 10 cycles. Therefore, the description of the implementation is replicated according to the maximum num- ber 10 times. The completion of an instruction has to be defined previously by an annotation. Only the annotation of one replicate is shown in the right-hand side of Fig. 9.6, the other copies (annotated implementation) are identical.4 Ini-

4Many different annotations are possible to achieve the same result as in Fig. 9.6. For

9.5 Examples for Annotations to Generate Finite Sequences 131

tially, instr fetched is cleared. Each instruction starts with the microprogram counter mad=2 which is reached again after terminating the previous instruction. instr fetched is set after fetching the first instruction. The if-then-else-clause evaluating instr fetched prevents fetching an additional instruction if the first instruction takes less than 10 cycles, i.e., mad=2 is reached again. The then- branch with the STALL is taken in this case, i.e., the register values are not changed in the remaining cycles. A replication of the behavioral specification is not necessary since it comprises one complete instruction.

Fig. 9.7 describes the annotations added to the LLS-description of the imple- mentation. The design is described in the segment body of La. The sequence to simulate is given in the first two lines on the right-hand side. The segment La is used 10 times since this is the maximum number of cycles for the execution of an instruction. The auxiliary register instr fetched is introduced to consider that some instructions take less than 10 cycles. It is cleared/set in L init/L mark to indicate whether an instruction has been started or not.

Implementation before annotations La: structural description

Implementation after annotations Segments to simulate:

L init, La, L mark, La, La, La, La, La, La, La, La, La

L init: instr fetched←0; mad←2;

L mark: instr fetched←1;

La: if (mad=2) and instr fetched then STALL;

else structural description Fig. 9.7: Annotations to generate the sequence to be simulated

DLX-Example

Section 7.2.1 gives experimental results for the verification of a structural DLX- description designed at Darmstadt University of Technology against a description of the DLX-instruction set. Section 4.1.3 describes how to generate for pipelined systems in general the two finite sequences to be compared according to the approach of [BD94]. The annotations required for symbolic simulation of the given DLX-example are discussed in the following.

The specification consists of flushing the pipeline followed by one serial execu- tion. The implementation comprises fetching an instruction in the inner body of the pipeline loop followed by flushing the pipeline. Flushing the structural pro- cessor description is not automatic as for the behavioral descriptions presented in section 7.1 since the different states of the pipeline are not described separately.

132 CHAPTER 9 Appendix

Only one structural description is given which subsumes all pipeline states. The number of cycles to simulate symbolically for flushing depends on possible stalls. 9 false negatives occurred due to incorrect flushing. These errors are more or less hard to consider in advance, but the equivalence checker identified the non- considered cases and correcting the flushing was simple. Note that the designer needed no insight in the verification process but only in his own design.

The improvements led to the flushing scheme sketched below. 4 cycles are required to flush a 5-stage pipeline without stalls.

Example 9.4

Fig. 9.8 shows one of the cases with two load-interlocks, where flushing takes more than 4 cycles.

LOAD R4,(400)R3 LOAD R3,(400)R2

LOAD R2,(400)R1 MEM WB

/ EX MEM WB

/ ID / EX MEM WB

Fig. 9.8: Flushing with load-interlocks

Flushing can take up to 7 cycles. Therefore, generating the specification consists of linking the following segments:

• setting the stall-register and clearing the branch-flag if no branch is in the EX-stage, see below;

• 7 times the structural pipelined description, and

• the sequential (behavioral) description of the instruction set.

The branch-flag is set iff a branch terminating the ID-stage is taken, i.e., it can only be set if the operation in the EX-stage is a branch. Otherwise an impossible initial state is assumed, which leads to a false negative. Note that the necessity of this additional annotation was detected automatically, i.e., the designer got the hint by the false negative.

One instruction is fetched before flushing in the implementation. But this instruction needs not be fetched in the first cycle. There might be a stall due to a load interlock or a taken branch, which delays the instruction fetch. Therefore, the worst case number of cycles to simulate is 9.

Example 9.5

Fig. 9.9 gives an example, where fetching one instruction and flushing afterwards takes 9 cycles. The branch is taken.

The cycle has to be determined, when the instruction is fetched and flushing has to begin. No instruction is fetched during a load-interlock. Furthermore, an

9.5 Examples for Annotations to Generate Finite Sequences 133 ADD · · · BEQZ R3,· · · LOAD R3,(400)R2 LOAD R2,(400)R1 MEM WB / EX MEM WB / / / ID / IF ID EX MEM WB

Fig. 9.9: Worst case number of cycles for fetching one instruction and flushing

instruction fetched is not executed after a taken branch. Therefore, an annotation is required each time after the first cycles, which sets the stall-register only if no taken branch or jump is in the EX-stage and no load-interlock occurred. An instruction fetched is not squeezed at least after three cycles. Flushing can begin at the latest after 5 cycles. The implementation consists of linking:

• clearing the branch-flag if no branch is in the EX-stage; • 5 times

– the structural pipelined description followed by

– an annotation setting the stall-input if there was no taken branch, jump, or load-interlock;5

• 4 times the structural pipelined description.

5It is not necessary to test all these conditions in each of the 5 cycles. Therefore, the actual

134 CHAPTER 9 Appendix