• No results found

A majority of testing occurred out of system and was done to verify functionality

of the GTMR structure and memory interfaces. Exhaustive testing of all XUM softcore

instructions was not conducted as it was assumed to function properly since no major

changes were applied to it. Instead, the primary test program used was a slightly modified

version of the UART demo program provided in the original source code for the XUM

softcore that would output a character every half second and echo back any characters

received. To minimize hardware differences between the flight board and development

board, the Genesys 2 development board was used. This board features a KINTEX-7

FPGA of the same size and speed grade as that of the flight hardware, and so its

performance should be similar. It also contains Flash, DDR3 and SD Card memories on

board so we could verify those interfaces. Because a majority of the testing occurred on

the Genesys development board and not flight hardware, the communications interface

primarily used was XUM’s UART module directly connected to a personal computer

(PC) vice the NPSAT PC-104 interface. Internal logic analyzers, which communicate via

the Joint Test Action Group (JTAG) signal lines between the PC and FPGA, were used

extensively to both debug and verify the design.

1.

Fault Injection Testing

Fault testing was performed by manipulation of the system clock lines. By forcing

one of the clocks to run at a different speed from the other two, we observed the effect

that a SET on an individual system clock would have on the softcore. Two results can

possibly occur from a SET on a system clock: the affected system progresses to its next

state earlier than it should or parts of the affected system can slip into a metastable state

and progress to the next state or miss several clock cycles. Testing of SEUs on individual

registers is not necessary as a system-wide effect essentially causes an SEU in every

register of the affected system.

42

The case where an SET causes the affected system to progress to the next state is

shown in Figure 19. This is simulated by running one of the system clocks at twice the

frequency of the other two system clocks. As can be seen at sample-time two, all three

systems are synchronized. There is a slight propagation delay in the incoming instruction

seen in systems zero and one, however, by sample-time three those outputs become

stable. At sample-time four, system clock zero experiences its SET. We see that system

zero has transitioned to its next state based on its vote for the program counter (PC).

While systems one and two maintain that the PC should be 0000000Ch, system zero’s

vote has changed to 0000020Ch. Though system zero’s vote changed, the majority vote

of all PCs, which drive follow-on combinational logic in all three systems, remains

0000000Ch. At sample-time six, we see all the systems resynchronize and agree that the

PC should be 0000020Ch; system zero having been fully restored.

The case where an SET causes the affected system’s PC to go into a simulated

metastable state and not progress is shown in Figure 20. This is accomplished by running

one of the system clocks at one quarter the speed of the other two system clocks. It can be

seen at the rising edge of system clocks zero and one that the votes for the PC are

updated, and since they are the same, the true vote being sent to all follow on logic in all

three systems is the correct value. Due to the simulated metastable state of system two’s

PC, its vote cannot be relied upon, and in this example is frozen at the value where it

entered this simulated metastable state. This does not necessarily have to be the case, and

in a true metastability, the register could capture new values sporadically. Regardless, at

sample-time 268, system two’s PC registers can be seen to have recovered, and it can be

seen that system two is once again synchronized with the other two systems.

Fault injection into configuration memory was not tested. The primary method for

verifying the fault tolerance in the configuration memory would have been to subject the

device to radiation testing. Once again, due to the extremely limited development time,

we were not able to conduct radiation testing, and this needs to be verified on orbit. Since

the SEM module does produce error signals, we will be able to differentiate errors

resulting from configuration memory upsets form those originating in user memory.

43

Waveform Capture Demonstrating GTMR Correction of a Fault Induced by a Fast Clock

Figure 19.

Waveform Capture Demonstrating GTMR Correction of a Fault Induced by a Slow Clock

Figure 20.

44

2.

Performance Testing

Periodically, throughout development, waveforms were captured to determine

how quickly data are being passed between the various levels of memory in an effort to

determine where the softcore could be optimized. Shown previously in Figures 19 and

20, we see that we were able to provide the XUM processors with new instructions every

single clock cycle from the L1 cache, which is the optimal case. The penalty incurred for

a L1 cache miss is depicted in Figure 21. By counting the number of samples Inst_Ready

was low and dividing by four since the system clock runs at 50 MHz, we determined the

number of CPU clock cycles the processor was stalled. Inst_Ready was low for samples

376 to 442, 76 total samples. This translates to a miss penalty of 19 processor clock

cycles. During these 19 clock cycles, we transferred 1024 bits of data, or 32 instructions,

to the L1 cache. Finally, the miss penalty for the L2 cache is show in Figure 22. Similar

to the L1 cache, we determined the total miss penalty by observing the number of

samples Inst_Ready is low, which in this case is exactly the miss penalty due to the

decreased sample rate of 50 MHz. We see the total L2 cache miss time is approximately

26,500 clock cycles, a majority of which is the SD card either internally locating and

preparing the data, CurrentState = 0Ch, or the actual shifting of the data block to the

FPGA, CurrentState = 0Dh. The L2 miss penalty could be further reduced if the 4-bit SD

protocol were used vice the 1-bit SPI protocol; however, this only reduces the portion of

the penalty incurred during CurrentState = 0Dh and saves approximately 3,000 clock

cycles. In the case where a write-back must be performed prior to allocation of new data,

these penalties essentially double.

45

Waveform Capture of a L1 Cache Miss with a Sample Rate of 200MHz

Figure 21.

Waveform Capture of a L2 Cache Miss with a Sample Rate of 50MHz

Figure 22.

46

Related documents