• No results found

Single event functional interrupt (SEFI)

4.5. Single event effects (SEE)

4.5.2. System level response

4.5.2.3. Single event functional interrupt (SEFI)

SEFI represent the most disruptive version of non-destructive SEE. Although this type of anomaly was previously predicted for space environments (Koga et al., 1985), the term single event functional interrupt (SEFI) was first mentioned

1996). SEFI is defined as all non-destructive failure modes that lead to the malfunction (or interruption of normal operation) of a part or the totality of the device (Bougerol et al., 2008). This definition is in contrast with certain authors that define SEFI as the cause of a higher error rate than expected due to uniformly distributed upsets (Crain et al., 1999; LaBel et al., 1996).

The causes and effects of SEFIs vary from the type of component and the technology used. In general, SEFIs are linked to an upset (SET or SEU) in a control area that configures a specific function, and leads to the loss of that function. In contrast to SEUs and SETs that may or may not affect the operation of the device, every single type of SEFI leads to a direct malfunction. Figure 4-12 shows an SET affecting combinational logic, not affected by the logical and electrical masking mechanisms (as in Figure 4-13), that propagates to a register in a control area within the latch window (as in Figure 4-14). If the register affected is being used by a vital part of the system software, a SEFI could take place.

Table 4-5. Classification of SEFI

Name Also called Typical Effect Recovery procedure Technology affected Examples

Logic SEFI Address error, recoverable bust error, temporary block error Reading/writing of the wrong row, column; 512- 8k addresses in errors

Rewriting of the right value

Complex memories

such SDRAM Fuse latch upsets (SEFLUs)

Soft SEFI Resettable SEFI Functionality loss of up to a full memory bank Refresh cycles FPGA, microprocessors, complex memories

Stuck block errors

Hard

SEFI Permanent SEFI, Reboot SEFI Complete loss of functionality

Complete power cycle of the device FPGA, microprocessors, complex memories Events that induce data and functionality loss that cannot be recovered

As microcircuits become more complex they also become more susceptible to SEFIs; among those: SDRAMs (Harboe-Sorensen et al., 2007) with complex internal architecture (such as state machine), FLASH memories (Irom and

Nguyen, 2007; Nguyen et al., 1999; Oldham et al., 2008), FPGA (Czajkowski et al., 2006) and microprocessors (Czajkowski et al., 2005). Dependent on cause, consequences and recovery procedures, SEFIs can be classified as logic, soft or hard (see Table 4-5).

Logic SEFIs (Bougerol et al., 2008): with regards to memories, it is also called “address  error”,  “recoverable  burst  error”  (R. Ladbury et al., 2006) or

“temporary  block  error”  and  mainly  includes  row  and  column  errors.  The upset of a row or column register leads to the reading or writing of the wrong

row/column. This type of SEFI typically causes between X and 8k addresses in errors where X is the number of addresses per row/column (Bougerol et al., 2008). Rewriting of the right values is used as to recover functionality (Schagaev and Buhanova, 2001).

Examples  of  logic  SEFIs  are  “fuse  latch  upsets”  also  called   SEFLUs (Bougerol et al., 2011, 2010) that lead to the wrong addressing of a whole row/column. Manufacturers are experiencing an increasing number of defective cells, therefore adding spare cells and exposing them to reliability tests. If during those tests, a cell is found defective, fuse latches are used to disable the particular row/column. Typical signatures of fuse latch upsets are multiples of X addresses where X is the number of addresses belonging to a column/row.

Soft  SEFIs  also  called  “Resettable  SEFIs”  (Bougerol et al., 2008; Lawrence, 2007) are due to upsets in the device configuration area and usually induce the functionality loss of several thousands of addresses up to a full memory bank. Reconfiguration of the device with a mode register set command can be used as a recovery procedure of the functionality (but not the data). Examples of this are “block  SEFIs”  also  called  “stuck  block  errors”,  observed  in  the  IBM  Luna-ES rev C during heavy ion testing (“NASNGSFC   Landsat-7 Project Office, Private

specific value. Since simple writing was not sufficient, device refresh cycles were used to clear the problem. SEUs in selected areas of an FPGA such the JTAG bit serial configuration port can lead to inability of reconfiguration.

Hard SEFIs (Bougerol et al., 2010; Harboe-Sorensen et al., 2007), also called Reboot SEFIs (Bougerol et al., 2008),  “permanent  SEFIs”  (Slayman, 2005),  “non   resettable   errors”   (Lawrence, 2007, p. 512) or   “persistent   non   recoverable   errors”  (R. Ladbury et al., 2006) can be induced by different phenomena and lead to the complete loss of memory functionality. Possible causes of this type of catastrophic SEFI are upsets in the internal state machine, counter registers or activation of special modes. An example of this is an SEU in one of the power on reset registers that can lead to the removal of the entire configuration area. Complete power cycle of the device is compulsory as a recovery procedure.

Fortunately, the probability of SEFI is low compared to other types of SEEs (Slayman, 2005). The reasons for that are:

The ratio of the periphery logic area to memory array area is very low; The critical charge for logic gates is usually higher than for SRAM cells. The most part of the periphery logic is combinational, and therefore less susceptible to upsets due to the three inherent masking mechanisms.

SEFIs can also be classified as high current SEFIs if they involve a certain increase in current (Koga et al., 2001a, 2001b).

In addition to SEFIs in complex memories, the energetic particles can also strike other circuits such that the error detection and correction mechanisms affect the functioning of the whole circuit. In FPGAs, SEFIs can cause the device to stop from functioning normally and therefore require a power reset in order to resume normal operations. In microprocessors, SEFIs can induce upsets in the program counter, illegal branching and jumps to undefined states.