• No results found

Chapter 5 Research Implementation

5.3 Adaptive Error Compensation in Subsequent Addition

We have implemented a modification to the approach described in algorithm 1. We re- fer to this approach asadaptive error compensation in subsequent addition(AECSA). In this design, we calculate the errors during summation and if possible apply a cor-

rection as well. If it is not possible to apply the correction because of unavailability of corresponding error, we accumulate the error. Once the set has been reduced com- pletely, we add the final accumulated error term and the result. This approach has been depicted in Figure 5.7.

Figure 5.6 shows the pipeline required for implementing one iteration of compen- sated summation algorithm which compensates for the error in subsequent addition. In this implementation, error term, e calculated in the previous iteration is required as an input to the first adder. Due to the depth of the adder network, it is not possi- ble to have this error every cycle. Error will not be available for each addition even if we replace the two adders with one custom adder which generates error. The values from different sets can be scheduled in an interleaved manner but this will require lot of on-chip resources, complex control logic and upfront processing of data.

Figure 5.6: Error Compensation in Subsequent Addition

Figure 5.8 shows an overview of AECSA. In this design, we require two reduction circuits but unlike AEC, the application of rules in ERC depends on the availability of errors to VRC. Further, since the error compensation may take during accumulation of values, two adders connected sequentially are required in VRC. The first adder is for error correction which adds the error to an input value while the other adder adds the output of the first adder with the second input value. The second adder also generates the error term.

Figure 5.7: Adaptive Error Compensation in Subsequent Addition

the input values. If |a| >|b|, a and b are swapped. Thus, the overall adder pipeline depth in VRC is more than two times of that in AEC and original reduction circuit. Here, we increase the size of the input FIFO but keep the number of buffers same as the original reduction circuit in order to keep the control logic simple.

As discussed previously, due to the depth of adder pipeline and simultaneous accumulations from the same set, there may arrive a situation where we may not be able to compensate for the error using the first adder. In such a case, we accumulate the error using ERC. Errors can also be supplied from ERC to VRC in order to maximize the chances of error compensation in VRC. After set reduction, the final error term from ERC can be added to summation to obtain the final result.

The major difference between AEC and AECSA is, in AEC the rules in ERC are independent of conditions in VRC while in AECSA, a check for set ID match is performed and if a match is found then that error value is supplied to VRC. In other words, the set ID of error term in ERC is equal to the set ID which is being input the adder pipeline in VRC, the corresponding value from ERC is supplied to VRC for error compensation.

In ERC, if conditions for a rule are satisfied but the set ID of error source matches in VRC then that rule is not applied. In such a case, some other rule is applied. Error

Figure 5.8: Module for AECSA

accumulation in ERC is performed only when is it not possible to supply the errors to VRC as input. Thus, error input to VRC is prioritized. The rules in ERC have been modified in accordance with this policy.

There are four sources of the input error in VRC- the output of the custom floating point adder in VRC, FIFO output in ERC, buffers in ERC and output of adder in ERC. When applying Rule 1, Rule 2, Rule 3 and Rule 4, the set ID of the error outputs is checked compared with the adder input set ID. If a match is found, the respective error term is supplied as input. If the error output from the custom adder in VRC serves as the input error, this error term is not supplied to ERC and err_valid line is de-asserted. If the error term comes from ERC, it is invalidated in ERC and is not considered for accumulation. The rule to be applied in ERC depends on whether the corresponding error term has been invalidated or not. For example, in ERC, if the set ID of adder output matches the set ID of one of the buffers (Rule 1), but at the same time, in VRC, the set ID is the same as set ID of adder output in ERC, then Rule 1 won’t be applied in ERC and the error term from output of the adder pipeline in ERC is supplied to VRC. In such a case, some other rule is applied in

ERC. Algorithm 6 describes the control logic in VRC. It is evident that the number of compare operations per cycle is significantly more and the application of a rule in ERC is dependent on the logic decision from VRC. This essentially levies a timing challenge and the performance in terms of operating is affected adversely due to this. Also, the rules in ERC have been modified to accommodate the condition for potential set ID match in VRC. The control logic for ERC is shown in algorithm 7.

In the original reduction circuit, we have three counters- the first counts up when a new value arrives in the reduction circuit, the second counts down when two values are supplied to the adder while the third counter counts down when a set has been reduced. Since ERC also supplies error terms back to VRC, the number of valid error term decreases even when there is no reduction. In order to account for the error terms supplied to VRC from ERC and keep track of the number of errors belonging to a particular set in ERC, we need a fourth counter. This counter counts down whenever an error term is supplied to VRC from ERC. In the absence of this counter, the error reduction will not be correct. Example in Figure 5.9 depicts the working of counters in ERC. The subscripts in the example represent the counter which is activated. Thus, when two error terms are added in ERC, counter 1 is activated. Similarly, if a value is supplied from ERC to VRC from either the input FIFO, buffers or output of adder, counter 4 is activated.

In ERC, similar to AEC, the error values belonging to a particular set are not contiguous hence we need to check the status of the set VRC. This can be checked using the valid_out signal when an error term is supplied to ERC. If valid_out signal is asserted, then the main set has been completely reduced and the last error value has been supplied to ERC. This signal is synchronized with the input FIFO in ERC and is stored in a dual ported memory in ERC. In order to assert valid_out_err in ERC, this signal must be asserted and the sum of outputs of the three counters must be 1.

Algorithm 6 AECSA VRC Rules

1: ifn :bufn.set=adderOut.set then .Rule 1 2: rule1 = 1 3: addIn1 =adderOut 4: addIn2 =bufn 5: if input.validthen 6: bufn=input 7: end if

8: else ifi, j :bufi.set=bufj.set then .Rule 2 9: rule2 = 1 10: addIn1 =bufi 11: addIn2 =bufj 12: if input.validthen 13: bufi =input 14: end if

15: if numActive(adderOut.set) = 1 then

16: result=adderOut

17: else

18: bufi =adderOut 19: end if

20: else if input.valid then .Rule 3

21: if input.set=adderOut.set then

22: rule3 = 1

23: addIn1 =input

24: addIn2 =adderOut

25: end if

26: else if input.valid then .Rule 4

27: ifn:bufn.set=input.set then

28: rule4 = 1

29: addIn1 =input

30: addIn2 =bufn

31: if numActive(adderOut.set) = 1 then

32: result=adderOut

33: else

34: bufn=adderOut 35: end if

36: end if

37: else if input.valid then .Rule 5

38: addIn1 =input

39: addIn2 = 0

40: if numActive(adderOut.set) = 1 then

41: result=adderOut

42: else

43: ifn:bufn.valid= 0 then 44: bufn=adderOut 45: else 46: ERROR 47: end if 48: end if 49: else .Rule 6 50: addIn1 =AdderOut 51: addIn2 = 0 52: end if

53: if rule1 OR rule2 OR rule3 OR rule4then

54: if not(adderOut.rule_5_or_6_out)then

55: addErrIn=addererrOut

56: errInput.disable = 1

57: else if errInput.set =adderOut.set then

58: addErrIn=errInput

59: errInput.errEn= 1

60: else if errAdderOut.set=adderOut.set then

61: addErrIn=errAdderOut

62: errAdderOut.errEn = 1

63: else ifn :errBufn.set=adderOut.set then 64: addErrIn=errBufn 65: errBufn.errEn= 1 66: else 67: addErrIn= 0.0 68: end if 69: else 70: addErrIn= 0.0 71: end if

72: if count1 +count2 +count3 = 1 then

73: redCktOut.valid= 1

74: redCktOut.set =adderOut.set

75: redCktOut=adderOut

76: end if

77: if not(errInput.disable)ornot(adderOut.rule_5_or_6_out)then

78: redCktOut.Err =adderOut.Err 79: redCktOut.errV alid=adderOut.valid

80: redCktOut.errSet=adderOut.set

81: end if

Algorithm 7 AECSA ERC Rules

1: ifn:errBufn.set=errAdderOut.set and

2: not(errBufn.errEn or errAdderOut.errEn)then . Rule 1 3: rule1 = 1 4: errAddIn1=errAdderOut 5: errAddIn2=errBufn 6: if errInput.validthen 7: errBufn =errInput 8: end if

9: else ifi, j:errBufi.set=errBufj.set and

10: not(errBufi.errEn or errBufj.errEn)then . Rule 2 11: rule2 = 1 12: errAddIn1=errBufi 13: errAddIn2=errBufj 14: if errInput.validthen 15: errBufi=errInput 16: end if

17: if numActive(adderOut.set) = 1then

18: result=errAdderOut

19: else

20: bufi=errAdderOut

21: end if

22: else if errInput.validthen . Rule 3

23: if errInput.set=errAdderOut.set and

24: not(errInput.errEn or errAdderOut.errEn)then

25: rule3 = 1

26: addIn1=errInput 27: addIn2=errAdderOut

28: end if

29: else if errInput.validthen . Rule 4

30: ifn:errBufn.set=errInput.set and 31: not(errInput.errEn or errBufn.errEn)then 32: rule4 = 1

33: errAddIn1=errInput 34: errAddIn2=errBufn

35: if numActive(errAdderOut.set) = 1then

36: result=errAdderOut

37: else

38: bufn=errAdderOut

39: end if

40: end if

41: else if errInput.validthen . Rule 5

42: errAddIn1=errInput 43: errAddIn2= 0

44: if numActive(errAdderOut.set) = 1 then

45: result=errAdderOut

46: else

47: ifn:errBufn.valid= 0then 48: errBufn=errAdderOut 49: else 50: ERROR 51: end if 52: end if 53: else . Rule 6 54: errAddIn1=errAdderOut 55: errAddIn2= 0 56: end if

57: if errCount1 +errCount2 +errCount3 +errCount4 = 1 andvrc.set.done= 1then

58: errOut.valid= 1

59: errOut.set=adderOut.set

60: errOut=adderOut

Also, the summation of the main set is stored in a dual ported memory. Once the error terms belonging to a set have been reduced, the final sum is calculated by adding the error term from the adder and main set sum from the memory. For this another floating point adder is required.

In AECSA, the overall behavior of VRC does not change. Rules are applied in the original order of priority even when an error term is not available from any of the sources.