ISSN: 2005-4238 IJAST 465
Copyright ⓒ 2019 SERSC
ERROR ANALYSIS FOR DWPTHARDWARE REALIZATIONUSING APPROXIMATION
G.Renuka1, V.Usha Shree2, P. Chandrasekhar Reddy3
1PhD scholar ,JNTUH,Hyderabad,Telangana.
2Professor & Principal, JBREC, Hyderabad.
3Professor of Co-ordination, JNTUH College of engineering, Hyderabad.
ABSTRACT
An error analysis for Discrete wavelet packet transform (DWPT) architecture using approximation computational methodology is presented in this paper. The multiplier adder unit of DWPT is replaced by Shifter–Adder -unit to decrease the complexity of parallel structures. Shift add register (SAR)approximate arithmetic architecture designs are proposed for low complexity multiplier –less design that can be an substitute for existing multiplier based designs for realization of multilevel DWPT.
Keywords: Adder, approximate multiplier, hardware circuits, shift register I. INTRODUCTION
The DWT needs many mathematical functions to be performed for its execution. As the number of calculations are more, [1]it is required to know the hardware necessities for such compound calculations .Generally there are 2methods for performing DWT. One method is Convolution based method where filter bank structures are implemented.[2]The second one is Lifting scheme which decreases the number of computations and avoids the use of multiple operations. In DWT just the low‐frequency portion of the spectrum is decomposed. But where as in a DWPT the high frequency portion of the spectrum also further decomposed. The benefit of DWPT is that it is conceivable to choose the optimum representation of the signal according to some standard. The representation that uses less coefficients is the best in image compression.
Canonic Signed Digit[3] (CSD) representation had the competence to do multiplications with less number of adders. It is different representation of a binary number with less number of 1’s and -1 ‘s digits. One important applications of CSD is in multiplication operation, where it takes less number of subtractions & additions to produce the product. V.Gupta et. al(2013), [4] proposed logic complexity at transistor level. K.-J in 2004 gave an error correction method for modified booth fixed width multiplier[5].H.R Mahdiani in 2010 presented a new method of bio- inspired approximate hardware arithmetic with improvements in cost and performance[5].H.A.Moghaddam et all,in 2015 [6] proposed an approximate multiplier which can trade off accuracy and energy at design time for DSP & recognition applications.On the basis of survey of approximate computation ,proposed approximate SAR unit for DWPT using CSD representation.
II. MATHEMATICAL FORMULATION FOR CONVOLUTION BASED DWPT To Know the error performance of proposed SAR unit on applications like image processing, 64 input sample signals of pixel size 512 × 512 for Lenna image processing are considered.
In level-1 DWPT, the input sequence x(n) is decomposed into low-pass sub-band ul(n) and high-pass uh(n) components. Using convolution scheme, the low-pass & high-pass components of 1-level DWPT is calculated as
ISSN: 2005-4238 IJAST 466
Copyright ⓒ 2019 SERSC
𝑢𝑙[𝑛] = ∑𝐾−1𝑖=0 ℎ (𝑖). 𝑥 (2𝑛 − 𝑖) (1)
𝑢ℎ[𝑛] = ∑𝐾−1𝑖=0 𝑔 (𝑖). 𝑥 (2𝑛 − 𝑖) (2)
Where, h(i) and g(i), for 0 ≤ i ≤ K-1, are the filter coefficient of low-pass & high-pass FIR filters of length K. The low-pass sub-band ul(n) & high-pass uh(n) of level-1 can be further decomposed in level-2 decomposition to generate four sub-bands, low-low (ull), low-high (ulh), high-low (uhl), and high-high (uhh), are calculated using convolution scheme given as: 𝑢𝑙𝑙[𝑛] = ∑𝐾−1𝑖=0 ℎ (𝑖). 𝑢𝑙(2𝑛 − 𝑖) (3)
𝑢𝑙ℎ[𝑛] = ∑𝐾−1𝑖=0 𝑔 (𝑖). 𝑢𝑙(2𝑛 − 𝑖) (4)
𝑢ℎ𝑙[𝑛] = ∑𝐾−1𝑖=0 ℎ (𝑖). 𝑢ℎ(2𝑛 − 𝑖) (5)
𝑢ℎℎ[𝑛] = ∑𝐾−1𝑖=0 𝑔 (𝑖). 𝑢ℎ(2𝑛 − 𝑖) (6)
Each sub-band of 2-level decomposition is further decomposed into low-pass & high-pass sub-bands in level-3 decomposition such that 8 sub-bands are low-low-low (ulll), low-low-high (ullh), low-high-low (ulhl), low-low-high (ulhh), high-low-low (uhll), high-low-high (uhlh), high-high-low (uhhl) and high-high- high (uhhh) are calculated as: 𝑢𝑙𝑙𝑙[𝑛] = ∑𝐾−1𝑖=0 ℎ (𝑖). 𝑢𝑙𝑙(2𝑛 − 𝑖) (7) 𝑢𝑙𝑙ℎ[𝑛] = ∑𝐾−1𝑖=0 𝑔 (𝑖). 𝑢𝑙𝑙(2𝑛 − 𝑖) (8) 𝑢𝑙ℎ𝑙[𝑛] = ∑𝐾−1𝑖=0 ℎ (𝑖). 𝑢𝑙ℎ(2𝑛 − 𝑖) (9)
𝑢𝑙ℎℎ[𝑛] = ∑𝐾−1𝑖=0 𝑔 (𝑖). 𝑢𝑙ℎ(2𝑛 − 𝑖) (10)
𝑢ℎ𝑙𝑙[𝑛] = ∑𝐾−1𝑖=0 ℎ (𝑖). 𝑢ℎ𝑙(2𝑛 − 𝑖) (11)
𝑢ℎ𝑙ℎ[𝑛] = ∑𝐾−1𝑖=0 𝑔 (𝑖). 𝑢ℎ𝑙(2𝑛 − 𝑖) (12) 𝑢ℎℎ𝑙[𝑛] = ∑𝐾−1𝑖=0 ℎ (𝑖). 𝑢ℎℎ(2𝑛 − 𝑖) (13) 𝑢ℎℎℎ[𝑛] = ∑𝐾−1𝑖=0 𝑔 (𝑖). 𝑢ℎℎ(2𝑛 − 𝑖) (14)
Similarly, in level-4 decomposition each sub-band of 3-level decomposition is further decomposed into low-pass & high-pass sub-bands to obtain 16 sub-bands named low-low-low-low (ullll), low-low-low-high (ulllh), low-low-high-low (ullhl), low-low-high-high (ullhh), low-high-low-low (ulhll), low-high-low-high (ulhlh), low-low-high-low (ulhhl), low-low-high-high (ulhhh), high-low-low-low (uhlll), high-low-low-high
ISSN: 2005-4238 IJAST 467
Copyright ⓒ 2019 SERSC
(uhllh), high-low-high-low (uhlhl), high-low-high-high (uhlhh), high-high-low-low (uhhll), high-high-low-high (uhhlh) high-high-high-low (uhhhl) & high-high-high-high (uhhhl) are calculated as:
𝑢𝑙𝑙𝑙𝑙[𝑛] = ∑𝐾−1𝑖=0 ℎ (𝑖). 𝑢𝑙𝑙𝑙 (2𝑛 − 𝑖) (15) 𝑢𝑙𝑙𝑙ℎ[𝑛] = ∑𝐾−1𝑖=0 𝑔 (𝑖). 𝑢𝑙𝑙𝑙 (2𝑛 − 𝑖) (16) 𝑢𝑙𝑙ℎ𝑙[𝑛] = ∑𝐾−1𝑖=0 ℎ (𝑖). 𝑢𝑙𝑙ℎ (2𝑛 − 𝑖) (17) 𝑢𝑙𝑙ℎℎ[𝑛] = ∑𝐾−1𝑖=0 𝑔 (𝑖). 𝑢𝑙𝑙ℎ (2𝑛 − 𝑖) (18) 𝑢𝑙ℎ𝑙𝑙[𝑛] = ∑𝐾−1𝑖=0 ℎ (𝑖). 𝑢𝑙ℎ𝑙 (2𝑛 − 𝑖) (19) 𝑢𝑙ℎ𝑙ℎ[𝑛] = ∑𝑘−1𝑖=0 𝑔 (𝑖). 𝑢𝑙ℎ𝑙 (2𝑛 − 𝑖) (20) 𝑢𝑙ℎℎ𝑙[𝑛] = ∑𝐾−1𝑖=0 ℎ (𝑖). 𝑢𝑙ℎℎ (2𝑛 − 𝑖) (21) 𝑢𝑙ℎℎℎ[𝑛] = ∑𝐾−1𝑖=0 𝑔 (𝑖). 𝑢𝑙ℎℎ (2𝑛 − 𝑖) (22) 𝑢ℎ𝑙𝑙𝑙[𝑛] = ∑𝐾−1𝑖=0 ℎ (𝑖). 𝑢ℎ𝑙𝑙 (2𝑛 − 𝑖) (23) 𝑢ℎ𝑙𝑙ℎ[𝑛] = ∑𝐾−1𝑖=0 𝑔 (𝑖). 𝑢ℎ𝑙𝑙 (2𝑛 − 𝑖) (24) 𝑢ℎ𝑙ℎ𝑙[𝑛] = ∑𝐾−1𝑖=0 ℎ (𝑖). 𝑢ℎ𝑙ℎ (2𝑛 − 𝑖) (25) 𝑢ℎ𝑙ℎℎ[𝑛] = ∑𝐾−1𝑖=0 𝑔 (𝑖). 𝑢ℎ𝑙ℎ (2𝑛 − 𝑖) (26) 𝑢ℎℎ𝑙𝑙[𝑛] = ∑𝐾−1𝑖=0 ℎ (𝑖). 𝑢ℎℎ𝑙 (2𝑛 − 𝑖) (27) 𝑢ℎℎ𝑙ℎ[𝑛] = ∑𝐾−1𝑖=0 𝑔 (𝑖). 𝑢ℎℎ𝑙 (2𝑛 − 𝑖) (28) 𝑢ℎℎℎ𝑙[𝑛] = ∑𝐾−1𝑖=0 ℎ (𝑖). 𝑢ℎℎℎ (2𝑛 − 𝑖) (29) 𝑢ℎℎℎℎ[𝑛] = ∑𝐾−1𝑖=0 𝑔 (𝑖). 𝑢ℎℎℎ (2𝑛 − 𝑖) (30)
As shown in the tree diagram of Fig. 1. Each filtering unit (FU) decomposes the input sequence of a particular level into a pair of sub-band (low-pass & high-pass). FU-1 receives the input sequence {x(n)}
and calculate low-pass {ul(n)} & high-pass {uh(n)} sub-band components of level-1 decomposition. Both the low-pass {ul(n)} & high-pass {uh(n)} components of 1-level decomposition is further processed separately using a pair of filter units (FU-21 and FU-22). The FU-21 processes the intermediate data- sequence {ul(n)} and produces a pair of low-pass and high-pass sub-band components {ull(n), ulh(n)}, where FU-22 processes the intermediate data-sequence {uh(n)} and produces another pair of low-pass and
ISSN: 2005-4238 IJAST 468
Copyright ⓒ 2019 SERSC
high-pass sub-band components {uhl(n), uhh(n)}. In level-3, the sub-band components {ull(n), ulh(n), uhl(n), uhh(n)} are processed separately using four identical filtering units (FU-31, FU-32, FU-33, FU-34) and produces components of eight sub-bands {ulll(n), ullh(n), ulhl(n), ulhh(n), uhll(n), uhlh(n), uhhl(n), uhhh(n)}.
Similarly, the sub-band components of 3-level are processed separately using 8 filtering units (FU-41, FU-42, FU-43, FU44, FU-45, FU-46, FU-47, FU48) to produce components 16 oriented selective sub- bands {ullll(n), ulllh(n), ullhl(n),ullhh(n), ulhll(n), ulhlh(n), ulhhl(n), ulhhh(n), uhlll(n), uhllh(n), uhlhl(n), uhlhh(n), uhhll(n), uhhhl(n),uhhhl(n), uhhhh(n)} of 4-the level DWPT.
Fig. 1: Generalized block diagram of 4-level DWPT
III. PROPOSED APPROXIMATE SAR DESIGN
We have used canonical sign digit (CSD) in fixed-point Daubechies 4-tap (DB-4) filter coefficient and approximated these values to replacing the multiplication operations by shift- add operation. The relative error of low-pass & high – pass filter coefficients in approximated values over fixed point values are very less than unity as shown in Table 1. The CSD representation of approximated DB-4 filter coefficient values are used to compute the sub-band outputs of DWPT.
Table 1 : Error estimation of Daub-4 low-pass & high-pass filter coefficient
Low-pass &
High-pass filter coefficient
Fixed point representation values (𝐴) Approximated representation values (𝐵) Relative Error Value (𝐴−𝐵
𝐴 ) Binary CSD values Decimal values Binary CSD
values
Decimal values
h(0) 0.10000100101 0.482910156 0.10001000000 0.4687500 0.029
h(1) 1.00101010001 0.83642578125 1.01000000000 0.7500000 0.103
FU-41
FU-34 FU-33 FU-32 FU-31
FU-22 FU-21
FU-11
FU-42
FU-43
FU-44
FU-45
FU-46
FU-47
FU-48 Level-2
Level-3
Level-4 Level-1
x(n) x(2n)
x(2n+1)
ullll(n) ulllh(n)
ullhl(n) ullhh(n)
ulhll(n) ulhlh(n)
ulhhl(n) ulhhh(n)
uhlll(n) uhllh(n)
uhlhl(n) uhlhhn)
uhhll(n) uhhlh(n)
uhhhl(n) uhhhh(n) ulll(n)
ullh(n) ulhl(n)
ulhh(n) uhll(n)
uhlh(n) uhhl(n)
uhhh(n) ull(n)
ulh(n)
uhl(n)
uhh(n) ul(n)
uh(n)
ISSN: 2005-4238 IJAST 469
Copyright ⓒ 2019 SERSC
h(2) 0.01001010101 0.224609375 0.01010000000 0.1875000 0.165
h(3) 0.00100001001 -0.12939453125 0.00100010000 -0.1328125 0.026 g(0) 0.00100001001 -0.12939453125 0.00100010000 -0.1328125 0.026 g(1) 0.01001010101 -0.22412109375 0.01010000000 -0.1875000 0.165 g(2) 1.00101010001 0.83642578125 1.01000000000 0.7500000 0.103 g(3) 0.10000100101 -0.48291015625 0.10001000000 -0.4687500 0.029
In Table 1, each ‘1’ shown in bold-face represents a subtraction while each normal ‘1’ represents an addition operation when the constant multiplication is performed using shift-add method. The multiplication results of input samples with approximate CSD values is expressed in shift-add operation derived from Table 1, as following.
𝑢𝑙1 = 𝑥(2𝑛). ℎ(0) = [𝑥(2𝑛) − 𝑥(2𝑛) ≫ 4] ≫ 1 (31)
𝑢ℎ1 = 𝑥(2𝑛). 𝑔(0) = −[𝑥(2𝑛) + 𝑥(2𝑛) ≫ 4] ≫ 3 (32) 𝑢𝑙2 = 𝑥(2𝑛 − 1). ℎ(1) = 𝑥(2𝑛 − 1) − 𝑥(2𝑛 − 1) ≫ 2 (33)
𝑢ℎ2 = 𝑥(2𝑛 − 1). 𝑔(1) = [−𝑥(2𝑛 − 1) + 𝑥(2𝑛 − 1) ≫ 2] ≫ 2 (34) 𝑢𝑙3 = 𝑥(2𝑛 − 2). ℎ(2) = [𝑥(2𝑛 − 2) − 𝑥(2𝑛 − 2) ≫ 2] ≫2 (35)
𝑢ℎ3 = 𝑥(2𝑛 − 2). 𝑔(2) = 𝑥(2𝑛 − 2) − 𝑥(2𝑛 − 2) ≫ 2 (36) 𝑢𝑙4 = 𝑥(2𝑛 − 3). ℎ(3) = −[𝑥(2𝑛 − 3) + 𝑥(2𝑛 − 3) ≫ 4] ≫ 3 (37)
𝑢ℎ4 = 𝑥(2𝑛 − 3). 𝑔(3) = [−𝑥(2𝑛 − 3) + 𝑥(2𝑛 − 3) ≫ 4] ≫ 1 (38) The low-pass & high-pass outputs of DB- 4 filter is calculated as:
𝑢𝑙(𝑛) = 𝑢𝑙1+ 𝑢𝑙2+ 𝑢𝑙3+ 𝑢𝑙4 (39) 𝑢ℎ(𝑛) = 𝑢ℎ1+ 𝑢ℎ2+ 𝑢ℎ3+ 𝑢ℎ4 (40)
A shift-add register (SAR) designed by using equation (31-40) of shift-add relations for computation of partial filter outputs {𝑢𝑙1, 𝑢𝑙2, 𝑢𝑙3, 𝑢𝑙4 } and {𝑢ℎ1 ,𝑢ℎ2, 𝑢ℎ3, 𝑢ℎ4} of low-pass &high-pass DB-4 tap filter.
The internal structure of SAR is shown in Figure 2 which is comprised of 8 adders and 24 shift operations. The partial results corresponding to low-pass and high-pass filter output are added according to separately in two tree adders (TAs) to produce filter output {ul(n), uh(n)}.
ISSN: 2005-4238 IJAST 470
Copyright ⓒ 2019 SERSC
Figure 2: Proposed SAR internal architecture using approximate shift-add operation.
IV. SIMULATION RESULTS AND DISCUSSION
To Know the effect of error of the filter coefficients on the filter output, 4-level DWPT is calculated in MATLAB using 12-bit approximated CSD values of Daub 4-tap filter and12-bit fixed-point values. Lena image is considered as the input signal. The mean square error (MSE) of sub-band components {ul, uh, ull, ulh, uhl, uhh, ulll, ullh, ulhl,ulhh, uhll, uhlh, uhhl, uhhh, ullll, ulllh, ullhl, ullhh, ulhll, ulhlh, ulhhl, ulhhh, uhlll, uhllh, uhlhl, uhlhh, uhhll, uhhlh, uhhhl, uhhhh} of 4-level DWPT is obtained using 12-bit fixed-point CSD values with respect to the corresponding sub-band values of fixed point filter-coefficient values.
TA-1
u
l(n) u
h(n)
x(2n) x(2n-1) x(2n-2) x(2n-3)
SAR
>>1
>>4
>>3
>>2
>>2 >>2
>>2
>>3
>>4
>>1
- -
- +
- +
-
+ +
- - +
- - + -
ul1 ul2 ul3 ul4 uh1 uh2 uh3 uh4
TA-2
ISSN: 2005-4238 IJAST 471
Copyright ⓒ 2019 SERSC
Fig 3: level -1fixed point output
Fig 3 gives the fixed point output of level-1, ul and uh each with 32 samples in output when input is of 64 sample signals of Lena image.
Fig 4: level-1 approximate value output
Fig 4 gives the Approximate values output of level-1, ul and uh each with 32 samples in output when input is of 64 sample signals of Lena image.
ISSN: 2005-4238 IJAST 472
Copyright ⓒ 2019 SERSC
Fig 5: level-2 fixed point output
Fig 5 gives the fixed point output of level-2,ull,ulh,uhl,uhh and each with 16 samples in output when input is of 32 sample signals.
Fig 6: level-2 approximate values output
Fig 6 gives the approximate output of level-2,ull,ulh,uhl,uhh and each with 16 samples in output when input is of 32 sample signals .
ISSN: 2005-4238 IJAST 473
Copyright ⓒ 2019 SERSC
Fig 7: level-3 fixed point output
Fig 7 gives the fixed point output of level-3,ulll,ullh,ulhl,ulhh,uhll,uhlh,uhhl,uhhh and each with 8 samples in output when the input is of 16 sample signals .
Fig 8: level-3 approximatevlaues output
Fig 8 gives the approximate values output of level-3,ulll,ullh,ulhl,ulhh,uhll,uhlh,uhhl,uhhh and each with 8 samples in output when the input is of 16 sample signals
ISSN: 2005-4238 IJAST 474
Copyright ⓒ 2019 SERSC
Fig 9.1: level-4 fixed point output
Fig 9.1 gives the fixed point output of level-4,ullll,ulllh,ullhl,ullhh,ulhll,ulhlhand each with 4 samples in output when the input is of 8 sample signals
Fig 9.2 : level -4fixed point output
Fig 9.2 gives the fixed point output of level-4,ulhhl,ulhhh,uhlll,uhllh,uhlhl,uhlhh and each with 4 samples in output when the input is of 8 sample signals
ISSN: 2005-4238 IJAST 475
Copyright ⓒ 2019 SERSC
Fig 9.3: level -4 fixed point output
Fig 9.3 gives the fixed point output of level-4,uhlhl,uhlhh,uhhll,uhhlh,uhhhl,uhhhhand each with 4 samples in output when the input is of 8 sample signals.
Fig 10.1: level-4 Approximate values output
Fig 10.1 gives the approximate values output of level-4,ullll,ulllh,ullhl,ullhh,ulhll,ulhlh and each with 4 samples in output when the input is of 8 sample signals.
ISSN: 2005-4238 IJAST 476
Copyright ⓒ 2019 SERSC
Fig 10.2: level-4 Approximate values output
Fig 10.2 gives the approximate values output of level-4,ulhhl,ulhhh,uhlll,uhllh,uhlhl,uhlhh and each with 4 samples in output when the input is of 8 sample signals.
Fig 10.3 : level-4 Approximate values output
Fig 10.3 gives the approximate values output of level-4,uhlhl,uhlhh,uhhll,uhhlh,uhhhl,uhhhhand each with 4 samples in output when the input is of 8 sample signals
The Table 2 gives error estimation of approximated four level DWPT sub-band output. Average of fixed point sub-band output is given in column 2 of table,average of approximated point sub-band output is given in column 3and the estimated Mean Square Error of the sub-band components are given in column
ISSN: 2005-4238 IJAST 477
Copyright ⓒ 2019 SERSC
4 of the table2.The approximated filter coefficient introduces a small amount of error in the sub-bands outputs of DWPT with respect to the values corresponding to 12-bit filter coefficients.
Table 2 : Error estimation of approximated four level DWPT sub-band output V. CONCLUSION
An approximate SAR unit which would be utilized efficiently to trade off area and delay quality for error –resilience DSP systems is proposed in this paper.SAR(Shift Add Register) unit for DWPT processing using canonical sign digit (CSD) techniques is proposed .Error analysis for approximated SAR for DWPT in four levels is calculated where the MSE value is very less when compared with fixed point values.These approximated circuits are highly suitable for the applications where accuracy is not requirement.
REFERENCES
1. Noor Mahammad sk, Mohamed Asan Basiri m, “An efficient VLSI architecture for lifting based 1D/2D DWT”,Microprocessor and Microsystems(2016).
Four Level Filter Sub-band output Average Fixed-point
sub-band output
Average Approximated- point sub-band
output
Average relative error of approximated-point
over fixed-point filter output
Average error of approximated
filter output
Level-1 ul 121.55 83.9435 0.30
0.55
uh 0.960 -1.7468 0.81
Level-2
ull 99.8329 80.8981 0.18
0.55
ulh 2.0087 -0.9254 0.53
uhl 2.6373 -0.3507 0.86
uhh 0.2919 0.47814 0.63
Level-3
ulll 61.6708 44.6588 0.27
0.75
ullh 5.1155 1.5258 0.70
ulhl 0.2546 -1.0614 3.16
ulhh 13.7550 10.6235 0.22
uhll 1.9662 0.2528 0.87
uhlh 3.0962 2.1459 0.30
uhhl 3.9368 3.1699 0.19
uhhh 2.2272 1.5796 0.29
Level-4
ullll -0.0538 -0.05235 0.02
0.93
ulllh -0.2011 -0.1847 0.08
ullhl -0.2011 -0.1847 0.08
ullhh -0.750 -0.652 0.13
ulhll -0.2011 -0.1847 0.08
ulhlh -0.750 -0.652 0.13
ulhhl -0.750 -0.652 0.13
ulhhh -2.80 -2.301 0.17
uhlll 0.00085 0.0026 2.05
uhllh 0.003175 0.00915 1.87
uhlhl 0.0032 0.00915 1.87
uhlhh 0.0119 0.03225 1.66
uhhll 0.0032 0.00915 1.87
uhhlh 0.0119 0.03225 1.66
uhhhl 0.0119 0.03225 1.66
uhhhh 0.044 0.11385 1.50
ISSN: 2005-4238 IJAST 478
Copyright ⓒ 2019 SERSC
2. Dr.T.Srinivasulu, G.KiranMaye, “A survey on VLSI Architectures for Wavelets”,IJERA,Vol 7,Issue 10,Oct 2017.
3. Sujatha, Pushpawati Changlekar, P.Anita, “ Implementation of Binary Canonic signed digit multiplier using ASIC”,IJERA,Vol 3,issue 1,2013.
4. ,D. Mohapatra, V. Gupta, Raghunathan, & K. Roy, “Low-power digital signal processing using approximate adders,” IEEE Trans. Computer-Aided Design Integration Circuits Systems, vol. 32, no. 1, pp. 124–137,Jan. 2013.
5. K.-J. Cho, K.-C. Lee, J.-G. Chung, and K. K. Parhi, “Design of low-error fixed-width modified booth multiplier,” IEEE Trans.VLSI Syst., vol. 12, no. 5, pp. 522–531,May 2004.
6. H. R. Mahdiani, Fakhraie A. Ahmadi, S. M., and C. Lucas, “Bio-inspired imprecise computational blocks for efficient VLSI implementation of soft-computing applications,” IEEE Trans. Circuits Syst. I, Reg. Papers,vol. 57, no. 4, pp. 850–862, Apr.
2010
7. S. Narayanamoorthy, Z. Liu H. A. Moghaddam, , T. Park, and N. S. Kim,“Energy-efficient approximate multiplication for digital signal processing and classification applications,” IEEE Trans.(VLSI) Syst., vol. 23, no. 6, pp. 1180–1184, Jun. 2015 8. G. Zervakis, K. Pekmestzi, D. Soudris,S. Xydis and K. Tsoumanis,“Design-efficient approximate multiplication circuits
through partial product perforation,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst.,vol. 24, no. 10, pp. 3105–3117, Oct. 2016
9. A. Momeni, J. Han, P. Montuschi, and F. Lombardi, “Design and analysis of approximate compressors for multiplication,”
IEEE Trans.Comput., vol. 64, no. 4, pp. 984–994, Apr. 2015.
10. James Garland and David Gregg, Trinity College Dublin,“Low Complexity Multiply Accumulate Unit for Weight-Sharing Convolutional Neural Networks”, DOI 10.1109/LCA.2017.2656880, IEEE Computer Architecture Letters,Dec-2014.
11. J. Liang, J. Han, and F. Lombardi, “New metrics for the reliability of approximate and probabilistic adders,” IEEE Trans.
Comput., vol. 63,no. 9, pp. 1760–1771, Sep. 2013.
12. K.-J. Cho, K.-C. Lee, J.-G. Chung, & K. K. Parhi, “Design of low-error fixed-width modified booth multiplier,” IEEE Trans VLSI Syst., vol. 12, no. 5, pp. 522–531,May 2004
13. B. K. Mohanty & Vikas Tiwari, “Modified probabilistic estimation bias formulation for hardware efficient fixed-width Booth multiplier”, Circuit, Systems & Signal Processing, Springer, , Impact Factor: 1.99, vol. 33, issue 12, pp 3981-3994, July 2014.
14. E. J. King &E. E. Swartzlander, Jr., “Data-dependent truncation scheme for parallel multipliers,” in Proc. 31st Asilomar Conf. Signals,Circuits Syst., Nov. 1998, pp. 1178–1182.
15. C. Liu, J. Han, & F. Lombardi, “A low-power, high-performance approximate multiplier with configurable partial error recovery,” in Proc.Conf. Exhibit. 2014, pp. 1–4.