ERROR ANALYSIS FOR DWPTHARDWARE REALIZATIONUSING APPROXIMATION

(1)

ISSN: 2005-4238 IJAST 465

ERROR ANALYSIS FOR DWPTHARDWARE REALIZATIONUSING APPROXIMATION

G.Renuka¹, V.Usha Shree², P. Chandrasekhar Reddy³

1PhD scholar ,JNTUH,Hyderabad,Telangana.

2Professor & Principal, JBREC, Hyderabad.

3Professor of Co-ordination, JNTUH College of engineering, Hyderabad.

ABSTRACT

An error analysis for Discrete wavelet packet transform (DWPT) architecture using approximation computational methodology is presented in this paper. The multiplier adder unit of DWPT is replaced by Shifter–Adder -unit to decrease the complexity of parallel structures. Shift add register (SAR)approximate arithmetic architecture designs are proposed for low complexity multiplier –less design that can be an substitute for existing multiplier based designs for realization of multilevel DWPT.

Keywords: Adder, approximate multiplier, hardware circuits, shift register I. INTRODUCTION

The DWT needs many mathematical functions to be performed for its execution. As the number of calculations are more, [1]it is required to know the hardware necessities for such compound calculations .Generally there are 2methods for performing DWT. One method is Convolution based method where filter bank structures are implemented.[2]The second one is Lifting scheme which decreases the number of computations and avoids the use of multiple operations. In DWT just the low‐frequency portion of the spectrum is decomposed. But where as in a DWPT the high frequency portion of the spectrum also further decomposed. The benefit of DWPT is that it is conceivable to choose the optimum representation of the signal according to some standard. The representation that uses less coefficients is the best in image compression.

Canonic Signed Digit[3] (CSD) representation had the competence to do multiplications with less number of adders. It is different representation of a binary number with less number of 1’s and -1 ‘s digits. One important applications of CSD is in multiplication operation, where it takes less number of subtractions & additions to produce the product. V.Gupta et. al(2013), [4] proposed logic complexity at transistor level. K.-J in 2004 gave an error correction method for modified booth fixed width multiplier[5].H.R Mahdiani in 2010 presented a new method of bio- inspired approximate hardware arithmetic with improvements in cost and performance[5].H.A.Moghaddam et all,in 2015 [6] proposed an approximate multiplier which can trade off accuracy and energy at design time for DSP & recognition applications.On the basis of survey of approximate computation ,proposed approximate SAR unit for DWPT using CSD representation.

II. MATHEMATICAL FORMULATION FOR CONVOLUTION BASED DWPT To Know the error performance of proposed SAR unit on applications like image processing, 64 input sample signals of pixel size 512 × 512 for Lenna image processing are considered.

In level-1 DWPT, the input sequence x(n) is decomposed into low-pass sub-band ul(n) and high-pass uh(n) components. Using convolution scheme, the low-pass & high-pass components of 1-level DWPT is calculated as

(2)

ISSN: 2005-4238 IJAST 466

𝑢_𝑙[𝑛] = ∑^𝐾−1_𝑖=0 ℎ (𝑖). 𝑥 (2𝑛 − 𝑖) (1)

𝑢_ℎ[𝑛] = ∑^𝐾−1_𝑖=0 𝑔 (𝑖). 𝑥 (2𝑛 − 𝑖) (2)

Where, h(i) and g(i), for 0 ≤ i ≤ K-1, are the filter coefficient of low-pass & high-pass FIR filters of length K. The low-pass sub-band ul(n) & high-pass uh(n) of level-1 can be further decomposed in level-2 decomposition to generate four sub-bands, low-low (ull), low-high (ulh), high-low (uhl), and high-high (uhh), are calculated using convolution scheme given as: 𝑢_𝑙𝑙[𝑛] = ∑^𝐾−1_𝑖=0 ℎ (𝑖). 𝑢_𝑙(2𝑛 − 𝑖) (3)

𝑢𝑙ℎ[𝑛] = ∑^𝐾−1_𝑖=0 𝑔 (𝑖). 𝑢𝑙(2𝑛 − 𝑖) (4)

𝑢_ℎ𝑙[𝑛] = ∑^𝐾−1_𝑖=0 ℎ (𝑖). 𝑢_ℎ(2𝑛 − 𝑖) (5)

𝑢_ℎℎ[𝑛] = ∑^𝐾−1_𝑖=0 𝑔 (𝑖). 𝑢_ℎ(2𝑛 − 𝑖) (6)

Each sub-band of 2-level decomposition is further decomposed into low-pass & high-pass sub-bands in level-3 decomposition such that 8 sub-bands are low-low-low (ulll), low-low-high (ullh), low-high-low (ulhl), low-low-high (ulhh), high-low-low (uhll), high-low-high (uhlh), high-high-low (uhhl) and high-high- high (uhhh) are calculated as: 𝑢_𝑙𝑙𝑙[𝑛] = ∑^𝐾−1_𝑖=0 ℎ (𝑖). 𝑢_𝑙𝑙(2𝑛 − 𝑖) (7) 𝑢𝑙𝑙ℎ[𝑛] = ∑^𝐾−1_𝑖=0 𝑔 (𝑖). 𝑢𝑙𝑙(2𝑛 − 𝑖) (8) 𝑢_𝑙ℎ𝑙[𝑛] = ∑^𝐾−1_𝑖=0 ℎ (𝑖). 𝑢_𝑙ℎ(2𝑛 − 𝑖) (9)

𝑢_𝑙ℎℎ[𝑛] = ∑^𝐾−1_𝑖=0 𝑔 (𝑖). 𝑢_𝑙ℎ(2𝑛 − 𝑖) (10)

𝑢_ℎ𝑙𝑙[𝑛] = ∑^𝐾−1_𝑖=0 ℎ (𝑖). 𝑢_ℎ𝑙(2𝑛 − 𝑖) (11)

𝑢_ℎ𝑙ℎ[𝑛] = ∑^𝐾−1_𝑖=0 𝑔 (𝑖). 𝑢_ℎ𝑙(2𝑛 − 𝑖) (12) 𝑢_ℎℎ𝑙[𝑛] = ∑^𝐾−1_𝑖=0 ℎ (𝑖). 𝑢_ℎℎ(2𝑛 − 𝑖) (13) 𝑢_ℎℎℎ[𝑛] = ∑^𝐾−1_𝑖=0 𝑔 (𝑖). 𝑢_ℎℎ(2𝑛 − 𝑖) (14)

Similarly, in level-4 decomposition each sub-band of 3-level decomposition is further decomposed into low-pass & high-pass sub-bands to obtain 16 sub-bands named low-low-low-low (ullll), low-low-low-high (ulllh), low-low-high-low (ullhl), low-low-high-high (ullhh), low-high-low-low (ulhll), low-high-low-high (ulhlh), low-low-high-low (ulhhl), low-low-high-high (ulhhh), high-low-low-low (uhlll), high-low-low-high

(3)

ISSN: 2005-4238 IJAST 467

(uhllh), high-low-high-low (uhlhl), high-low-high-high (uhlhh), high-high-low-low (uhhll), high-high-low-high (uhhlh) high-high-high-low (uhhhl) & high-high-high-high (uhhhl) are calculated as:

𝑢_{𝑙𝑙𝑙𝑙}[𝑛] = ∑^𝐾−1_𝑖=0 ℎ (𝑖). 𝑢_𝑙𝑙𝑙(2𝑛 − 𝑖) (15) 𝑢_{𝑙𝑙𝑙ℎ}[𝑛] = ∑^𝐾−1_𝑖=0 𝑔 (𝑖). 𝑢_𝑙𝑙𝑙(2𝑛 − 𝑖) (16) 𝑢_{𝑙𝑙ℎ𝑙}[𝑛] = ∑^𝐾−1_𝑖=0 ℎ (𝑖). 𝑢_𝑙𝑙ℎ(2𝑛 − 𝑖) (17) 𝑢_{𝑙𝑙ℎℎ}[𝑛] = ∑^𝐾−1_𝑖=0 𝑔 (𝑖). 𝑢_𝑙𝑙ℎ(2𝑛 − 𝑖) (18) 𝑢_{𝑙ℎ𝑙𝑙}[𝑛] = ∑^𝐾−1_𝑖=0 ℎ (𝑖). 𝑢_𝑙ℎ𝑙(2𝑛 − 𝑖) (19) 𝑢_{𝑙ℎ𝑙ℎ}[𝑛] = ∑^𝑘−1_𝑖=0 𝑔 (𝑖). 𝑢_𝑙ℎ𝑙(2𝑛 − 𝑖) (20) 𝑢_{𝑙ℎℎ𝑙}[𝑛] = ∑^𝐾−1_𝑖=0 ℎ (𝑖). 𝑢_𝑙ℎℎ(2𝑛 − 𝑖) (21) 𝑢_{𝑙ℎℎℎ}[𝑛] = ∑^𝐾−1_𝑖=0 𝑔 (𝑖). 𝑢_𝑙ℎℎ(2𝑛 − 𝑖) (22) 𝑢_{ℎ𝑙𝑙𝑙}[𝑛] = ∑^𝐾−1_𝑖=0 ℎ (𝑖). 𝑢_ℎ𝑙𝑙(2𝑛 − 𝑖) (23) 𝑢_{ℎ𝑙𝑙ℎ}[𝑛] = ∑^𝐾−1_𝑖=0 𝑔 (𝑖). 𝑢_ℎ𝑙𝑙(2𝑛 − 𝑖) (24) 𝑢_{ℎ𝑙ℎ𝑙}[𝑛] = ∑^𝐾−1_𝑖=0 ℎ (𝑖). 𝑢_ℎ𝑙ℎ(2𝑛 − 𝑖) (25) 𝑢_{ℎ𝑙ℎℎ}[𝑛] = ∑^𝐾−1_𝑖=0 𝑔 (𝑖). 𝑢_ℎ𝑙ℎ(2𝑛 − 𝑖) (26) 𝑢_{ℎℎ𝑙𝑙}[𝑛] = ∑^𝐾−1_𝑖=0 ℎ (𝑖). 𝑢_ℎℎ𝑙(2𝑛 − 𝑖) (27) 𝑢_{ℎℎ𝑙ℎ}[𝑛] = ∑^𝐾−1_𝑖=0 𝑔 (𝑖). 𝑢_ℎℎ𝑙(2𝑛 − 𝑖) (28) 𝑢_{ℎℎℎ𝑙}[𝑛] = ∑^𝐾−1_𝑖=0 ℎ (𝑖). 𝑢_ℎℎℎ(2𝑛 − 𝑖) (29) 𝑢_ℎℎℎℎ[𝑛] = ∑^𝐾−1_𝑖=0 𝑔 (𝑖). 𝑢_ℎℎℎ(2𝑛 − 𝑖) (30)

As shown in the tree diagram of Fig. 1. Each filtering unit (FU) decomposes the input sequence of a particular level into a pair of sub-band (low-pass & high-pass). FU-1 receives the input sequence {x(n)}

and calculate low-pass {ul(n)} & high-pass {uh(n)} sub-band components of level-1 decomposition. Both the low-pass {ul(n)} & high-pass {uh(n)} components of 1-level decomposition is further processed separately using a pair of filter units (FU-21 and FU-22). The FU-21 processes the intermediate data- sequence {ul(n)} and produces a pair of low-pass and high-pass sub-band components {ull(n), ulh(n)}, where FU-22 processes the intermediate data-sequence {uh(n)} and produces another pair of low-pass and

(4)

ISSN: 2005-4238 IJAST 468

high-pass sub-band components {uhl(n), uhh(n)}. In level-3, the sub-band components {ull(n), ulh(n), uhl(n), uhh(n)} are processed separately using four identical filtering units (FU-31, FU-32, FU-33, FU-34) and produces components of eight sub-bands {ulll(n), ullh(n), ulhl(n), ulhh(n), uhll(n), uhlh(n), uhhl(n), uhhh(n)}.

Similarly, the sub-band components of 3-level are processed separately using 8 filtering units (FU-41, FU-42, FU-43, FU44, FU-45, FU-46, FU-47, FU48) to produce components 16 oriented selective sub- bands {ullll(n), ulllh(n), ullhl(n),ullhh(n), ulhll(n), ulhlh(n), ulhhl(n), ulhhh(n), uhlll(n), uhllh(n), uhlhl(n), uhlhh(n), uhhll(n), uhhhl(n),uhhhl(n), uhhhh(n)} of 4-the level DWPT.

Fig. 1: Generalized block diagram of 4-level DWPT

III. PROPOSED APPROXIMATE SAR DESIGN

We have used canonical sign digit (CSD) in fixed-point Daubechies 4-tap (DB-4) filter coefficient and approximated these values to replacing the multiplication operations by shift- add operation. The relative error of low-pass & high – pass filter coefficients in approximated values over fixed point values are very less than unity as shown in Table 1. The CSD representation of approximated DB-4 filter coefficient values are used to compute the sub-band outputs of DWPT.

Table 1 : Error estimation of Daub-4 low-pass & high-pass filter coefficient

Low-pass &

High-pass filter coefficient

Fixed point representation values (𝐴) Approximated representation values (𝐵) Relative Error Value (^𝐴−𝐵

𝐴 ) Binary CSD values Decimal values Binary CSD

values

Decimal values

h(0) 0.10000100101 0.482910156 0.10001000000 0.4687500 0.029

h(1) 1.00101010001 0.83642578125 1.01000000000 0.7500000 0.103

FU-41

FU-34 FU-33 FU-32 FU-31

FU-22 FU-21

FU-11

FU-42

FU-43

FU-44

FU-45

FU-46

FU-47

FU-48 Level-2

Level-3

Level-4 Level-1

x(n) x(2n)

x(2n+1)

ullll(n) u_lllh(n)

ullhl(n) u_llhh(n)

u_lhll(n) ulhlh(n)

ulhhl(n) u_lhhh(n)

uhlll(n) u_hllh(n)

uhlhl(n) u_hlhhn)

u_hhll(n) u_hhlh(n)

uhhhl(n) u_hhhh(n) u_lll(n)

ullh(n) u_lhl(n)

ulhh(n) uhll(n)

u_hlh(n) u_hhl(n)

u_hhh(n) u_ll(n)

ulh(n)

uhl(n)

uhh(n) ul(n)

uh(n)

(5)

ISSN: 2005-4238 IJAST 469

h(2) 0.01001010101 0.224609375 0.01010000000 0.1875000 0.165

h(3) 0.00100001001 -0.12939453125 0.00100010000 -0.1328125 0.026 g(0) 0.00100001001 -0.12939453125 0.00100010000 -0.1328125 0.026 g(1) 0.01001010101 -0.22412109375 0.01010000000 -0.1875000 0.165 g(2) 1.00101010001 0.83642578125 1.01000000000 0.7500000 0.103 g(3) 0.10000100101 -0.48291015625 0.10001000000 -0.4687500 0.029

In Table 1, each ‘1’ shown in bold-face represents a subtraction while each normal ‘1’ represents an addition operation when the constant multiplication is performed using shift-add method. The multiplication results of input samples with approximate CSD values is expressed in shift-add operation derived from Table 1, as following.

𝑢_𝑙1 = 𝑥(2𝑛). ℎ(0) = [𝑥(2𝑛) − 𝑥(2𝑛) ≫ 4] ≫ 1 (31)

𝑢_ℎ1 = 𝑥(2𝑛). 𝑔(0) = −[𝑥(2𝑛) + 𝑥(2𝑛) ≫ 4] ≫ 3 (32) 𝑢_𝑙2 = 𝑥(2𝑛 − 1). ℎ(1) = 𝑥(2𝑛 − 1) − 𝑥(2𝑛 − 1) ≫ 2 (33)

𝑢_ℎ2 = 𝑥(2𝑛 − 1). 𝑔(1) = [−𝑥(2𝑛 − 1) + 𝑥(2𝑛 − 1) ≫ 2] ≫ 2 (34) 𝑢_𝑙3 = 𝑥(2𝑛 − 2). ℎ(2) = [𝑥(2𝑛 − 2) − 𝑥(2𝑛 − 2) ≫ 2] ≫2 (35)

𝑢_ℎ3 = 𝑥(2𝑛 − 2). 𝑔(2) = 𝑥(2𝑛 − 2) − 𝑥(2𝑛 − 2) ≫ 2 (36) 𝑢_𝑙4 = 𝑥(2𝑛 − 3). ℎ(3) = −[𝑥(2𝑛 − 3) + 𝑥(2𝑛 − 3) ≫ 4] ≫ 3 (37)

𝑢_ℎ4 = 𝑥(2𝑛 − 3). 𝑔(3) = [−𝑥(2𝑛 − 3) + 𝑥(2𝑛 − 3) ≫ 4] ≫ 1 (38) The low-pass & high-pass outputs of DB- 4 filter is calculated as:

𝑢_𝑙(𝑛) = 𝑢_𝑙1+ 𝑢_𝑙2+ 𝑢_𝑙3+ 𝑢_𝑙4 (39) 𝑢ℎ(𝑛) = 𝑢ℎ1+ 𝑢ℎ2+ 𝑢ℎ3+ 𝑢ℎ4 (40)

A shift-add register (SAR) designed by using equation (31-40) of shift-add relations for computation of partial filter outputs {𝑢_𝑙1, 𝑢_𝑙2, 𝑢_𝑙3, 𝑢_𝑙4 } and {𝑢_{ℎ1 ,}𝑢_ℎ2, 𝑢_ℎ3, 𝑢_ℎ4} of low-pass &high-pass DB-4 tap filter.

The internal structure of SAR is shown in Figure 2 which is comprised of 8 adders and 24 shift operations. The partial results corresponding to low-pass and high-pass filter output are added according to separately in two tree adders (TAs) to produce filter output {ul(n), uh(n)}.

(6)

ISSN: 2005-4238 IJAST 470

Figure 2: Proposed SAR internal architecture using approximate shift-add operation.

IV. SIMULATION RESULTS AND DISCUSSION

To Know the effect of error of the filter coefficients on the filter output, 4-level DWPT is calculated in MATLAB using 12-bit approximated CSD values of Daub 4-tap filter and12-bit fixed-point values. Lena image is considered as the input signal. The mean square error (MSE) of sub-band components {ul, uh, ull, ulh, uhl, uhh, ulll, ullh, ulhl,ulhh, uhll, uhlh, uhhl, uhhh, ullll, ulllh, ullhl, ullhh, ulhll, ulhlh, ulhhl, ulhhh, uhlll, uhllh, uhlhl, uhlhh, uhhll, uhhlh, uhhhl, uhhhh} of 4-level DWPT is obtained using 12-bit fixed-point CSD values with respect to the corresponding sub-band values of fixed point filter-coefficient values.

TA-1

u

_l

(n) u

_h

(n)

x(2n) ^x(2n-1) ^x(2n-2) x(2n-3)

SAR

>>1

>>4

>>3

>>2

>>2 >>2

>>2

>>3

>>4

>>1

- -

- +

-

+ +

- - +

- ^- + -

u_l1 u_l2 u_l3 u_l4 u_h1 u_h2 u_h3 u_h4

TA-2

(7)

ISSN: 2005-4238 IJAST 471

Fig 3: level -1fixed point output

Fig 3 gives the fixed point output of level-1, ul and uh each with 32 samples in output when input is of 64 sample signals of Lena image.

Fig 4: level-1 approximate value output

Fig 4 gives the Approximate values output of level-1, ul and uh each with 32 samples in output when input is of 64 sample signals of Lena image.

(8)

ISSN: 2005-4238 IJAST 472

Fig 5: level-2 fixed point output

Fig 5 gives the fixed point output of level-2,ull,ulh,uhl,uhh and each with 16 samples in output when input is of 32 sample signals.

Fig 6: level-2 approximate values output

Fig 6 gives the approximate output of level-2,ull,ulh,uhl,uhh and each with 16 samples in output when input is of 32 sample signals .

(9)

ISSN: 2005-4238 IJAST 473

Fig 7: level-3 fixed point output

Fig 7 gives the fixed point output of level-3,ulll,ullh,ulhl,ulhh,uhll,uhlh,uhhl,uhhh and each with 8 samples in output when the input is of 16 sample signals .

Fig 8: level-3 approximatevlaues output

Fig 8 gives the approximate values output of level-3,ulll,ullh,ulhl,ulhh,uhll,uhlh,uhhl,uhhh and each with 8 samples in output when the input is of 16 sample signals

(10)

ISSN: 2005-4238 IJAST 474

Fig 9.1: level-4 fixed point output

Fig 9.1 gives the fixed point output of level-4,ullll,ulllh,ullhl,ullhh,ulhll,ulhlhand each with 4 samples in output when the input is of 8 sample signals

Fig 9.2 : level -4fixed point output

Fig 9.2 gives the fixed point output of level-4,ulhhl,ulhhh,uhlll,uhllh,uhlhl,uhlhh and each with 4 samples in output when the input is of 8 sample signals

(11)

ISSN: 2005-4238 IJAST 475

Fig 9.3: level -4 fixed point output

Fig 9.3 gives the fixed point output of level-4,uhlhl,uhlhh,uhhll,uhhlh,uhhhl,uhhhhand each with 4 samples in output when the input is of 8 sample signals.

Fig 10.1: level-4 Approximate values output

Fig 10.1 gives the approximate values output of level-4,ullll,ulllh,ullhl,ullhh,ulhll,ulhlh and each with 4 samples in output when the input is of 8 sample signals.

(12)

ISSN: 2005-4238 IJAST 476

Fig 10.2: level-4 Approximate values output

Fig 10.2 gives the approximate values output of level-4,ulhhl,ulhhh,uhlll,uhllh,uhlhl,uhlhh and each with 4 samples in output when the input is of 8 sample signals.

Fig 10.3 : level-4 Approximate values output

Fig 10.3 gives the approximate values output of level-4,uhlhl,uhlhh,uhhll,uhhlh,uhhhl,uhhhhand each with 4 samples in output when the input is of 8 sample signals

The Table 2 gives error estimation of approximated four level DWPT sub-band output. Average of fixed point sub-band output is given in column 2 of table,average of approximated point sub-band output is given in column 3and the estimated Mean Square Error of the sub-band components are given in column

(13)

ISSN: 2005-4238 IJAST 477

4 of the table2.The approximated filter coefficient introduces a small amount of error in the sub-bands outputs of DWPT with respect to the values corresponding to 12-bit filter coefficients.

Table 2 : Error estimation of approximated four level DWPT sub-band output V. CONCLUSION

An approximate SAR unit which would be utilized efficiently to trade off area and delay quality for error –resilience DSP systems is proposed in this paper.SAR(Shift Add Register) unit for DWPT processing using canonical sign digit (CSD) techniques is proposed .Error analysis for approximated SAR for DWPT in four levels is calculated where the MSE value is very less when compared with fixed point values.These approximated circuits are highly suitable for the applications where accuracy is not requirement.

REFERENCES

1. Noor Mahammad sk, Mohamed Asan Basiri m, “An efficient VLSI architecture for lifting based 1D/2D DWT”,Microprocessor and Microsystems(2016).

Four Level Filter Sub-band output Average Fixed-point

sub-band output

Average Approximated- point sub-band

output

Average relative error of approximated-point

over fixed-point filter output

Average error of approximated

filter output

Level-1 ul 121.55 83.9435 0.30

0.55

uh 0.960 -1.7468 0.81

Level-2

ull 99.8329 80.8981 0.18

0.55

ulh 2.0087 -0.9254 0.53

uhl 2.6373 -0.3507 0.86

uhh 0.2919 0.47814 0.63

Level-3

ulll 61.6708 44.6588 0.27

0.75

ullh 5.1155 1.5258 0.70

ulhl 0.2546 -1.0614 3.16

ulhh 13.7550 10.6235 0.22

uhll 1.9662 0.2528 0.87

uhlh 3.0962 2.1459 0.30

uhhl 3.9368 3.1699 0.19

uhhh 2.2272 1.5796 0.29

Level-4

ullll -0.0538 -0.05235 0.02

0.93

ulllh -0.2011 -0.1847 0.08

ullhl -0.2011 -0.1847 0.08

ullhh -0.750 -0.652 0.13

ulhll -0.2011 -0.1847 0.08

ulhlh -0.750 -0.652 0.13

ulhhl -0.750 -0.652 0.13

ulhhh -2.80 -2.301 0.17

uhlll 0.00085 0.0026 2.05

uhllh 0.003175 0.00915 1.87

uhlhl 0.0032 0.00915 1.87

uhlhh 0.0119 0.03225 1.66

uhhll 0.0032 0.00915 1.87

uhhlh 0.0119 0.03225 1.66

uhhhl 0.0119 0.03225 1.66

uhhhh 0.044 0.11385 1.50

(14)

ISSN: 2005-4238 IJAST 478

2. Dr.T.Srinivasulu, G.KiranMaye, “A survey on VLSI Architectures for Wavelets”,IJERA,Vol 7,Issue 10,Oct 2017.

3. Sujatha, Pushpawati Changlekar, P.Anita, “ Implementation of Binary Canonic signed digit multiplier using ASIC”,IJERA,Vol 3,issue 1,2013.

4. ,D. Mohapatra, V. Gupta, Raghunathan, & K. Roy, “Low-power digital signal processing using approximate adders,” IEEE Trans. Computer-Aided Design Integration Circuits Systems, vol. 32, no. 1, pp. 124–137,Jan. 2013.

5. K.-J. Cho, K.-C. Lee, J.-G. Chung, and K. K. Parhi, “Design of low-error fixed-width modified booth multiplier,” IEEE Trans.VLSI Syst., vol. 12, no. 5, pp. 522–531,May 2004.

6. H. R. Mahdiani, Fakhraie A. Ahmadi, S. M., and C. Lucas, “Bio-inspired imprecise computational blocks for efficient VLSI implementation of soft-computing applications,” IEEE Trans. Circuits Syst. I, Reg. Papers,vol. 57, no. 4, pp. 850–862, Apr.

2010

7. S. Narayanamoorthy, Z. Liu H. A. Moghaddam, , T. Park, and N. S. Kim,“Energy-efficient approximate multiplication for digital signal processing and classification applications,” IEEE Trans.(VLSI) Syst., vol. 23, no. 6, pp. 1180–1184, Jun. 2015 8. G. Zervakis, K. Pekmestzi, D. Soudris,S. Xydis and K. Tsoumanis,“Design-efficient approximate multiplication circuits

through partial product perforation,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst.,vol. 24, no. 10, pp. 3105–3117, Oct. 2016

9. A. Momeni, J. Han, P. Montuschi, and F. Lombardi, “Design and analysis of approximate compressors for multiplication,”

IEEE Trans.Comput., vol. 64, no. 4, pp. 984–994, Apr. 2015.

10. James Garland and David Gregg, Trinity College Dublin,“Low Complexity Multiply Accumulate Unit for Weight-Sharing Convolutional Neural Networks”, DOI 10.1109/LCA.2017.2656880, IEEE Computer Architecture Letters,Dec-2014.

11. J. Liang, J. Han, and F. Lombardi, “New metrics for the reliability of approximate and probabilistic adders,” IEEE Trans.

Comput., vol. 63,no. 9, pp. 1760–1771, Sep. 2013.

12. K.-J. Cho, K.-C. Lee, J.-G. Chung, & K. K. Parhi, “Design of low-error fixed-width modified booth multiplier,” IEEE Trans VLSI Syst., vol. 12, no. 5, pp. 522–531,May 2004

13. B. K. Mohanty & Vikas Tiwari, “Modified probabilistic estimation bias formulation for hardware efficient fixed-width Booth multiplier”, Circuit, Systems & Signal Processing, Springer, , Impact Factor: 1.99, vol. 33, issue 12, pp 3981-3994, July 2014.

14. E. J. King &E. E. Swartzlander, Jr., “Data-dependent truncation scheme for parallel multipliers,” in Proc. 31st Asilomar Conf. Signals,Circuits Syst., Nov. 1998, pp. 1178–1182.

15. C. Liu, J. Han, & F. Lombardi, “A low-power, high-performance approximate multiplier with configurable partial error recovery,” in Proc.Conf. Exhibit. 2014, pp. 1–4.

ERROR ANALYSIS FOR DWPTHARDWARE REALIZATIONUSING APPROXIMATION