• No results found

DESIGN AND VERIFICATION OF FAST 32 BIT BINARY FLOATING POINT MULTIPLIER BY INCREASING SPEED OF MANTISSA MULTIPLICATION

N/A
N/A
Protected

Academic year: 2020

Share "DESIGN AND VERIFICATION OF FAST 32 BIT BINARY FLOATING POINT MULTIPLIER BY INCREASING SPEED OF MANTISSA MULTIPLICATION"

Copied!
10
0
0

Loading.... (view fulltext now)

Full text

(1)

Available Online at www.ijpret.com 955

INTERNATIONAL JOURNAL OF PURE AND

APPLIED RESEARCH IN ENGINEERING AND

TECHNOLOGY

A PATH FOR HORIZING YOUR INNOVATIVE WORK

DESIGN AND VERIFICATION OF FAST 32 BIT BINARY FLOATING POINT

MULTIPLIER BY INCREASING SPEED OF MANTISSA MULTIPLICATION

EKTA1, VINOD2, PRANALI3

1.Faculty of Electronics and Communication Engineering Department, IBSS College of Engineering, Amravati.

2.Faculty of Electronics and Communication Engineering Department, Manoharbhai Patel Institute of Engineering and Technology, Bhandara. 3.Faculty of Electronics Engineering Department, Radhikatai Pandav Polytechnic College, Nagpur.

Accepted Date: 05/03/2015; Published Date: 01/05/2015

\

Abstract: Floating point number is the standard number format describe by IEEE. In computing, floating point describes the representation of a real numbers which have wide range of values. Operations on floating point are hard to implement on Field Programmable Gate Array (FPGA). On the other hand floating point multiplier is one of the most useful modules in electronics industries such as DSP, image processing and arithmetic unit in the microprocessor. Multiplier is the heart of many applications like FFT, DFT, arithmetic units. In most of the application it is needed to do arithmetic operations very fast with greater accuracy. Many algorithms like Booths algorithm, Wallace tree algorithm, Dadda multiplier, etc., are used to perform multiplications of binary numbers. This Paper presents a fast single precision binary floating point multiplier. To improve the speed of multiplier, we minimize the delays in arithmetic operations at every stage of mantissa multiplication. This is done by using Dadda algorithm for mantissa multiplication. Dadda algorithm needs more slice area but, speed of system is increases. Also instead of AND gates, a shifting technique is used to forming a partial product matrix which is new concept in Dadda algorithm. This technique optimizes the hardware requirement, results in minimization in slice area and improvement in speed of multiplication. All basic modules of this binary floating point multiplier are written in Verilog hardware descriptive language (HDL) and targeted to Virtex5 FPGA family in Xilinx 13.1 ISE software. Design is simulated by using Model Sim6.3f. Practically Synthesized results are compares with previously designed multiplier unit. Our designed binary single precision floating point multiplier achieves maximum frequency of 550.5 MHz with 619 slices area and 566 LUT–Flip Flop pairs in Virtex5 xc5vlx20t-2ff323 FPGA.

Keywords:Mantissa, Dadda Algorithm, Fast Multiplier, Binary Multiplication

Corresponding Author: MS. EKTA

Access Online On:

www.ijpret.com

How to Cite This Article:

Ekta, IJPRET, 2015; Volume 3 (9): 955-964

(2)

Available Online at www.ijpret.com 956

INTRODUCTION

In early 1980s, IEEE had given a standardization to represent a real numbers in the world of computing called as floating point number format; it is also known as IEEE 754. Floating point arithmetic is an interesting subject for many researchers. This is not surprising because floating point is used in almost every computer system. Almost every machine language supports floating point data types. Intel’s 80486 is the first microprocessor which has inbuilt floating point unit. Computers from PCs, laptops to supercomputers have floating point arithmetic unit, accelerators, compiler, etc., and virtually every operating system must respond to floating point exceptions such as underflow, overflow. Various algorithms have been proposed for the use of floating point number systems. Which algorithm is to be used is depends on the requirement of that system. Some algorithm has greater speed of calculation where some of them need more area to implement that algorithm. In most of the DSP applications, multiplier has great significance and it is require doing calculation very fast.

This paper presents a report on a fast binary floating point multiplier. A Dadda algorithm for multiplication of mantissa is discussed and then designed for increasing the speed of calculation. Finely design multiplier unit is targeted to FPGA and synthesis results were compared with the results of existing system.

Floating point numbers attempt to represent real numbers with uniform accuracy. A unique way to represent a real number is in the form

Where,

‘n’ is chosen so that ‘a’ falls within a defined range of values and

‘b’ is usually implicit in the data type and it is often equal to 2 ( for binary).

(3)

Available Online at www.ijpret.com 957 S: Sign bit (0 for positive and 1 for negative number)

E: Exponent which is of 8 bit (from bit 23 to 30)

F: Mantissa or Fraction which is of 23 bit (from 0 to 22)

31 30 23 22 0

S E (Exponent) F (Mantissa or Fraction)

Fig.1. IEEE 754 single precision floating point format

Equivalent real number is given by

Where,

And Bias= 127

(4)

Available Online at www.ijpret.com 958 Both algorithms consist of three same stages but the rules for reduction in second stage are different. Due to this the adders have different delays and different area. Barry Fagin and Cyril Renard give strengths and weaknesses of FPGA for floating point arithmetic in “Reference *5+”. Also they gave comparison of different adders and multipliers on basis of area, delay and performance. In “Reference *6+”, an efficient implementation of floating point multiplier is designed in VHDL. It achieves maximum frequency of 301.114 MHz with 604 slices area.

In this paper we designed a fast floating point multiplier using updated Dadda algorithm. Design achieves highest speed of 550.5 MHz with area of 619 slices.

FLOATING POINT MULTIPLIER ALGORITHM

Floating point multiplier is one of the mostly used units in industries. Different techniques and algorithms are used for designing multiplier. The main motto behind that is to minimize the delay, latency period, area, power consumption and to maximize the accuracy and speed of the multiplication.

Normally the following steps are necessary to multiply two floating point numbers.

1. Multiplying the significant i.e. M1 X M2.

2. Placing the decimal point in the result.

3. Adding the exponents and subtract bias i.e., E1 + E2 - Bias.

4. Obtaining the sign i.e., S1 XOR S2.

5. Normalizing the result i.e. obtaining 1 at the MSB of significant multiplication’s result.

6. Rounding the result to fit in the available bit format.

7. Checking for underflow or overflow condition.

The Dadda multiplier is a hardware multiplier design invented by computer scientist Luigi Dadda in 1965. It is similar to the Wallace multiplier, but it is slightly faster and requires fewer gates. Dadda multiplier has the three steps to multiply two floating point numbers:

(5)

Available Online at www.ijpret.com 959 2. Reduce the number of partial products to two layers by using full and half adders.

3. Group the wires in two numbers, and add them with a conventional adder.

Unlike Wallace multipliers that reduce as much as possible on each layer, Dadda multipliers do as few reductions as possible. Because of this, Dadda multipliers have a less expensive reduction phase, but the numbers may be a few bits longer, thus requiring slightly bigger adder. In Dadda algorithm following procedure is used to reduce partial product into two rows.

1. Consider d1 as minimum reduced height i.e., d1 = 1 and d1+j = round [1.5dj]. Where dj is the height of partial product matrix of jth stage. Repeat this calculation until the largest height is calculated. Find smallest j such that at least one column of the original partial product matrix has more than bits.

2. In the jth stage from the end, apply full and half adders (according to their need) to reduce height of column until no column have height more than bits for that stage.

3. Repeat above until the height of each column is reduced to two.

These methods are used to reduce the height of partial product matrix up to two and after that addition of two rows is done by using appropriate adder. Values of dj = 2, 3, 4, 6, 9, ….

MAIN BLOCKS OF FLOATING POINT MULTIPLIER

The block diagram of floating point multiplier is given in “figure 2”. Main sub blocks of floating point multiplier are sign calculator block, exponent calculator block, mantissa multiplier block and normalization unit.

A. Sign calculator block:-

This block is use to find out sign of result. In this two input XOR gate is used. If both input are same then the output is 0 i.e., positive sign and if both inputs are different then output is 1 i.e., negative sign.

B. Exponent calculator block:-

(6)

Available Online at www.ijpret.com 960 C. Mantissa multiplier unit:-

In this block fixed point multiplication is done. Multiplication can be carrying out with different algorithms but in our project we used Dadda algorithm. In this instead of AND gates, shifting algorithm is used to generate matrix of partial product. Then this partial product is reduced to row of two with the help of adders then final addition is done to get multiplication result of mantissa which is of 46 bits.

D. Normalization unit:-

This is used to normalize the result. Normalized number means it having leading ‘1’ just immediate to the left of decimal point in 46 bit mantissa multiplication result. According to the position of decimal point exponent is adjusted. Leading ‘1’ is skipping from result and then remaining bits after decimal point are truncated to 23 bit which is mantissa of final result.

If decimal point is shifted to right hand side then exponent is increases and if decimal point is shifted to left hand side then exponent is decreases. Number of increases or decreases in exponent value is equal to number by which decimal point is shifted. Truncation is nothing but the rounding of result.

E. Exponent calculator block:-

This block is used to find out exponent of output. In this we used 8-bit adder. First two exponent of input number is added and then bias is subtracted from them. Subtraction is done by adding 2’s compliment of bias. In this block bias is constant and in our case it is 127.

F. Mantissa multiplier unit:-

In this block fixed point multiplication is done. Multiplication can be carrying out with different algorithms but in our project we used Dadda algorithm. In this instead of AND gates, shifting algorithm is used to generate matrix of partial product. Then this partial product is reduced to row of two with the help of adders then final addition is done to get multiplication result of mantissa which is of 46 bits.

G. Normalization unit:-

(7)

Available Online at www.ijpret.com 961 decimal point is shifted to right hand side then exponent is increases and if decimal point is shifted to left hand side then exponent is decreases. Number of increases or decreases in exponent value is equal to number by which decimal point is shifted. Truncation is nothing but the rounding of result.

Fig.2. Block diagram of floating point multiplier

Shifting Algorithm:-

In this first we concatenate 23 zero to extreme left of mantissa of both the input

operands. Then next step is to multiply (Logical AND) all bits of operand 1 with first bit of operand 2. Instead of AND gate we use simple mechanism to get partial product. If first bit of operand 2 is logic high i.e., 1 then, operand 1 is copied down otherwise if first bit of operand 2 is logic low i.e., 0 then, a row of all zero is copied down. This is nothing but the first row of partial product matrix. Now see the second bit of operand 2, if it is ‘1’ then operand 1 is shifted left by 1 position and copied down. If second beet of operand 2 is ‘0’ then again all zero are copied down. This is second row of partial product matrix. In this way a partial product matrix is formed. Then row wise addition is done to get final mantissa result. This is nothing but the fixed point multiplier.

EXPERIMENTAL RESULT

(8)

Available Online at www.ijpret.com 962 Verilog HDL. With the help of these basic modules our proposed floating point multiplier is designed in same. Whole design is synthesized using Xilinx ISE 13.1 and simulated in Model Sim 6.3. This design is targeted to Vertex 6 family FPGA, device xc6vlx240t-1ff1156 and compare with previously designed module. Following “figure 3” shows the block view of designed multiplier unit.

It has two 32 bit operands ‘a’ and ‘b’, along with a clock and one 32 bit output, which is nothing but the final result of multiplication. In this block, many sub blocks such as XOR gate, half adder, full adder, shifter, etc., are present. This multiplier gives output in single T state. Following waveform in “figure 4” shows simulation result of multiplication of two numbers. In this we take two 32 bit random floating point numbers as input and it gives a 32 bit output.

Fig.3. Block view of designed unit

Input:

A = 32'b01000001110011001100110011001101

b = 32'b01000010000110111001100110011010

Multiplication Output =

32’b01000100011110001111010111000011

(9)

Available Online at www.ijpret.com 963 The comparison of utilization of slice, flip flop-LUT pair and frequency is given in table 1. It is found that our design achieves maximum frequency of 861 MHz in best case with number of slice occupied 433. As compare to previously implemented multiplier, this multiplier have better result in case of speed and slice area.

Proposed Unit Multiplier unit in “Ref.[1]”

Used Slice 619 1149

Used Flip flop and LUT pair 1230 1146

Max. Frequency in Best Case 550.5 MHz 527 MHz

TABLE 1. AREA AND FREQUENCY COMPARISON BETWEEN PROPOSED UNIT AND DESIGNED UNIT IN “REF [1]”

CONCLUSION AND FUTURE WORK

This paper presents a fast floating point multiplier which supports IEEE 754 single precision binary floating point number format. Design is targeted to FPGA of Virtex5 xc5vlx20t-2ff323 device and compare with previously designed module. It gives output in only one T state and achieves maximum frequency of 550.5 MHz with 619 slice area. For reducing delay and area we used shifting algorithm in partial product formation instead of AND gates. We plan to extend this work for FFT unit

ACKNOWLEDGMENTS

We express our deep sense of gratitude to Prof. P. M. Palsodkar and Dr. Pravin Dakhole, Department of Electronics Engineering, YCCE, Nagpur for lending every support at every stage of this project work. We are indebted to their esteemed guidance, constant encouragement and fruitful suggestions from the beginning to the end of this project. Their trust and support inspired us in the most important moments of making right decisions. We would like to thank our parents, who always encouraged us in the successful completion of our work.

REFERENCES

(10)

Available Online at www.ijpret.com 964 2. Anna Jain, Baisakhy Dash, Ajit Kumar Panda, Muchharla Suresh, “FPGA Design of a Fast 32-bit Floating Point Multiplier Unit”, International Conference on Devices, Circuits and Systems (ICDCS), Coimbatore, PP. 545 – 547,IEEE-2012.

3. Loucas Louca, Todd A. Cook, and William H. Johnson, “Implementation of IEEE Single Precision Floating Point Addition and Multiplication on FPGAs,” Proceedings of 83 the IEEE Symposium on FPGAs for Custom Computing Machines (FCCM’96), PP.107-116, 1996.

4. Whytney J. Townsend, Earl E. Swartzlander, Jr., Jacob A. Abraham “A Comparison of Dadda and Wallace multiplier delays”, Advanced Signal Processing Algorithms, Architectures, and Implementations XIII, SPIE , San Diego, CA, PP. 552-560, August-2003.

5. Barry Fagin, Cyril Renard “Field Programmable Gate Arrays and Floating Point Arithmatic”, IEEE Transaction on Very Large Scale Integration (VLSI) systems, Vol. 2, No. 3, PP. 365-367, September 1994.

6. Mohamed Al-Ashrafy, Ashraf Salem, Wagdy Anis, “An Efficient Implementation of Floating Point Multiplier”, Saudi International Electronics, Communication and Photonics Conference (SIECPC), Ryadh, PP. 1-5, 24-26 April 2011.

References

Related documents

Thus, the length of detected tracts reflects the gene conversion initiation rate rather than the real tract length, and GENECONV has very limited power when gene conversion is

The NCM concept provides a general opportunity for the control of all species of mosquitoes as it is based on the interaction between the population of all

Based on the rounded mean + 2SD of the increase in mPAP in our healthy control group during DST (2.8 + 1.8 mm Hg), PAH patients were divided, prior to analysis, into two

Hasil Penelitian ini menunjukkan bahwa kelompok K(+) dengan pemberian simvastatin 0,052 mg/ekor/hari lebih baik dalam menurunkan kadar kolesterol total mencit

[r]

However when the cells were cultured invitro with SLP along with bacterial lysate (from corn oil induced SLP group), a significant decrease in the cytokines level was observed