• No results found

Novel Dsp Accelarator Architecture Based On Carry Save Arithmetic

N/A
N/A
Protected

Academic year: 2020

Share "Novel Dsp Accelarator Architecture Based On Carry Save Arithmetic"

Copied!
6
0
0

Loading.... (view fulltext now)

Full text

(1)

Available online: https://edupediapublications.org/journals/index.php/IJR/ P a g e | 2018

Novel Dsp Accelarator Architecture Based On Carry Save

Arithmetic

Katabathini Venkata Ramarao, Rajesh Kanuganti & N.Chandrashekhar

1M-Tech, Dept. of ECE,Khammam Institute of Technology and Science, Khammam.

2Associate Professor, Dept. of ECE,Khammam Institute of Technology and Science, Khammam.

3Associate Professor& HOD,Dept. of ECE,Khammam Institute of Technology and Science,

Khammam. Abstract

In the virtual signal processing (DSP) area,

Hardware acceleration is proved an

exceptionally promising implementation

strategy. Instead of adopting a monolithic software-specific integrated circuit layout technique, a unique accelerator structure comprising bendy arithmetic components that guide the execution of a massive set of operation templates discovered in DSP kernels is introduced. One of its essential peculiarity is to enable computations to be aggressively performed with deliver-shop (CS) formatted statistics. Incorporation of Error Tolerant Adder is some otherspeciality. Advanced arithmetic layout ideas, i.E., recoding techniques, and more desirable arithmetic components are utilized permitting CS optimizations to be executed in a bigger scope than in previous approaches.

Keywords: Arithmetic optimizations, deliver-shop (CS) shape, data path synthesis, Error

Tolerant Adder, bendy accelerator, operation chaining.

1. INTRODUCTION

Modern embedded systems goal excessive-stop application domain names requiring efficient implementations of computationally intensive virtual sign processing (DSP) features. The incorporation of heterogeneity via specialized hardware accelerators improves performance

and reduces strength intake although

application-precise incorporated circuits

(ASICs) shape the best acceleration solution in phrases of overall performance and electricity, their inflexibility results in expanded silicon complexity, as multiple instantiated ASICs are had to accelerate various kernels. Many researchers have proposed the use of

area-specific coarse-grained reconfigurable

(2)

High-Available online: https://edupediapublications.org/journals/index.php/IJR/ P a g e | 2019 performance flexible facts paths havebeen

proposed to successfully map primitive or chained operations located in the preliminary statistics-float graph (DFG) of a kernel. The templates of complex chained operations are both extracted directly from the kernel’s DFG [10] or specified in a predefined behavioral template library [4], [6], [7]. Design selections at the accelerator’s statistics course especially affect its efficiency. Existing works on coarse-grained reconfigurable information paths in particular take advantage of structure-degree optimizations, egg.Expanded education-stage parallelism (ILP) [2]–[5], [7]. The domain-precise architecture technology algorithms of [5] and [9] range the sort and range of

computation devices accomplishing a

customized design structure. In [2] and [4], bendy architectures have been endorse dexploiting ILP and operation chaining. Recently, Ansaloni et al. [8]adopted aggressive operation chaining to enable the computation of whole sub expressions the use of more than one ALUs with heterogeneous arithmetic

capabilities. The aforementioned

reconfigurable architectures exclude arithmetic optimizations in the course of the architectural synthesis and take into account them only at the inner circuit shape of primitive components, egg.Adders, in the course of the

logic synthesis. However, studies sports have proven that the arithmetic optimizations at better abstraction stages than the structural circuit one drastically effect on the information direction performance.

In, timing-driven optimizations based totally on bring-store (CS)arithmetic were accomplished on the post-Register Transfer Level(RTL) layout degree. In, common sub expression elimination

in CS computations is used to optimize linear DSP circuits. Verma et al. Developed transformation strategies on the utility’s DFG to maximise the use of CS arithmetic earlier the actual information path synthesis. The aforementioned CS optimization procedures target inflexible information path, i.E., ASIC, implementations. Recently, Xydis et al. [6], [7] proposed a bendy architecture combining the ILP and pipelining strategies with the CS-conscious operation chaining. However, all the aforementioned answers characteristic an inherent drawback, i.E., CS optimization is

bounded to merging handiest

(3)

time-Available online: https://edupediapublications.org/journals/index.php/IJR/ P a g e | 2020 ingesting carry propagations. In this quick, we

endorse a excessive-overall performance architectural scheme for the synthesis of bendy hardware DSP accelerators with the aid of combining optimization strategies from each the architecture and mathematics stages of abstraction. We present an adaptable data route layout that exploits CS stepped forward layouts of tied operations. The proposed structure includes adaptable computational devices (FCUs), which permit the execution of a significant association of operation formats discovered in DSP quantities. The proposed quickening agent engineering conveys average gains of up to 61.Ninety one% in territory delay object and fifty four.Forty three% in

energy consumption contrasted with

circumstance of-craftsmanship adaptable

information paths [4], [7],sustaining skill ability toward scaled innovations.

2. DSP acceleration:

Table II suggests the theoretically estimated values for the execution latency and region complexity of the DSP kernels mapped onto the examined architectures. The analysis is primarily based at the unit gate version as in Section IV. Regarding each the execution latency and the location complexity and thinking about all of the DSP kernels, the

proposed FCU-based totally structure

outperforms those constructed at the FCC and the RAU. As expected, the timing constraints and the effects of mobile sizing implied with the aid of the Design Compiler synthesis tool, in some cases result in inconsistencies between the experimental and the theoretical research, e.G., in Table I, the latency of ELLIPTIC kernel on FCC is extra efficient than the only on RAU, but in Table II, the RAU-based ELLIPTIC kernel outperforms the only based at the FCC. In any case, both the experimental and theoretical analysis indicated that the proposed method paperwork the maximum green architectural solution

3.CARRY-SAVE ARITHMETIC

(4)

Available online: https://edupediapublications.org/journals/index.php/IJR/ P a g e | 2021 CS optimization procedures have restricted

impact on DFGs dominated by means of multiplications, e.G., filtering DSP programs. 4.EXPERIMENTAL RESULTS

Schematic

Technological schematic

Simulation

Design summery 5.CONCLUSION

(5)

Available online: https://edupediapublications.org/journals/index.php/IJR/ P a g e | 2022 enabling excessive levels of computational

density to be carried out. Theoretical and experimental analyses have proven that the proposed solution forms an green design tradeoff factor handing over optimized

latency/vicinity and electricity

implementations. 6.REFERENCE

[1] P. Ienne and R. Leupers, Customizable Embedded Processors: Design Technologies and Applications. San Francisco, CA, USA: Morgan Kaufmann, 2007.

[2] P. M. Heysters, G. J. M. Smit, and E. Molenkamp, “A flexible and energy-efficient coarse-grained reconfigurable architecture for mobile systems,” J. Super comput., vol. 26, no. 3, pp. 283–308, 2003.

[3] B. Mei, S. Vernalde, D. Verkest, H. D. Man, and R. Lauwereins,“ADRES: An architecture with tightly coupled VLIW processor and coarse-grained reconfigurable matrix,” in Proc. 13th Int. Conf. Field Program.Logic Appl., vol. 2778. 2003, pp. 61– 70.

[4] M. D. Galanis, G. Theodoridis, S. Tragoudas, and C. E. Goutis,“A high-performance data path for synthesizing DSP kernels,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 25, no. 6,pp. 1154– 1162, Jun. 2006.

[5] K. Compton and S. Hauck, “Automatic design of reconfigurable domain specific flexible cores,” IEEE Trans. Very Large Scale Integr. (VLSI)Syst., vol. 16, no. 5, pp. 493– 503, May 2008.

[6] S. Xydis, G. Economakos, and K.

Pekmestzi, “Designing coarse-grain

reconfigurable architectures by in lining flexibility into custom arithmeticdata-paths,” Integr., VLSI J., vol. 42, no. 4, pp. 486–503, Sep. 2009.

[7] S. Xydis, G. Economakos, D. Soudris, and K. Pekmestzi, “High performance and area efficient flexible DSP data path synthesis,” IEEE Trans.Very Large Scale Integr. (VLSI) Syst., vol. 19, no. 3, pp. 429–442,Mar. 2011. [8] G. Ansaloni, P. Bonzini, and L. Pozzi, “EGRA: A coarse grained reconfigurable architectural template,” IEEE Trans. Very Large ScaleIntegr. (VLSI) Syst., vol. 19, no. 6, pp. 1062–1074, Jun. 2011.

[9] M. Stojilovic, D. Novo, L. Saranovac, P. Brisk, and P. Ienne, “Selective flexibility:

Creating domain-specific reconfigurable

arrays,” IEEE Trans.Comput.-Aided Design Integr. Circuits Syst., vol. 32, no. 5, pp. 681– 694,May 2013.

(6)

Available online: https://edupediapublications.org/journals/index.php/IJR/ P a g e | 2023 Design Autom.Electron. Syst., vol. 7, no. 4, pp.

605–627, Oct. 2002. Authors Profile

KATABATHINIVENKATA RAMARAO

He is pursuing M-Tech (VLSI system design)

Degree from Khammam Institute of

Technology and Science(JNTU

HYDERABAD), Ponnekal, Khammam, in (2015-2017) 2017, and B-Tech Degree from Khammam Institute of Technology and Science (JNTU HYDERABAD), ponnekal , Khammam,in ( 2011-2015)2015 , all in Electronics and Communication engineering.

RAJESH KANUGANTI

He is hailed from KHAMMAM (Dist.) born on 23rd Aug 1984. He received B-Tech in Electronics and Communication Engineering

from JNTU, Hyderabad, AP. He received M-Tech in E.I.E from Andhra University, Visakhapatnam, AP, and India. His research interests include Fuzzy logic system used in Signal processing and Embedded Systems Design, Optoelectronics in MEMS. He has published 04 International Journal & 06 National Conference. Presently he is working as Associate Professor in Khamma Institute of

Technology and Science(KITS),

Khammam,Telangana,India. He is having 9 years’ experience in teaching field.

N.CHANDRASHEKHAR

References

Related documents

[14] in a study to investigate the influence of HRIS on performance of Kenyan public universities found that HRIS application on recruitment and

Roman women, gender identity, social identity, economy, epigraphy, Roman literature,.. working women,

12.4.4.4 Plot of Bisphenol A adsorption isotherms by chemically activated sludge (Chapter 7) 417 12.4.4.5 Parameter fitted to Freundlich, Langmuir and Temkin model equations of

Double basin active solar still coupled with evacuated glass tube solar collector and use of glass stray inside the solar still, to reduce heat loss of upper portion

Assessment of approximate chemical formula and energy content by Modified Dulong formula of municipal solid waste of Allahabad city.. Vijai Krishna 1* & Sadhana

Comparison results between 72-layer and 256-layer HSV features in terms of precision (top), recall (middle) and F-score (bottom). The retrieval time when applying each method

These two genres are: (1) the music videos mentioned above, and (2) news broadcasts from American television. These videos are discussed more in Chapter 4. Firmly planted within

The integro-differentiation fractional order, which is either real or complex depending on the generation of the control design, permits parameterization of the open-loop