• No results found

Low Power Clock Gating Method with Subword based Signal Range Matching Technique

N/A
N/A
Protected

Academic year: 2022

Share "Low Power Clock Gating Method with Subword based Signal Range Matching Technique"

Copied!
13
0
0

Loading.... (view fulltext now)

Full text

(1)

1. Introduction

Low power VLSI is key technology area effecting growth of personal mobile communication, portable battery powered devices and several wireless equipment industries. The Complementary Metal–Oxide–

Semiconductor (CMOS) became default choice to low power VLSI design. Power in CMOS VLSI circuits can be considered into two categories i.e. static and dynamic.

Static power depends on leakage current and dynamic power is on transients i.e. (Ptransient) and capacitive load (Pcap), which can be described through the below equations.

Static Power (Pstatic) = Istatic Vdd (1)

Where Istatic is static current flows through the device and Vdd is supply voltage. The major advantage of CMOS VLSI circuits is low static power. However, with the present generation nano meter devices, there results increase in probability of tunneling due to reduced gate oxide thickness. The increased tunneling results in larger leakage currents. Several researchers proposed different methods in order to decrease leakage power. With growing demand on computing due to future rich applications, the dynamic power component is becoming dominant. In the cited paper, illustrates a novel clock gating technique to reduce dynamic power. This work is the continuity to the research published at1-3.

The primary source of dynamic power consumption which is due to switching activity at both input and

Abstract

Low power VLSI is key technology area enabling battery powered applications. The research work given here presents clock gating scheme based on signal range comparison for low power VLSI. The work at first stage is applied to FIR architectures.

Further in second stage to study the usability of the proposed technique both in ASIC and FPGA applications, circuit level simulation is also carried out. Low power validation of the proposed clock gating scheme at circuit level is simulated at 130 nm technology using SPICE tools. Analyses are carried out to study the effectiveness of the clock gating scheme with respect to the presence of information in given 2’s complement signal. In final stage to demonstrate an application for the clock gating scheme, the FIR filter is extended towards realizing the real time signal correlator consisting a Finite State Machine (FSM). All the blocks of filters and correlator are simulated initially through Modelsim and results are verified. Xilinx ISE tools are used to verify the synthesis aspects. Power analysis is carried out using Xilinx X power tool. The transposed FIR is more suitable for VLSI implementation and demonstrates power saving of 47% when compared with non clock gating based scheme. In circuit simulation results tt is observed that 21% of power saving is possible a teach subword register stage with the proposed clock gating scheme under 50% of probability for NO-Information (NOI) input signal conditions. The subword clock gated correlator results show power saving of 34% under given signal conditions. The research demonstrates improved clock gating mechanism suitable for most of DSP applications. The proposed clock gating scheme can be more effective in the context when signal is with less amplitude or for narrow band signal applications. The work finds application is several future low power VLSI applications.

Keywords: Correlator, Dynamic Power, Subword Register, SPICE Power Analysis

Low Power Clock Gating Method with Subword based Signal Range Matching Technique

A. Ranga Nayakulu

1*

and K. Satya Prasad

2

1KITS, Markapur – 523316, Andhra Pradesh, India;

[email protected]

2JNTUCEK, Kakinada - 533003, Andhra Pradesh, India

(2)

output. The said power is need to charge and discharge the capacitance on output gate. In CMOS circuit dynamic powerconsumption is due to sum of Ptransient and Pcap. Ptransient is the power consumed due to toggling the device from ‘0’ and ‘1’ states and vice versa. Pcap is due to charging and discharging of capacitive load.

Dynamic Power = PCap + PTransient

PDynamic= (CL + C) Vdd2f α (2) Where PCap is the total capacitive load power consumption, PTransient is power consumption due to transients; C is the internal capacitance of the gate. α is the activity factor and f is the frequency.

Total Power (PTotal) = Pstatic + PDynamic (3) From Equation (3), PTotal is the total power which sum of Pdynamic and Pstatic. Since the Dynamic power is the crucial element in a CMOS VLSI circuits. Several researchers analyzed possibilities of reducing dynamic power. The clock gating technique is very effective to disable the registers and subsequently the associated combinational logic effectively to achieve the dynamic power. There are multiple types of clock gating techniques which are discussed in the next section.

2. Clock Gating

2.1. Different Clock Gating Methods

This section presents various clock gating schemes, which are currently employed in Electronic Design Automation (EDA) tools from different vendors. The tools (synthesis, placement and route and clock tree insertion) accept user primitives along with user design to develop clock trees with gating techniques applied at various stages.

The clock gating methods5 can be classified into different groups as given below.

• The synthesis-based methods are widely used in present day EDA tools, in which the clock enables, are synthesized based on the logic of the underlying system. This method has limitation of increased redundancy in the logic added for clock gating.

• The data-driven type of clock gating compares the data values and decides the clock gating signal.

Several methods are emerging in this category, which are able to achieve higher power savings. These algorithms adopt signal processing methods for

generating clock enable signal. The present research develops a method in this category.

• The auto-gating Flip-Flops are simple to design, which results in only small power savings in most of the applications.

2.2 Overview of the Proposed Clock Gating Scheme

Among different clock gating schemes some are suitable for signal processing applications. These techniques provide opportunity to explore additional properties in signal to generate clock gating signal. The research discussed at5 proposes Look-Ahead Clock Gating (LACG). In this method the enable signal for clock input of each FF is computed one cycle ahead of time, based on the present clock cycle value of that FF on which it depends. The work presented here uses similar approach in addition with subword partitioning each register is divided into sections, by which the clock gating scheme can be effectively applied.

2.3 Literature Survey - Clock gating at RTL

The present paper is continuation of the research, which is referred at1-3.The research referred at1 illustrates the principle of subword based clock gating in typical Register Transfer Level (RTL) design flow. Signed signal representation with 2’s complement number system is assumed, which is applicable in most of the DSP algorithms. Depending on the signal value, number of information carrying bits can be computed. The presence of the information in the magnitude bits in a particular sub word out of a given word. Power analysis is carried out on clock gated subword based Multiply Accumulate (MAC) block. The work is extended with2 transistor level design of proposed clock gating scheme.

Gate level optimal NO Information (NOI) detector logic is implemented. This NOI logic is used to disable clock gating. Power optimization levels under different signal value probabilities are discussed.

The research work published at6 presents the clock gating for Floating-Point Unit (FPU) of microprocessor.

The FPU is made inactive for application programs, which don’t require floating point computations. In similar way several blocks of a micro processor are not used more than 50%6 , which can be kept in disable mode. Similar to this type few others methods to take advantage of inactive blocks are described7. The basic principle being making clock zero for inactive blocks. Extending these principles

(3)

to synthesis-level clock gating is presented8.

If a specific register stage is not active for certain period, then the related clock buffer should be disabled.

There by it’s corresponding combinational block is turned off by using a clock gating signal. This principle is shown in Figure 1. This method is popular in Register Transfer Logic designs to decrease the power consumption. In the Figure, there a clock buffer 1, which is not clock-gated, for which clock signal is always applied, i.e. CLK1 to n-bit register. Where as local buffer 2 is clock-gated. When CG_sel signal is ‘0’, then CLK2 is disabled. Hence power consumption is reduced due to CLK2. Additionally the dynamic power consumption due to block B following the CG registers also saved when CLK2 is disabled. For fine clock gating controls, enable signal at each stage of registers can be used.

Figure 1. RTL circuit non clock gated and clock gated register stages.

In auto gating FFs technique, the clock frequency and period constraints at every stage considered as requirements. Based on that respective blocks are identified for which clock is disabled9.

Figure 2. Two types of non Clock gating circuits (a) Without enable (b) With enable.

There are few other types of clock gating, which are in this section. In RTL designs, digital circuits always contain some redundant computations. By disabling the

clock, power can be minimized at respective stages10. In the research work11, two types of non-CG circuits are presented. The Figure 2(a) shows a circuit with no enable, while Figure 2(b) shows circuit with enable signal.

In the configuration where enable is not present, researchers evolved bus-specific clock gating12, TCG13 and OBSC14 techniques. These mechanisms reduce the power consumption by taking switching activity of signals into account. In digital circuits where the enable signal is used the techniques such as Local-Explicit Clock Gating (LECG), Enhanced Clock Gating (ECG), Waste- Toggle-Rate-based (WTR) clock gating11 and the Single Comparator-Based Clock Gating (SCCG)15 techniques are suitable. The principle of Bus-Specific Clock gating (BSC) is shown in Figure 3. Here12 for individual Flip- Flop (FF), the input and output are subjected to XOR gates. If the bits are equal, the XOR produces ‘0’, which disables the clock to save power.

Figure 3. Clock gating technique - Bus specific clock gating.

The Enhanced Clock Gating (ECG) combines the gating techniques of BSC and LECG, is shown in Figure 4.

The gated clock of register is disabled as long as its toggle is resulting in waste transition. The Threshold based clock gating in comparison with bus specific clock gating technique, gates all the flip flops without considering signal activities, in13, has proposed this data driven clock gating method to improve BSC.

Figure 4. Enhanced clock gating circuit schematic.

(4)

A fine-grained activity driven clock gating method was used in (OBSC) Optimized Bus Specific Clock gating which was cleared explained14. The optimized bus specific clock gating technique determines which flip flops should be grouped and to be gated by taking the logical association of each FF14, which was in contrast with TCG.

The OBSC method in particular proposes as a single comparator based solution for clock gating to enhance the ECG.

The work published at16 design of the decoder and encoder blocks of communication system with a clock gating scheme for power optimization as well and for even uncompromised system performance. Extending the basic principle related to clock gating, the different circuit level techniques are also attempted by many researchers. The adaptive pulse triggered flip flop method is presented in the research17. The work vividly says PTFF with dynamic power optimization and acceptable timing characteristics as well.

Several other researchers18-21 worked on variants of these clock gating methods for developing DSP applications on FPGA. The work given at22, discusses the Fine-Grained Dynamic Clock Gating for LDPC decoder.

For streaming applications the methods given at23, offers solutions based on asynchronous queued blocks. The work given at24 combines three i.e. data driven, Auto gated flip flops and synthesis based gating methods for achieving reduction in clock switching power.

In network processing applications, suitable clock gating scheme is given25. The Sequential Equivalence Checking (SEC) based Clock-Gating methods with suitable theorems is described26. Different clock gating schemes such as Latch, flip-flop, AND gate and mux based along with their results are discussed in the work given27,28. The research published29 presents the dynamic power reduction by reducing the circuit area for three bit full adder. Low power shift register with DFFs, with clock and Power gating integration is presented30. Two schemes Optimized Bus-Specific-Clock-Gating (OBSC) and Run Time Power Gating (RTPG) using Tanner tools were discussed.

The design methodologies with clock gating for ASICs is discussed31,32. The work given at33, uses merge and split clock gated concepts to achieve low power dissipation.

The work given at34, discusses the Reduced Instruction Set Architecture (RISA) to handle multiple interrupts with clock gating technique. This technique attempts to reduce power in establishing serial communication.

The research given at35, presents low power flip-flop based design with clock gating. The work given at36, presents software control based Adaptive Clock Gating (ACG) technique. By disabling the ACG of IP block both dynamic power and leakage power are controlled. The work given at37, discusses in general the non-clock gated designs to clock gated designs.

The clock gating in the context of computing systems such as microprocessors must be distinguished from data paths of signal processing algorithms. The circuits implementing signal processing algorithms has higher scope of usage of clock gating based power optimization which is presented in detail in next section.

2.4 Proposed Clock Gated DFF - Circuit Level Implementation

There are different types of clock gating schemes each having its merits and demerits. The methods of designing low power circuits with clock gating are different for combination logic and sequential logic. In the present work, clock gating is performed with novel no-information detection scheme, which is more effective in signal processing applications. The main advantage of this method is low area over head when compared to typical comparator based clock gating scheme. In the present implementation each register is divided into sub words, which is independently controlled. The gate level implementation of enable generation circuit is given in Figure 5.

Figure 5. Clock gate control signal generation.

This enable output of above circuit is control input for the clock gating AND gate. The high level schematic of clock gated register is shown in Figure 6. The Gate- Diffusion Input (GDI) Multiplexers and master-slave flip flop are the principles used for this circuit design. The detailed functionality and simulation aspects of this clock gated circuit are given at reference 2.

(5)

Figure 6. High level schematic for clock gated register.

3. Architecture of Clock Gated Subword Register

The present section illustrates high level architecture of clock gated subword register. The clock gating scheme proposed with subword register can be used as additional optimization on the circuit level low power techniques.

Hence the proposed technique can yield further optimization when compared to existing circuit level clock gating principles.

The principle for generating the clock gating signal for proposed method is different in conjunction with other researchers. The clock gating is generated based on comparing the sign bit with magnitude bits. Consider 2’s complement signal with n+1 bit length as shown in Figure 7. The Figure 8 shows the sub word representation of signal, where p sub words each with m (= n/p) bits are represented. Depending on the dynamic range of signal information presence in a particular gets decided1. The enable for clock gating generated for each subword. The signal X represented3 with n+1 bit can be described as in equation (4). The sub word based representation of the same signal is given in equation (5).

X = { bn, bn-1, bn-2 ………. b1, b0 } (4)

X = { bn, {bm-1, ……… b0 }p-1, {bm-1, ……… b0 }1, {bm-1,

……… b0 }0, } (5)

Figure 7. Signed number represented in 2’s complement representation.

Figure 8. Subword representation of 2s complement signal.

The NOI (No information) logic in the ith subword (0

< i < p-1) is set to ‘1’ if all the bits of subword {bm-

1, ……… b0 } are same as sign bit and also if all high significant jth (i+1 < j < p-1) subwords have NOI logic ‘1’.

The logic at Figure 5 is optimal circuit implementation for clock gating signal. The Figure 9 shows the four bit word register.

Figure 9. 4 bit register with clock gating.

(6)

The results of four bit subword register are presented2. The inputs (d3, d2, d1, d0) are input signals to subword register. Sign is the MSB of the entire signal. The (i+1)th subword stage generated enabled is referred as En_MSSW.

The (q3, q2, q1, q0) is the output of clock gated subword.

The results with 130 nm CMOS circuit simulator SPICE tools are presented at2. It is observed that 21% of power saving is possible at each subword register stage with the proposed clock gating scheme under 50% of probability for no-information (NOI) input signal conditions.

4. Low Power FIR Filter with Proposed Clock Gating Scheme

Here FIR-filter architecture is considered to demonstrate proposed clock gating technique. Due to wide spread usage in implementation of many signal processing algorithms. Fourier transforms and correlations use MAC, which is similar in FIR filter. Hence clock gating scheme is implemented for FIR filter.

Figure 10(a) shows implementation of basic FIR filter and Figure 10(b) is the transposed form of FIR filter.

The transposed FIR filter is suitable38 for high speed and this implementation employs pipelining technique by incorporating registers in adder path. Enable generation logic is being optimized and against N enable generation circuits one enable logic is used. Hence the overhead due to enable logic generation is reduced in proposed architecture. The total design is coded in VHDL using generic style so that number of subwords and also bits in each subword are changed. In the next coming sections the detailed power analysis is given.

Figure 10. FIR filter architecture (a) Direct form FIR (b) Transposed form FIR with pipelining.

5. Low Power Correlator with Proposed Clock Gating Scheme

To illustrate the low power technique for complete DSP algorithm correlator is implemented with subword register scheme3. The proposed method here present under category of serial parallel architecture. The FPGAs,

ASICs and DSPs are provided with high speed MAC39 blocks to achieve very high speed implementation of filtering. To take advantage of this high speed data paths the proposed architecture here uses transposed FIR filter as basic block. Equation (6) gives correlation function computed by the Correlator. Figure (11) shows the architecture of Correlator implemented using transposed FIR architecture.

(6)

Figure 11. Correlator architecture.

(2D-array)

X Transposed

FIR Y-samples

Pipe line delay

Corr-ver-val

FSM Controller Corr-start

Correlation index Counter

Corr-vec-addr Corr-out-valid

Pipeline stages between each adder stage in order to get high speed designs can be provided by transposed FIR structures. Hence the transposed FIR architecture delay is N samples.

To start computing the correlation function the enable signal correlation-start is used. First sample of X and Y are applied on the rising edge of correlation- start signal. The pipeline delay block delays the correlation start signal to match the delay of transposed FIR filter and produces correlation out valid signal. Correlation out valid signal stays high for 2M+1 samples during which the corresponding correlation vector address in the range of –M to +M and correlation value will be generated.

All this controlling is achieved through state machine, which is discussed below. A state machine controller is implemented to perform correlation and to generate necessary control signals to the data path, whose details are given3.

6. Simulation Results

6.1 Functional Validation of Clock Gated FIR Architecture

The subword clock gating enabled register is verified for

(7)

correct functionality using VHDL simulation tool Models.

The Figure 12 has simulation results. The en signal going zero for less amplitude conditions can be observed.

Figure 12. Simulation results for register with subword based clock gating.

Further to the register level verification of clock gating and ensuring that the basic functionality is not disturbed the top level FIR module is simulated with VHDL test benches. The test bench consists of SINE wave source plus additive white Gaussian noise. The Figure 13 as simulation results.

The results show filtered noise and passed SINE wave component through filter. It proves that the subword partitioning of clock gating is not affecting main functionality.

Figure 13. FIR filter simulation results.

6.2 Simulation of Clock Gated Correlate Architecture

The correlate is also verified for its correct functionality using VHDL simulation tools. The test bench feeds two signals with a given delay of 10 cycles as shown in Figure 14. The correlation function (Y) has produced maximum at index value of 10, validating the functionality.

7. Synthesis and Power Analysis

In the previous section, functional verification and validation of proposed clock gating based subword

Figure 14. Simulation results for subword clock gated correlate for delay value 10.

(8)

architectures are presented. This section shows FPGA synthesis of these architectures and power analysis carried out both at circuit level and FPGA level.

7.1 Power Analysis of Subword Register at Circuit Level - SPICE Simulation

Table 1 shows power analysis for power saving at different percentages of NOI input combinations2. For NOI probability conditions from 0 to 50% for both positive number and negative number combinations are shown in table. The average of both the case for 50% probability for NOI conditions can be taken to conclude 21% for power saving. Note that this is power saving only in the four bit subword register stage.

Table 1. Power analysis

Input condition NOI % of sim time Power

(uw) Power sav- ing (%) En_

MSSW Sign d3d2d1d0

0 0 0000 0 7.6678 0

0 1 1111 0 8.2305 0

0 0 0000 10 7.1352 7

0 1 1111 10 7.9601 3

0 0 0000 20 6.6732 13

0 1 1111 20 7.7083 6

0 0 0000 30 6.3007 18

0 1 1111 30 7.5991 7

0 0 0000 40 5.9276 22

0 1 1111 40 7.3222 11

0 0 0000 50 5.5224 28

0 1 1111 50 7.0674 14

In signal processing applications for low amplitude and less bandwidth applications, the most significant sub words will be in NOI condition only. Hence the assumption of 50% criteria is applicable for the envisaged usage of the proposed clock gating scheme.

7.2 Power Analysis with FPGA Tools

The Xilinx’s X power tool is used for power analysis. The post Place and Route (P&R) net list is timing simulated to create timing simulated results. These waveforms data exported to X power generate the power analysis.

Since the IOs power is not part of clock gating scheme it is neglected for power analysis. Only the Clocks, logic, signals and DSPs power is summed for considering the total dynamic power. The Figure 15 and 16 shows the power analysis for basic FIR filter architecture without and with clock gating respectively. Since the basic FIR is not pipelined design the clock gating is not effective.

The Figure 17 and 18 has the Xpower analysis for transposed architecture of FIR filter. Since the transposed form is pipelined its advantage with clock gating can be seen.

The Figures 19 and 20 has X power analysis for correlate architecture without and with clock gating respectively.

Results of the above three modules are summarized in Table 2. The power saving is computed using below equation.

(6)

Figure 15. Xpower analysis of basic FIR filter without clock gating.

(9)

Figure 16. Xpower analysis of basic FIR filter with clock gating.

Figure 17. Xpower analysis of Transposed FIR filter without clock gating.

(10)

Figure 18. Xpower analysis of transposed FIR filter with clock gating.

Figure 19. Xpower analysis of correlator without clock gating.

Figure 20. X power analysis of correlator with clock gating.

(11)

Table 2. Power saving comparison S. NO. Applica-

tion Type Dynamic Power With-

out Clock Gating

Dynamic Power With Clock Gating

Power Saving (%) 1 Basic FIR

Filter 10 mW 11 mW No power

saving 2 Trans-

posed FIR Filter

21 mW 11 mW 47.61

3 Correlator 139 mW 92 mW 34.53

Even though the basic FIR architecture claims low power in comparison with transposed FIR, it cannot be used due non-pipelined38 design. In all VLSI implementations transposed FIR or its variants are only preferred. Hence the power saving for clock gated transposed FIR is considered here. Based on X-power results, the dynamic power 21 mW for without clock gating and 11 mW for with clock gating to transposed FIR filter, the power saving will be (21-11)*100/21=47%.

For correlator, substituting the 139 mW and 92 mW for without and with clock gating methods, the power

saving will be (0.139 – 0.92)*100/0.139 = 34.53%.

8. Conclusion

A new scheme for clock gating for range matching of signal value is developed in this research work. The 2s complement signal is divided into different subwords, where each subword is gated with separate control signal.

The area efficient enable generation logic is implemented at gate level using relationship between adjacent sub words. Both circuit level SPICE simulations and RTL level VHDL simulations are carried out to prove correctness of proposed scheme. The circuit level power computation tools are used to prove the power reduction for 4 bit subword at 130 nm ASIC technologies. The FPGA tool X power is used to prove the developed scheme for FIR filters and correlate. The work demonstrates novel principle of subword based clock gating scheme in comparison with other clock gating methods43 and its application for DSP architectures. The results of present research work is compared with other researchers work and summarized in table.

Table 3. Comparison with other clock gating methods

Ref Principle of power optimization % of power saving in ref work Comparison with present work [5] Look-Ahead Clock Gating (LACG) Computes

clock enable signals of each FF one cycle ahead of time.

22.6% Clock power saving The presented work here uses similar approach in addition also uses subword approach for further power saving.

[40] Local Explicit Clock Gating (LECG), Enhanced Clock Gating (ECG), Waste-Toggle-Rate-based (WTR) clock gating and the single Comparator- based clock gating (SCCG) techniques.

17.67% Maximum power saving By considering the optimization possible in signal processing ap- plications subword schemes proves more efficient.

[41] Huddle-based Distributed Register (HDR) 14.9% Maximum power saving Hence higher power optimization is also demonstrated.

[42] Waste Toggling Rate (WSR) 35.84% Maximum power saving (More suitable for Processor’s ALU and highly pipelined datapath)

This subword based technique allows even detecting the waste tog- gling rate at subword level. The FIR filter with subword clock gating has 47% power saving.

(12)

9. References

1. Ranganayakulu A, Satya Prasad K. Subword partition based data driven clock gating scheme for low power VLSI design.

International Journal of Computer Applications. 2014 Dec;

108(14):1-6.

2. Ranganayakulu A, Prasad KS. Sub word Partitioning and signal value based clock gating scheme for low power VLSI applications. International Journal of Current Engineering and Technology. 2015 Jun; 5(3):1-9.

3. Ranganayakulu A, Prasad KS. Low power correlator us- ing signal range and sub word based clock gating scheme.

International Journal of Hybrid Information Technology.

2016; 9(3):159-70.

4. Oklobdzija VG. Digital System Clocking – High-Perfor- mance and Low-Power Aspects. New York, NY, USA: Wi- ley; 2003.

5. Wimer S, Albahari A. A look-ahead clock gating based on auto-gated flip-flops. IEEE Transactions on Circuits and Systems I: Regular Papers. 2014 May; 61(5):1465-72.

6. Brooks D, Bose P, Schuster S, Jacobson H, Kudva P, Buyuk- tosunoglu A, Zyuban V, Gupta M, Micro P. Architecture:

Design gook, power-aware and challenge modelling for next-generation microprocessors. IEEE Micro. 2010 Nov- Dec; 20(6).

7. Shen W, Cai Y, Hong X, Hu J. Activity and register place- ment aware gated clock network design. Proceedings of ISPD; 2008. p. 182–9.

8. Design notes Synopsys Design Compiler. 2013. Available from: www.synopsys.com

9. FPGA-based Prototyping Methodology Manual. Synopsys, Inc.; 2011.

10. Pedram M, Abdollahi A. Low-power RT-level synthesis techniques: A tutorial. IEE Proceedings on Computers and Digital Techniques; 2005 May 6; 152(3):333-43.

11. Li L, Choi K. Selective clock gating by using wasting toggle rate. IEEE International Conference on Electro/Informa- tion Technology; Windsor, ON. 2009. p. 399-404.

12. Lang T, Musoll E, Cortadella J. Individual flip-flops with gated clocks for low power datapaths. IEEE TCAS-II: Ana- log and Digital Signal Processing. 1997; 44(6):507-16.

13. Bonanno A, Bocca A, Macii A, Macii E, Poncino M. Data driven clock gating for digital filters. Integrated Circuit and System Design Power and Timing Modelling, Optimization and Simulation. 2003; 96-105.

14. Li L, Choi K. Activity-driven optimized bus-specif- ic-clock-gating for ultra-low-power smart space applica- tions. IET Commun. 2011; 5(17):2501–8.

15. Wang W, Tsao YC, Choi K, Park S, Chung MK. Pipeline power reduction through single comparator-based clock gating. International SoC Design Conference (ISOCC);

2009. p. 480-3.

16. Sahni K, Rawat K, Pandey S, Ahmad Z. Power optimization of communication system using clock gating technique. 5th

International Conference on Advanced Computing and Communication Technologies (ACCT); 2015 Feb 21-22. p.

375-8.

17. Kavali K, Rajendar S, Naresh R. Design of low power adap- tive pulse triggered flip-flop using modified clock gating scheme at 90 nm Technology.

18. Gupta N. Clock power analysis of low power clock gated arithmetic logic unit on different FPGA. International Con- ference on Computational Intelligence and Communica- tion Networks (CICN); 2014 Nov 14-16. p. 913-6.

19. Anand N, Joseph G, Oommen SS. Performance analysis and implementation of clock gating techniques for low power applications. International Conference on Science Engineering and Management Research (ICSEMR); 2014 Nov 27-29. p. 1-4.

20. Warrier R, Vun CH, Zhang W. A low-power pipelined MAC architecture using Baugh-Wooley based multiplier. IEEE 3rd Global Conference on Consumer Electronics (GCCE);

2014 Oct 7-10. p. 505-6.

21. Wimer S, Koren I. Design flow for flip-flop grouping in data-driven clock gating. IEEE Transactions on Very Large Scale Integration (VLSI) Systems. 2014 Apr; 22(4):771-8.

22. Park YS, TaoY, Zhang Z. A fully parallel non-binary LDPC decoder with fine-grained dynamic clock gating. IEEE Journal of Solid-State Circuits. 2015 Feb; 50(2):464-75.

23. Bezati E, Brunet SC, Mattavelli M, Janneck JW. Coarse grain clock gating of streaming applications in program- mable logic implementations. Proceedings of the Electron- ic System Level Synthesis Conference (ESLsyn); CA. 2014 May 31-June 1. p. 1-6.

24. Murlidharan V, Vignesh M, Varatharaj M. Clock gating based on auto-gated flip-flops. IEEE Transactions on Cir- cuits and Systems I: Regular Papers. 2014; 61(5):1465-72.

25. Kulkarni R, Kulkarni SY. Implementation of clock gating technique and performing power analysis for processor engine (ALU) in network processors. International Confer- ence on Electronics and Communication Systems (ICECS);

2014 Feb 13-14. p. 1-5.

26. Savoj H, Berthelot D, Mishchenko A, Brayton R. Combina- tional techniques for sequential equivalence checking. For- mal Methods in Computer-Aided Design (FMCAD); 2010 Oct 20-23. p. 145-9.

27. Chaudhary H, Goyal N, Sah N. Dynamic power reduction using clock gating: A Review. IJECT. 2015; 6(1):1-5.

28. Prakash NR, Akash. Clock Gating for dynamic power re- duction in synchronous circuits. 2013; 4(5):1-4.

29. Kaushik PG, Gulhane SM, Khan AR. dynamic power re- duction of digital circuits by clock gating. International Journal of Advancements in Technology. 2013 Mar; 4(1).

30. Koteswara Rao D, Pale TR. Low power register design with integration clock gating and power gating. International Journal of Application or Innovation in Engineering and Management. 2014; 3(10):1-6.

31. Emnett F, Biegel M. Power Reduction through RTL Clock Gating. San Jose: SNUG; 2000. p. 1-11.

(13)

32. Kathuria J, Ayoubkhan M, Noor A. A review of clock gat- ing techniques. MIT International Journal of Electron- ics and Communication Engineering. 2011; 1(2):1-9.

33. Hariharan K, Jaya Kumar C. Clock gating for low power circuit design by Merge and split methods. IOSR Journal of Engineering. 2012; 2(4):1-5.

34. Kamaraju M, Chinavenkateswararao G. Low power re- duced instruction set architecture using clock gating tech- nique. International Journal of VLSI Design and Commu- nication Systems (VLSICS). 2013; 4(5):1-17.

35. Strollo AGM, Napoli E, De Caro D, New clock-gating tech- niques for low-power flip-flops. Proceedings of Interna- tional Symposium on Low Power Electronics and Design (ISLPED); 2000. p. 114-9.

36. Chang X, Zhang M, Zhang G, Zhang Z, Wang J. Adaptive clock gating technique for low power IP core in SoC design.

IEEE International Symposium on Circuits and Systems;

2007. p. 2120-3.

37. Kshirsagar S, Mali MB. A review of clock gating techniques in low power applications. International Journal of Inno- vative Research in Science, Engineering and Technology.

2015; 4(6):1-5.

38. Aksoy L, Lazzari C, Costa E, Flores P, Monteiro J. Design

of digit-serial FIR filters: Algorithms, architectures and a CAD tool. IEEE Transactions on Very Large Scale Integra- tion (VLSI) Systems. 2013 Mar; 21(3):498-51.

39. Warrier R, Vun CH, Zhang W. A low-power pipelined MAC architecture using Baugh-Wooley based multiplier. IEEE 3rd Global Conference on Consumer Electronics (GCCE);

2014 Oct 7-10. p. 505-6.

40. Zhang Y, et al. Automatic register transfer level cad tool de- sign for advanced clock gating and low power schemes. In- ternational SoC Design Conference (ISOCC); Jeju Island.

2012. p. 21-4.

41. Akasaka H, Yanagisawa M, Togawa N. Energy-efficient high-level synthesis for HDR architectures with clock gat- ing.  International SoC Design Conference (ISOCC); Jeju Island. 2012. p. 135-8.

42. Li L, Choi K. Selective clock gating by using wasting toggle rate. IEEE International Conference on Electro/Informa- tion Technology (EIT’09); 2009. p. 399-404.

43. Ravi S, Sinha S, Adithyan R, Kittur HM. Design and analy- sis of clock gating elements. Indian Journal of Science and Technology. 2016 Feb; 9(5). Doi no:10.17485/ijst/2016/

v9i5/87184

References

Related documents

CO-OP cognitive orientation to occupational performance, IF implementation facilitator, TSNs Toronto Stroke Networks, VCoP virtual community of practice, ITS interrupted time series,

The second was to demonstrate the use of the Taguchi parameter design in order to identify the optimum surface roughness performance with a particular combination of

Although most of the mutation-deficient mutants are also more easily killed by radiation or other inactivating agents than are wild-type strains (LEMONTT 1971; LAWRENCE and

Data from previous and on-going experiments at NC State University’s Sediment and Erosion Control Research and Education Facility (NCSU SECREF) and construction sites were

The presented paper of “CFD and Thermal analaysis of NATURAL CONVECTIVE HEAT TRANSFER FROM INCLINED NARROW PLATES” provides analysis of Natural convective

The blur images is convert in grey scale images because it is use to frequently the effect of measuring the intensity of light at each pixel in a single band of the

This analysis procedure, which is called modal superposition method, is more effective than the time integration method because equation (3) is a large scale coupled

consider two criteria to design pools: optimal coverage of the chemical space and min-.. imal collision