CLK
Figure 3.2: Simulated unwanted glitch at different logic depth in a chain of inverters with FTL implementation.
between M 1 and nMOS PDN because all inputs to nMOS transistors are reset to logic “0” during the reset period. The first generation of FTL exhibits many shortcomings including excessive power dissipation and reduced noise margin. To mitigate these problems, we propose a new **high**-**performance** logic style that we call constant-delay (CD) logic. CD logic provides a local window technique and a self-reset circuit that enables robust logic operation with minimized power consumption while maintaining FTL’s speed advantage. The most distinct characteristic of CD logic from previously proposed logic styles is that the delay is, to a first-order approximation, not affected by the logic expression 1 . Unlike SCL, CD logic does not require complementary signals and can be easily integrated with static and dynamic domino logics. Also, CD logic does not have the problem of constant static power dissipation similar to pseudo-nMOS. Furthermore, the clock timing requirement of CD logic is not as stringent as OPL. CD logic can achieve robust operation with optimal **performance** as long as CLK signal arrives earlier than the input signals. This thesis will demonstrate that CD logic has the potential to 1) outperform other logic styles with better **energy**-efficiency, and is particularly suitable for **high**-**performance** digital blocks; and 2) CD logic is robust under extreme process, voltage, and temperature (PVT) variations.

Show more
138 Read more

Domino logic is attractive for **high**-speed **circuits** & it is 1.5 – 2x faster than static **CMOS**, so widely used in **high**-**performance** microprocessors. But many challenges: Monotonicity, Leakage, Charge sharing, Noise. The Fig.2 above shows X is a dynamic node holding value as charge on the node & eventually sub-threshold leakage charge may disturb to reduce the charge leakage by using keeper circuit and foot transistor is used to increase the **performance** of the circuit.

of these test vectors on the chip for BIST is not possible. In such condition, DFT modifications are required for effective reduction of test vector set size. Such modifications degrade the **performance** of embedded array multipliers. The application of highly regular test patterns which are efficiently generated on the chip without any DFT modifications, for embedded array multipliers, leads to an **efficient** BIST scheme which leads to a very **high** fault coverage. BIST designs methods for the Iterative Logic Arrays have been proposed in [18, 19]. A customized Test Pattern Generator (TPG) needs to be designed for the method suggested in [20]. Again, this is applicable only for one-dimensional Iterative Logic Array and further requires modifications to two-dimensional ILAs which increases complexity in hardware. For carry-propagate type array multiplier, test methodology is studied and explained in [18]. Based on the graph labeling techniques, test pattern generation and DFT for Iterative Logic arrays are proposed. Tree multipliers are better choice for **performance** oriented designs. Testability of tree multipliers is at par with the array multipliers as both of them give the same BIST methodology for testing [17].Tree like parallel multipliers are considered better than array multipliers but due to the lower regularity in their structure, generally avoided for use [10]. The linear programming approach for test pattern generation is proposed in [20] by which the best possible the test vectors are selected from the n-detection test set which is derived out of the weighted defect part level estimation model. The output deviations are used for ranking of patterns in [21].

Show more
Parallel Asynchronous Self-Timed Adder (PASTA) is performing multi-bit binary addition based on the recursive formulation. PASTA does not need any carry chain propagation. The design is provided with the completion and detection unit and the design attains the logarithmic **performance** without any speedup circuitry or look-ahead schema. The implementation of PASTA has no limitations for **high** fan-outs. For asynchronous logic the **high** fan-in gate is required and which is managed by connecting the transistors in parallel. The simulations have been performed in CADENCE 180nm technology.

Show more
The current sequential circuit designs do not focus on maintaining the state of the flip-flop while moving to the sleep mode. However, it is important for the flip-flop devices to maintain their state while they are in the sleep mode and retrieve it after coming out of the sleep state. For this, a special data retention circuitry is required. This circuitry should be such that it does not increase the leakage during the sleep mode and at the same time does not degrade the **performance** while in the active mode. It is also essential that it utilize the circuitry and the control signals from the current design for storing the data [2]. It keeps the circuit simple, as it does not require any additional circuitry and control signals. This reduces the extra capacitive load on the critical path, thereby making the circuit faster.

Show more
Neural networks, are also known as artificial neutral network (ANN’s), are information processing system with their design inspired by the studies of the ability of the human brain to learn from observations and to generalize by abstraction.[2] The fact that neural network can be trained to learn any arbitrary nonlinear input/output relationships from corresponding data and the acquired knowledge has resulted in their use in a number of areas such as pattern recognition, speech processing,[2][4] control, bio medical engineering, RF and microwave etc. recently, (ANNs) have been applied to **CMOS** analog circuit design and optimization problems as well. Neural networks are first trained to model the electrical behavior of passive and active components/ **circuits**.[2][7] These trained neural networks, often referred to as neural- network models, can then be used in **high** level simulation and design, providing first and accurate answers to the task they have learnt by acquiring the knowledge from their training. Neural networks are effective and **efficient** alternatives to conventional method such as numerical modeling methods, which could be highly computationally expensive, or analytical methods which could be difficult to obtain for newly achived devices, or empirical modeling solutions due to huge range and limited accuracy.[2][6] Neural network

Show more
However, the chief disadvantage with the domino dynamic logic circuit is its excessive power dissipation owing to the change activity and the clock load. To upset the excessive power dissipation of the dynamic logic, the present style methodologies trade power for **performance** within the delay in the important sections of the circuit [13]. This can be achieved through a combination of dynamic and static circuit designs, use of twin provide voltages and dual-Vt transistors. Domino and dynamic logics are chiefly tormented by charge sharing and race issues [14]. To resolve these demerits, we tend to propose the new MT-**CMOS** dynamic logic circuit victimization multi-threshold MOS semiconductor device for reducing the facility dissipation and FTL logic for increasing the operative speed of **arithmetic** **circuits** [15]. The projected circuit is giving response to these issues and minimizing power wastages. During this paper, we tend to propose a coffee power **high** **performance** Ripple Carry Adder (RCA) logic structure employing a new MT-**CMOS** domino logic blocks [16].

Show more
Abstract: - IC development is nowadays a huge industry. There is an almost innate amount of consumer products like mobile phones, processors, televisions, cameras, refrigerators, ovens and cars that in one way or another uses custom IC components. Integrated **circuits** can provide anything from analog-to-digital conversion to digital filtering and much more. A digital integrated circuit can be manufactured with a number of different approaches, but they all contain the same basic steps. It all starts with transistors, wiring and all the things that make up the circuit being placed in a layout, designed in a CAD (Computer Aided Design) tool and ends up with that layout being physically created on a chip. The way to create these layout dyers depending on design requirements. Standard cell library contains a collection of components that are standardized at the logic or functional level, and consists of cells or macro-cells based on the unique layout. The economic and **efficient** accomplishment of an IC design depends heavily upon the choice of the library. Therefore, it is important to build library that fulfills the design requirement. A library of logic cells is the set of building blocks for the ASIC design flow. The library is typically called a standard cell library because of its common interface implementation and regular structure. The library provides the functional building blocks used for synthesis and a layout representation of the cells for place-and-route. It is very important to note that the process of HDL synthesis limits the choice of logic cells to those that are found in the library provided. This guarantees that a physical or layout representation of the cells exists when the design is implemented using place and route tools. One way to understand the required layout characteristics of standard cells is to understand their history and the reasons behind their development.

Show more
Now in these days, demand for less power VLSI with improved **performance** and efficiency tends to design more effectively by their layout, logic levels, technology and basic architecture of design [4]. When taking into consideration of logic design of combinational logic **circuits**, one has to look for its less power consumption, more level of achievement, utilization of less space, more speed of operation and mostly with **high** efficiency. The different parameters affecting the above conditions are any short circuit and leakage current, activities on transition and power dissipation switching capacitance. Now it becomes more vital to choose proper design of circuit and **efficient** method of implementation by restricting the formulation of universal regulations for optimal logic styles. Investigations of the process are mainly considered on full-adder, full subtractor, and multiplier **circuits** used in

Show more
tightly related to the circuit working temperature. Hence, low power consumption is a zero-order constraint for most ICs manufactured today. In fact, higher **performance**-per-watt is the new mantra for micro-processor chip manufacturers today. In order to achieve **high** density and **high** **performance**, **CMOS** technology feature size and threshold voltage have been scaling down for decades. Because of this trend, transistor leakage power has increased exponentially. The reduction of the supply voltage is dictated by the need to maintain the electric field constant on the ever shrinking gate oxide. Unfortunately, to keep transistor speed (proportional to the transistor “on” current) acceptable, the threshold voltage must be reduced too, which results in an exponential increase of the “off” transistor current, i.e. the current constantly flowing through the transistor even when it should be “non- conducting”.

Show more
FFT processor consists of flip flop, MUX, SRAM and radix-2 blocks. Here, flip flop block is replaced by the modified DOMS-FF, which makes the computation of the result much faster than the existing system. As a result FFT processor can be used for **high** speed and low power applications. Twiddle factors are fetched and stored in SRAM memory, stored data is used for next iterations. REFERENCES

The second adder is complementary pass transistor logic (CPL) uses 32 transistors with swing restoration. Most CPL gates can have an complexity in interconnection at the layout level with the increase in power and delay. For low power applications Pass Transistor Logic (PTL) is best suitable technique and explanation was given in .The advantage of Pass Transistor Logic (PTL) is that either PMOS or NMOS is enough to implement a complete design [6]-[8] and so number of transistors gets decreased and also smaller input loads, especially for NMOS network and also by this PTL we can eliminate short circuit **energy** dissipation.

Show more
In addition to the area requirement, the error rate **performance** achieved is an important metric of PHY layer ASICs. Therefore, in this section we compare the error rate **performance** of the proposed PHY layer implementations. In particular, we compare in this section the error rate **performance** of the soft-output MMSE detection, the soft-output STS SD with two and with five cores, and the LRALD based PHY layer ASICs. The error rate of the hard-output MMSE detection based PHY layer ASIC is of less interest, as the area of the soft-output and the hard- output MMSE detector based PHY layer ASICs are similar and the error rate of the soft-output MMSE detector based implementation will always be superior to the hard-output MMSE detector based PHY layer. Furthermore, the error rate **performance** of the PHY layer ASIC with ten sphere cores is disregarded, as the area of the implementation is tremendously larger than the other implementations, and the error rate **performance** improvement of five sphere cores compared to the implementation with two sphere cores is only significant for **high** modulation order and a large number of spatial streams. Hence, for transmissions with less than four streams, the PHY layer ASIC with ten sphere cores does not provide any error rate **performance** gain compared to the other STS SD based PHY layer implementations. However, the implementation with ten sphere cores may improve the error rate **performance** for transmissions with four streams, 64-QAM modulation and a code rate of 5 /6, compared to implementation STS1 and STS2. The error rate **performance** was evaluated with Monte-Carlo simulations by transmitting frames with 1000 Bytes payload using the HT-MF frame format in the 40 MHz (108 OFDM tones) transmission mode, over a TGnD channel model. A regular GI of 0.8 µs is used. Independent of the number of streams for the transmission, the transmitter always uses four transmit antennas and spatial expansion to radiate the signal. Similar to the transmitter, the receiver always uses four receive chains to receive the transmission. The modulation, the number of streams, and the code rate is altered according to the MCSs used for evaluation. The code itself is always generated with a convolution code with constraint length seven and the generator polynomial [133 o 171 o ]

Show more
215 Read more

1. Introduction
The rapidly increasing complexity of VLSI systems has made it necessary to pay ever more attention to de- sign issues that affect **energy** consumption. One of the original motivations for **CMOS** technology was its low en- ergy consumption, and today, there are still no alternatives that approach it in **energy** efficiency. Nevertheless, **energy** consumption is more and more often the factor that limits the **performance** of contemporary **CMOS** systems.

In this paper, the author presented novel architectures and designs of **high** speed, low power 3- 2, 4-2 and 5-2 compressors capable of operating at ultra-low voltages. The power consumption, delay and area of these new compressor architectures are compared with existing and recently proposed compressor architectures and are shown to perform better. The proposed architecture lays emphasis on the use of multiplexers in **arithmetic** **circuits** that result in **high** speed and **efficient** design. In the proposed architecture these outputs are efficiently utilized when compared to existing designs to improve the **performance** of compressors. In the proposed novel architectures of 3-2, 4-2 and 5-2 compressors, the author replaces some XOR blocks with MUX blocks. Since the availability of the select bits before the input bits arrive completes the switching activity of the transistors, the overall delay in the critical path is reduced while using MUX blocks. This paper concludes by analyzing the proposed architectures of 3-2, 4-2, 5-2 compressors with the conventional architectures.

Show more
consequently time and cost of development. The automatic regeneration of logic levels involves **high** noise immunity and low sensitivity to the variances of transistor characteristics. This provides accurate and reliable data and signal processing.
Digital fuzzy processors are generally designed for multipurpose applications in order to interest a maximum of potential customers. They should thus implement a great and various number of fuzzy operators, membership functions and inference rules. This make them rather **efficient** for a large range of applications, provided that appropriate programming is possible (which supposes an appropriate internal or external memory). On the other hand there are some disadvantage for digital realization. Complex representation of fuzzy vectors and parallel structures are however required to obtain accurate and fast processing. Digital implementations of common fuzzy operations leads unfortunately rapidly to complicated, enormous VLSI **circuits**.

Show more
162 Read more

In order to reduce the dynamic power, an alternative approach to the traditional techniques of power consumption reduction, named adiabatic switching, has been proposed in the last years. In such approach, the process of charging and discharging the node capacitances is carried out in a way so that a small amount of **energy** is wasted and a recovery of the **energy** stored on the capacitors is achieved. In literature, various kinds of adiabatic **circuits** proposed all of them can be grouped into two fundamental classes: fully adiabatic **circuits** and quasi- adiabatic or partial **energy** recovery **circuits**. In the first class, in particular working conditions can consume asymptotically zero **energy** for operation, the large area occupation and the design complexity make these **circuits** not competitive with traditional **CMOS** where as in second class **circuits** designed to recover large portion of the **energy** stored in the circuit node capacitances. This **energy** loss drawback however allows a good trade-off between circuit complexity and then area occupation.In this paper different circuit are presented, and comparison of conventional **CMOS** adder **circuits** [1][2][4][5][6], 2PASCAL [10] ,and also comparison of adiabatic families[3][7][8][9] is done. In this work we analyzed the **performance** of conventional and adiabatic adder circuit’s in-terms of power consumption.

Show more
ABSTRACT
The objective of this project is to design **high** **performance** **arithmetic** **circuits** which are faster and have lower power consumption using a new dynamic logic family of **CMOS** and to analyze its **performance** for sequential **circuits** and effects upon cascading. This new dynamic logic family is known as Feedthrough logic. It has two basic structures: **high** speed (HS0) and low power (LP0). It allows for commencement of evaluation in a computational block before its evaluation phase begins, and quickly performs a final evaluation as soon as the inputs are valid. This dynamic logic family is best suited to **arithmetic** **circuits** because the critical path is made of a long chain of cascaded inverting gates. As the major advantage of this logic which is higher speed is observed upon cascading, it’s most suitable for **arithmetic** **circuits**. We compare a set of ripple carry adders 4 bit and 16 bit in domino logic with the two basic structures derived. Experimental results have shown that the lower power structure provides for smaller power delay product when compared with domino logic.

Show more
61 Read more

Abstract- The most fundamental **arithmetic** operation is addition which is used in a digital data path logic system. **Arithmetic** and logic units , Microprocessors ,etc. are some examples where we need to use **arithmetic** operations for processing data, for calculating addresses respectively .There are different architectures for building adder circuit .For example: 1)carry look ahead adder(CLA), 2)carry propagate adder(CPA), 3)carry save adder(CSA), & 4)carry select adder(CSLA) . Among these different architectures CSLA is a particular way of implementing adder that performs addition rapidly and are used for faster addition in many data processing processors .From observation of the carry select adder architecture we can see that there is scope for modification in order to significantly minimize the area and power consumed by the circuit. In this work we are going to propose simple and **efficient** modification at gate-level structure in CSLA. Based on this 16-, 32-bit square root CSLA (SQRT CSLA) have been developed & compared with regular structure. The proposed architecture design has reduced area & power consumption compared to regular structure with slight increase in delay. The evaluation of the proposed design is done based on delay, area & power **performance** metrics. The results show that proposed CSLA design is better than regular SQRT CSLA. Index Terms- Area and **energy** **efficient**, CSLA, **Arithmetic** operations, SQRT CSLA, Data path logic systems.

Show more