Acknowledge Request
Chapter 9 Conclusions and Future work
9.3 Applications and future work.
There is no reason why Asynchrobatic Logic could not be commercialised. Obvious wide-data-path applications for Asynchrobatic logic include Very Long Instruction Word (VLIW), Floating-Point, Single Instruction Multiple Data (SIMD), Vector and cryptographic processors. In the field of cryptography, as well as obvious targets like block ciphers, there is potential to implement data-paths to allow processing for Elliptic Curve Cryptography (ECC). These use Galois Field (GF) arithmetic over prime bit-widths, for example GF(2173) [Leun03]. Another potential benefit of using these systems
for cryptography is the potential to reduce side channel information leakage. It is conjectured that the asynchronous nature of operations will make it harder to derive information from circuit timing, and that reversible (adiabatic) operation can reduce susceptibility to Differential Power Analysis (DPA) attacks [Thap06]. Furthermore, the dual-rail nature of the logic, the lower power consumption and what is effectively power-supply damping caused by the tank capacitors should also combine to further reduce this technology’s susceptibility to power analysis. This dual-rail implementation may also make the circuit more tamper resistant, because attempting to inject data could cause the dual-rail wiring to enter an invalid state, allowing this unauthorised access to be detected.
The register-file structure presented by Moon et al. [Moon98] could be converted to Asynchrobatic operation. This would allow the efficient implementation of systems that require register-style storage. Whilst the availability of register-files is not absolutely essential, as they can be implemented by the feedback of reused values around a loop with a multiplexer (MUX) to allow new data to be written, this functional block would be one of the most desirable to implement as the suggested alternative is nowhere near as efficient as a randomly addressable register-file. There are some more design complications with register-file design, as the Static RAM (SRAM) cell would need to be margined to ensure that its contents can be
reliably written, stored and read. Register-files are essential components in most processors.
The power consumption of the asynchronous controller probably has potential for further optimisation. It is likely that it could be further reduced. This could be achieved for example by using a lower-power logic style than standard static CMOS. As an example, further work could consider using sub- threshold, current mode circuits.
With the demonstrated potential to implement fully reversible circuits using this technology, further research looking at implementing more complex reversible gates would be useful. Demonstrating more complex (and more useful) reversible gates is likely to cause interest in PFAL-based Asynchrobatic Logic from those researching reversible computation.
As noted previously, this work approached the idea from a position of intellectual strength that was more superior in terms of adiabatic logic. The steeper learning curve for asynchronous logic has alluded to further areas of crossover that may be exploitable in the future. There is a substantial tranche of work in asynchronous logic that is predicated upon the use of Differential Cascode Voltage Switch Logic (DCVSL) circuitry. Given that the adiabatic logic circuits are also predicated, albeit in a different way, on DCVSL circuits, there may be further exploitable potential in this area. The best example of this is at the control/data-path interface where the result of a data-path operation is a single bit that must influence the control structure. If this single bit cannot travel with other data in the wide data-path, then to operate it with an Asynchrobatic power-clock would not be the most efficient method for its propagation, and moving it into a purely asynchronous domain is likely to be more efficient.
The majority of the work performed in the evaluation of Asynchrobatic Logic was conducted using sub-micron, rather than deep sub-micron or
nanometre processes. However, there is evidence from simulations that adiabatic circuits will operate when implemented in deep sub-micron processes, and no reason to expect this situation to change for nanometre processes. In fact, the availability of devices with different threshold voltages (VT) devices provide further optimisation potential for all parts to the design.
Given the extensive list of different adiabatic logic families that have been proposed, and which are catalogued in the appendix, it would be a worthwhile exercise to systematically compare the power, area and performance of a large number of these for several benchmark tests under defined process conditions on a variety of different CMOS processes. These benchmarking tests would ideally use components that are of use in production systems like arithmetic units or parts of cryptographic systems, rather than having tests based upon buffers or inverters.
The decision to limit decision tree depth to four NMOS devices is based upon the Electrical Rule Checker (ERC) limits that are imposed on standard static CMOS to prevent problems caused by CMOS switches being resistive, non-ideal switches. However, due to the different nature of operation of Asynchrobatic Logic and the underlying adiabatic logic families in its data- path, it may be possible to waive this arbitrary limit in some situations. However, further analysis of how this affects performance would be necessary.
Finally, other design methods for obtaining more optimal implementations of functions should be investigated as and when they are published.