We presented the design and evaluation of a many-core Recognition and Min- ing processor using spin-based memory technologies. We developed models of STT- MRAM and DWM memories, which were used to perform an evaluation of, and architectural exploration for, the RM processor. Our results demonstrate that spin- based memory has great potential in improving the performance of Recognition and Mining, and perhaps other parallel data-intensive workloads.
5. TAPESTRI: DESIGN OF DWM TAPES WITH
SHIFT-BASED WRITE
As described in earlier chapters, the spin-based memories offer considerable benefits in terms of density and leakage power. However, the use of MTJ to perform writes in STT-MRAM and DWM results in high write energy/latency. Our analysis indicates that an 1MB STT-MRAM cache requires 1.8X more write energy and 3.5X more write time compared to an SRAM cache of the same capacity. Further, the high write current requirement of MTJ-based write demands the use of large access transistors that compromises the density benefits, and aggravates the possibility of dielectric breakdown leading to reliability concerns. In addition, the conflicting requirements of read and write operations imposes stringent design constraints, resulting in reduced stability and increased read/write failures under variations.
Many previous efforts have proposed various optimizations of MTJ-based writes, including different genres of STT-MRAM, hybrid caches, volatile STT-MRAM design etc. as described in Chapter 2. Although these proposals have resulted in notable improvements, there is still a significant gap between the write energy and latency of SRAM and spin-based memory.
In this chapter, we bring a completely new and different insight to address the challenge of write energy and latency in spintronic memory design — domain wall motion, which was originally proposed for performing shift operations in DWM, offers a fast, energy-efficient alternative for performing writes. The concept of domain wall motion is fundamentally different from the MTJ-based write mechanism used in both STT-MRAM and traditional DWM designs. This write mechanism using domain wall motion has been experimentally demonstrated to be more efficient than MTJ-based write in terms of energy and latency, as well as scalability for nanoscale magnets with perpendicular magnetic anisotropy (PMA) [21].
Our proposal, which we call tapestri (TAPE with ShifT based wRIte), leverages the above insight to achieve fast, energy-efficient write operations in spintronic mem- ories. We propose the design of two different bit-cells, 1bitDWM and MultibitDWM, which are optimized for the differing requirements of the different levels of the cache hierarchy.
• 1bitDWM is a bit-cell that is designed to optimize performance. It retains all the benefits of STT-MRAM and can match SRAM in write efficiency. Moreover, unlike conventional DWM bit-cells, it does not require any shift operations. This allows it to be used in L1 cache, where spin memories have conventionally not been used due to the high write latency/energy.
• MultibitDWM is a bit-cell that is designed to maximally utilize the density benefits of domain wall memory. It achieves much higher density than STT- MRAM (and 1bitDWM) by storing multiple bits in a single cell. However, this design, in general, requires shift operations to be performed on a cell before read/write accesses.
The proposed bit-cells also feature decoupled read-write paths that use MTJ for reads, and shifts for writes. This enables independent optimizations of read and write operations resulting in improved read/write stability along with fast, energy- efficient read and write operations. Also, the access transistors in 1bitDWM can be minimum-sized, which enables it to achieve density benefits similar to 1T-1R STT- MRAM bit-cell.
The rest of this chapter is organized as follows. Section 5.1 describes the shift- based write mechanism. Section 5.2 presents the proposed 1bitDWM and Multi- bitDWM designs. In Section 5.3, we present a comparison of the characteristics of standalone caches designed using the proposed bit-cells with SRAM and STT-MRAM and Section 5.4 concludes the chapter.
5.1 Shift-based write
The write operation in spintronic memories is typically performed by injecting cur- rent into an MTJ, which causes switching of nanomagnets through a mechanism called Spin-Transfer Torque (STT) as shown in Figure 5.1a. In order to achieve successful write operation, a current of appropriate magnitude needs to be passed through the MTJ for sufficient duration. This write mechanism leads to high write energy and write latency, which is a major challenge for designing spin-based memories.
I
write0
Iwrite1 Tunneling Oxide
Fixed Layer
Free Layer
(a) MTJ-based write
Fixed Domains
Free Domain IWrite0
IWrite1
(b) Shift-based write
Fig. 5.1.: Different write mechanisms in spintronic memories
However, a recent development in DWM technology [21] has eliminated this in- efficiency. It has been experimentally shown that domain wall motion can also be used to perform fast, energy-efficient writes in DWMs. This property, often referred as shift-based writes, is demonstrated in Figure 5.1b. The structure for write oper- ations consists of a ferromagnetic wire with three domains – two fixed domains and a free domain. The magnetization of the two fixed domains are set to up-spin and down-spin during fabrication. However, the magnetization of the free domain, which is sandwiched between the fixed domains, can be varied by shifting the magnetization of one of the fixed domains by applying a current pulse in the appropriate direction.