• No results found

Computing for Data-Intensive Applications:

N/A
N/A
Protected

Academic year: 2021

Share "Computing for Data-Intensive Applications:"

Copied!
11
0
0

Loading.... (view fulltext now)

Full text

(1)

Computing for Data-Intensive Applications:

Beyond CMOS and Beyond Von-Neumann

Workshop on Memristive systems for Space applications European Space Agency, ESTEC 30 April 2015, Noordwijk, Netherlands

Said Hamdioui

Computer Engineering Delft University of Technology The Netherlands

© Said Hamdioui , Computer Engineering, TUDelft

The big picture

2 • High cost • Reduced reliability • Saturated Clk • Higher power

Technology

Computers

Applications

• Storage • Computing efficiency What is the solution?
(2)

© Said Hamdioui , Computer Engineering, TUDelft

Contents

Technology

• The good, the bad and the challenging

Computing

• The good, the bad and the challenging

Future computers: possible scenarios

Toward a new computing paradigm: CIM Architecture

• Technology, potential & open questions

Conclusion

3

© Said Hamdioui , Computer Engineering, TUDelft

Technology…….

The good

Density: double

• Dimensions reduce by 30%

• WxL: 1x1 0.7x0.7 = 0.5

• Reduced IC cost

Performance: 43% increase

• Gate delay reduced by 30%

• C=(0.7*0.7)/0.7= 0.7 • freq =1/0.7 = 1.43

• uP freq ~ doubled every generation (till 2004)

Power

• Constant voltage scaling

• Power= C x V2x f = 1

• Constant field scaling

• Power= C x V2x f= 0.5

4

[source: Mark Horowitz, Stanford Uni] [Groeseneken, ESSSDERC’10] Vdd=1 Vdd=0.7 Vdd=1

(3)

© Said Hamdioui , Computer Engineering, TUDelft

Technology…….

The bad

Normal lifetime ~ 3 to 15 years Wearout Infant mortality ~ 1 to 20 weeks Failure rate Time Technology Scaling: Increasing transient errors Increasing wearout failures Burn-in out Of steam?

Components are becoming

Unreliable

FIT

[Ref: NASA]

Ref. H-Y. Kang, 2005

Higher failure rate & reduced life time

5

© Said Hamdioui , Computer Engineering, TUDelft

Technology…….

The challenging (long term)

High static leakage (volatile)

Unreliable components

• Expensive solutions

• Economically not affordable

Complex manufacturing

• Low yield • High cost

• Limited scalability • Comes at additional cost

Less/no economical benefit

=> need of new device technologies

6

Source: ITRS + SEMATEC

END of CMOS scaling

(4)

© Said Hamdioui , Computer Engineering, TUDelft

Computing…….

The good

Early computers

Core

DRAM µProc: ~55%/year (2X/1.5yr) DRAM: 7%/year (2X/10yrs) CPU-Mem Perfor. Gap (grows 50%/year)

Core

DRAM Cache 1 core with embedded cache Ref: Intel

Core n

DRAM L1 Cache

Core 1

L1 Cache L2 Cache

Multi/many core with shared L2

7

[source: Mark Horowitz, Stanford Uni]

TUDelft/ Computer Engineering

Computing…….

The bad

• High energy consumption

• Dominated by com & memory • 70 to 90% for data-ints. appl

• Limited data-bandwidth

• Communication bottleneck • Stored program principle

• Complex programmability

• Memory coherency

• programmability overheard (scheduling policies, priorities, mapping, ..)

• Reduced / saturated performance

• Enhancement based on expensive on chip memory (~70% of area)

• Requires LD & ST: killers of overall perf

Chip-level energy trends

Source: S. Borkar, “Exascale Computing - a fact or a fiction?,” IPDPS, 2013

HPC system-level power break-down

Source: R. Nair, “Active Memory Cube,” 2ndWorkshop on Near-Data Processing, 2014

2012 2018

(5)

TUDelft/ Computer Engineering

Computing…….

The bad

Ref: Jorn W. Janneck, Computing in the age of parallelism: Challenges and opportunities, Multicore Day, Sept 2013

9

TUDelft/ Computer Engineering

Computing…….

The bad

Ref: Jorn W. Janneck, Computing in the age of parallelism: Challenges and opportunities, Multicore Day, Sept 2013

Intel desktop processor

Speed-up is no longer the result of a faster clock…

Instead, we get more

parallel devices

…...

BUT:

hard to scale!!

(6)

TUDelft/ Computer Engineering

Computing…….

The challenging

Extremely data intensive application (Big Data problems)

• Economics, business, science, social media, healthcare, etc. • Data storage and analysis

E.g., 1PB@1000MB/sec = 12,5 days!

• Assume transfer rate of Front Side Bus at 1000MB/s

Speed information increase exceed Moore’s law

Data size has already surpassed the capabilities of today’s

computation architectures

• All suffer from communication and memory bottleneck

=> need of new architectures

Core

Memory

11

© Said Hamdioui , Computer Engineering, TUDelft

Future computers: possible scenarios

Near memory computing

• Bring computation closer to data

• From computer-centric to data-centric model • Old concept, but renewed interest due to:

• Technology trends • Big data

• New technologies (3D)

HP: Machine

• Discard conv. computer model? • Flatten complex data hierarchies • Bring processing closer to the data • Electronics for computing, Photons for

communications and ions for storage

Source: HP

(7)

© Said Hamdioui , Computer Engineering, TUDelft

Toward a new computing paradigm

How to address: memory wall/ communication and big data?

Core n DRAM L1 Cache Core 1 L1 Cache L2 Cache

• Integrate storage and computation in the same physical location

• Significantly reduces communication/ memory bottleneck

• Use non-volatile technology

• Practically zero leakage

• Support massive parallelism

• Crossbar architecture

Core +

Data memory

External Memory

????

[Source: S. Hamdioui, et.al, DATE 2015] 13

© Said Hamdioui , Computer Engineering, TUDelft

Toward a new computing paradigm

CIM architecture: Computing in Memory

Communication & Control

Co mmunicatio n & Co ntr ol Top electrode Bottom electrode Junction with switching material

Crossbar topology

• Dense, non-volatile two terminal device at each junction

No/ limited memory wall

• Data will be loaded only at the first time

Scalability at low cost

• Significant reduction in area

• Nanomtric device dimension (below 10 nm)

Core +

data memory

External Memory

[Source: S. Hamdioui, et.al, DATE 2015] 14

(8)

© Said Hamdioui , Computer Engineering, TUDelft

Toward a new computing paradigm:

CIM technology

Requirements for switching device

• Dual functionality

• Realize both memory and logic functions • Low energy consumption

• Low/zero leakage: Non-volatility; • Reduce the overall power consumption • Scalability/ Nanometric dimensions

• Realize extreme density at low price and reduce area • CMOS compatibility

• Enhance manufacturing at low cost

• Two terminal device with metal/insulator/metal structure • Realize the crossbar architecture

• Good endurance & Good Reliability

• Solution

Emerging resistive switching devices = Memristors*

* L.Chua, “Resistance Switching Memories Are Memristors”, 2014, Memristor Networks page 21-51, Springer

15

© Said Hamdioui , Computer Engineering, TUDelft

CIM technology

: Memristor

Invention

• 1971: Leon Chua proposed • resistor with memory

• “hysteretic” current-voltage

characteristic

Demonstration

• 2008, HP manufactured memristor • Analysed different properties

• integration density, leakage,..

Potential applications

• Memory • Logic • Etc

(9)

© Said Hamdioui , Computer Engineering, TUDelft

CIM potential

Examples

• Healthcare: DNA sequencing

• we assume we have 200 GB of DNA data to be compared to • A healthy reference of 3GB for 50% coverage**

• Mathematic: 106parallel additions

Assumptions

• Conventional architecture

• FinFET 22nm multi-core implementation, with scalable number of

clusters, each with 32 ALU (e.g comparator)

• 64 clusters; each cluster share a 8KB L1 cache • CIM architecture

• Memristor 5nm crossbar implementation

• The crossbar size equals to total cache size of CMOS computer [**E. A. Worthey, Current Protocols in Human Genetics, 2001]

[Source: S. Hamdioui, et.al, DATE 2015] 17

© Said Hamdioui , Computer Engineering, TUDelft

CIM potential

Metrics

• Energy-delay/operation

• Computing efficiency : number of operations per required energy • Performance area : number of operations per required area

Results

> x100

> x100

> x100

[Source: S. Hamdioui, et.al, DATE 2015]

106 additions 1.5043e-18 9.25702-21 6.5226e+9 3.9063e+12 5.1118e+09 4.9164e+12 Key drives: Reduced memory bottleneck, non-volatile

technology, massive parallelism

(10)

© Said Hamdioui , Computer Engineering, TUDelft

CIM Partners

19

© Said Hamdioui , Computer Engineering, TUDelft

8. Conclusion

Constant voltage scaling (CMOS)

• Complex faults & unreliable components • Higher leakage: volatile technology

• Increasing cost: Lower yield, limited scalability, … • Higher business pressure

=> limits the applicability & benefitability

Long term:

Alternative device technologies;

E.g., Memristor

Short term:

Not only new tools/ IC flow required,

but also innovations in DFX/BISX both for manufacturing and

online/in field testing, monitoring, characterization, recovery, ….

(11)

© Said Hamdioui , Computer Engineering, TUDelft

8.Conclusion

Von-Neumann based computers

• Memory & communication bottleneck • Complex progammability of multi-cores • Higher power consumption

• Big data & data-intensive applications

=> Unable to solve (today) and future big data problems

Long term:

Alternative architecture, beyond Von Neumann

E.g., CIM architecture

Short term:

Specialization: application-specific accelerators (reduced prog)

Near memory computing, accelerator around memories

(data-centric model)

21

Computing for Data-Intensive Applications:

Beyond CMOS and Beyond Von-Neumann

Workshop on Memristive systems for Space applications European Space Agency, ESTEC 30 April 2015, Noordwijk, Netherlands

Said Hamdioui

Computer Engineering Delft University of Technology The Netherlands

References

Related documents

The purpose of this study is to explore human capital productivity strategies used by THL business leaders in southern Nigeria that have improved employee productivity. This

Your child/student is invited to be in a research study of middle school student understanding and perception of differentiated instruction. This study is being conducted in an

Without a subject guide, I was unable to consider any collections that did not have these keywords but may have included archival materials created by LGBT individuals.. Most

PPE placed on the market must satisfy the basic health and safety requirements laid down in the Regulation on PPE and must not endanger the safety of users, other individuals,

The synorogenic strata on the west side of the basin closest to the Beartooth Range, also called proximal deposits, are variously mapped as upper Paleocene Fort Union Formation

Early season damage to plant terminals and young fruit may stimulate branch and leaf development improving canopy light interception and carbon gain (Sadras, 1996b).. An improved

Aside from its role in the synthesis of the sex hormones, acetyl-CoA, of which Coenzyme-A is the important component, it is also important in fatty acid metabolism as an acyl

Connecting services for advanced computing systems applications in healthcare data storage, private cloud solution to support.. cash forecasting spreadsheet clothing with