SOC architecture and design
• system-on-chip (SOC)
– processors: become components in a system
• SOC covers many topics
– processor: pipelined, superscalar, VLIW, array, vector – storage: cache, embedded and external memory
– interconnect: buses, network-on-chip
– impact: time, area, power, reliability, configurability
– customisability: specialized processors, reconfiguration
– productivity/tools: model, explore, re-use, synthesise, verify – examples: crypto, graphics, media, network, comm, security – future: autonomous SOC, self-optimising/verifying design
• our focus
iPhone SOC
1 GHz ARM Cortex A8
I/O
Processor
Basic system-on-chip model
AMD’s Barcelona Multicore Processor
Core 1 Core 2
Core 3 Core 4
Northbridge
512KB L2 512KB L2 512KB L2
512KB L2
2MB s hare d L3 Cac he
4 out-of-order cores
1.9 GHz clock rate
65nm technology
3 levels of caches
integrated Northbridge
SOC vs processors on chip
• with lots of transistors, designs move in 2 ways:
– complete system on a chip
– multi-core processors with lots of cache
System on chip Processors on chip
processor multiple, simple, heterogeneous
few, complex, homogeneous
cache one level, small 2-3 levels, extensive memory embedded, on chip very large, off chip functionality special purpose general purpose
interconnect wide, high bandwidth often through cache
Processor types: overview
Processor type Architecture / Implementation approach SIMD Single instruction applied to multiple
functional units
Vector Single instruction applied to multiple pipelined registers
VLIW Multiple instructions issued each cycle under compiler control
Superscalar Multiple instructions issued each cycle under hardware control
Processors for SOCs
SOC Basic ISA Processor description
Freescale c600:
signal processing
PowerPC Superscalar with vector extension
ClearSpeed
CSX600: general
Proprietary Array processor with 96 processing elements PlayStation 2:
gaming
MIPS Pipelined with 2 vector coprocessors
ARM VFP11:
general
ARM Configurable vector
coprocessor
Sequential and parallel machines
• basic single stream processors
– pipelined: overlap operations in basic sequential – superscalar: transparent concurrency
– VLIW: compiler-generated concurrency
• multiple streams, multiple functional units
– array processors – vector processors
• multiprocessors
Pipelined processor
IF ID AG DF EX WB
Instruction #1
IF ID AG DF EX WB
Instruction #2
IF ID AG DF EX WB
Instruction #3
IF ID AG DF EX WB
Instruction #4
Time
Superscalar and VLIW processors
IF ID AG DF EX WB
Instruction #2
IF ID AG DF EX WB
Instruction #3
IF ID AG DF EX WB
Instruction #5
IF ID AG DF EX WB
Instruction #6
IF ID AG DF EX WB
IF ID AG DF EX WB
Instruction #4 Instruction #1
Superscalar
VLIW
hardware for parallelism control
Array processors
• perform op if condition = mask
• operand can come from neighbour
mask op dest sr1 sr2
one instruction
n PEs, each with memory; neighbour communications
Vector processors
• vector registers, eg 8 sets x 64 elements x 64 bits
• vector instructions: VR3 = VR2 VOP VR1
Memory addressing:
three levels
(each segment contains pages for a program/process)
User view of memory: addressing
• a program: process address (offset + base + index)
– virtual address: from page address and process/user id
• segment table: process base and bound
(for each process)– system address: process base + page address
• pages: active localities in main/real memory
– virtual address: page table lookup to physical address – page miss: virtual pages not in page table
• TLB (translation look-aside buffer): recent translations
– TLB entry: corresponding real and (virtual, id) address
TLB and Paging:
Address translation
process base
(find process)
(find page)
System Address
Virtual Address
(recent translations)
SOC interconnect
• interconnecting multiple active agents requires
– bandwidth: capacity to transmit information (bps)
– protocol: logic for non-interfering message transmission
• bus
– AMBA (Adv. Microcontroller Bus Architecture) from ARM, widely used for SOC
– bus performance: can determine system performance
• network on chip
– array of switches
– statically switched: eg mesh
Design cost: product economics
• increasingly product cost determined by
– design costs, including verification – not marginal cost to produce
• manage complexity in die technology by
– engineering effort
– engineering cleverness
• design effort
– often dictated by
Basic physical tradeoffs Design time
and effort
Design complexity
processors
Cost: product program vs engineering
Manufacturing costs
Engineering Marketing,
sales, administration
Fixed
costs Variable costs
Chip design
CAD support Software Verify & test
Mask costs
Labor costs
Engineering costs
Example: two scenarios
• fixed costs Kf, support costs 0.1 x function(n), and variable costs Kv x n, so
• design gets more complex, while production costs decrease
– Kf increases while Kv decreases
– if same price, requires higher volumes to break even
• when compared with 1995, in 2015
– Kf increased by 10 times
More recent: higher NRE
2015 1995