• No results found

Inside Tiger Lake: Intel s Next Generation Mobile Client CPU. Xavier Vera Principal Engineer, Client Performance Architecture Intel Corporation

N/A
N/A
Protected

Academic year: 2021

Share "Inside Tiger Lake: Intel s Next Generation Mobile Client CPU. Xavier Vera Principal Engineer, Client Performance Architecture Intel Corporation"

Copied!
26
0
0

Loading.... (view fulltext now)

Full text

(1)
(2)

Inside Tiger Lake:

Intel’s Next Generation Mobile Client CPU

Xavier Vera

Principal Engineer, Client Performance Architecture Intel Corporation

(3)

G o a l s

Tiger Lake

Architecture

Greater than generational performance in CPU

Disruptive performance in integrated graphics

Scalable AI for emerging client workloads

Increased memory & fabric efficiency for high bandwidth

Best in class set of IPs throughout the SoC

Continue to advance Intel’s SoC Security

…. all at the same power envelopes with increased power efficiency

CPU

GPU

(4)

Intel 10nm SuperFin Process Technology

SoC Architecture

Willow Cove Core

Xe Graphics

SoC Uncore, IO and Display

Power Management

Outline

(5)

Additional Gate Pitch

Higher Drive Current

Enhanced Epitaxial Source/Drain

Lower Resistance, Increases Strain

Improved Gate Process

Higher Channel Mobility

New High Performance XTOR

(6)

Improved Metal Stack

Innovation Across The Entire Process Stack, From Channel To Interconnects

Novel Thin Barrier

reduces via resistance by 30%

Super MIM Capacitor

>5x increase in MIM capacitance

Thin layers of different Hi-K materials, each just a few Angstroms thick, stacked in a repeating

(7)

Architecture and Design

Improvements

(8)

Introducing Tiger Lake

New over Ice Lake Upgrade over Ice Lake Same as Ice Lake

Willow Cove Core SVID Clock Fuse Debug Coherent Fabric Last Level Cache (LLC)

12MB, Non-Inclusive

GNA 2.0 Graphics and Media

Power Management OPIO IPU6 Display Engine PCIe Gen4 FIVR JTAG SGX USB 4.0 Thunderbolt 4.0

DP 1.4 Willow Cove Core Willow Cove Core Willow Cove Core

Tiger Lake CPU

D D RIO DDR4 LPDDR4x (initial) LPDDR5 (future) N o n -C o h eren t F a b ri

c Display IO

T

yp

e

-C

Total Memory Encryption (AES-XTS, 128b key) Up to 3200MT/s Max capacity 64GB Up to 4267MT/s Max capacity 32GB

12MB Non-Inclusive LLC

IO caching directly into LLC Non-Inclusive 1.25MB MLC

SOix power enhancements

Autonomous fabric frequency control

Increased operating frequency at given voltage Up to 5400MT/s

Max capacity 32GB

Telemetry Aggregator Low Power Closed Chassis Debug Remote Debug and triage

Volume Management Device (VMD)

Up to 6 camera sensors Hardware implemented pipeline Video up to 4K90 (initial 4K30) Up to 8K display

4 pipes

Support for 64GB/s of read bandwidth

Optimized for no-device-attached standby power

First time on mobile

(9)

Willow Cove Core

Built on the Sunny Cove architectural foundation

Redesigned fundamental circuits to take advantage of SuperFin technology

Redesigned caching architecture to larger non-inclusive, 1.25MB MLC​

Control Flow Enforcement technology to help

(10)

The Result

Fr

equ

ency

0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5

Sunny Cove

(11)

The Result

Fr

equ

ency

0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5

Sunny Cove

Voltage

Large Frequency Gains

(12)

The Result

Fr

equ

ency

0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5

Sunny Cove

Voltage

W I L L O W C O V E

(13)

Fr

equ

ency

0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5

Sunny Cove

Voltage

W I L L O W C O V E

Greater Dynamic Range

(14)

Graphics

Large improvements in performance per watt efficiency

Up to 96EUs with increased capabilities

3.8MB L3 cache

Increased bandwidth to LLC and Memory

(15)

Fabrics and Memory

Coherent Fabric

• 2x increase in coherent fabric bandwidth

– Dual ring microarchitecture

• 50% LLC size increase to non-inclusive

• IO caching Memory​

• More efficient memory bandwidth for graphics and cores

– Support for up to ~86GB/s of memory bandwidth

– Deeper, narrower dual memory controller for higher efficiency

• Architectural support for LP4x-4267 and DDR4-3200 (initial) and up to LP5-5400

(16)

Display

Connectivity

The future calls for more displays, higher resolutions and better image quality

– 8K displays

Added a dedicated fabric path (DIP) to memory to maintain quality of service​

– Up to 64GB/s of isoch bandwidth

External display connectivity through DP/HDMI protocols

(17)

IO

PCIe Gen4 on CPU for low latency, high bandwidth device access to memory

Full 8GB/s bandwidth to memory

~100ns less latency when attached to CPU vs PCH

Integrated Thunderbolt 4 and USB4 support

Up to 40Gb/s bandwidth on each port

USB tunneling

Integrated Display output via Type-C

DP alternate mode/tunneling over Thunderbolt

DP-in ports for discrete graphics card display output to mux over type-C port

(18)

Power Management

Moved extensive logic to gated power domains

Increased FIVR efficiency

Deeper package C state turning off all

clocks in CPU​

Hardware-based save and restore logic

Autonomous DVFS in coherent fabric and memory subsystem to optimize frequency and voltage based on bandwidth and

(19)

PM - DVFS

Goal is to adapt

frequencies (and voltages) to required performance level

CORE

FABRIC

(20)

PM - DVFS

Workload is in a core centric phase with few requests

Core increases frequency to match required

performance

Fabric and memory reduce frequency to match low number of requests

CORE

FABRIC

(21)

PM - DVFS

Workload enters a phase where requests go out to fabric and hit the large LLC

Core reduces the

frequency, while fabric increases it to match the increasing number of requests

Memory stays at low

frequency since requests are served from cache

CORE

FABRIC

(22)

PM - DVFS

Workload enters a phase where data requests miss in the LLC and go to

memory

Core keeps reducing frequency

Memory increases

frequency to match the increasing demand

Fabric can reduce frequency

CORE

FABRIC

(23)

CORE

FABRIC

MEMORY

PM - DVFS

Working set is within the core caches, core raises the frequency to deliver max frequency, while fabric and memory stay at the

minimum frequency to deliver the few requests

(24)

Tiger Lake SoC Architecture

Leveraging Process Tech Improvements,

Tiger Lake SoC Architecture delivers significant advancements across a wide set of SoC IPs, with:

More than a generational increase in CPU performance in Willow Cove CPU core

Massive improvements in graphics power efficiency in Xe-LP graphics IP

Improved fabric and memory to deliver more bandwidth efficiently

(25)

Legal Disclaimers

Capacitance and resistance measurements and CPU utilization rates are based on silicon projections and preliminary development board

measurements as of March 2020. They may not reflect all publicly available ​updates and are subject to change.

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors.

Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information about performance and benchmark results, visit http://www.intel.com/benchmarks.

Refer to https://software.intel.com/articles/optimization-notice for more information regarding performance and optimization choices in Intel software products.

Your costs and results may vary.

Intel technologies may require enabled hardware, software or service activation. No product or component can be absolutely secure. Intel contributes to the development of benchmarks by participating in, sponsoring, and/or contributing technical support to various benchmarking groups, including the BenchmarkXPRT Development Community administered by Principled Technologies.

Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy.

© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.

(26)

References

Related documents

Citrix XenClient makes use of Intel hardware-assisted virtualization and Intel vPro technology for improved desktop performance, security, and manageability.... Solution Brief

BunnyPeople, Celeron, Celeron Inside, Centrino, Centrino Inside, Cilk, Core Inside, i960, Intel, the Intel logo, Intel AppUp, Intel Atom, Intel Atom Inside, Intel Core, Intel

BunnyPeople, Celeron, Celeron Inside, Centrino, Centrino Inside, Cilk, Core Inside, i960, Intel, the Intel logo, Intel AppUp, Intel Atom, Intel Atom Inside, Intel Core, Intel

BunnyPeople, Celeron, Celeron Inside, Centrino, Centrino Inside, Core Inside, i960, Intel, the Intel logo, Intel AppUp, Intel Atom, Intel Atom Inside, Intel Core, Intel Inside,

HE-B71 the 4 th Generation Intel of the PICMG 1.3 Half-size CPU Card, supports 4 th Generation Intel® Core™ i7, Core™ i5, Core™ i3, Celeron Mobile Processor and features

Intel, Intel Logo, Intel Atom, Atom Inside, Celeron, Celeron Inside, Centrino, Centrino Inside, Centrino Logo, Core Inside, Intel Core, Intel Inside, Intel Inside Logo, Intel

Intel, the Intel logo, Intel Core, Core Inside, Xeon and Xeon Inside are registered trademarks of Intel Corporation in the U.S. and other

27 inch LED Widescreen, height adjustable stand, VGA/DVI/DP Monitor (Igual marca que el Equipo) Port Replicator with dual digital display and legacy port support. 24 inch