• No results found

LTE Layer 1 Software on the MSC8156 DSP Built on StarCore Technology

N/A
N/A
Protected

Academic year: 2021

Share "LTE Layer 1 Software on the MSC8156 DSP Built on StarCore Technology"

Copied!
27
0
0

Loading.... (view fulltext now)

Full text

(1)

LTE Layer 1 Software on the MSC8156 DSP

Built on StarCore

®

Technology

(2)

Agenda

Introduction

Broadband Wireless Technology Timelines

3G Evolution – from Thin to Thick Data Pipe

Multicore DSP Roadmap based on StarCore

®

LTE standard overview

LTE overview

SC-FDMA and OFDMA

LTE L1 Channel Overview

Multi User (MU) - MIMO

Software overview

LTE Layer 1 Software Components

Algorithms

L1 Matlab Reference Model

Uplink Processing Chain

Manager API Example

MAPLE Abstraction Layer

Implementation proposal on MSC8156

MSC8156 device overview

Performance Analysis Methodology

Use case definition & System Architecture

Summary

(3)
(4)

Broadband Wireless Technology Timelines

Source: Rysavy Research

Note: Throughput rates are peak network rates. Radio channel bandwidths indicated. Dates refer to initial network

deployment except 2006 which shows available technologies that year.

2006

2007

2008

2009

2010

2011

3GPP GSM EDGE Radio

Access Network Evolution

EDGE

DL: 474 kbps

UL: 474 kpbs

Evolved EDGE

DL: 1.1 Mbps

UL: 947 kbps

3GPP UMTS Radio

Access Network Evolution

HSDPA/HSUPA

DL: 14.4 Mbps

UL: 5.76 Mbps

in 5 MHz

HSDPA

DL: 14.4 Mbps

UL: 384 kbps

in 5 MHz

Rel 7 HSPA+

DL: 28 Mbps

UL: 11.5 Mbps

in 5 Mhz

Rel 8 HSPA+

DL: 42 Mbps

UL: 11.5 Mbps

in 5 Mhz

LTE 4X4 MIMO

DL: 326 Mbps

UL: 86 Mbps

in 20 MHz

LTE 2X2 MIMO

DL: 173 Mbps

UL: 58 Mbps

in 20 MHz

3GPP Long Term

Evolution

Mobile WiMAX

Evolution

Fixed WiMAX

Wave 1

DL: 23 Mbps

UL: 4 Mbps

10 MHz 3:1 TDD

Wave 2

DL: 46 Mbps

UL: 4 Mbps

10 MHz 3:1 TDD

IEEE 802.16m

CDMA2000 Evolution

UMB 4X4 MIMO

DL: 280 Mbps

UL: 68 Mbps

in 20 MHz

UMB 2X2 MIMO

DL: 140 Mbps

UL: 34 Mbps

in 20 MHz

EV-DO Rev B

DL: 14.7 Mbps

UL: 4.9 Mbps

in 5 MHz

EV-DO Rev A

DL: 3.1 Mbps

UL: 1.8 Mbps

in 1.25 MHz

EV-DO Rev 0

DL: 2.4 Mbps

UL: 153 kbps

in 1.25 MHz

(5)

3G Evolution – from Thin to Thick Data Pipe

Increasing flexibility for data rates and bandwidth

Algorithm differentiation and flexibility require high-performance

multicore DSPs for programmability combined with integrated or

attached accelerators for cost and power efficiency

3G LTE Significantly Outperforms 3G Standards

WCDMA

0.5 Mbps

at 5MHz

HSDPA

Up to

14 Mbps DL

at 5MHz

HSPA+

Up to

42 Mbps DL

at 5MHz

HSUPA

Up to

5 Mbps UL

at 5Mhz

3G-LTE

300+ Mbps DL

at 20 MHz

2003 – 2004

2005 – 2006

2007 – 2008

2009 – 2010

2011 – 2012

(6)

MSC8122

MSC8112/3

P

er

fo

rm

an

ce

2006 – 2007

2008

2009

2004 – 2005

Multicore DSP Roadmap based on StarCore

®

Binary Code Compatible

Quad core

500-MHz SC140

8 (16-bit)

GMACs

1.4Mbyte RAM

90nm

MSC8126

Quad core

500-MHz SC140

8 (16-bit) GMACs

Integrated Turbo

& Viterbi COPs

1.4 Mbyte RAM

Ethernet, Serial

90nm

MSBA8100

Multicore DSP SoCs

Next generation

StarCore Core

Positioned for

3G-LTE, TDD-3G-LTE,

W iMAX, TD-SCDMA,

3GPP, 3GPP2

Next generation

process technology

MSC8156

Future

StarCore DSP

Enabled

Accelerator device for

3G-LTE, TDD-LTE,

W iMAX, TD-SCDMA,

3GPP, 3GPP2

Turbo, Viterbi, FFT,

DFT

512 KB internal RAM

DDR2

PCI

Dual Serial RapidIO™

ports x4 (3.125 Gbaud)

Companion for

MSC8144

90nm

Quad core

1-GHz SC3400 cores

16 (16-bit) GMACs

10.5 Mbyte RAM

Dual 1G Ethernet (SGMII)

ATM/Utopia

Integrated Security Accel.

Serial RapidIO™ port x4

(3.125 Gbaud)

90nm

MSC8144/E

MSC81xx

Future

Sampling

Alpha Sampling

Production

Tri & Dual core

400/300-MHz SC140

Starcore cores

8 (16-bit) GMACs

1.4Mbyte RAM

90nm

2008 intro

Enabled with

Advanced

BaseBand

Accelerators

(7)
(8)

LTE overview

LTE facts

3GPP LTE

(ongoing)

WiMAX 802.16e

Base standard

Currently v8.5.0

IEEE

®

802.16e-2005

Duplex method

FDD/TDD

TDD (FDD optional)

Downlink

OFDMA

OFDMA

Uplink

SC-FDMA

OFDMA

Channel BW (MHz)

1.25, 2.5, 5, 10,15, 20

5, 7, 8.75, 10 (1.25~20 opt)

Frame size

10 ms TDD

5 ms TDD

Modulation DL

QPSK/16QAM/

64QAM

QPSK/16QAM/

64QAM

Modulation UL

QPSK/16QAM/

64QAM

QPSK/16QAM

Channel Coding DL

Turbo / CC

Turbo / CC

Channel Coding UL

Turbo / CC

Turbo / CC

Throughput (DL/UL)

100/50 Mbps (20 MHz)

~40 shared (10 MHz, TDD)

(9)

Single Carrier FDMA (SC-FDMA) and OFDMA

The symbol mapping in

OFDM happens in the

frequency domain.

In SC-FDMA, the

symbol mapping is

done in the time

domain.

Appropriate subcarrier

mapping in the

frequency domain

allows control of the

PAPR

SC-FDMA enables

frequency domain

equalizer approaches

like OFDMA

S/P

X(k)

x(n)

Cyclic

Prefix

Frequency Domain

Time Domain

IFFT

P/S

Subcarrier

Mapping

X(k)

x(n)

Cyclic

Prefix

Frequency

Domain

Time Domain

IFFT

P/S

DFT

Time

Domain

OFDMA

(downlink)

SC-FDMA

(uplink)

(10)

LTE L1 Channel Overview

DL

UL

Data

Control

DL-SCH PCH

BCH

MCH

PDSCH

PBCH

PMCH

CFI

HI

DCI

PHICH PDCCH

PCFICH

UL-SCH RACH

PUSCH PRACH

UCI

PUCCH

Transport Channels

Physical Channels

Dir

Transport

Channel

Physical

Channel

Usage

Coding

DL

DL-SCH

PDSCH

DL data channel

Turbo 1/3

PCH

PDSCH

Paging channel for call

initialization

Turbo 1/3

BCH

PBCH

Broadcast channel for

general cell information

Conv. 1/3

MCH

PMCH

Multicase channel

Turbo 1/3

CFI

PCFICH

Control format indicator,

encodes the number of

DL-CCH OFDMA

symbols

Block

Code 1/16

HI

PHICH

HARQ feedback

channel

Repet. 1/3

DCI

PDCCH

DL control channel with

subframe scheduling

information

Conv. 1/3

UL

UL-SCH

PUSCH

UL data channel

Turbo 1/3

RACH

PRACH

Random access

channel for UE

connection init

64 ZC

signatures

UCI

PUCCH

UL control channel for

CQI and HARQ

feedback

Reed

Muller

encoding

(11)

Multi User (MU) – MIMO

LTE Uplink: classic SIMO or

MU-MIMO for enhanced data rates

MU-MIMO: several users are

transmitting data simultaneously onto

the same frequencies.

MIMO Decoder: Tx users streams

demultiplexed at Equalization stage.

Equalization based on MMSE, IRC or

iterative cancellations (SIC)

Tx

Rx

Channel

siso

h

SISO/SIMO:

(1xM)

MIMO:

(2xM)

Tx1

Tx2

1

x

2

x

1

y

2

y

m

y

11

h

12

h

m

h

2

Rx1

Rx2

(12)
(13)

LTE Layer 1 Software Components

Application Layer:

Integration and

application scheduler functionality.

LTE Library:

The OS independent

implementation of the LTE Layer1

functionality.

Multicore Framework:

Responsible for

memory management, multicore

communication and low level resource

scheduling.

MAPLE Abstraction Layer:

Thin layer to

abstract OS implementation details for

controling the MAPLE HW accelerator.

(Multi Accelerator Platform for BaseBand,

details in next slides)

Coherency Abstraction Layer:

Function

library with services to handle the

coherency management.

IF1 and IF4 interfaces:

Cover the protocol

and LTE specific aspects of the interface

with PQ and FPGA.

Operating System:

Operating system

services and driver level support for device

peripheral access.

Application Layer

Framework

IF1

LTE

SP Lib

IF4

MAPLE

Abstraction

Software

Coherency

Abstraction

Operating System

(14)

L1 Matlab Reference Model

Maintaining LTE Matlab model for fast

algorithm validation

Channel estimator

MIMO detector

Equalization

Modulation Demapper

HARQ Combining

Matlab model also serves as

Golden Reference

DSP C code included through MEX files

Ability to generate test vectors

High simulation speed

• Between 10 kbps to 150 kbps

(15)

Uplink Processing Chain

CP Removal

FFT

Guard Removal

Ref Vec Correlation

IDFT

User Path Separation

DFT

SNR estimation

Matrix interpolation

MMSE Equalization

IDFT

DemodMapping

DeScrambling

Chnl De-Interleav.

DC Demux

Code Block Deconcatenation

Sub-block De-interl

Soft Combining

Turbo Decoding

CB CRC Check

CB DeSegmentation

TB CRC Check

CP Removal

FFT

Guard Removal

Ref Vec Correlation

IDFT

User Path Separation

DFT

RxAnt0

RxAnt1

IDFT

DemodMapping

DeScrambling

Chnl De-Interleav.

DC Demux

Code Block Deconcatenation

Soft Combining

Turbo Decoding

CB CRC Check

.

.

.

.

Soft Combining

Turbo Decoding

CB CRC Check

CB DeSegmentation

TB CRC Check

Soft Combining

Turbo Decoding

CB CRC Check

SBL1_PHULSPM_RSP

SBL1_PHULSPM_PRB

SBL1_PHULSPM_VRB

SBL1_PHULSPM_DCdemux

SBL1_PHULSPM_CBP

SBL1_PHULSPM_TBP

Sub-block De-interl

Sub-block De-interl

Sub-block De-interl

(16)

Application Layer

Framework

IF1

LTE

SP Lib

IF4

MAPLE

Abstraction

Software

Coherency

Abstraction

Operating System

Manager API Example

INT32

SBL1_PHULSPM_PRB

(

SPM_PRB_DYNAMIC_T

*spm_prb_dynamic,

void

*spm_prb_static,

SPM_PRB_CTRL_DYNAMIC_T *spm_prb_ctrl_dynamic,

SYS_CONFIG_T

*sys_config,

SWC_T

*swc_handler,

type_dftpe_mal

maple_handler)

(1) SPM_PRB_DYNAMIC_T

•pointers to the buffers specific to the current manager call instance.

(2) SPM_PRB_STATIC_T

•pointers to the buffers common to all the instances of this manager.

(3) SPM_PRB_CTRL_DYNAMIC_T

•control parameters that are specific to the current manager call instance (the number of allocations, number of codeblocks…)

(4) SYS_CONFIG_T

•system configuration parameter information, setup at system initiation (sector bandwidth, the number of antennas, etc.)

(5) SWC_T

•pointers to the software coherency functions (cache flush and cache invalidate).

(6) type_dftpe_mal

•pointers to the Maple abstraction function for IDFT / DFT / TurboDecoder / ViterbiDecoder

6

5

4

3

2

(17)

MAPLE Abstraction Layer

MAPLE Abstraction Layer (MAL) is responsible

for the encapsulation of the MAPLE interaction

based on SDOS drivers.

The goal is to keep the SP Lib independent of

the underlaying OS while allowing a close

integration of the MAPLE accelerator in the

processing chains

MAL API covers:

MAPLE init functionality for LTE mode

FFTPE, DFTPE and TVPE drivers

configured for LTE operation

Callback functions

Master Slave PE

Dispatch & post msg Get msg & FMWK sched

Manager execution

Idle: Mape Abstraction Polling mechanism PE execution Manager execution FMWK sched & Post msg #0 Master Slave PE

Dispatch & post msg Get msg & FMWK sched Manager execution PE execution Manager execution FMWK sched ISR post msg #1 FMWK sched & Post msg #0

Blocking

Non-Blocking

MAPLE

Call Back

Process

L1 SW

Manager

MAPLE Drivers

MAPLE Feedback

Application layer

SPKernal 2

MAPLE Abstraction

SPKernal 1

(18)
(19)

MSC8156E – Broadband Wireless DSP

6 SC3850 Cores Subsystems (up to 6GHz/48GMACS) each

with:

• SC3850 DSP core at up to 1GHz (8GMACs 16b or 8b)

• 512 Kbyte unified L2 cache / M2 memory.

• 32 Kbyte I-cache, 32Kbyte D-cache, WBB, WTB, MMU, PIC

Internal/External Memories/Caches

• 1056 KByte M3 shared memory (SRAM)

• Two DDR 2/3 64-bit SDRAM interfaces at up to 800 MHz

CLASS – Chip-Level Arbitration & Switching Fabric

• Non-Blocking, fully pipelined, low latency

• Full fabric 12 masters to 8 slaves, up to 512 Gbps throughput

MAPLE-B

Baseband Accelerator

• Turbo/Viterbi Decoder up to 160/115 Mbps, supporting:

3G-LTE, 802.16, 3G, CDMA2K standards

• FFT/DFT accelerator up to 280/180 Msps DFT

Security Engine (Talitos 3.1)

• Data and Code Protection (AES, SHA, Kasumi, SNOW3G)

High Speed Interconnects

• Dual 4x/1x Serial RapidIO at 1.25/2.5/3.125 Gbaud

• PCI-e 4x/1x

Dual RISC QUICCEngine® supporting

• Dual SGMII/RGMII Gigabit Ethernet ports

• Eth. L1 Protocols, Talitos control and sRIO offload

TDM Highway

• 1024 ch., 400Mbps, divided into 4 ports of 256

DMA Engine

16 bi-directional channels w/ external req/ack

8 hardware semaphores

Other Peripheral Interfaces

• SPI, UART, I2C, 32 GPIO, 16 Timers, 96KB boot ROM,

JTAG/SAP, 8 WDT

Technology

• 45nm SOI, 1V core, 2.5, 1.8/1.5V I/O

• FCBPGA (29x29) 1mm pitch, RoHS

Shared

Memory

1056 KB

DDR 2/3

Memory

Controller

CLASS – Non-Blocking Switch Fabric

6

cores

H/W Semaphores

I²C, UART, GPIOs

C

L

A

SS

N

on

-B

loc

k

ing

S

w

itc

h Fa

br

ic

C

L

A

SS

N

on

-B

loc

k

ing

S

w

itc

h Fa

br

ic

DMA Engine

MAPLE-B

Baseband

Accelerator

SerDes x4

On-Chip Network

2x SRIO 4x/1x,

1x PCIe 4x/1x

2x Gigabit

Ethernet, SPI

Security

Processing

Engine

SerDes x4

SC3850 core

32KB L1

I-Cache

32KB L1

D-Cache

512KB Unified M2/L2

TDM

Highway

4 ports

DDR 2/3

Memory

Controller

Alpha Sampling Now

SC3850 core

32KB L1

I-Cache

32KB L1

D-Cache

512KB Unified M2/L2

SC3850 core

32KB L1

I-Cache

32KB L1

D-Cache

512KB Unified M2/L2

SC3850 core

32KB L1

I-Cache

32KB L1

D-Cache

512KB Unified M2/L2

SC3850 core

32KB L1

I-Cache

32KB L1

D-Cache

512KB Unified M2/L2

SC3850 core

32KB L1

I-Cache

32KB L1

D-Cache

512KB Unified M2/L2

(20)

MSC8156 MAPLE-B Throughput & Compliance Data

Technology

Accel.

Standard Compliance

Data Rates

Comments

3G-LTE,

TDD-LTE

Turbo

3G-LTE (Evolved UTRA) turbo decoding as

specified in 3GPP TS 36.212, section 5.1.2.2

up to

160

Mbps (8 iterations)

up to

200

Mbps (6 iterations)

Max Log Map or Linear Log Map (MAX*)

Support Rate-De-Matching (sub-block

de-interleaving and de-interlacing)

CRC calculation

Viterbi

3G-LTE (Evolved UTRA) channel decoding as

specified in 3GPP TS 36.212, section 5.1.2.1

up to

100

Mbps (K=7 with tail biting)

Multi-iteration decoding

FFT/DFT

FFT sizes - 128, 256, 512, 1024, 2048 points

DFT sizes - Variable lengths DFT/IDFT

processing of the form 2

k

·3

m

·5

n

·12, up to 1536

points

FFT – up to

280

Mega samples/sec

DFT – up to

175

Mega samples/sec

Advanced scaling options

Guard bands insertion in iFFT

CRC

Transport and Code Block CRC for UL and DL

up to 12 Gbps

CRC check or insertion

WiMAX

Turbo

W iMAX OFDMA turbo decoding as specified in

IEEE® 802.16™-2005 standard

up to

156

Mbps (8 iterations)

up to

195

Mbps (6 iterations)

Max Log Map or Linear Log Map (MAX*)

Support Rate-De-Matching (sub-block

de-interleaving and de-interlacing)

Viterbi

W iMAX OFDMA turbo decoding as specified in

IEEE® 802.16™-2005 standard

up to

100

Mbps (K=7 with tail biting)

Multi-iteration decoding

FFT

FFT sizes - 128, 256, 512, 1024, 2048 points

FFT2048 – up to

280

Mega samples/sec

FFT1024 – up to

350

Mega samples/sec

Advanced scaling options

Guard bands insertion in iFFT

CRC

PHY Burst CRC for UL and DL

up to 12 Gbps

CRC check or insertion

HSPA+

Turbo

3GPP turbo decoding as specified in 3GPP TS

25.212, section 4.2.3.2.

up to

131

Mbps (8 iterations)

up to

165

Mbps (6 iterations)

Max Log Map or Linear Log Map (MAX*)

Support EDCH Rate De-Matching

Viterbi

3GPP viterbi decoding as specified in 3GPP TS

25.212, section 4.2.3.1.

up to

115

Mbps (K=9 zero tail)

Programming

Model

ALL

Buffer descriptors paradigm for allocation of data and control parameters

Sharing of MAPLE-B modules in multiple devices using SRIO

(21)

Performance Analysis Methodology

Development steps used to perform capacity analysis on MSC8156:

1.

3GPP LTE standards detailed analysis

2.

LTE Matlab Model:

Validation of full LTE chain functionality, uplink and downlink

Research, development and performance validation of algorithms (Channel Estimation,

MIMO Equalizer, RACH, HARQ Combining)

3.

StarCore C code implementation

Code generation

Fixed point validation, feeding code back in Matlab with mexfiles

Target cycles measurements on MSC8156 simulator and ADS boards

Code optimization

4.

Real time integration

Integration of all LTE Layer 1 Software Components

Validation of real time throughput and latency

(22)

Port 0

DDR

DDR2/DDR3

FLASH

CPRI-sRIO

Bridge

GbE

QorIQ

GbE

PHY

GbE

GbE

LB

sRIO

Port 1

PHY

Ant.

B

a

ck

P

la

n

e

Use case definition & System Architecture

4x

4x

4x

sRIO

Switch

DDR

DDR

2

GbE

GbE

sRIO

DDR

DDR2

GbE

GbE

sRIO

MSC8156

DSP

MSC8156

DSP

sRIO

sRIO

LTE - FDD:

• 20 MHz

• 4x4 MIMO DL, up to 300 Mbps

• 2x4 MIMO UL, MMSE MIMO Equalizer, up to 150 Mbps

• 1 device DL, 1 device UL (+remote TVPE of DL device)

• 1 master core per device for I/O and application

(23)

Port 0

DDR

DDR2/DDR3

FLASH

CPRI-sRIO

Bridge

GbE

QorIQ

GbE

PHY

GbE

GbE

LB

sRIO

Port 1

PHY

Ant.

B

a

ck

P

la

n

e

Use case definition & System Architecture

4x

4x

4x

sRIO

Switch

DDR

DDR

2

GbE

GbE

sRIO

DDR

DDR

2

GbE

GbE

sRIO

MSC8156

DSP

MSC8156

DSP

sRIO

sRIO

LTE - FDD:

• 20 MHz

• 4x4 MIMO DL, up to 300 Mbps

• 2x4 MIMO UL, MMSE MIMO Equalizer, up to 150 Mbps

• 1 device DL, 1 device UL (+remote TVPE of DL device)

• 1 master core per device for I/O and application

scheduling

DL Device:

Cores

:

From transport

block encoding down to

physical channel Mapping

(24)

Port 0

DDR

DDR2/DDR3

FLASH

CPRI-sRIO

Bridge

GbE

QorIQ

GbE

PHY

GbE

GbE

LB

sRIO

Port 1

PHY

Ant.

B

a

ck

P

la

n

e

Use case definition & System Architecture

4x

4x

4x

sRIO

Switch

DDR

DDR

2

GbE

GbE

sRIO

DDR

DDR

2

GbE

GbE

sRIO

MSC8156

DSP

MSC8156

DSP

sRIO

sRIO

LTE - FDD:

• 20 MHz

• 4x4 MIMO DL, up to 300 Mbps

• 2x4 MIMO UL, MMSE MIMO Equalizer, up to 150 Mbps

• 1 device DL, 1 device UL (+remote TVPE of DL device)

• 1 master core per device for I/O and application

scheduling

UL Device:

Cores

:

Channel estimation,

MIMO Detector,

Equalization,LLR calculation

.

Maple

:

FFT, IDFT,

Turbo Dec.

(25)

Summary

LTE standard designed for high throughput with 4x4 MIMO OFDMA in

Downlink and 2x4 MIMO SC-FDMA in Uplink. Real challenge for:

High signal processing complexity for advanced algorithms

Complex system SW architecture in multicore environment

Very high data rates and low latency systems

Freescale LTE Layer1 enablement software components include

Advanced Signal Processing library with key algorithms

a Multicore Framework

all needed application & abstraction layers for a smooth integration

Freescale DSP MSC8156:

The MAPLE-B baseband accelerator together with the advanced StarCore

cores provides key factor for current and future BaseBand systems design

with very high data throughput and low latency requirements

(26)

Q&A

Thank you for attending this presentation. We’ll now take a few

moments to review the audience questions, and then we’ll begin the

question and answer session.

(27)

References

Related documents

In order to extract terms from the free speech text in clinical trials, both PoS tagging and dictionary-based approaches have been used, as well as simply using the terms men-

[Volume 209 in a series that provides state of the art expert reviews of central topics in modern chemistry; this series is an excellent source of detailed information on a spectrum

“Regulating Geography through Drug Enforcement: The Role of Drug-Free Zones in Urban Criminal Justice,” Florida State University Law School (Feb 2020). “Regulating Geography

• Organization/People: Enhance consistency, standardize functions, create career paths, improve training, clarify roles and responsibilities, establish robust performance metrics

ICMP ƒ TYPES OF MESSAGES ƒ MESSAGE FORMAT ERROR REPORTING MSG QUERY MSG ƒ ICMP

In the weekend prior to the plaintiff’s injury, the ground had been used for a farming expo and the Court of Appeal held that when all of the circumstances of the case were taken

Populism today is a response to one such major transformation of statehood: the emergence after the Cold War of the internationalized state whereby, in response to

On balance, the four inhibitors of career progression for women (negative discrimination, the culture of the boys’ club, the tension between personal and professional life,