• No results found

DynaCORE Coprocessor

N/A
N/A
Protected

Academic year: 2021

Share "DynaCORE Coprocessor"

Copied!
25
0
0

Loading.... (view fulltext now)

Full text

(1)

I

I

T

T

I

I

DynaCORE

DynaCORE

Dynamically Reconfigurable

Dynamically Reconfigurable

Coprocessor

Coprocessor

for Network Processors

for Network Processors

Carsten

Carsten Albrecht, Albrecht, Roman KochRoman Koch, , ChristophChristoph OsterlohOsterloh,, Thilo

Thilo PionteckPionteck, Erik , Erik MaehleMaehle

Institut

Institut ffüürrTechnischeTechnische InformatikInformatik

Universit

Universitäät zu Lt zu Lüübeckbeck Head: Prof. Dr.

(2)

Overview

Overview

Introduction System Architecture Key Components Internal Interconnect Runtime-Adaptive Network-on-Chip Architecture Buffer Sizes Fault Tolerance Fault Scenarios Stepwise Procedure Modelling DynaCORE Principles DynaCore Model Simulation Runtime Reconfiguration Point of Reconfiguration Technical Aspects

Evaluation and Demonstrator Publications

(3)

Introduction

Introduction

(1/2)

(1/2)

In-transit packet processing in edge routers

Header

processing

Situation

Payload

processing

Processing tasks

(4)

Introduction

Introduction

(2/2)

(2/2)

DynaCORE

= Dynamically adaptable COprocessor based on Reconfiguration

Reconfigurable

hardware accelerator for

payload processing

Allows

flexible adaptation

to changes in network traffic profile

→ Dynamic partial reconfiguration of FPGA

Combination of

Network processor

(e. g. FlexPath NP)

→header processing

+ DynaCORE

(in Xilinx Virtex-4 FX)

→payload processing

Loose coupling

Gigabit Ethernet

Suitable for various network processors

(5)

System

System

Architecture

Architecture

(1/3)

(1/3)

Interface Interface Type H Type S Type 0 Application specific Hardware Assist 1 Hardware Assist 2 Hardware Assist 3 Hardware Assist 4 Transmit-Unit Receive-Unit Dispatcher Reconfiguration Manager (HW + SW)

Static partition Dynamic partition

(6)

System

System

Architecture

Architecture

(2/3)

(2/3)

Transmit Unit

Send processed packets back to NP

Receive Unit/Dispatcher

Recognise requested type of processing

Assign packets to suitable hardware assists

Report to reconfiguration manager in case of unassignable packets

Reconfiguration Manager

Implemented as software running on embedded PowerPC

Collect utilisation information from hardware assists, decide when and how to reconfigure

Control actual process of reconfiguration,

i.e. send configuration data to reconfiguration logic

Reconfiguration Control Logic

Write configuration data to FPGA-internal configuration access port (ICAP)

Software-based Hardware Assist

Backup processing unit

Utilises additional hard-wired PowerPC cores (UltraController II)

Components in the Static Partition

I/O In te rfa ce

(7)

System

System

Architecture

Architecture

(3/3)

(3/3)

Hardware Assists

Actual payload processing modules

Equipped with universal, algorithm-independent interface

Embedded off-the-shelf IP cores

Switches

Forward packets from static partition to HAs and back

(8)

Runtime

Runtime-

-

Adaptive

Adaptive

Network

Network

-

-

on

on

-

-

Chip

Chip

(1/2)

(1/2)

NoC architecture for runtime reconfigurable FPGAs

Virtual cut-through switches with for equal full-duplex links (16 bit)

Low hardware overhead compared to other NoCs

Switches not needed for a certain setting of processing units can be

removed from the network

low latency

Support for QoS

Physical

and

logical

addresses

• Physical addresses:

refer to specific switches

at specific locations within the NoC topology

• Logical addresses

: refer to processing entities

inside hardware modules

CoNoChi

CoNoChi

= Confígurable Network on Chip

log add

Interface

phy add

Hardware Assist

physical addressphysical address logical addresslogical address

(9)

Runtime

Runtime-

-

Adaptive

Adaptive

Network

Network

-

-

on

on

-

-

Chip

Chip

(2/2)

(2/2)

Interface HA 6 In te rfa ce HA 5

Topology Adaptation

Network topology can be

adapted at runtime

Coarse-grained tile

Merging/separation

of

neighbouring tiles

Provides space for

modules of varying

complexity

(10)

Fault

Fault Tolerance

Tolerance

(1/3)

(1/3)

Fault scenarios:

User data

• Non-permanent fault

• Huge hardware effort to detect and correct • Tolerated by application area

Processing units and infrastructure • Device degradation

Fault in hardware structure

• Single-Event Functional Interrupts (SEFIs) Bitflip in configuration data

Approach: Combination of

Configuration readback

• Slow (33 ms for one tile)

• Does not detect hardware faults

Test packets

Do not cover all faults

Alive messages

Missing alive message indicates problem

Permanent faults → → → → need to be corrected

DynaCORE

(11)

Fault

Fault Tolerance

Tolerance

(2/3)

(2/3)

Fault detection

Fault detection

Alive messages

Test packets

Periodic configuration readback

Fault localization and correction

Fault localization and correction

Stepwise procedure using test packets

Test against different assumptions

SEU in control registers → tile reset

SEFI → rewritting reconfiguration

Permanent hardware fault → reorganization

(12)

Fault

Fault Tolerance

Tolerance

(3/3)

(3/3)

Example: no alive message from switch 1

1. Identification of faulty segment

Identify path under test

Known by the reconfiguration manager

Send test packets to all switch along the path under test

If a test packet does not return correctly, faulty segment has been

identified

(13)

Fault

Fault Tolerance

Tolerance

(3/3)

(3/3)

Example: no alive message from switch 1

2. Assumption: SEU in control registers of switches or routing tables

Reset switches in affected section

Send new routing tables

Repeat test

(14)

Fault

Fault Tolerance

Tolerance

(3/3)

(3/3)

Example: no alive message from switch 1

3. Assumption: SEFI

Readback configuration data for each tile and compare with reference

In case of mismatch, reconfigure tile

If tile contains a switch, send new routing tables

Repeat test

permanent hardware error

reorganize system

Procedure takes time, does not cover all fault scenarios, yet is hardware efficient

(15)

Modelling

Modelling

DynaCORE

DynaCORE

(1/4)

(1/4)

Dynamically Structured Discrete Event-Based System Network (DSDEVN)

Extends discrete-event based system (DEVS) formalism

States of controller χ can again be models

„Simple“ DEVS simulator sufficient for simulation of DSDEVN

DynaCORE Model:

DSDEVN

= < X

, Y

,

χ

, M

χ

>

∆ identifies DynaCORE

X, valid inputs of the system, and Y, outputs of the system: messages received from and send to the NP

χ: DynaCORE-specific controller

(16)

Modelling

Modelling

DynaCORE

DynaCORE

(2/4)

(2/4)

Controller Description as DEVS:

Mχ= < Xχ, Sχ, Yχ, δintχ,, δextχ, λχ, τχ >

Xχ: Set of valid controller input

Sχ: Controller state space

Yχ: Set of valid controller output

δintχ: State transition function for internal events – including „timeouts“

δextχ: State transition function for external events

λχ: Output function

τχ: Timeout function (assigns a timeout value to states from Sχ)

Controller States

Include information on system configuration, i.e. configured HAs

Contain, in turn, models of system components active in respective state

(17)

0 200 400 600 800 1000 1200 1400 0.0005 0.0165 0.0325 0.0485 0.0645 B a n d w id th [ M b it /s ] R e c o n fi g u ra ti o n

input data rate output data rate reconfiguration

Modelling

Modelling

DynaCORE

DynaCORE

(3/4)

(3/4)

Structure of SystemC

Simulation Model

Simulation Stimulus and Output

Input burst

(18)

Modelling

Modelling

DynaCORE

DynaCORE

(4/4)

(4/4)

Influence of Buffer Sizes

4 16 64 2 8 32 128 8,0 9,0 10,0 11,0 12,0 13,0 Latency [ms] Buffer Switch [#Pkt] Buffer NoC-Interface [#Pkt] 0,00 0,20 0,40 0,60 0,80 1,00 1,20 4 8 16 32 64 128

Buffer size [#packets]

R a ti o 0,00 2,00 4,00 6,00 8,00 10,00 12,00 T im e [ m s ]

Data rate Packet loss Latency Low impact of buffer sizes between NoC and HA

Large switch buffers:

• Only little advantage for latency

• Increased packet loss in case of reconfiguration

(19)

Runtime

Runtime

Reconfiguration

Reconfiguration

(1/3)

(1/3)

Configuration State Space

Three modules

Three types of HA

Possible transitions between

configurations

Transition costs

(number of HAs to be

reconfigured)

{ A B C } { A B B } { A C C } { B B B } { A A A } { A A B } { A A C } { C C C } 2 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 1 1 2 2 1 3 2 2 3 2 2 3 3 2 3 3 2 3 3 2 3

(20)

Runtime

Runtime

Reconfiguration

Reconfiguration

(2/3)

(2/3)

Reduced Configuration State Space

Transition cost limited

A B C A B B A C C C B C B B B B B C A A A A B A A A C C C C 1 9 2 22 3 10 4 13 5 16 6 19 7 25 8 11 12 28 14 17 15 27 29 18 21 30 20 23 24 26

Reconfiguration Trigger

Configurable per-HA utilisation threshold exceeded multiple times in sequence Zeit Schwellwert T Sχ u Sχ v Sχ u Sχv Sχu Sχ u Sχv Sχu Monitor -datum Sχ u Sχ v

(21)

Runtime

Runtime

Reconfiguration

Reconfiguration

(3/3)

(3/3)

Merging and Separating Tiles

Changes number and shapes of partially reconfigurable regions

Different sets of bus macros

Technical Aspects

Scenario 1 Scenario 2

Static elements in original design as

part of hard macro Bus macros

Reconfiguration Speed

(22)

Evaluation/

Evaluation/

Demonstrator

Demonstrator

(1/2)

(1/2)

(23)

Evaluation/

Evaluation/

Demonstrator

Demonstrator

(2/2)

(2/2)

FlexPath NP

NP with reconfigurable data-path

Virtex-4 FX 60

DynaCORE

reconfigurable processing modules (HAs)

Virtex-4 FX 60

analysis,

analysis,

visualisation

visualisation

(24)

Publications

Publications

[PKA09] Pionteck, T.: Koch, R.; Albrecht, C.; Maehle, E.: A Design Technique for Adapting Number and Boundaries of Reconfigurable Modules at Runtime. International Journal of Reconfigurable Computing, vol. 2009, Article ID 942930,, Hindawi Publishing Corporation , New York 2009

[PAK08a] Pionteck, T.; Albrecht, C.; Koch, R,; Maehle, E,: Adaptive Communication Architectures for Runtime Reconfigurable System-on-Chips. Parallel Processing Letters, 2008

[AFK09] Albrecht, C.; Foag, J.; Koch, R.; Maehle, E.; Pionteck, T.: DynaCORE – Dynamically Reconfigurable Coprocessor for Network Processors. To Appear: Dynamically Reconfigurable Systems Architectures: Design Methods and Applications, Springer, 2009 [AKP09] Albrecht, C.; Koch, R.; Pionteck, T.; Glösekötter, P.: Towards a Flexible Fault-Tolerant System-on-Chip. 22th International Conference on Architecture of Computing Systems - Workshop Proceedings, 83-90, VDE Verlag GmbH, Berlin 2009

[KAP09] Koch, R.; Albrecht, C.; Pionteck, T.: Adaptive Health Monitoring in a Reconfigurable Network-on-Chip. Workshop on Diagnostic Services in Network-on-Chips (DSNOC), Nice 2009

[AOP08] Albrecht, C.; Osterloh, Ch.; Pionteck, T.; Koch, R.; Maehle, E.: An Application-Oriented Synthetic Network Traffic Generator. European Conference on Modelling and Simulation 2008, 299-305, ECMS, Nicosia, Cyprus 2008

[ARK08] Albrecht, C.; Roß, P.; Koch, R. ; Pionteck, T. ; Maehle, E.: Performance Analysis of Bus-Based Interconnects for a Run-Time Reconfigurable Co-Processor Platform. PDP 08, 200-205, IEEE Computer Society, Toulouse, France 2008

[AWP08] Albrecht, C.; Werner, M.; Pionteck, T.; Fuchsen, R.; Koch, R.; Maehle, E.: WCET Determination Tool for Embedded Systems Software. SIMUTools08 Proceedings, 1, ICST, Marseille, France 2008

[PAK08] Pionteck, T.; Albrecht, C.; Koch, R.; Brix, T.; Maehle, E.: Design and Simulation of Runtime Reconfigurable Systems. IEEE Workshop on Design and Diagnostics of Electronic Circuits and Systems (DDECS 2008 ), 2008

[PAK08b] Pionteck, T.; Albrecht, C.; Koch, R.; Maehle, E.: Performance and Reliability Monitoring in Network-on-Chips. To Appear: Workshop on Diagnostic Services in Network-on-Chips (DSNOC), 2008

[PAK08c] Pionteck, T.; Albrecht, C.; Koch, R.; Maehle, E.: On the Design Parameters of Runtime Reconfigurable Systems. Accepted for: International Conference on Field Programmable Logic and Applications (FPL 2008), Heidelberg, Germany 2008

[AKP07] Albrecht, C.; Koch, R.; Pionteck, T.; Maehle, E.: Simulation System for Run-Time Reconfigurable Networks-on-Chip. Proceedings of the 6th EUROSIM Congress on Modelling and Simulation, ARGESIM - ARGE Simulation News, Wiedner Hauptstrasse 8-10, 1040 Vienna 2007

[APK07]Albrecht, C.; Pionteck, T.; Koch, R.; Maehle, E.: Modelling Tile-Based Run-Time Reconfigurable Systems Using SystemC. European Conference on Modelling and Simulation 2007, Prague, Czech Republic 2007

(25)

Summary

Summary

DynaCORE-specific aspects:

Interconnect performance analysis

• Bus versus NoC

• based on a formally derived simulation model

Synthetic traffic generator

Performance enhancement compared to software based systems

Proof of concept by means of demonstrator

• In cooperation with FlexPath / TU Munich

Universal aspects

SystemC simulation methodology for runtime reconfigurable systems

• SystemC kernel needs not to be adapted

Reconfiguration Management

References

Related documents

“Does your department ever bring in real time case studies for your students to apply what you’re teaching them?” [Tell them a bit about your program – what you do, how

6.4 , the average throughput per user achieved by the proposed optimal scheme and suboptimal schemes are compared with the unconstrained scheme [54] and the scheme [11] which uses

The new AA-T degree, with its four levels of core classes that are only offered once per year will likely see an increase in average years to completion, since if a student

• Transient Accommodation Use – Shall not exceed: (1) thirty (30) units per acre; or (2) in the alternative, if designated on the Zoning Atlas with the Transient Accommodation

Speaker, Labour Law in the Millennium, Conference, University of Saskatchewan, College of Law, March 29 th , 2004, Saskatoon; prepared and presented paper on application

requirement by taking a minimum of 9 credit hours during the summer semester while at FIU. Students must meet all of the state and university requirements in order to

While SILK sutures are not absorbed, progressive degradation of the proteinaceous silk fiber in vivo may result in gradual loss of tensile strength over

the country in which they are headquartered, so sales by an overseas subsidiary will be counted towards the total for the parent company’s country. The Top 100 does not include