• No results found

A Generic Network Interface Architecture for a Networked Processor Array (NePA)

N/A
N/A
Protected

Academic year: 2021

Share "A Generic Network Interface Architecture for a Networked Processor Array (NePA)"

Copied!
18
0
0

Loading.... (view fulltext now)

Full text

(1)

A Generic Network Interface

Architecture for a Networked

Processor Array (NePA)

Seung Eun Lee, Jun Ho Bahn,

Yoon Seok Yang, and Nader Bagherzadeh

EECS @ University of California, Irvine

(2)

Outline

Introduction

Network-on-Chip

Related Works

Generic Network Interface

– Related Works

– Networked Processor Array (NePA) Architecture

– Generic Network Interface

– Programming Sequence

– Modular Wrapper for a Slave IP Core

– Case Studies: Memory/ Turbo Decoder IP Cores

(3)

Introduction

0 40 80 120 160 200 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 Year Gate Dela y in Tim e s (ps) 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 W ir e D e lay in Tim es (ps)

Gate Delay (HP) Gate Delay (LOP) Gate Delay (LSTP) Global Wire Delay 1 Global Wire Delay 2 Metal1 Wire Delay 1 Metal1 Wire Delay 2 Int. Wire Delay 1 Int. Wire Delay 2

In 2018, the interconnection delay is estimated to be 1000 times greater than gate delay [ITRS]

The interconnection network among multiple IPs becomes another challenging issue in System-on-Chip (SoC) design

(4)

Introduction (cont’d)

Current Trends in VLSI Technology

– Requirements

• Computation intensive applications

• Highly integrated + low power

– Increasing # of computing resources in SoC

• CPUs, DSPs, ASPs

– System platforms

• MPSoC (Multi Processor System-on-Chip) or CMP (Chip Multi Processor)

• Homogeneous/Heterogeneous processors

• Similarity in a small scale distributed computer system

(5)

Network-on-Chip Interconnection

Interconnect Network Switch Link Network Interface CPU RAM I/O co-Processor Communication ROM DSP Peripheral

The use of switching based technology

Have been extensively used for computer network Communication between IPs can be packet based

The key efficiency of NoC

(6)

Network-on-Chip (cont’d)

Network-on-Chip (NoC) Architecture

– Network-like interconnection

• Insertion of routers

• Shortened wiring requirement

• Alleviating scalability and freedom from the limitation of complex wiring

– Difference from computer network technology (Internet TCP/IP)

• Simple and light-weight modification

• low power requirement for mobile applications

• Performance and cost

Different interface specification of integrated components

raise a considerable difficulty for adopting NoC

(7)

Generic Network Interface

The reuse of IP cores in plug-and-play manner can be

achieved by using a generic network interface (NI)

– Reduce design time of new system

– Translate packet-based communication into a higher level protocol

– Decouple computation from communication

(8)

Related Works

Different Packetization strategy

– Software library, on-core and off-core implementation

– A hardware wrapper implementation has the lowest area overhead and latency

NI for standard Interface such as OCP, DTL and AXI

– Improve reuse of IP cores

– Performance is penalized because of increasing latency

Generic architecture and automatic generation of

interface

– Existing researches limit the embedded IP cores to CPU (ARM7 and MC68000)

The designs of wrapper for application specific cores still

lack generic aspects

(9)

NePA Architecture

System Platform

MS Host I/F (HI) IP1 Memory Station (MS) Host I/F (HI) Processing Element (PE) Processing Element (PE) Processing Element (PE) Processing Element (PE) Host I/F (HI) Memory Station (MS) Memory Station (MS) Host I/F (HI) PE Router Processor Core (ARM™ / MIPS etc) Program RAM Data RAM NI Memory Station (MS) IP2 IP3 IP4 IPx Router Specific IP (FFT, Viterbi or Turbo coder) Network Interface Router Data RAM NI Memory Controller

(10)

NePA Architecture:

High-Performance Router Architecture

Interconnect throughout FIFO between neighboring PEs

– Simple Interconnect Wiring

Minimal (shortest) adaptive routing

– Livelock-free

Point-to-point single or block transfer Two disjoint sub-networks for the west-to-east and east-to-west traffics

– Network avoids a cyclic dependency

– Resulting in deadlock-freedom

Prioritized packet delivery

W Input N1 Input S1 Input IntR Input S1 O utput E Output N1 Output W Input S2 Input N 2 Input IntL Inpu t N2 Output W output S2 Output N W S E N S N1 N2 W INT E S1 S2 Int R Int L INT Internal Router Right Router Left Router

(11)

Network Interface Prototype:

Packetization Unit

Build the packet header and converts the data into flits

– Header builder: form the head flit based on the information provided by registers

– DMA controller: generate control over the address and read signal for the internal memory automatically

(12)

Network Interface Prototype:

Depacketization Unit

Receive data from interconnection network

– Flit Controller: select head flit from a packet and pass it to the header parser

– Header parser: extract control information from the head flit and assert an interrupt signal to the OpenRISC core

– DMA controller: writes the body flit data into the internal memory automatically

(13)

Network Interface Prototype:

Programming Sequence

Sending SINGLE Packet

– All required parameters are set to the associated registers

– Writing command register generate a complete packet

Sending BLOCK packet

sDataReg represents the number of data

sReadAddrReg indicates the start address of data in memory

Receiving SINGLE/BLOCK packets

– Parameters are accessed by interrupt service routine

Accessing rDataReg completes the procedures for current packet

– For BLOCK packet, OpenRISC sets the corresponding write address (wWriteAddrReg) for internal memory access

(14)

Generic Network Interface

Modification of Packet for NI access

A slave IP core is not able to write registers in a current NI

These registers are accessed by other cores using the

network

– Opcode and Operand of an instruction are located at Tag and Data field in the SINGLE packet

Type field indicates that packet contains an instruction for NI

– Instruction decoder in the header parser fetches opcode and operand from a packet

(15)

Generic Network Interface

Modular Wrapper for a slave IP core

Un-buffered Mode: data is exchanged in data stream without intermediate buffer

(16)

Generic Network Interface

Modular Wrapper for a Memory

Maintain data and shared among a number of PEs.

Assume synchronous SRAM model

Wrapper design

– Core type is slave IP

– There is no control signals for initialization or status monitoring

– Data interface is realized in the un-buffered mode removing the FIFOs between NI and memory

Programming sequence

– The base address is set to the desired value using SINGLE packet

– Sending BLOCK packet stores data into memory

(17)

Generic Network Interface

Modular Wrapper for a Turbo Decoder

Stand-alone turbo decoder operating block by block

process

Wrapper design

– Core type is slave IP

– There are six signals that are used for initialization and mode selection

– Data interface adopts buffered mode, inserting FIFOs between NI and the core

Programming sequence

– Before starting turbo decoding, it is initialized by sending packet which accesses the input control signals

– Data is sent to the core using BLOCK packet

– When decoding of one block is completed, NI start to send a packet to the other node automatically

(18)

Summary

Introduced Networked Processor Array (NePA)

Proposed network interface architecture for OpenRISC

core

Classified the possible IP cores for processing elements

Proposed a modular wrapper for an embedded IP cores

– Allocation table was used for the configuration of the modular wrapper

– Programming model was presented

– Case studies in memory and turbo decoder cores demonstrated feasibility and efficiency of the proposal

References

Related documents

The main issues in this study that different from the previous researches is to test reclaimed water samples in different percentage of blending with potable water 20%,50% and

When the Lord appears, when we no longer live but Christ lives in us, we are able to bring an offering in righteousness, one that God will accept. The

In this paper, we prove coincidence point and common fixed point results of two, three and four self mappings in normal cone metric type spaces.. The results presented in this

• If this is successful, airlines should experience lower training costs for lower-time pilots because the baseline of knowledge and skills will be

Employees, representatives and agents of the Hospital are expected to be honest in their communications with patients and their families, attorneys, staff members, auditors and

There is a bit of history to this report, we contracted a Developer who had been working for the company for a while to use the Google Adword API to get to the stage where we can

Average monthly stratospheric mountain wave activity in percent of time for the winter months December, January, February, and March (white, black, grey, and white, respectively)