• No results found

Command and Control of a Massively Parallel GALS Environment

N/A
N/A
Protected

Academic year: 2021

Share "Command and Control of a Massively Parallel GALS Environment"

Copied!
15
0
0

Loading.... (view fulltext now)

Full text

(1)

Command and Control of a

Massively Parallel GALS

Environment

Cameron Patterson

Supervisor: Steve Furber

SpiNNaker Team, APT Group,

University of Manchester, UK.

(2)

Management

SpiNNaker

– ASIC for modelling Artificial Neural Networks

– Utilises Asynchronous Interconnects

Command and Control of a Massively Parallel GALS Environment

• On-chip System and Communication NoCs • Off-chip inter-connects

to other SpiNNaker chips

– Up to 20 cores per chip – Each core simulates up

to 1,000 neurons

– System Scales to >1M

(3)

Management

Resource Contention

– In the brain there are billions of neurons, massive

interconnectivity, but biological operation is slow

Command and Control of a Massively Parallel GALS Environment

• Computing technology is

fast, so we can multiplex

• Contention is mapped by

anticipated statistics

• Machine is homogenous

• However behaviour of

NN is organic: we don’t know where `hot-spots’ will form

(4)

Proposed Research

(1of3)

System Command and Control

– Many thousands of components in a large system:

Command and Control of a Massively Parallel GALS Environment

– Chips, RAMs, Links, Ethernets

– They could go wrong `Fault

Alerting’

– Other facets of system

management:

• Capacity Management • Accounting

• Performance Analysis • Security

(5)

Proposed Research

(2of3)

Real-time Software Monitoring

– Neural Software running in biological time

Command and Control of a Massively Parallel GALS Environment

– The system doesn’t have to

be a like a `batch system’ / `black box’

– Customers may be

neuro-scientists or psychologists

– Which parts of the artificial

brain are`lit-up’

– Save this activity for later

timeline analysis

* Picture from: Powell K: Economy of the Mind.PLoS Biol 1/3/2003: e77. Rewarding the Brain – Ventral midbrain activity

(6)

Proposed Research

(3of3)

In-Flight Reconfiguration Management

– Dealing with the consequences of a h/w or s/w issue

Command and Control of a Massively Parallel GALS Environment

– A core is overloaded – A chip is failing

– A link is congested

– Take automated remedial

action

– Remap neurons to different fascicle processors

– Re-routing of data around the fault in the machine – Turning on QoS

(7)

Infrastructure

To retrieve the information from the system:

Command and Control of a Massively Parallel GALS Environment

– Two components:

– Protocol to get/set data – Data stored

– Common solution used for

network attached systems:

• SNMP * • MIB *

• Research attempting

implementation of

SNMP & MIB on SpiNNaker

* J Case, M Fedor, M Schoffstall, and C Davin. RFC 1067 A Simple Network Management Protocol (SNMP), 1989

(8)

Infrastructure

SpiNNaker is a very large scale system

It may therefore require many command and

control machines to

monitor the whole system

Command and Control of a Massively Parallel GALS Environment

• Functions can be split into management domains eg:

• Function: Capacity, Faults, Software Visualisation etc. • Type: processor utilisation,

memory use, link capacity • Location: cluster of chips by

(9)

Issues to Overcome

SpiNNaker resources are limited

– Small instruction memory

– Restricted I/O via Ethernet links

– Want to limit all but essential load to maximise neural computation

The management system domain will be large.

A solution that minimises system resources is

required, and one which is scalable

– Hierarchical System of both NMSs and agents

• AgentX * permits master/slave agent relationship – delegates the collection and data store for the end systems

Command and Control of a Massively Parallel GALS Environment

* M Daniele, B Wijnen,M Ellison, and D Francisco. RFC 2741 Agent Extensibility (AgentX) Protocol, 2000

(10)

Proposal

• Combine `system’ and `neural’ management data into

single standardised command/control SNMP framework

• Offload as much processing as possible from the

SpiNNaker machine

Command and Control of a Massively Parallel GALS Environment

• A protocol translator is therefore proposed to provide facilities for both

– To SpiNNaker side, looks like a low cost native host – To NMS looks like a

standard SNMP agent

(11)

Visualisation

What might the management systems look like?

(12)

Work so Far

• Design of low processing cost IP compatible protocol for the SpiNNaker Ethernet links

– Permits routing - collaboration with partners off site & resiliency

• Test Software – Doughnut Hunter

– Neural Network Application – Implemented on test systems – Validation of IP protocol

Command and Control of a Massively Parallel GALS Environment

• Diagnostic testing and specification

– Existing Test Chip – Proposals for next

iteration of chip design

(13)

Future Work

• Implementing the SNMP and Protocol Translator system

for SpiNNaker

– Devising a MIB for hardware and software

– Software Creation, optimisation, comparison with other solutions – Topological Testing – placing the P.T. internally to the system

• Creation of Neural Visualisation

– Explore standard tools vs. bespoke

– Extend P.T. to provide standard neural imaging output format

• In-Flight System reconfiguration

– Use management output to command/control the system

– Look at automation of re-routing around hot spots and failures without a stop being required of the system

(14)

Conclusions

The Protocol Translator is a promising idea to

offload management functions, but providing a

standard SNMP interface to an NMS

It also seems that using the same framework in

order to support both hardware and software

visualisation is a valid one

The low cost IP protocol developed has already

been validated as a low-cost way to significantly

improve network functionality

(15)

Questions ?

Command and Control of a Massively Parallel GALS Environment

Contact Details

References

Related documents