How safe is your software?

(1)

P R E S E N T A T I O N

Presentation Bio

Return to Main Menu

T4

Thursday, Dec 7, 2000

How safe is your software?

(2)

How safe is your software?

Johan Hedberg, SP

Lars Strandén, SP

SP Swedish National Testing and Research Institute johan.hedberg@sp.se

(3)

PALBUS

• Swedish joint research project between Industry, University and Research Institutes

• How to design dependable distributed control systems • How to validate/verify systems to reach dependability • Present useful methods for validation/verification

(4)

Background

• Trend with increased use of distributed control systems in safety critical applications

• Important to achieve a certain level of dependability in such systems

• Increased industry interest concerning dependability

(5)

Background, cont.

• Started in August 1999 and will end in April 2001

(6)

Application domain

• System

• Distributed control system • Safety

• Embedded system • Real-time

(7)

System overview

Adjustment of format Filtering Calculation Calculation Filtering Adjustment of format Protocol generator / Protocol degenerator Calculation Filtering Adjustment of format Adjustment of format Filtering Calculation Protocol generator / Protocol degenerato Protocol generator / Protocol degenerator Calculation Filtering Adjustment of format Adjustment of format Filtering Calculation Protocol generator / Protocol degenerator Calculation Filtering Adjustment of format Adjustment of format Filtering Calculation Common communication bus

Overview of the distributed system

inparameter_node1 outparameter_node1 inparameter_node2 outtparameter_node2 inparameter_node3 outparameter_node3 inparameter_node4 outparameter_node4 * * * *

(8)

Definitions & standards

• According to Laprie´s definitions • Dependability - availability - reliability - safety - security • Relevant standards

... FAILURE FAULT ERROR FAILURE FAULT ...

Adjudged or hypothesized cause System internal effect User perceived effect Activation (internal) Occurrence (external) Deviation of delivered service from compliance to system specification Fault - fault, bug, defect, mistake, et c

(9)

New types of errors

• Node error • Bus error • Timing error

• Data consistency error

• Initialization & restart error • Babbling idiot error

• Configuration error

Node A Node B

Node D Node C

(10)

New types of errors, cont.

• New error types require new fault detection and fault handling methods

• The distribution of computing gives new ways to detect and handle faults

(11)

Time triggered/event triggered

• Bus access techniques

• Delay • Jitter

• Scheduling

• Fault detection and fault handling • Acceptance

• How do the protocols handle errors related to distributed systems?

(12)

Bus access techniques

node1 node2 node3 node4 node1 node2 node3 node4

increasing time Each nodes sending timeslot in a time triggered protocol

Bit wise arbitration mechanism:

Communication bus

node 1 node 2 node3

001101 001011 ₀₀₁₀₁₀

(13)

Design principles for

dependable systems

• Focus on design principles specific for

distributed control systems • Utilize the protocol

optimally to reach as high dependability as possible • How to improve the

(14)

Validation & verification

methods

• Focused on methods related to the ”distribution”

• Divided into the following groups: - formal methods

- analysis - test

(15)

Fault tree applied to distributed

control systems

Failures related to the distribution of nodes Safety related failure in the system Failure in software Communic ation controler level Application

level Bus level

Implementation of software do

not fulfill the specification > > > > > > Failure in hardware > EMC Humi dity Temp Vibra

Specification has not considered all

possible risks System level

(16)

Structure of a distributed

control system

• What do we mean with ”system aspects”?

• All levels must be considered to reach dependability System Application Bus Communication controller

(17)

Prototype system

• CAN based system

• Available for all participants in the project

• Used by the participants to be able to analyze and test ideas and methods developed in the project

(18)

Applying analysis & test

methods

• Application dependent

• ”Practical” implementation of analysis methods described in the project

• Quality of developed methods • New methods

(19)

Trends

• Increased system complexity • More frequent use of COTS • Less embedded systems

(20)

Conclusions

• Distributed control systems give new possibilities to supervise the behaviour of the application

• Decide, as early as possible, if a distributed architecture should be used in the development phase

• Present validation/verification methods to handle

software related errors in distributed control systems • Indicate a certain level of dependability in software by

(21)

PALBUS information

PALBUS results are available at the following address:

(22)

Thursday 7 December 2000 T4

How safe is your software? Johan Hedberg

Johan Hedberg received his Master of Science in Electrical Engineering 1999 from Chalmers University of Technology in Gothenburg, Sweden. His thesis work named ”Implementation of a Distributed Control Application Based on the TTP/C Architecture” was performed at Volvo Technological Development. At SP he has continued to work with distributed systems in a research project financed by NUTEK (The Swedish National Board for Industrial and Technical Development). The purpose of this project is to find out methods to evaluate dependability of distributed control systems. He is working at the SP section of Software & Safety.

Lars Strandén received his Master of Engineering Physics 1976 from University of

Technology in Uppsala, Sweden. He has worked with real-time embedded systems and large radar applications written in Ada at Ericsson Microwave Systems. He worked as a technical specialist concerning software development methods and received his Licentiate of

Engineering 1998 from Chalmers University of Technology in Gothenburg, Sweden. After that he has worked two years as a computer system consultant and works now at SP section Software & Safety. His interests are focused around dependable software within real-time embedded applications and he is currently involved in a research project concerning