• No results found

Towards Verification and Validation for Increased Autonomy

N/A
N/A
Protected

Academic year: 2021

Share "Towards Verification and Validation for Increased Autonomy"

Copied!
34
0
0

Loading.... (view fulltext now)

Full text

(1)

Dimitra Giannakopoulou

D

Dimitra Giannakopoulou

Towards  Verifica.on  and  Valida.on  for  

Increased  Autonomy  

 

(2)
(3)

Safe  Air  Transporta.on  

§  Air  traffic  opera.ons  are  expected  to  increase  significantly.  

Automa.on  must  maintain  or  exceed  current  safety  standards  

§  Separa.on  Assurance  –  algorithms  and  systems  gradually  taking  the  

role  of  air-­‐traffic  controllers  to  enable  reduced  aircra8  separa.on  

(4)

ACAS  X  –  a  completely  new  paradigm  

dhint

hrel

dhown

§  40  secs  from  Near-­‐Mid-­‐Air  Collision  (NMAC)  

§  state  variables  

–  hrel  :  rela.ve  al.tude,  in  [-­‐1000…1000]  !  

–  dhown  /  dhint  :  ownship  /  intruder  climb  

rates,  in  [-­‐2500…2500]!/min    

§  aprev  /  sRA:  advisory  issued  by  ACAS  X  in  

previous  sec  /  current  pilot  state,  both  in      

{COC,  CL/DES1500,  SCL/SDES1500,  SCL/SDES2500  

§  update  and  advisory  frequency  is  set  to  1  sec  

§  discre7za7on  resolu7on  n  for  a  variable  V  

means  that  V  is  discre.zed  to  n  points  above   and  n  points  below  0  within  its  interval  of   values.  For  example,  discre.za.on  resolu.on   of  10  for  hrel  means:  

 {-­‐1000,  -­‐900,  -­‐800,  …  ,  0,  100,  …,  900,  1000}  

  ownship

(5)

ACAS  X  –  goals  

stateA

stateB stateC stateE

CL 1500 ft/min COC P = 0.4 P = 0.6 stateD

P = …

§  NMAC  (near-­‐mid-­‐air  collision  

§  Alert  (from  COC  to  advisory)  

§  Strengthening  (strengthen  advisory)  

§  Reversal  (e.g.  climb  to  descend)  

§  COC  (clear  of  conflict)  

(6)

deploying  ACAS  X  as  a  look  up  table    

each grid-point in the look up table has corresponding costs for each advisory; ACAS X returns the advisory with the smallest cost.

(7)
(8)
(9)

ACAS-­‐X  

(10)

would  you  trust  ACAS  X  in  a  flight?  

did they pick the right costs?

how likely is an NMAC? does the model make sense?

(11)

ACAS-­‐X  

(12)

discre.za.on  resolu.on  (dh

own

,  dh

int

,  h

rel

)  

con.nuous    

model  

genera+on  

model  

evalua+on  

model  

look-­‐up  table  

model    

checking  

DR

G

 

DR

E  

(higher  resolu7on  

than  DR

G

)  

DRG  :  model  discre.za.on  resolu.on  for  look-­‐up  table  genera.on;                              baseline  [KC  2011]  resolu.on  is  (dhown=10,  dhint=10,  hrel=10)   DRE  :  model  discre.za.on  resolu.on  to  model  check  look-­‐up  table  

(13)

effects  of  resolu.on  DR

E

 on  model  checking  

 

§  we    compute  P(NMAC)  of  the  baseline  look  up  table  deployed  in  

models  that  are  obtained  through  discre.za.on  with  varying   resolu.ons  DRE  (dhown,  dhint,  hrel)  

 

–  ALL  varies  climb  rates  and  rela.ve  al.tude  in  DRE:  (20,  20,  20),  (30,  30,  30),  …  

–  climb  varies  climb  rates  only  in  DRE:  (20,  20,  10),  (30,  30,  10),  …  

–  alt  varies  rela.ve  al.tude  only  in  DRE:  (10,  10,  20),  (10,  10,  30),  …  

(14)

0 0.0002 0.0004 0.0006 0.0008 0.001 0.0012 0.0014 0.0016 10 15 20 25 30 35 40 45 50 P(NM A C) resolution Height Climbing Rate All

§  P(NMAC)  decreases  with  higher  evalua.on  resolu.ons  

§  rela.ve  al.tude  discre.za.on  is  indica.ve  

model checking resolution DRE

effects  of  resolu.on  DR

E

 on  model  checking  

alt

climb ALL

(15)

probabilis.c  model  checking  DR

E

 =  (50,  50,  100)  

 

allows  precise  automated  analysis  of  probabilis.c  proper.es  

expressed  in  a  formal  logic  such  as  PCTL;  generates  encounters  that   exhibit  property-­‐related  behaviors  

 

§  what  is  the  probability  of  NMAC?      (P=?[F  NMAC])  2.5  x  10-­‐4  

§  what  if  pilot  responds  immediately?  

                                                                                           (P=?(F  NMAC  |  Gaprev  =  sRA))  2.3  x  10-­‐8    

 

§  what  is  the  probability  of  a  split  advisory?      1.8  x  10-­‐3  

         P=?[  F  (!COC  ∧ P=1[X  COC]  ∧  P>0  [F  !COC]  )]  

§  split  advisories  are  harder  to  directly  take  into  account  during  look  

(16)

split  advisory  encounter  

§  sensible  behavior  by  ACAS  X    

(reward  for  COC  +  cost  of  alert)  <  cost  of  reversal  (“sneaky”  reversals)        

-3000 -2000 -1000 0 1000 2000 3000 0 5 10 15 20 25 30 35 40 v a lu es time h0 h1 h 0 1 2 3 4 5 6 dhint dhown DES1500 SDES2500 COC va lu es time to LHS CLI1500 COC hrel

(17)
(18)

weights  vs.  performance  

§  tune  look-­‐up  tables  based  on  minimum  acceptable  performance  

–  determinis.c  look-­‐up  tables  based  on  weights  form  a  convex  Pareto  front;  we   implement  algorithms  that  approximate  it  above  target  performance  

-0.62 -0.6 -0.58 -0.56 -0.54 -0.52 -0.5 -0.0007 -0.0006 -0.0005 -0.0004 -0.0003 -0.0002 -Al erts -P(NMAC) Controllers Target First Point

(19)

discre.za.on  resolu.on  (dh

own

,  dh

int

,  h

rel

)  

con.nuous    

model  

genera+on  

model  

evalua+on  

model  

look-­‐up  table  

model    

checking  

DR

G

 

DR

E  

(higher  resolu7on  

than  DR

G

)  

DRG  :  model  discre.za.on  resolu.on  for  look-­‐up  table  genera.on;                              baseline  [KC  2011]  resolu.on  is  (dhown=10,  dhint=10,  hrel=10)   DRE  :  model  discre.za.on  resolu.on  to  model  check  look-­‐up  table  

Pareto  

(20)

synthesizing  bemer  tables  in  higher  DR

G

   

§  ques.on:  what  is  the  effect  of  resolu.on  discre.za.on  DRG  on  look-­‐

up  table  synthesis?    

§  experiment  set  up:  

–  evaluate  baseline  (dhown=10,  dhint=10,  hrel=10)  [KC  2011]    in  new  DRG   –  use  result  as  target  for  synthesis  

§  how  we  vary  resolu.ons  DRG  (dhown,  dhint,  hrel)  

–  ALL  varies  climb  rates  and  rela.ve  al.tude  in  DRG:  (20,  20,  20),  (30,  30,  30),  …  

–  climb  varies  climb  rates  only  in  DRG:  (20,  20,  10),  (30,  30,  10),  …  

–  alt  varies  rela.ve  al.tude  only  in  DRG:  (10,  10,  20),  (10,  10,  30),  …  

–  note  that  alt  results  in  the  smallest  look  up  tables  −  in  terms  of  numbers  of  

states  −  for  each  value  increase  

 

(21)

table  synthesis  at  different  resolu.ons  

recommenda.on:  (10,  10,  30),  or  (n,  n,  3*n)   0 5e-05 0.0001 0.00015 0.0002 0.00025 0.0003 0.00035 0.0004 10 15 20 25 30 35 40 45 50 P(NM A C) resolution Height Climbing Rate All generation resolution DRG climb alt ALL
(22)

verifica.on  achievements  

§  we  could  not  use  off-­‐the-­‐shelf  tools,  so  we  built  VeriCA  toolset   -  our  tools  support  models  wrimen  in  Java    

-  we  customized  verifica.on  and  synthesis  algorithms  for  ACAS  X  needs

 

§  we  analyzed  ACAS  X  version  that  we  reproduced  based  on:    

Kochenderfer,  M.  J.,  and  Chryssanthacopoulos,  J.  P.  Robust  airborne  collision  avoidance   through  dynamic  programming.  Project  Report  ATC-­‐371,  Massachusems  Ins.tute  of   Technology,  Lincoln  Laboratory,  2011.  

 

§  ETAPS  2014  EASST  best  paper  award  

–  Chris.an  von  Essen,  Dimitra  Giannakopoulou:  Analyzing  the  Next  Genera+on  

Airborne  Collision  Avoidance  System,  TACAS  2014.    

§  FAA  /  NASA  Ames  Interagency  Agreement  for  ACAS  X  and  VeriCA  

(23)
(24)

comparing  model  to  real  world  

State machine model with probabilistic transitions (MDP) is used to generate onboard look-up table. The MDP is not present in the onboard system.

t

We defined Conformance Relations that compare MDP model to flight data

state estimate at time t on look-up table

(25)

comparing  model  to  real  world  

State machine model with probabilistic transitions (MDP) is used to generate onboard look-up table. The MDP is not present in the onboard system.

t

t+1sec

We defined Conformance Relations that compare MDP model to flight data

state estimate at time t on look-up table

state estimate at time t+1 on look-up table

(26)

comparing  model  to  real  world  

State machine model with probabilistic transitions (MDP) is used to generate onboard look-up table. The MDP is not present in the onboard system.

t

t+1sec

We defined Conformance Relations that compare MDP model to flight data

MDP model state transitions

full conformance:

ACAS X states contained in MDP states

(27)

comparing  model  to  real  world  

State machine model with probabilistic transitions (MDP) is used to generate onboard look-up table. The MDP is not present in the onboard system.

t

t+1sec

We defined Conformance Relations that compare MDP model to flight data

MDP model state transitions

full conformance:

ACAS X states contained in MDP states

MDP model state transitions

non-conformance:

ACAS X and MDP states do not intersect

(28)

data  genera.on  and  analysis  

Data  genera7on:  Non-­‐conforming  encounters  are  rare  in  test  data  of   the  ACAS  X  distribu.on.  We  used  a  reinforcement  learning  framework   to  target  genera.on  of  non-­‐conforming  simulated  encounters.  

     

Data  Analysis:  The  intruder  climb   rate  has  been  iden.fied  as  a  

common  factor  for  divergence  

across  the  data.  Further  analysis  is   needed.  

Open  ques7on:  Does  non-­‐ conformance  imply  poten.al  

(29)

ACAS-­‐X  

(30)

ACAS-­‐X  

(31)

§  formula.on  of  requirements  is  harder  –  autonomy-­‐specific?  

–  op.miza.on,  adap.ve  and  learning    algorithms   –  example:  loss  of  separa.on,  ACAS  X    

 

(32)

generate  strategic  secondary  aircra8  

no picked resolution is allowed to cause a more imminent secondary conflict

C1-2 T2 T1 C1-3 T3 T1

(33)

§  formula.on  of  requirements  is  harder  –  autonomy-­‐specific?  

–  op.miza.on,  adap.ve  and  learning    algorithms   –  example:  separa.on  assurance,  ACAS  X    

§  need  for  advanced  tes.ng  infrastructures  

–  test  case  genera.on  for  stress-­‐tes.ng  and  requirements  coverage   –  examples:  ACAS  X,  separa.on  assurance,  autonomous  vehicles    

§  V&V  at  run.me,  including  requirements  

–  ACAS  X  (error  predic.on  with  sta.s.cal  learning)   –  separa.on  assurance  

§  trust  

–  extensive  verifica.on  

–  explana.on  of  decision-­‐making  algorithms  

(34)

1.  Dimitra  Giannakopoulou,  David  H.  Bushnell,  Johann  Schumann,  Heinz  Erzberger,   Karen  Heere:  Formal  tes.ng  for  separa.on  assurance.  Ann.  Math.  Ar.f.  Intell.   63(1):  5-­‐30  (2011)  

 

2.  Dimitra  Giannakopoulou,  Falk  Howar,  Malte  Isberner,  Todd  Lauderdale,  Zvonimir   Rakamaric,  Vishwanath  Raman:  Taming  test  inputs  for  separa.on  assurance.  ASE   2014.  

 

3.  Marko  Dimjasevic,  Dimitra  Giannakopoulou:  Test-­‐Case  Genera.on  for  Run.me  

Analysis  and  Vice  Versa:  Verifica.on  of  Aircra8  Separa.on  Assurance.  ISSTA2015.   4.  Chris.an  von  Essen,  Dimitra  Giannakopoulou:  Probabilis.c  verifica.on  and  

synthesis  of  the  next  genera.on  airborne  collision  avoidance  system.  STTT  18(2):   227-­‐243  (2016).  Extended  version  of  TACAS  2014  paper  awarded  ETAPS 2014 EASST best paper.

 

5.  Dimitra  Giannakopoulou,  Dennis  Guck,  Johann  Schumann:  Exploring  Model  

Quality  for  ACAS  X.  FM  2016.  

References

Related documents

Similarly these normalized rank-1 CP matrices together with the normalized extremely bad matrices constitute the extreme points of ( 23 ).. We prove

If breastfeeding by itself doesn’t effectively remove the thickened inspissated milk, then manual expression of the milk, or the use of an efficient breast pump after feeds will

Priest: Almighty God have mercy on you, forgive you all your sins through our Lord Jesus Christ, strengthen you in all goodness, and by the power of the Holy Spirit keep you

Based on 4 months of qualitative fieldwork in Myanmar in 2017 –2018, this article explores how civil society organisations, international non- governmental organisations,

Particulars Dr. Land was Revalued by Rs.10000, effect of which was not given in books of accounts. Sundry Creditors Bills Payable Bank Overdraft Prov. During the year ended

Beginning upon the Letter Effective Date and terminating at the expiration of thirty (30) days after Seller’s delivery to Buyer of the Letter Inspection Documents, Buyer,

Cannabis to describe term used to spell it uses the terms are described as policy in the success or system.. In

The 17-item Natsal-SF is positively associated with the Female Sexual Function Index-6 (B = 0.572) and Brief Sexual Function Questionnaire for men (B = 0.705); it can