5 Reachability-Time Games
5.1. Introduction
5.1.1. Definition
DEFINITION5.1.1(Reachability-Time Games). Areachability-time gameon a timed automa- ton is a tuple(Γ,RTMin,RTMax), where:
– Γ= (T,LMin,LMax)is a timed game automaton such thatT = (L,C,S,A,E,δ,ξ,F)
is a timed automaton,LMinis the set of locations controlled by player Min, andLMax
is the set of locations controlled by player Max;
– RTMin : Runs → R andRTMax : Runs → Rare payoff functions, which for every run of the timed automaton return the amount the player Min loses and the player Max wins, respectively. The functionsRTMinandRTMaxare defined in the following way: for a runr =hs0,(t1,a1),s1,(t2,a2), . . .i ∈Runs we have
RTMin(r) =RTMax(r) =
(
∑Stopi=1(r)ti ifStop(r)<∞
∞ otherwise.
Since the functions RTMin and RTMax are equal, we write RT : Runs → R for this function.
We define QMin = {(`,ν) ∈ Q : ` ∈ LMin}, QMax = Q\QMin, SMin = S∩QMin, SMax =S\SMin,RMin={[s] : s∈ QMin}, andRMax =R \ RMin.
The strategies of player Min and player Max are defined as usual (see Section 3.4.2). We writeΣMinfor the set of strategies for player Min, and we writeΣMaxfor the set of strategies
for player Max. We write ΠMin and ΠMax for the sets of positional strategies for player
Min and for player Max, respectively. Reachability-time payoff functionRT: Runs→R
naturally gives rise to the function RT : S×ΣMin×ΣMax → R in the following way.
For strategies µ ∈ ΣMin and χ ∈ ΣMax of respective players and a state s ∈ S we have
RT(s,µ,χ) =RT(Run(s,µ,χ)).
5.1.2. Value of Reachability-Time Game
If player Min uses the strategyµ ∈ ΣMin and player Max uses the strategyχ ∈ ΣMax then
player Min loses the value RT(s,µ,χ) and player Max wins the value RT(s,µ,χ). In a
reachability-time game player Min is interested in minimising the value she loses and player Max is interested in maximising the value he wins. We define theupper valueVal(s)and the
lower valueVal(s)of the reachability-time game at the states∈ Sby
Val(s) = inf µ∈ΣMin
sup χ∈ΣMax
RT(Run(s,µ,χ)), andVal(s) = sup
χ∈ΣMax
inf µ∈ΣMin
RT(Run(s,µ,χ)).
From Proposition 1.2.4 the inequality Val(s) ≤ Val(s) always holds. A reachability-time game isdeterminedif for everys ∈S, the lower and upper values atsare equal to each other; then we say that thevalueVal(s)exists andVal(s) =Val(s) =Val(s).
For strategiesµ∈ΣMinandχ∈ ΣMax, we define Valµ(s) = sup
χ∈ΣMin
RT(Run(s,µ,χ)), andValχ(s) = inf µ∈ΣMin
RT(Run(s,µ,χ)).
We say that a strategy µ ∈ ΣMin or χ ∈ ΣMax, respectively, is optimal if for every s ∈ S,
we haveValµ(s) = Val(s)or Val
χ(s) = Val(s), respectively. For an ε > 0, we say that a strategyµ∈ ΣMinor χ∈ ΣMax isε-optimalif for everys ∈ S, we haveValµ(s) ≤ Val(s) +ε
orValχ(s)≥Val(s)−ε, respectively. Note that if a game is determined then for everyε>0,
both players haveε-optimal strategies.
For anε> 0, we say that a strategyµ∈ ΣMinfor Min isε-optimalif for everys ∈ S, we
haveValµ(s)≤Val(s) +
ε. For anε>0, we say that a strategyχ∈ΣMax for Max isε-optimal
if for everys ∈S, we haveValχ(s)≥Val(s)−ε. Optimal andε-optimal strategies for player Max are defined analogously.
We say that a reachability-time game is positionally determinedif for every s ∈ S, we have
sup µ∈ΠMin
inf χ∈ΣMax
RT(Run(s,µ,χ)) =Val(s) = inf
χ∈ΠMax
sup µ∈ΣMin
RT(Run(s,µ,χ)).
Note that if the reachability-time game is positionally determined then for every ε > 0,
and Theorem 5.4.12) yield a constructive proof of the following fundamental result for reachability-time games.
THEOREM 5.1.2(Positional determinacy). Reachability-time games are positionally deter-
mined.
5.1.3. Optimality Equations
LetΓbe a timed game automaton, and letT:S →RandD:S→N. We write (T,D) |= OERT
MinMax(Γ), and we say that (T,D) is a solution of optimality
equationsOERT
MinMax(Γ), if for alls∈S, we have:
– ifD(s) =∞thenT(s) =∞; and – ifs∈ Fthen(T(s),D(s)) = (0, 0); and – ifs∈SMin\Fthen T(s) = inf a,t{t+T(s 0) : s a −→t s0}, and D(s) = min 1+d0 : T(s) =inf a,t{t+T(s 0) : s a −→t s0andD(s0) =d0} ; – ifs∈SMax\Fthen T(s) = sup a,t { t+T(s0) : s−→a t s0}, and D(s) = max1+d0 : T(s) =sup a,t { t+T(s0) : s −→a t s0 andD(s0) =d0} ;
LEMMA5.1.3(ε-Optimal strategies from optimality equations). If(T,D) |= OERTMinMax(Γ),
then for alls ∈ S, we haveVal(s) = T(s)and for everyε > 0, both players havepositional ε-optimal strategies.
PROOF. We show that for every ε > 0, there exists a positional strategy µε : SMin →
A×R⊕for player Min, such that for every strategyχfor player Max, ifs ∈ Sis such that D(s)<∞, then we haveRT(Run(s,µε,χ))≤T(s) +ε. The proof, that for everyε>0, there exists a positional strategyχε :SMax → A×R⊕for player Max, such that for every strategy
µfor player Min, ifs∈Sis such thatD(s)< ∞then we haveRT(Run(s,µ,χε))≥T(s)−ε, is similar and omitted. The proof, that ifD(s) =∞then player Max has a strategy to prevent ever reaching a final state, is routine and omitted as well. Together, these facts imply thatT
is equal to the value function of the reachability-time game, and the positional strategiesµε
andχε, defined in the proof below for allε >0, areε-optimal.
For ε0 > 0, T : S → R, ands ∈ SMin\F, we say that a timed action(a,t)∈ A×R⊕ is
ε0-optimal for(T,D)insifs−→a t s0, and
D(s0) ≤ D(s)−1, and (5.1.1)
t+T(s0) ≤ T(s) +ε0. (5.1.2)
Observe that for every states ∈SMinand for everyε0 >0, there is aε0-optimal timed action
for(T,D)insbecause(T,D)|= OERT
we have that for everys ∈SMax\Fand timed action(a,t), such thats −→a t s0, we have
D(s0) ≤ D(s)−1, and (5.1.3)
t+T(s0) ≤ T(s). (5.1.4)
Let ε > 0; we defineµε : SMin → A×R⊕ by settingµε(s), for every s ∈ SMin, to be a timed action which isε0(s)-optimal for(T,D)ins, whereε0(s)>0 is sufficiently small (to be
determined later). Letχbe an arbitrary strategy for player Max and letr =Run(s,µε,χ) = hs0,(a1,t1),s1,(a2,t2), . . .i. LetN=Stop(r). Our goal is to prove thatRT(r)≤T(s) +ε, i.e.,
thatT(s)≥∑N
k=1tk−ε.
For every states∈ S, such thatD(s)< ∞, defineε0(s) =ε·2−D(s). Note that if we add
left- and right-hand sides of the inequalities (5.1.2) or (5.1.4), respectively, for all statessi,
andε0(si)-optimal timed actionsµε(si)ifsi ∈SMin, wherei=0, 1, . . . ,N−1, then we get T(s) = T(s0) ≥ N
∑
k=1 tk− N−1∑
k=0 ε0(sk) ≥ N−1∑
k=0 tk−ε.The first inequality holds by T(sN) = T(sStop(r)) = 0, and the second inequality holds because N−1
∑
k=0 ε0(sk) = N−1∑
k=0 (ε·2−D(sk)) ≤ ε· ∞∑
d=1 2−d ≤ ε,where the first inequality follows by (5.1.1) and (5.1.3).
It may be worth noting that if the finite values of the function Dare bounded, i.e., if
B < ∞, where B = sups∈S{D(s) : D(s) < ∞}, then in the above proof it is sufficient to define ε0(s) = ε/B, for all s ∈ S, which gives arguably more realistically “physically
implementable”ε-optimal strategies.