Relating plays in the game and executions in the system

(1) S∩ Q

#₌_{∅ and O ⊆ Obs}

S ,_{→ hS, Oi} (2) S∩ Q

#_{6= ∅ and O ⊆ Obs is passive}

S ,→ hS, Oi (3) β ∈ O ∧ S 0₌_s0_{∈ Q ∪ Q}# _: _{∃s ∈ S s.t. s}₌∗β∗_{⇒ s}0 _{6= ∅} hS, Oi,→ Sβ 0 (4) s∈ S s.t. ActA#(s, O) =∅ hS, Oi ,→ stop(s) (5) s_{∈ S s.t. s ⇑} hS, Oi ,→ div(s) (6) s∈ S s.t. choice O is blocking for s in A

hS, Oi ,→ fail(s)

(7) s∈ S s.t. ActA#(s)6= ∅ and ActA#(s, O) =∅ hS, Oi ,→ fail(s)

Figure 5.1: Edge relation in the complete-information game the vertices_{hS, Oi ∈ V}env. If there exists ans ∈ S with s

∗β∗

=_{⇒ s}0 for someβ _{∈ O, then} there exists an edge_{hS, Oi} ,_{→ S}β 0 _{to some}_S0 _{∈ V}

ctr. Otherwise, no states ∈ S has any

enabled visible actions. If there is a states ∈ S that has no enabled invisible actions as well,ActA#(s, O) = ∅ and there exists an edge hS, Oi ,→ stop(s). Otherwise, all states

s_{∈ S have at least one enabled invisible action. As S is closed under invisible moves and} non-empty, this can only be the case if alls ∈ S are divergent and thus there exist edges hS, Oi ,→ div(s).

Remark 5.5. To simplify the presentation, the edge relation (Figure 5.1) allows the first player to select a choiceO ⊆ Obs, even though it might violate one or more of the admissibility conditions, i.e., wherehS, Oi has an edge to a fail(s) vertex. In practice, when building the reachable fragment of the game graph for the subsequent algorithmic analysis, these choices are ultimately not legal for the first player, as he has to maintain admissibility. These vertices can thus be directly pruned from the game graph when generating the successor vertices for the vertices in_Vctr. As shown in Remark 2.7, there always exists an

admissible choice, therefore eachS ∈ Vctr has at least one outgoing edge even when the

non-admissible choices are pruned.

5.2 Relating plays in the game and executions in the system

A (partial) play ofG = (V, ,→) denotes any finite or infinite sequence η = v0 ,→ v1,→ v2 ,→ . . .

5.2. Relating plays in the game and executions in the system Chapter 5. Game-based synthesis

withvi ∈ V and respecting the edge relation ,→, starting in the initial vertex v0 = [Q0]∗.

Note thatv2i∈ (Vctr∪ Vterm) and v2i+1 ∈ Venvfori > 0. A play is maximal if it is either

finite and ending in a vertex without outgoing edges or is infinite. As only theVterm-vertices

are terminal, the infinite maximal plays have the form η = S0 ,→ hS0, O1i

β1

,→ S1 ,→ hS1, O2i β2

,→ . . ., while the finite maximal plays have the form

η = S0 ,→ hS0, O1i β1

,_{→ . . .},β_{→ S}n n,→ hSn, On+1i ,→ v,

withv _{∈ V}term. We denote bylast(η) the last vertex in a play. The observation of a play

obs(η) denotes the sequence of observations βiof the moveshSi−1, Oii βi

,→ Si. A play for

an observationσ _{∈ Obs}∞is a playη such that obs(η) = σ. Given a decision function d, a d-play for an observation σ = β1β2. . . is a partial play η as above with Oi+1= d(β1. . . βi).

The fundamental properties of the powerset construction can be rephrased for our purposes as follows:

Lemma 5.6. Letd be a decision function that respects suspend (cf. Def. 2.6). Then, for every finite observationσ = β1. . . βnthat is bothd-schedulable andA-schedulable, there

exists a unique partiald-play η = S0,→ hS0, d(ε)i

β1

,→ . . .β,n−1→ hSn−1, d(β1. . . βn−1)i βn

,→ Sn

forσ with Sn ∈ Vctr. Furthermore, for all statess ofA#,s∈ Sniff there exists an initial

d-execution π inA ending in state s|Awithobs(π) = σ and s ∈ Q# ifβn ∈ Obs# and

s∈ Q if βn∈ Obs/ #. Hence,Sn⊆ Q#ifβn∈ Obs#andSn∩ Q#=∅ if βn∈ Obs/ #.

Proof. The proof is by induction on the length of the observation.

First, we consider the empty observation σ = ε. The play η = S0, with S0 = [Q0]∗,

is the uniqued-play for observation ε. By the definition of A#_, _[Q

0]∗ ∩ Q# = ∅. Let

s_{∈ S}0 = [Q0]∗ andq = s|A = s. Then, there exists a state q0 ∈ Q0such thatq0 ∗

=_{⇒ q,} inducing an initialA-execution π = q0

α1

−→ . . . αm

−−→ qm ending inq = qm. As all actions

αiare invisible,αi+1∈ ActA(qi, d(ε)) for all i < m and π is a d-execution for observation

ε ending in state q = s_|A. Vice versa, for every initiald-execution π = q0 α1

−→ . . . αm

−−→ qm

with the empty observationobs(π) = ε, all αiare invisible (i.e.,αi ∈ Act/ vis), and therefore

q0 ∗

=_{⇒ q}mandqm ∈ [Q0]∗.

In the step of induction, we regard an observation of lengthn+1, say σ = σ0βn+1where

σ0 = β1. . . βn. Let η0 = S0 ,→ hS0, d(ε)i β1

,→ S1 ,→ . . . βn

,→ Sn be a d-play for σ0 =

β1. . . βnending in vertexSn. Ifσ0ends with an observableβn∈ Obs#of a suspend action,

Chapter 5. Game-based synthesis 5.2. Relating plays in the game and executions in the system

suspend, i.e.,d(σ0_{) is passive, rule (2) then yields the unique edge S}

n,→ hS, d(σ0)i. In the

other case, ifσ0 _{does not end with}_β

n ∈ Obs# thenSn ∩ Q# = ∅ and rule (1) yields

the unique edgeSn ,→ hSn, d(σ0)i. By applying rule (3), we obtain the unique edge and

successor

hSn, d(σ0)i βn+1

,→ Sn+1

as follows. Asσ is d-schedulable, βn+1∈ d(σ0). It remains to show that Sn+1is nonempty.

There exists, asσ is both d-schedulable and_{A-schedulable, an initial d-execution} π = q0 α1 −→ . . . αm −−→ qm αm+1 −−−→ qm+1

inA such that obs(αm+1) = βn+1andobs(α1. . . αm) = σ0. Let

π#= s0 α1 −→ . . . αm −−→ sm αm+1 −−−→ sm+1

be the corresponding execution in A#_{, with} _q

i = si|A for all i 6 m + 1. As

obs(α1. . . αm) = σ0 it follows that sm ∈ Sn. But thensm ∗βn+1∗

=⇒ sm+1. Hence, Sn+1

is non-empty. We can therefore extendη0to the unique playη for σ with η = S0 ,→ hS0, d(ε)i β1 ,_{→ S}1,→ . . . βn ,_{→ S}n,→ hSn, d(σ0)i βn+1 ,_{→ S}n+1.

Pick some statesn+1∈ Sn+1. Then, there existssn∈ Snsuch thatsn ∗βn+1∗ =⇒ sn+1inA#, inducing an execution π0 = qn αm+1 −−−→ . . . αm+k −−−→ qn+1

from stateqn= sn|Ato stateqn+1 = s|AinA with observation obs(π0) = βn+1. Assn∈

Sn, there exists an initiald-execution π for σ0 inA, ending in state qn. The concatenation

ofπ and π0_{yields an initial}_{d-execution for σ that ends in q}

n+1= sn+1|A.

Suppose now that π = q0 α1 −→ . . . αm −−→ qm αm+1 −−−→ qm+1 αm+2 −−−→ . . . −−−→ qαm+k m+k is a

finite, initiald-execution for σ, with obs(α1. . . αm) = σ0,obs(αm+1) = βn+1andαm+i ∈/

Actvisfor1 < i6 k. Let π#be the correspondingd-execution inA#, with

π#= s0 α1 −→ . . . αm −−→ sm αm+1 −−−→ sm+1 αm+2 −−−→ . . . αm+k −−−→ sm+k

andsi|A = qi for all 0 6 i 6 m + k. As obs(α1. . . αm) = σ0,sm ∈ Sn. Furthermore

sm ∗βn+1∗

=⇒ sm+ifor all16 i 6 k and therefore in particular sm+k∈ Sn+1. Additionally, if

βn+1∈ Obs#thensm+i ∈ Q#, ifβn+1∈ Obs/ #thensm+i ∈ Q for all 1 6 i 6 k.

The unique vertex givend and σ. Given a decision function d that respects suspend and a finite observationσ that is d-schedulable andA-schedulable, we denote by Sd,σ the last

5.2. Relating plays in the game and executions in the system Chapter 5. Game-based synthesis

d-plays ending in _Venv. By Lemma 5.6, for a decision function d that respects suspend

and for each finite observationσ = β1. . . βnthat is bothd-schedulable andA-schedulable,

there exists a uniqued-play

η = S0,→ hS0, d(ε)i β1

,→ . . .,β→ Sn n,→ hSn, d(σ)i

forσ as well that ends in a vertex_hSn, d(σ)i ∈ Venv.

Infinite d-plays. Using the arguments from [RCDH07], we can show a correspondence between the existence of infinited-executions and d-plays:

Lemma 5.7. Letd be a decision function that respects suspend. Then, for every infinite observationσ = β1β2. . . that is d-schedulable there exists a unique maximal infinite d-play

η = S0,→ hS0, d(ε)i β1 ,→ S1 ,→ hS1, d(β1)i β2 ,→ S2 ,→ hS2, d(β2)i β3 ,→ . . .

of alternating_Vctr- andVenv-vertices forσ iff σ isA#-schedulable, i.e., there exists an infi-

nited-execution π inA#_with_{obs(π) = σ.}

Proof.

“⇐=” Given an infinite d-path π with infinite observation obs(π), we obtain η by the re- peated application of Lemma 5.6.

“=_{⇒” Let d-play be a maximal, infinite play} η = S0,→ hS0, d(ε)i β1 ,_{→ S}1 ,→ hS1, d(β1)i β2 ,_{→ S}2 ,→ hS2, d(β2)i β3 ,_{→ . . .} with observationobs(η) = β1β2. . ..

We have to show the existence of an infinited-execution in_A#_with_{obs(π) = obs(η). By}

the construction ofG, all Siare non-empty fori> 0. We construct a directed acyclic graph

G where the nodes have the form hs, ii with i > 0 and s ∈ Si. In addition, we have a

special root noder. The sons of the root are the nodes_{hs, 0i, where s ∈ S}0. The sons of an

inner nodehs, ii are the nodes ht, i + 1i where t_{∈ S}i+1ands

∗βi∗

=_{⇒ t.}

G is finitely branching and infinite, as all Si are nonempty and for all t ∈ Si+1 there

exists an s _{∈ S}i such that s ∗βi∗

=_{⇒ t. By K¨onig’s Lemma, G has an infinite path} hs0, 0i hs1, 1i hs2, 2i . . ., corresponding to an infinite sequence

s0 ∗β1∗ =⇒ s1 ∗β2∗ =⇒ s2 ∗β3∗ =⇒ . . .

Chapter 5. Game-based synthesis 5.2. Relating plays in the game and executions in the system

The reachability of vertices of the form stop(s) and div(s) corresponds to the possibility of finite behavior and divergence (Lemma 5.8) and the fail(s) vertices can be used to exactly characterize the admissible decision functions (Lemma 5.9).

Lemma 5.8 (d-plays, stop(s) and div(s)). Let d be a decision function that respects suspend, σ be a finite observation that is both d-schedulable and _{A-schedulable and let} η = S0 ,→ . . . ,→ Sn ,→ hSn, d(σ)i be the unique finite d-play for σ ending in a vertex in

Venv. Then, for all statess ofA#:

(a) _hSn, d(σ)i ,→ stop(s) with s ∈ Sniff there exists an initial, finited-path

π = q0 α1

−→ . . . αm

−−→ qmwithobs(π) = σ and qm= s|A.

(b) _hSn, d(σ)i ,→ div(s) for some s ∈ Sniff there exists an initial, infinited-path

π = q0 α1

−→ . . . αm

−−→ qm αm+1

−−−→ qm+1. . . with obs(π) = σ, such that qm = s|Aand

αi ∈ Act/ visfor alli > m.

Proof.

ad (a). First, we show that if hSn, d(σ)i ,→ stop(s) then there exists an initial d-path

ending in stateq = s_|A. By Lemma 5.6 and the fact thats ∈ Sn, there exists an initial

d-execution π = q0 α1

−→ . . . αn

−−→ qninA with qn= s|Aandobs(π) = σ. The edge relation

yields thatActA#(s, d(σ)) =∅. In case s ∈ Q,

ActA#(s, d(σ)) = ActA(q, d(σ)) =∅.

In cases_{∈ Q}#_,_{σ ends with an observation in Obs}

#andd(σ) is passive. Therefore,

ActA#(s, d(σ)) = Act_A#(s)∩ Act_ctr= ActA(q)∩ Actctr= ActA(q, d(σ)) =∅.

Hence,π is an initial d-path, as in both cases ActA(q, d(σ)) = ∅. Vice versa, if π is an

initiald-path π inA with obs(π) = σ ending in q = s|A for some state s ofA#, then

s∈ Snby Lemma 5.6. Asπ is a d-path, ActA(q, d(σ)) =∅. Then

∅ = ActA(q, d(σ))⊇ ActA#(s, d(σ)).

Hence, there is the edgehSn, d(σ)i ,→ stop(s).

ad (b). IfhSn, d(σ)i ,→ div(s) then s is divergent in A#, and subsequentlyq = s|AinA

is divergent as well. By Lemma 5.6 and the fact thats_{∈ S}n, we obtain an initiald-execution

π0 = q0 α1

−→ . . . αm

−−→ qm

inA ending in qm = s|A. Becauseqmis divergent, there exists an infinite execution

π00_{= q} m αm+1 −−−→ qm+1 αm+2 −−−→ . . . in A

5.2. Relating plays in the game and executions in the system Chapter 5. Game-based synthesis π = q0 α1 −→ . . . αm −−→ qm αm+1 −−−→ qm+1 αm+2 −−−→ . . .

with αi ∈ Act/ vis for alli > m. As obs(π) = σ, π is an initial d-execution and as π is

infinite,π is an initial d-path in_{A. To prove the other direction, let} π = q0 α1 −→ . . . αm −−→ qm αm+1 −−−→ qm+1. . .

be an initial, infinited-path with obs(π) = σ, such that qm = s|A for some states ofA#

and, for alli > m, αi ∈ Act/ vis. By Lemma 5.6,s∈ Sn. Asqmis divergent inA, s is also

divergent inA#_{. This proves the existence of the edge}_hS

n, d(σ)i ,→ div(s).

We now relate the admissibility ofd (when not considering Fair[A]) with the reachability of fail(s)-vertices.

Lemma 5.9 (Admissibility andd-plays). A decision function d that respects suspend is not admissible with respect to conditions (A1), (A2) and (A3) of Definition 2.6 iff there exists an observationσ that is both d-schedulable and_{A-schedulable such that there is some} s_{∈ Q ∪ Q}#_{and a}_d-play

η = S0 ,→ . . . ,→ Sn,→ hSn, d(σ)i ,→ fail(s)

forσ. Hence, d is admissible iff there is no d-play that reaches a vertex fail(s)_{∈ V}term.

Proof.

“_{⇐=”: By Lemma 5.6 and the fact that s ∈ S}n, there exists an initiald-execution

π = q0 α1

−→ . . . αm

−−→ qminA with qm = s|A.

AshSn, d(σ)i ,→ fail(s), the condition of rule (6) or the condition of rule (7) in Figure 5.1

applies. Let us first consider the case that the edge to the fail(s) vertex is due to rule (6). Note that the transformation from_{A to A}#does not change the enabledness of the uncontrollable actionsActvis\ Actctr, i.e.,

ActA#(s, O)∩ (Act_vis\ Act_ctr) = ActA(qm, O)∩ (Actvis\ Actctr).

Asd(σ) is blocking for s inA#_{it is thus likewise blocking for}_q

minA. The choice d(σ)

thus violates condition (A1) of admissibility for the executionπ.

In the other case, if the edge to the fail(s) vertex arises by rule (7), then the condition ActA#(s) 6= ∅ and Act_A#(s, d(σ)) = ∅ is satisfied. If last(σ) ∈ Obs_# then, by the

construction of_A#,ActA#(s)⊆ Act_vis\ Act_ctrandd(σ) is blocking, i.e., the condition of

rule (6) applies as well. On the other hand, iflast(σ)_{6∈ Obs}#then

s /_{∈ Q}#_,_Act

Chapter 5. Game-based synthesis 5.2. Relating plays in the game and executions in the system

AsActA(qn, d(σ)) =∅ and ActA(qn)6= ∅ the choice d(σ) is not progressive for execution

π and state qn, therefored violates condition (A3) of admissibility.

“=_{⇒”: As d is not admissible but respects suspend, i.e., satisfies condition (A2), there exists} a finite, initiald-execution

π = q0 α1

−→ . . . αm

−−→ qmwithσ = obs(π)

such thatd(σ) violates condition (A1) or (A3) for state qm. Let

η = S0 ,→ hS0, d(ε)i . . . βn

,_{→ S}n,→ hSn, d(σ)i

be the uniqued-play for observation σ ending in_Venv and let s ∈ Sn be the state inA#

such thats_|A = qm. We first consider the case that condition (A1) is violated, i.e.,d(σ) is

blocking for stateqm. As noted above, the transformation ofA to A#does not change the

enabledness of the visible but uncontrollable actions and thusd(σ) is blocking for state s as well, yielding the edgehSn, d(σ)i ,→ fail(s).

On the other hand, if condition (A3) is violated, last(σ) 6∈ Obs# and the choiced(σ) is

not progressive forqm. This is the case ifActA qm, d(σ)

=∅ while ActA(qm)6= ∅. As

last(σ) /_{∈ Obs}#and thuss∈ Q we have:

Act_A#(s) = ActA(qm)6= ∅ and ActA#(s, d(σ)) = ActA(qn, d(σ)) =∅.

Therefore, the condition of rule (7) is satisfied, yielding the edge_hSn, d(σ)i ,→ fail(s).

In both cases we obtain thed-play S0 ,→ hS0, d(ε)i . . .

βn

,→ Sn,→ hSn, d(σ)i ,→ fail(s)

for observationσ ending in fail(s).

The uniqueness of thed-plays for a decision function (Lemma 5.6, Lemma 5.7) allow us to consider decision functions as deterministic game strategies for the complete-information game, i.e., where player 1 has complete information about the history of vertices in the play, in our case about the moves of player2 (the selection of the observable β and thus the Vctr-vertex for infinite plays).

LetΦ be an observation-based objective. We can reformulate Φ as three sets of observations Accinf,AccstopandAccdiv, with

Accinf ={σ ∈ Obsω : there exists a path π∈ Φ with σ = obs(π)},

Accstop={σ ∈ Obs∗ : there exists a finite path π∈ Φ with σ = obs(π)},

5.2. Relating plays in the game and executions in the system Chapter 5. Game-based synthesis

We then say a decision functiond that respects suspend and Fair[_{A], i.e., such that all d-} schedulable observations areFair[A]-schedulable wins an initial, maximal d-play η in G with observationσ = obs(η) for objective Φ iff (1) σ is infinite and σ∈ Accinfor (2)σ is

finite, the last vertex ofη is a stop(s)-vertex and σ_{∈ Acc}stopor (3)σ is finite, the last vertex

ofη is a div(s)-vertex and σ _{∈ Acc}div. We sayd is winning in the complete-information

gameG if all initial, maximal d-plays are winning.

We can now combine the previous lemmas to show the equivalent result to [Rei84, RCDH07] in our setting, that the transformation from the partial-information game to the complete-information game preserves winning.

Lemma 5.10. Letd be a decision function that respects suspend and let Φ be an observation-

In document Compositional Synthesis and Most General Controllers (Page 57-74)