3.2 Probabilistic Graph Programming
3.2.2 Existence of a Markov Chain
In this section, we describe how a P-GP 2 program can be interpreted in the context of a Markov chain. We assume a discrete time model for P-GP 2 as we are only concerned with the step-wise operation of a graph program, rather than a specific modelling domain.
It then becomes clear that a rule-set applied to a graph using probabilistic syntax induces a first-order Markov chain. A Markov chain is a model in probability theory where there are transitions between states in a countable set S occurring with fixed probabilities [173, 202]. This is viewed as a Markov process, see Definition 8, over a discrete, countable state space.
Definition 8. (Markov process) [173, 202].
A Markov process is a stochastic process X = (X0, X1, X2, ...Xn) consisting of a sequence of
random variables where for each random variable Xi at time i, all future states are condi-
tionally dependent on the current state and independent from previous states:
P r(Xi+1= x | X0 = x0, X1 = x1, X2 = x2, ..., Xi= xi) = P r(Xi+1= x | Xi = xi). (3.11)
Definition 9. (Markov chain) [173, 202].
A Markov chain is a Markov process X = (X0, X1, X2, ...Xn) on a countable state space S,
such that each random variable Xi at time i is a probability distribution over S.
Fixed probabilities mean that the probability of transitioning from one state to another depends only on the current state. The transition probabilities can be represented as a |S|×|S| transition matrix Q where for any two states s, s0 ∈ S, Q(s, s0) is the probability of transitioning from state s to state s0. The behaviour of the process can then be simulated by repeatedly multiplying initial distribution X0, a vector of size |S| describing a probability
the vector Xn containing as elements the probabilities Xn(s) of being in a state s ∈ S. If S
is countable but infinite, there may be no natural representation for Q.
For a rule-set R applied probabilistically to graph G, the induced Markov chain’s state space S is every graph reachable by repeatedly applying R to G given by
S = {H | G ⇒∗ H}. (3.12)
For any probabilistic call to rule-set R and input graph G, the implied state space must be a subset of the set of all possible host graphs considered up to isomorphism: S ⊂ G. As G is countable, it entails that S must always be countable. The induced transition matrix Q is defined according to the possible transitions between pairs of graphs A, B ∈ S and their associated fixed probabilities given by PAR given by
Q(A, B) = X
(r,g)∈AR|A⇒
r,gB0,B0∼=B
PAR(r, g). (3.13)
Informally speaking, the transition matrix entry for the transition between graphs A and B is the total probability of A being transformed into B in a single step by probabilistically executing R on A using any of the matches in AR.
The initial distribution X0 is a trivial case; the probability of being in initial state G, the
host graph, when R is called, is 1. This means that the initial distribution is defined, for any graph G0 ∈ S, as X0[G0] = 1 if G0= G 0 otherwise . (3.14)
In special cases, it may be possible to consider transition matrix Q explicitly for a proba- bilistic rule-set call and find probabilities of its resultant graph accordingly, but more generally the input graph to a program is not known before run-time, preventing pre-computation of state space S and therefore Q. In this case, a step-wise execution of probabilistic rule-set call to produce a result graph can be seen as sampling from the Markov chain induced by the rule-set and host graph. The execution of a probabilistic single rule-set call in P-GP 2 corresponds to a single step of the corresponding induced Markov chain, whereas the as-long- as-possible call R! corresponds to simulation of the induced Markov chain until reaching some absorbing state (see [173] for more information).
More generally we can consider P-GP 2 programs, rather than single probabilistic rule-set calls. The following sufficient conditions can be used to characterise a P-GP 2 program’s behaviour:
1. If a program is terminating and all rule-sets called by the program are (a) called as long as possible, and (b) confluent then the program is deterministic.
2. If all rule-sets called by the program are either (a) called probabilistically, or (b) con- fluent and called as long as possible then the program forms a Markov chain. The deterministic sub-components of the program form may be treated as part of proba- bilistic transitions of some previous probabilistic step.
If some rule-sets called by the program are called probabilistically but there are other rule- sets called non-deterministically which are not confluent, then the program forms a Markov Decision Process (see [202]) with non-deterministic sub-components executed according to the implementation of the compiler. If there are no probabilistic rule-set calls in the program and some rule-sets called non-deterministically which are not confluent, the program is in general non-deterministic.
To see that these conditions are sufficient, but not necessary:
1. Consider a program where there are non-confluent rule-sets called non-deterministically, but before each such rule-set call, a confluent rule-set is applied as long as possible which prevents any possible critical pairs of the non-confluent rule-set. Then the program is deterministic despite not meeting the above condition.
2. Consider a program with a loop (r1; [r2])!. Then there are examples of r1, r2 where the loop induces a Markov chain when considering resultant graphs up to isomorphism despite containing a non-deterministic rule-set call (the single call to r1) which is not executed as long as possible. See Figure 3.9 for such an example.