• No results found

Syntax and Semantics

3.2 Probabilistic Graph Programming

3.2.1 Syntax and Semantics

We present a conservative extension to GP 2, P-GP 2, where a rule-set may be executed probabilistically by using additional syntax. Rules in the set are picked according to proba- bilities specified by the programmer, while the match of a selected rule is chosen uniformly at random. When the new syntax is not used, a rule-set is treated as non-deterministic and executed as in GP 2’s implementation [17]. This is preferable when executing confluent rule-sets where the discovery of all possible matches is expensive and unnecessary.

To formally describe probabilistic decisions in P-GP 2, we consider the application of a rule-set R = {r1, . . . , rn} to some host graph G. The set of all possible rule-match pairs from

R in G, denoted by GR, is given by

GR= {(ri, g) | ri ∈ R and G ⇒ri,gH for some graph H}. (3.1) We make separate decisions for choosing a rule and a match. The first decision is to choose a rule, which is made over the subset of rules in R that have matches in G, denoted by RG,

given by

RG = {ri | ri∈ R and G ⇒ri,gH for some match g and graph H}. (3.2) Once a rule ri∈ RG is chosen, the second decision is to choose a match with which to apply

ri. The set of possible matches of ri in G, denoted by Gri, is given by

Gri = {g | G ⇒

ri,gH for some graph H}. (3.3)

We assign a probability distribution (defined below) to GR which is used to decide particular rule executions. This distribution, denoted by PGR, has to satisfy

PGR: GR→ [0, 1], such that

X

(ri,g)∈GR

PGR(ri, g) = 1, (3.4)

where [0, 1] denotes the real-valued (inclusive) interval between 0 and 1.

P-GP 2 allows the programmer to specify PGR by rule declarations in which the rule can be associated with a real-valued positive weight. This weight is listed in square brackets after the rule’s variable declarations, as shown in Figure 3.1. This syntax is optional and if a rule’s weight is omitted, the weight is 1.0 by default. In the following, we use the notation w(r) for the positive real value associated with any rule r in the program.

grow_loop(n:int) [3.0]

n

1

n

1 1 2

Figure 3.1: A P-GP 2 declaration of a rule with associated weight 3.0. The weight is indicated in square brackets after the variable declaration.

To indicate that the call of a rule-set {r1, . . . , rn} should be executed probabilistically, the

call is written with square brackets:

[r1, . . . , rn]. (3.5)

This includes the case of a probabilistic call of a single rule r, written [r], which ignores any weight associated with r and simply chooses a match for r uniformly at random. Given a probabilistic rule-set call R = [r1, . . . , rn], the probability distribution, PGR, is defined as follows;

The summed weight of all rules with matches in G is X

rx∈RG

w(rx), (3.6)

and the weighted distribution over rules in RG assigns to each rule ri ∈ RG the probability

w(ri)

P

rx∈RG w(rx)

. (3.7)

The uniform distribution over the matches of each rule ri∈ RGassigns the probability 1/|Gri|

to each match g ∈ Gri. This yields the definition of P

GR for all pairs (ri, g) ∈ GR given by PGR(ri, g) = w(ri) P rx∈RG w(rx) × 1 |Gri|. (3.8)

In the implementation of P-GP 2, the probability distribution, PGR, decides the choice of rule and match for R = [r1, . . . , rn] (based on a random-number generator). Note that

this is correctly implemented by first choosing an applicable rule ri according to the weights

and then choosing a match for ri uniformly at random. The set of all matches is computed

at run-time using the existing search-plan method described in [15]. Note that this is an implementation decision that is not intrinsic to the design of P-GP 2.

hComi ::= hRuleSetCall i | hProbRuleSetCall i | hGlobalProbRuleSetCall i| hProcCall i | if hComSeq i then hComseq i [else hComSeq i]

| try hComSeq i [then hComseq i] [else hComSeq i] | hComSeqi ‘!’

| hComSeqi or hComSeqi | ( hComSeq i )

| break | skip | fail

hProbRuleSetCall i ::= [ RuleId ] | [ [RuleId { , RuleId}] ]

hGlobProbRuleSetCall i ::= [[ RuleId ]] | [[ [RuleId { , RuleId}] ]]

Figure 3.2: The modified abstract syntax of P-GP 2’s programs (see Figure 2.10). P robRuleSetCall denotes a probabilistic rule-set call, to be executed as we have outlined. GlobalP robRuleSetCall denotes a global probabilistic rule-set call, also to be executed as we have outlined.

We also add special syntax to allow a programmer to specify that a uniform distribution should be used across all matches for all rules of a rule-set. If the programmer uses the double square bracket syntax

[[r1, . . . , rn]] (3.9)

then we ignore rule weights and instead assign PGR as PGR(ri, g) =

1

|GR|. (3.10)

We refer to this as a ‘global’ probabilistic rule-set call.

If a rule-set R is called using GP 2 curly-brackets syntax, execution follows the GP 2 im- plementation [17]. Hence our language extension is conservative; existing GP 2 programs will execute exactly as before because probabilistic behaviour is invoked only by the new syntax. P-GP 2 modifies GP 2’s syntax grammar. Figure 3.2 gives the modified parts of the program grammar to include new probabilistic rule-set calls and global probabilistic rule-set calls.

As one final and relatively minor probabilistic extension to GP 2, we also introduce a new integer operator rand int(a,b). This is called with integer arguments a and b and returns a random integer drawn from the (inclusive) interval (a, b). This also requires a modification of GP 2’s grammar; in this case, the integer aspects of GP 2’s expression grammar. Figure 3.3 shows the updated integer grammar.

hInteger i ::= Digit {Digit} | Ivariable | ’-’ hInteger i

| hInteger i hArithOpi hInteger i | (indeg | outdeg ) ( Node ) | length( (AVariable | SVariable | LVariable) )

| rand int( hInteger i , hInteger i )

Figure 3.3: The modified abstract syntax of P-GP 2’s expressions (see Figure 2.7). rand int allows a programmer to sample a uniform distribution over the inclusive range of its 2 input integers.