3.2 Matching Semantics
3.2.4 Horizontal Matching
The patterns described so far process data sequences in a linear manner, removing zero or more elements but never adding elements. That is, if hp, sin, ini
m
! hr, sout, outi then there is an sr so that append(sr, sout) = sin. The horizontal operators preserve this linear consumption property if all their arguments fulfil this property. This entails that the horizontal operators match their arguments in such a way that the result of one match is invisible to the next match. This is illustrated by Figure 3.2: the input elements a, b and con level L0 are transformed in a stepwise fashion into results x, y and z on L1. The parsing processes (illustrated by numbered circles) have no access to the results on L1. In general, all input is processed on the same horizontal level Ln.
a
b
c
1
2
3
x
y
z
L
0L
1Figure 3.2: Illustration of Horizontal Matching
is matched to the input data sequence and each following pattern to the data sequence and bindings produced by the previous match. The results of all matches are combined into a sequence. hp, sin, sini m ! hrp, sp, pi h⇠(P ), sp, pi m ! hrP, sout, outi combine(rp, rP)7! rout h⇠(p::P ), sin, ini m
! hrout, sout, outi
SEQUENCESUCCESS h⇠(✏), sin, ini m ! h✏, sin, ini SEQUENCEEMPTY hp, sin, ini ! ?m h⇠(p::P ), sin, ini m ! ? SEQUENCEMATCH?
The sequencing operator creates a compound result from individual results by using the helper function combine defined in Section 3.2.1. The advantage of combining results in this way is that in most cases there is no need to express explicit flattening of results in the pattern expressions. If nesting is explicitly desired, it can be expressed using typed sequences which are not automatically flattened. Therefore, combine can distinguish be- tween an intended nesting or a nesting that is used to pass a sequence of results. Wrapping of a result into a typed sequence can be expressed using transformations, as defined in Section 3.4.
For example, if a, b, c and d are atoms, matching the pattern ⇠([⇠([a, b]),⇠([c, d])]) with input [a, b, c, d] produces intermediate results [a, b] and [c, d] for the two inner se- quential patterns. However, the overall result is the sequence [a, b, c, d] in which the nesting is invisible. A pattern that expresses result nesting explicitly is the following: ⇠([⇠([a, b]) ) ⌧([a, b], list),⇠([c, d]) ) ⌧([c, d], list)]). Matching it with the input from above produces the result [⌧([a, b], list), ⌧([c, d], list)] in which the nesting is explicit. Choice The choice operator or matches its arguments in order until one of them suc- ceeds. Changes made by unsuccessful matching attempts have no effects on the result, the data sequence or the store. Only the first successful match has an effect. Matching a choice pattern fails if there is no choice that can be matched successfully.
hp, sin, ini m ! hr, sout, outi hor(p::P ), sin, ini m ! hr, sout, outi CHOICESUCCESS hp, sin, ini m ! ? hor(P ), sin, ini m ! result hor(p::P ), sin, ini m
! result CHOICERECURSE
hor(✏), sin, ini m! ?
CHOICE NONE?
The semantics of the choice operator defines a clear left-to-right order in which patterns are matched. Although definitions may be ambiguous, i.e., more than one pattern of the choice can match an input, the result is always unambiguously defined to be that of the first pattern that matches starting from the left. For instance, if b is an atom the pattern or([b, ↵]) has a clear matching semantics even though both choices match a sequence that starts with b: choice ↵ is only tried if matching b fails. Once the matching of one of the choices is successful, there will be no backtracking. This semantics is crucial when patterns are used to express computations and grammars (see Chapter 4). For example, putting the base case of a recursive definition before the recursive case in the choice pattern ensures that the recursive case is only tried if the base case fails. In the follow- ing, the alternative notation p |...|p will be used where appropriate to denote the choice
Repetition The matching semantics of the pattern p⇤ is to match p repeatedly to the input until matching p fails. The result is the combined result of all successful matches and an empty result if there are no successful matches. This means that p⇤ never fails.
hp, sin, ini m ! hrp, sp, pi hp⇤, sp, pi m ! hrp⇤, sout, outi combine(rp, rp⇤)7! rout hp⇤, sin, ini m! hrout, sout, outi
REPEATGREEDY hp, sin, ini m ! ? hp⇤, s in, ini m ! h✏, sin, ini REPEATNOMATCH
The⇤operator has greedy semantics as it attempts to consume as many elements from the data sequence as possible. Care must be taken when using the operator with patterns that do not consume from the data sequence as this leads to infinite regress, as in the case of the pattern (⇠(✏))⇤.