Rules and patterns - Rules, patterns and functions

IV. Rules, patterns and functions

4.2 Rules and patterns

To better understand functions in Mathematica, we need a good understanding of patterns and rule substitution. These two topics are just two facets of a single one, since it is the form of the pattern which deter -mines when (on which class of objects) the associated rule will apply.

4.2.1 Rule, RuleDelayed, Replace and ReplaceAll commands

4.2.1.1 Basic rules and the Rule head (function)

It is very easy to organize a rewrite rule in Mathematica. For example, the following rule will replace a to b Clear@a, b, ourruleD;

ourrule = a®b a®b

The literal equivalent to the arrow (which represents the rule) is the Rule command. If we look at the FullForm of < ourrule > variable, we see that Rule is used :

FullForm@ourruleD

Rule@a, bD

4.2.1.2 How to make the rule apply: ReplaceAll function

By itself, any rule that we define does not do anything useful. The command Rule by itself is just a named container of the left and right hand sides of the rule. It becomes useful when combined with another command, which actually performs the rule substitution in an expression. This command has a format ReplaceAll[expr, rules], and a shorthand notation: <expr/.rules>. There can be more than one rule being applied, in which case they should normally be placed into a simple (not nested) list - we will see such examples later . If the rules are placed in a nested list, Mathematica interprets them differently - see Mathematica Help for details, and also below.

This is, for instance, how our rule will work on some test expression : Clear@f, a, bD;

f@aD . ourrule f@bD

or, which is the same, f@aD . a®b f@bD

If we have a more complicated expression, where < a > happens more than once, it will be replaced in all places (when we use /., or ReplaceAll command):

Clear@f, g, hD;

f@a, g@a, h@aDDD . a® b

f@b, g@b, h@bDDD

4.2.1.3 The order in which ReplaceAll tries rules on parts of expressions

Although this is not immediately obvious, often the rule application starts from the larger expression, and if it matches as a whole, then the subexpressions are not checked for further matches. This is so when the pattern (see discussion on patterns below) looks like h[x_] or similar. For example, in this case:

Clear@a, qD;

888a<<< .8x_<:>q@xD q@88a<<D

we will need to apply the rule several times to replace all the list braces with <q> - s : 888a<<< .8x_<:>q@xD .8x_<:>q@xD

q@q@8a<DD

888a<<< .8x_<:>q@xD .8x_<:>q@xD .8x_<:>q@xD q@q@q@aDDD

But not in this case - here the pattern is just a symbol List :

888a<<< . List®q q@q@q@aDDD

This behavior is rather logical, but in cases when a different order of rule substitution is desired, tech-niques are available to change it. We will discuss them

later (see section 5.2.4.2).

4.2.1.4 Associativity of rules

As the previous example may have suggested, the application of rules is left - associative, meaning that in the expression < expr /. rule1 /. rule2 > is legitimate, and first the rule (or rules if this is a list of rules, see below) < rule1 > will be applied to < expr >, and then the rule (s) < rule2 > will be applied to the result.

4.2.1.5 Locality of rules

It is very important that the rules like the one above are local. This means that when the rule rewrites an object to which it applies into something else, it changes the copy of it, while the initial object remains unchanged. In particular, in the above example, an expression f[a] taken separately, did not change :

f@aD

f@aD

This is the main difference between the rule and the function which performs a similar transformation - in the latter case a similar rule is defined globally (which means that it will be automatically tried by the kernel on any expression entered interactively or being otherwise evaluated). Essentially, this is the only fundamental difference between functions and lists of rules in Mathematica. For example, we can easily simulate the squaring function by a local rule:

Clear@fD;

8f@xD, f@yD, f@elephantD, f@3D< . f@z_D:> z ^ 2

9x2, y2, elephant2, 9=

4.2.1.6 Delayed rules - the RuleDelayed function

I used this example to introduce two new ideas. First, we now have patterns on the left hand side of the rule - they are used to widen the class of objects to which the rule will apply. Second, we have used a new kind of rule (there are only two, and one we already considered before) - the one which uses the :> (colon - greater) sign instead of -> one. The literal equivalent of this is RuleDelayed[lhs,rhs] command:

RuleDelayed@a, bD a¦b

As we can guess by the name, this corresponds to a delayed rule substitution - that is, the r.h.s. of the rule is evaluated only after the rule substitution takes place. Later we will consider cases where Rule or RuleDe-layed are more appropriate, in more detail, but in general the situation here is similar with the one with Set and SetDelayed assignment operators. This similarity is not accidental, but once again reflects the fact the assignment operators in Mathematica are just operators which create global rules.

4.2.1.7 The difference between Rule and RuleDelayed

To illustrate a difference between Rule and RuleDelayed on one particular example, consider a model problem : given a list of elements, substitute every occurrence of the symbol < a > by a random number.

Here is our sample list

Clear@sample, a, b, c, d, e, f, g, hD;

sample = 8d, e, a, c, a, b, f, a, a, e, g, a<

8d, e, a, c, a, b, f, a, a, e, g, a<

Now, here is the rule - based solution using Rule : sample. a®Random@D

8d, e, 0.177741, c, 0.177741, b, f, 0.177741, 0.177741, e, g, 0.177741<

And here is the same using RuleDelayed : sample. a :>Random@D

8d, e, 0.655171, c, 0.432888, b, f, 0.564996, 0.30648, e, g, 0.872856<

We see that the numbers are the same in the first case and different in the second. This is because, in the first case, the r.h.s. of the rule has been evaluated before it was applied, once and for all. In the second case, the r.h.s. of the rule was re - evaluated every time the rule was applied. As a variation on the theme, suppose we want to substitute each <a> in the list by a list {a,num}, where <num> will be counting the occurrences of <a> in the list. Here is our attempt with Rule:

n=1;

sample. a® 8a, n ++<

8d, e, 8a, 1<, c, 8a, 1<, b, f, 8a, 1<, 8a, 1<, e, g, 8a, 1<<

Obviously this did not work as intended. And now with RuleDelayed : n=1;

sample. a :> 8a, n ++<

8d, e, 8a, 1<, c, 8a, 2<, b, f, 8a, 3<, 8a, 4<, e, g, 8a, 5<<

Clear@sample, nD;

4.2.2 Rule substitution is not commutative

4.2.2.1 Lists of rules

When we have more than just one rule to be tried on an expression, we place the rules in a list. For example:

8a, b, c, d< .8a®1, b® 2, d®4<

81, 2, c, 4<

For all the rules to be tried on an expression, the list of rules has to be a flat list, that is, all the list elements should have a head Rule or RuleDelayed. Supplying a nested list of rules to Replace or ReplaceAll is not an error, but is interpreted as if we want to try all the sublists of rules separately on several copies of our original expression:

should have a head Rule or RuleDelayed. Supplying a nested list of rules to Replace or ReplaceAll is not an error, but is interpreted as if we want to try all the sublists of rules separately on several copies of our original expression:

8a, b, c, d< .88a®1, b® 2<, 8c®3, d® 4<<

8881, 2, c, d<, 8a, b, 3, 4<<<

As a side note, there is nothing special about lists of rules with respect to lists of other Mathematica objects:

FullForm@8a® 5, a®6<D

List@Rule@a, 5D, Rule@a, 6DD

4.2.2.1 Non-commutativity

The result of the rule substitution depends in general on the order in which the rules are stored in the list, as the following example illustrates.

Clear@a, fD;

f@aD .8a®5, a®6<

f@aD .8a®6, a®5<

f@5D f@6D

The reason for this is that once some rule has been applied to a given part of an expression, ReplaceAll goes to the next part of an expression and tries the rules on that next part. But even if we run ReplaceAll several times (there is a special command ReplaceRepeated related to this, which we will discuss shortly), the results will still be generally different for different orderings of rules in a list. This is because once a rule applies to a (part of) expression, this part is generally rewritten so that (some of) the rules in our list of rules which applied before will no longer apply, and vice versa.

In any case, our final conclusion is that the rule application is not commutative, and the order of rules in the rule list does matter in general. For an extreme example of this, we will soon consider a rule-based factorial function, where different rule ordering will result in infinite iteration.

4.2.3 An interplay between rules and evaluation process

When working in Mathematica, it is important to remember that we never actually start from scratch, but always with a large built-in system of rules which connect together the built-in functions. This gives great flexibility in using these functions, since these system rules can be manipulated or overridden with the user-defined ones. On the other hand, one has to be more careful, because the rules (or function definitions and variable assignments, which are global rules) newly defined by the user, immediately start to interact with the built-in ones. The mentioned above non-commutativity of rules can make this interaction quite non-trivial. This often results in some unexpected or "erroneous" behavior, which many Mathematica users immediately proclaim as bugs, but which can be avoided just by getting a better understanding of how the system works.

4.2.3.1 When the rule applies, expression is evaluated

As one example, consider a gamma-function of the symbolic argument:

Clear@aD; Gamma@aD Gamma@aD

Since the system does not know what < a > is, no one of the rules associated with the gamma - function applies, and the input is just returned back. Let us now use the following rewrite rule :

Gamma@aD . a®5

We see that as soon as the numerical (integer) value has been substituted, one of the built - in rules applied, producing the result. At the same time, for a number Π (for instance), there is no rule which forces Mathematica to compute the numerical value, and thus we have :

Gamma@aD . a®Pi Gamma@ΠD

If we want to compute a numerical value in this case, we can either do this : Gamma@aD . a® N@PiD

2.28804

Or, which is equivalent (with some tiny differences unimportant now), this : N@Gamma@aD . a® PiD

2.28804

(the N function computes a numerical value of its argument). What I want to stress is that the decision whether to keep say Gamma[5] as it is here or to substitute it by its numerical (well, integer) value is rather arbitrary in the sense that it is defined by certain (very sensible) Mathematica conventions but there is no first principle which tells which form of the answer is advantageous. In fact, in some cases I may wish to keep a Gamma[5] function in its unevaluated form. More generally, the whole advantage in using rule-based approach is that we don’t need a first principle to add rules for a new situation that we want to describe.

This means not that Mathematica is unpredictable, but that the programs we write should not depend on features that are defined purely by conventions. In particular, in Mathematica one should always assume that all expressions may evaluate to something else. Thus, if some expression has to be kept unevaluated for some time, the programmer has to take care of it. On the other hand, if some expression must be evaluated completely (say, to a numerical value), once again the programmer has to ensure it.

4.2.3.2 Evaluation affects applicability of rules Consider now a different example:

8f@PiD, Sin@PiD, Pi ^ 2<

8f@PiD, Sin@PiD, Pi ^ 2< . Pi® a

9f@ΠD, 0,Π2= 9f@aD, 0, a2=

Note that in the second input in the list, we will naively expect Sin[a] instead of 0 as an output, in the case when we apply the rule Pi -> a. The reason for this result being as it is can be understood easily, once we recall that the sign /. is just an abbreviation, and equivalently we can write the last input as

ReplaceAll@8f@PiD, Sin@PiD, Pi ^ 2<, Pi®aD 9f@aD, 0, a2=

Now we recall the general evaluation strategy in Mathematica, where the subexpressions are normally evaluated before the expression itself. This means that once the evaluation process reached ReplaceAll command, our expression has been already transformed to the same form as the output of the first input (without rule substitution). Sin[Pi] evaluated to 0, and since 0 does not contain < Pi > any more and thus does not match the rule, no further substitution took place for this part of our expression. Once again, we can see the evaluation dynamics by using the Trace command:

Trace@8f@PiD, Sin@PiD, Pi ^ 2< . Pi®aD

998Sin@ΠD, 0<,9f@ΠD, 0,Π2==,9f@ΠD, 0,Π2= .Π ® a,9f@aD, 0, a2==

These examples may give an impression that Mathematica is unstable with respect to bugs related to rule orderings. While it is true that many non-trivial bugs in the Mathematica programs are related to this issue, there are also ways to avoid them. As long as the rule or list of rules are always correct in the sense that either they represent exact identities (say, mathematical identities), or one can otherwise be sure that in no actual situation they, taken separately, will lead to an incorrect result, it should be fine.

Bugs happen when rules considered "correct" give incorrect results in certain unforeseen situations, but this is also true for programs written within more traditional programming paradigms. Perhaps, the real differ-ence is that for more traditional programming techniques it is usually easier to restrict the program to

"stay" in those "corners" of the problem parameter space where correct performance can be predicted or sometimes proven. I personally view the complications arising due to rule orderings as a (possibly inevita-ble) price to pay for having a very general approach to evaluation available.

4.2.4 Rules and simple (unrestricted) patterns

Let us give some examples of how rules work with the simplest patterns.

4.2.4.1 A simplest pattern and general pattern-matching strategy

The simplest pattern of all, which we have already seen before, is just a single underscore <_>, and has a literal representation Blank[]:

Blank@D _

This pattern represents any Mathematica expression. Let us take some sample Mathematica expression:

x ^ y*Sin@zD

Now we will use our simplest pattern to replace any element by, say, a symbol < a >:

Clear@a, x, y, z, g, hD;

8x, Sin@xD, x ^ 2, x*y, x +y, g@y, xD, h@x, y, zD, Cos@yD< ._ ®a a

This is not very exciting. What happened is that our entire expression (list) matched this pattern and then got replaced by < a > . Before we move forward, let me explain a bit how patterns work and why the substitution based on patterns is possible. The main ingredient for this is the uniform representation of Mathematica expressions by symbolic trees. Basically, when we try to match some (however complex) pattern with some expression, we are matching two trees. The tree that represents the pattern is also a legal Mathematica expression (patterns are as good Mathematica expressions as anything else), but with some branches or leaves replaced by special symbols like Blank[] (the underscore). For example:

FullForm@H_^_L *Sin@_DD

Times@Power@Blank@D, Blank@DD, Sin@Blank@DDD

This pattern tree (or, just pattern) will match some expression < expr > if they are identical modulo some parts of < expr > which can be "fit" in the placeholders such as Blank[], present in this pattern. In particu -lar, the pattern above will match any expression which is a product of something to a power of something else, and a Sine of something.

4.2.4.2 Does the pattern match? The MatchQ function

There is a very useful command that allows one to check whether or not there is a match between a given expression and a given pattern: MatchQ. It takes as arguments an expression and a pattern and returns True when pattern matches and False otherwise. For example:

MatchQ@x ^ y*Sin@zD, H_^_L *Sin@_DD True

MatchQ@Exp@-x ^ 2D^ 2*Sin@Cos@x -yD^ 2D, H_^_L *Sin@_DD True

MatchQ@x*Sin@zD, H_^_L *Sin@_DD False

It is important to understand that the pattern - matching (for simple, or unrestricted, patterns) is based completely on syntax, and not semantics, of the expressions being matched.

4.2.4.3 Pattern tags(names) and expression destructuring

Now, while there is some value in just establishing the fact that some expression matches certain pattern, it becomes much more useful when we get access to the parts of this expression which match certain parts of the pattern, so that we can further process these parts. This is called expression destructuring, and is a very powerful pattern - related capability. For instance, in the above example we may wish to know which expressions were the base, the power and the argument of Sine. But to be able to do such destructuring, we need to somehow label the parts of the pattern. This is possible through the mechanism of pattern tags (or names) : we attach some symbol to the pattern part, and then this symbol stores the corresponding part of the actual expression matched, ready for further processing. This is how, for example, we can insert tags in our pattern:

Hbase_^ pwr_L *Sin@sinarg_D

The pattern tags can not be composite expressions - only true symbols (with the head Symbol).

The presence of pattern tags does not change the matching in any way, it just gives us additional informa-tion. We will not obtain this information through MatchQ, however, since MatchQ just establishes the fact of the match. We will need a real rule application for that, since the rule will tell us what to do with these matched (sub) expressions. For example, here we will simply collect them in a list :

8x ^ y*Sin@zD, Exp@-x ^ 2D^ 2*Sin@Cos@x -yD^ 2D< . Hbase_^ pwr_L *Sin@sinarg_D ® 8base, pwr, sinarg<

98x, y, z<,9ã,-2 x2, Cos@x -yD2==

What we just did is exactly a special case of destructuring. The parts that we tagged are now available for whatever operations we would like to perform.

So, to summarize: whenever a pattern contains a part which is a special symbol like Blank[] (there are a few more like it, we will cover them shortly), possibly with a pattern tag attached, this means that the actual expression matched can contain in this place a rather arbitrary subexpression (how arbitrary, depends on the particular special symbol used). However, the parts which do not contain these symbols (multiplication, Power and Sin in our example above), have to be present in exactly the same way in both pattern and the expression, for them to match.

One more important point about pattern tags is that the two identical pattern tags (symbols) in different parts of a pattern can not stand for different corresponding subexpressions in the expression we try to match. For example :

MatchQ@a ^ a, b_^ b_D

True

MatchQ@a ^ c, b_^ b_D False

4.2.4.4 Example: any function of a single fixed argument

The following pattern will work on every function, but of the argument which has to literally be <x>:

Clear@f, xD; f_@xD

It can be used for instance when we need to substitute the literal < x > by some value (say, 10) at every place where it appears as a single argument of some function. Consider now some list of expressions which we will use throughout this section to illustrate the construction of various patterns :

Clear@x, y, z, g, h, aD;

8x, Sin@xD, x ^ 2, x*y, x +y, g@y, xD, h@x, y, zD, Cos@yD<

9x, Sin@xD, x2, x y, x +y, g@y, xD, h@x, y, zD, Cos@yD=

We now use our pattern :

8x, Sin@xD, x ^ 2, x*y, x +y, g@y, xD, h@x, y, zD, Cos@yD< . 8f_@xD ® f@10D<

9x, Sin@10D, x2, x y, x +y, g@y, xD, h@x, y, zD, Cos@yD=

The replacement happened only in the second element of the list. To understand this, we have to recall the tree - like nature of Mathematica expressions and also that the rules application is based on expression syntax rather than semantics. Let us look at the FullForm of these expressions:

In document Mathematica Programming - Advanced Intro (Page 92-123)