Classifying points - An Introduction to Programming with Mathematica.pdf

Quadrants in the Euclidean plane are conventionally numbered counterclockwise from quadrant 1 (x and y positive) to quadrant 4 (x positive, y negative). The function point Loc[{x,y}] will compute the classification of point x, y , according to Table 5.1.

Point Classification 0, 0 0 y 0 on the x axis 1 x 0 on the y axis 2 Quadrant 1 1 Quadrant 2 2 Quadrant 3 3 Quadrant 4 4

Table 5.1: Quadrant classification

We will use this problem to illustrate the features covered in this chapter, by giving a number of different solutions, using multiclause function definitions with predicates, single-clause definitions with If and its relatives, and combinations of the two.

Perhaps the first solution that suggests itself is one that uses a clause for each of the cases above. In[17]:= pointLoc 0, 0 : 0 pointLoc x_, 0 : 1 pointLoc 0, y_ : 2 pointLoc x_, y_ : 1 ; x 0 && y 0 pointLoc x_, y_ : 2 ; x 0 && y 0 pointLoc x_, y_ : 3 ; x 0 && y 0 pointLoc x_, y_ : 4 ; x 0 && y 0

It is a good idea to include the last condition as a comment, rather than as a condi- tion in the code, because Mathematica would not realize that the condition has to be true at that point and would check it anyway.

We will use the following list of points as our test cases. In[24]:= pts

0, 0 , 4, 0 , 0, 1.3 , 2, 4 , 2, 4 , 2, 4 , 2, 4 ;

In[25]:= Map pointLoc, pts

Translated directly to a one-clause definition using If, this becomes: In[26]:= pointLoc x_, y_ : If x 0 && y 0, 0, If y 0, 1, If x 0, 2, If x 0 && y 0, 1, If x 0 && y 0, 2, If x 0 && y 0, 3, 4

In[27]:= Map pointLoc, pts

Out[27]= 0, 1, 2, 1, 2, 3, 4

Actually, a more likely solution here uses Which. In[28]:= pointLoc x_, y_ : Which

x 0 && y 0, 0, y 0, 1, x 0, 2, x 0 && y 0, 1, x 0 && y 0, 2, x 0 && y 0, 3, True x 0&&y 0 , 4

In[29]:= Map pointLoc, pts

Out[29]= 0, 1, 2, 1, 2, 3, 4

In[30]:= pointLoc 5, 9

Out[30]= 3

All of our solutions so far suffer from a certain degree of inefficiency, because of repeated comparisons of a single value with 0. Take the last solution as an example, and suppose the argument is ( 5, 9). It will require five comparisons of 5 with 0 and three comparisons of 9 with 0 to obtain this result. Specifically:

1. evaluate x == 0; since it is false, the associated y == 0 will not be evaluated, and we next

2. evaluate y == 0 on the following line; since it is false,

3. evaluate x == 0 on the third line; since it is false,

4. evaluate x > 0 on next line; since it is false, the associated y > 0 will not be evaluated, and we next,

6. the y > 0 comparison, which is false, so we next,

7. evaluate x < 0 on the next line; since it is true, we then evaluate y < 0, which is also true, so we return the answer 3.

How can we improve this? By nesting conditional expressions inside other conditional expressions. In particular, as soon as we discover that x is less than, greater than, or equal to 0, we should make maximum use of that fact without rechecking it. That is what the following pointLoc function does.

In[31]:= pointLoc x_, y_ : Which x 0, If y 0, 0, 2 , x 0, Which y 0, 1, y 0, 4, True y 0 , 1 , True, x 0 Which y 0, 3, y 0, 2, True y 0 , 1

Let us count up the comparisons for 5, 9 this time: (i) evaluate x == 0; since it is false, we next, (ii) evaluate x > 0; since it is false, we go to the third branch of the Which, evaluate True, which is, of course, true; then, (iii) evaluate y < 0, which is true, and we return 3. Thus, we made only three comparisons – a substantial improvement.

When pattern matching is used, as in our first, multiclause solution, efficiency calculations are more difficult. It would be inaccurate to say that Mathematica has to compare x and y to 0 to tell whether the first clause applies; what actually happens is more complex. What is true, however, is that it will do the comparisons indicated in the last four clauses. So, even if we discount the first three clauses with argument 5, 9 , some extra comparisons are done. Specifically: (i) the comparison x > 0 is done; then, (ii) x < 0 and (iii) y > 0; then, (iv) x < 0 and (v) y < 0. This can be avoided by using conditional expressions

within clauses. In[32]:= pointLoc 0, 0 : 0 pointLoc x_, 0 : 1 pointLoc 0, y_ : 2 pointLoc x_, y_ : If x 0, 2, 1 ; y 0 pointLoc x_, y_ : If x 0, 3, 4 ; y 0

Now, no redundant comparisons are done. For 5, 9 , since y > 0 fails, the fourth clause is not used, so the x > 0 comparison in it is not done. Only the single x < 0 comparison in the final clause is done, for a total of two comparisons.

Having done all these versions of pointLoc, we would be remiss if we did not remind the reader of a basic fact of life in programming: your time is more valuable than your computer’s time. You should not be worrying about how slow a function is until there is a demonstrated need to worry. Far more important is the clarity and simplicity of the code, since this will determine how much time you (or another programmer) will have to spend when it comes time to modify it. In the case of pointLoc, we would argue that we got lucky and found a version (the final one) that wins on both counts (if only programming were always like that!).

Finally, a technical, but potentially important, point: Not all of the versions of pointLoc work exactly the same. The integer 0, as a pattern, does not match the real number 0.0, since they have different heads. Thus, using the last version as an example, pointLoc[{0.0,0.0}] returns 4.

In[37]:= pointLoc 0.0, 0.0

Out[37]= 4

See Section 6.2 for a discussion of alternatives, which allows us to efficiently deal with these various cases.

Exercises

1. Using an If function, write a function gcd[m,n] that implements the Euclidean algorithm (see Exercise 10 of Section 5.2) for finding the greatest common divisor of

m and n.

2. Use Piecewise to define the pointLoc function given in this section.

3. Extend pointLoc to three dimensions, following this rule: for point (x, y, z), if

z 0, then give the same classification as (x, y), with the exception that zero is treated as a positive number (so the only classifications are 1, 2, 3, and 4); if z 0, add 4 to the classification of (x, y) (with the same exception). For example, (1, 0, 1) is in octant 1, and (0, 3, 3) is in octant 8. pointLoc should work for points in two or three dimensions.

The use of rules to transform expressions from one form to another is one of the most powerful and useful tools available in the Mathematica programming language. The thousands of rules built in to Mathematica can be expanded limitlessly through the creation of user-defined rules. Rules can be created to change the form of expressions, to filter data based on some criteria, and can be set up to apply to broad classes of expressions or limited to certain narrow domains through the use of appropriate pattern matching techniques. These rules can perform many of the tasks normally associated with more traditional programming constructs, such as we have discussed in the chapters on procedural and functional programming. In this chapter we will discuss the structure and application of rules to common programming tasks and look at their application in some concrete examples.

6.1 Introduction

Users of Mathematica typically first encounter rules as the output to many built-in func- tions. For example, the Solve function returns its solutions as a list of rules.

In[1]:= soln Solve a x2 _{b x} _c _{0, x}

Out[1]= x b b

2 _{4 a c}

2 a , x

b b2 _{4 a c}

2 a

They are also used to specify options for functions and replacement rules in many kinds of computations.

In[2]:= FactorInteger 5, GaussianIntegers True

Out[2]= , 1 , 1 2 , 1 , 2 , 1

In[3]:= StringReplace "acgttttccctgagcataaaaacccagcaatacg", "ca" "CA", "tt" "TT"

Out[3]= acgTTTTccctgagCAtaaaaaccCAgCAatacg

When you define a function via an assignment such as the function f below, you are defining a rule that says whenever f is given an argument, it should be replaced with that

argument squared. This rule will be applied automatically whenever you evaluate f[anything].

In[4]:= f x_ : x2

In[5]:= f bob

Out[5]= bob2

On the other hand, you can set up rules to be applied on demand by using the replacement operator ReplaceAll, written in shorthand notation as /. . These rules can then be used to transform one expression into another. For example, the following rule is used to extract the real and imaginary parts of a complex number and convert it to an ordered pair.

In[6]:= 3 4 . Complex a_, b_ a, b

Out[6]= 3, 4

This rule reverses the elements in each ordered pair.

In[7]:= , 1 , , 2 , , 3 . x_, y_ y, x

Out[7]= 1, , 2, , 3,

And here is a rule that turns each of the superscripts in the polynomial below into a subscript.

In[8]:= poly Factor 1 x11

Out[8]= 1 x 1 x x2 _x3 _x4 _x5 _x6 _x7 _x8 _x9 _x10

In[9]:= ToBoxes poly . SuperscriptBox SubscriptBox DisplayForm

Out[9]//DisplayForm=

1 x 1 x x2 x3 x4 x5 x6 x7 x8 x9 x10

Rule-based programming is such a useful construct for manipulating lists and arbi- trary expressions that no user of Mathematica should be without a working knowledge of this paradigm. This chapter gives a thorough introduction to pattern matching and then proceeds to rule-based programs, many of which were introduced earlier as functional or procedural programs.

6.2 Patterns

Blanks

When you make an assignment to a symbol, like x=4, you are making a rule that should be applied to the literal expression x. Loosely speaking, the rule says, replace x with the value 4 whenever x is encountered. We have seen that you can also define functions of one or more arguments that allow you to substitute arbitrary expressions for those arguments.

In[1]:= f x_ : x 1

The left-hand side of the above assignment is a pattern. It contains a blank (underscore) which can stand for any expression, not just the literal expression x.

In[2]:= f

Out[2]= 1

In[3]:= f bob

Out[3]= 1 bob

While any specific expression can be pattern matched (because any object must match itself), we usually want to be able to pattern match large classes of expressions (for example, a sequence of expressions or expressions having Integer as the head). For this purpose, patterns are defined as expressions that may contain blanks. That is to say, a pattern may contain one of the following: a single (_) blank, a double (__) blank, or a triple (___) blank.

We will find it useful to identify the pattern to which an expression is matched (for example, on the left-hand side of a function definition) so that it can be referred to by name elsewhere (for example, on the right-hand side of the function definition). A pattern can be labeled by name_, or name__, or name___ (which can be read as “a pattern called

name”) and the labeled pattern will be matched by the same expression that matches its

unlabeled counterpart. The matching expression is given the name used in the labeled pattern.

You can see what class of expressions match a given pattern by using MatchQ. For example, this tests whether the symbol bob matches any expression because the single underscore can stand for anyMathematica expression.

In[4]:= MatchQ bob, _

This tests whether the number 3.14 matches any expression with head Real. In[5]:= MatchQ 3.14, _Real

Out[5]= True

Of course 3.14 does not match any expression with head Integer. In[6]:= MatchQ 3.14, _Integer

Out[6]= False

If you want to look at a list of expressions and see which ones match a particular pattern, you can use Cases. Cases[expr, patt] outputs those elements of expr that match the pattern patt. For example, the only two elements of the list below that have head Integer are 3 and 17. Notice the fourth element is a string.

In[7]:= Cases 3, 3.14, 17, "3", 4 5 I , _Integer

Out[7]= 3, 17

In[8]:= Cases 3, 3.14, 17, "3", 4 5 I , _String

Out[8]= 3

Remember that the OutputForm of strings is to display without the quote characters. If you want to check the structure of this last output, use FullForm or check its Head.

In[9]:= FullForm %

Out[9]//FullForm=

List "3"

Here are some additional examples of pattern matching. This next example matches all those expressions with head g.

In[10]:= Cases g x , f x , g h x , g a, 0 , _g

Out[10]= g x , g h x , g a, 0

In the following example, the pattern {p_,q_} matches any list with two elements. In[11]:= Cases a, b , , 1, 0 , c, d, 3 , p_, q_

Out[11]= a, b , 1, 0

Let us clear symbols we no longer need. In[12]:= Clear f

In document An Introduction to Programming with Mathematica.pdf (Page 151-159)