In the last chapter we reduced the amount of trial-and-error search in a problem by constructing equivalent state-action trees of reduced size. In this chapter, we discuss a very different way of reducing the number of state-action sequences that have to be searched before achieving the solution. The method has two parts: (a) defining an evaluation jimclion over all states including the goal state and (b) choos ing actions at any given state to achieve a next state with an evaluation closer to that of the goal . Picking an action on the basis of such a local evaluation of its consequences is known as hill climbing, since evaluation functions are frequently defined so that the goal state has the maximum value on some one-dimensional evaluation function. Figure
5. 1
illu strates the application of state evaluation and hill climbing to the state-action tree for some unknown problem with a hypothetical evaluation fu nction defined over each state. The value of the function for each state is written inside the circle for each node ( state). This example arbitraril y u ses an integer-valued evaluation function, with the begi nning state having value 0, the goal state having value 1 0, and nongoal states having values intermediate between °Begin ning state
State level
Goal state Hill-cl imbing result
FIG U R E 5-1
State-action tree with an integer-valued evaluation fu nction defined over every state (node). One-step hill cl imbing results in the action sequence shown by the arro w . N ote that, i n t h i s case, h i l l c l i m b i ng does not achieve the goal state .
o
2
3
4
and 1 0. Application of a one-step hilI-climbing method to this state action tree with this evaluation function yields the sequence of action choices shown by arrows in Fig. 5- 1 . You wiII note that hilI climbing need not succeed in achieving the goal the first time, and this time it did not.
H aving failed to achieve the goal by hill climbing in the first attempt, there are many things you can do to achieve the goal, still using hill climbing. You could try choosing the action with the next to best value at one of the variou s nodes on the original hill-climbing path, use strict hill climbing at all other nodes, and see if you achieved the goal with any of these minimal violations of the general hill climbing method. I n the present instance, this minimal .modification of hill climbing would succeed if you took the next to best action going from state level 0 to state level l , because, from that point on, hill climbing results in an action sequence that achieves the goal .
Alternatively, you could try two-step hill cli mbing and choose the sequence of two actions at any given node that resulted in a node with the greatest value. This two-step hill cl imbing would produce the goal the first time in the problem shown in Fig. 5- 1 .
Finally, you could question the evaluation function you had defined over the states in the problem. There is usually no way to be certain that you have defined the evaluation function that is ideal for represent ing progress in achieving the goal in any given problem. Sometimes the failure of hill climbing suggests that a reexamination of the (explicit or implicit) evaluation function is in order. Evaluation functions are generally not given in the problem (except in optimization problems), and so any evaluation function can be chosen to see if it works in conjunction with hill cli mbing (or some other problem-solving method) to produce the solution to the problem.
Sometime.s when hill climbing is used in conjunction with a state evaluation function, a real-valued ( numerical) evaluation is defined for each state. In other cases, you may have some ability to compare several states and judge which is closer to the goal , but no actual numbers are assigned to the states. Whether or not numbers are assigned to states, two states can have e'luivalent evaluation and so you could not choose between them.
So far we have discussed problems with only a single-valued (one dimensional ) state-evaluation function, but there are also problems where the goal differs from the beginning state on several dimensions. I n these cases, it is usually possible to make judgments regarding closeness to the goal on each of the dimensions separately, but there may be no single, necessarily optimal way to combine the evaluations on each separate dimension into a single overall evaluation of each state. Thu s, you cou ld have a vector-valued evaluation fu nction as signed to each state, as shown in Fig. 5-2.
There are a nu mber of hill-climbing options in regard to vector valued evaluation functions, such as that shown in Fig. 5-2. You cou ld try variou s alternation schemes - that is, hill climbing on one dimen sion for a while and then hill cl imbing on another dimension for a while. Obviously, when no improvement is possible on a particular dimension by any action that you could take from the node where you are currently located, you should hill-climb on a different dimension for at least that node. I f you have reached the goal with respect to one di mension, you should also hill climb on other dimensions. I n using these alternation schemes, i t helps to keep records of the nodes where you could have chosen to improve on a different di mension than the one you did choose. When the first hill-climbing path through the state-action tree fails to produce the solution, these nodes where you had good alternative choices are the obvious places to back up to and start new path s.
Beginning state State level o 2 3 Goal state FIG U R E 5-2
State-action tree with a two-dimensional vector-valued evaluation function defined over every state (node). In this case. the goal state has the evaluation vector ( 5 . 4), and the beginning state has the evaluation vector (0, 0). The path taken by a hi ll climbing method depends o n whether you h i l l climb on weighted summed components o r try some alternation s c h e m e . In t h e fo rmer case. the exact weighing of t h e two c o m p o n e n t values i s also i m portant i n determ i n i n g the path taken by hill c l i mb i ng.
Another approach to multidimensional evaluation functions is to combine the values on the separate dimensions i nto a single overall value for each state. If there is some single most natural way to com bine them . do it that way first ; but remember that. no matter how natural the combination method might be, it could be the wrong way to combine the values on the different dimensions - that is, wrong for achieving the solution by one-dimensional hill climbing. If the orig inally chosen combination method fails to work , try some other method of combination, alternation schemes with the original multidimensional evaluation function, multistep hill climbing, defining a new evaluation
fu nction, or the l i ke.
APPLICATIONS
Examples of the u se of state-eval uation functions and hill climbing abou nd i n problem solving. For instance, when you plan a trip across the country on a map. you initially examine roads that go in nearly the right direction. The right direction is the direction that reduces the di stance between where you are and where you are going at the fastest rate. Of cou rse, choosing the road at the beginning of a trip that goes closest to the right direct ion may prove to be a bad choice.
This road may eventual ly lead to a dead end or require you to go far out of the way to reach the goal. I n addition , planning a trip on a map usually involves other considerations - speed, scenery, or other properties - besides finding the shortest road between the starting and ending points. These considerations place you in the position of doing hill climbing on a vector-valued evaluation function. Despite
all these complications, experience suggests that hill climbing is a prominent method used in solving trip planning problems with a map. Pencil-and-paper maze problems are rather similar to trip-planning problems on a map, and people frequently use hill climbing in an at tempt to solve them. However, challenging maze problems are u sually deliberately constructed to frustrate a h :n-c1imbing approach. Maze problems frequently require nonoptimal choices at early and middle stages of the solution and may even require detours (increases in the distance from the goal, as measured by the most obvious evaluation function of physical di stance). On the other hand , maze problems usually do not involve considerations of road speed or scenic beauty. Defining an explicit evaluation function and employing hill climbing is also usefu l in solving the one-heavy-coin problem discussed in Chapter 3 :
You have a pile of 24 coins. Twenty-three of these coins have the same weight, and one is heavier. Your task is to determine which coin is heavier and to do so in the minimum number of weighings. You are given a beam balance ( scale), which will compare the weight of any two sets of coins out of the total set of 24 coins.
A suitable evaluation fu nction for solving this problem would be the number of coins whose classification as heavy or light is known. At the beginning of the problem, the value of the function is zero, since none of the 24 coins is known to be either heavy or light. In the goal state, the heavy-light classification of all 24 coins is known, so the value of the function is 24. Thus, a hill-cl imbing approach wou ld choose an action at each node that maximized the nu mber of coins whose heavy light classification is known.
A very large number of alternative actions are present at each node. For example, at the first node, you might weigh any one of the coins against any two of the other coins. I n general , you might weigh any set of m coins against any set of
n
coins, where n + m :5 24. The number of different pairs of sets of m and
n
coins that satisfy the restrictionthat m + II :5 2 4 is extremely large. H owever, the most elementary
the hill-climbing approach immediately rules out all actions that do not involve weighing two sets containing equal numbers of coins in the two pans of the beam balance. This exclusion reduces the number of alternative actions considerably.
Furthermore, using the method of defining equivalence classes of actions discussed in Chapter 4 , note that, at the first node of the problem, you have no way to distinguish different subsets of i coins; thus, you must consider any two sets of i coins to be equivalent to each other (in their likelihood of containing the heavy coin). This con sideration reduces the number of different alternative actions at the first node to 1 2 - that is, a set of 1 2 coins is weighed against a set of 1 2 coins, a set of 1 1 coins against another set of 1 1 coins, 1 0 against 1 0, and so on, or 1 against 1 .
I f you explicitly inquire which of these 1 2 alternative actions results in the greatest number of known coins following the first weighing, you should be led to select the optimal action at the first node - that is, to weigh a set of 8 coins against another set of 8 coins, since this maximally increases the value of the evaluation function from 0 known coins to 1 6 known coins following the first weighing, whatever the out come of the first weighing.
The same sort of evaluation function and hill-climbing approach can be used to solve more complex coin-weighing problems, such as those involving two heavy coins or one coin that might be either heavier or lighter than the other coins. When the coins are classified into three or more categories (for example, heavy, medium, and light), then it may be usefu l to u se as an evaluation function the number of coin-classification pairings (for example, coin I is heavy, coin 2 is medium, coin 3 is light) that have been ruled out.
In all of the coin-weighing problems, from the simplest to the most complex, keep in mind that, after a given weighing, the value of the evaluation function may be different for the different outcomes of the weighing. In such cases, the value of the evaluation fu nction for a particular weighing is usually best considered to be the expected value of the evaluation function across all different outcomes, where the value of the evaluation function for each outcome is weighted by the probability of obtaining that outcome. Thus, after the first weigh ing of eight coins against eight coins in the previously mentioned one-heavy-coin problem, the optimal choice in the second weighing is either to weigh two coins against two coins or three coins against three coins. In either case, the three outcomes of the weighing (tilt left, bal ance, tilt right) are not equally likely, nor does each outcome result in an equivalent increase in the number of known coins. For example, with
the three against three weighing (out of the eight remaining coins), the probability of their balancing evenly is �, while the probability of tilt left is �, and the probability of tilt right is � .
For simplicity, let u s u se as the evaluation function the nu mber of unknoll'll coins, where the goal state has a value of zero unknown coins. Thus, hill climbing, in this case, means attempting to minimize the value of the evaluation fu nction. U sing this evaluation function, the value of a balanced outcome in the three-against-three weighing is two remai ning unknown coi ns, while the value of tilt left is 3 and the value of tilt right is also 3. The overal l evaluation of the three-against three weighing, then, is (� . 3) + (� . 3) + (� . 2) = ¥ = 21 .
The three-against-three weighing produces the minimum expected value on the evaluation function. This fact can be seen by computing the expected value for the other three plausible weighings - namely, one against one, two against two, and four against four. The two against-two weighing is almost as good as the three-against-three weighing, by this evaluation function. The two-against-two weighing has an expected value of (i ' 2) + (i . 2) + (4 ' 4) = 3. The four-against four weighing has an expected value of a . 4) + (4 . 4) = 4. The one against-one weighing has the poorest expected value of all - namely,
(� . 0) + (� . 0) + (* . 6) = 44 .
I n terms of achieving the goal of determining the one heavy coin out of 24 in the minimum number of weighings, either the three-against three weighing or the two-against-two weighing is optimal on the second weighing. Thus , in this case, hill climbing is a successful prob lem-solving method, since it chooses one of the two actions that will lead to the goal with the minimum nu mber of total actions (weighings). Sol ving simple linear equations provides another example of the possibility of successful use of hill climbing in problem solving. Con sider the linear equation 9x + 7 = 5x + 15 as the given, with an expres- sion of the form x = ___ being the goal . The blank, ___ , repre-
sents some currently unknown real number that constitutes the value of x in the solution to the equation.
I nitiall y, we might define a four-valued vector eval uation fu nction for this problem, consisting of the coefficients of the x and nu merical
terms on the left-hand side of the equation and the x and nu merical terms on the right-hand side. For the linear equation above, then, the value of the evaluation function at the given state wou ld be (9, 7, 5 , 1 5 ). The value of the evaluation fu nction for the goal state is ( 1 , 0, 0, ___ ), where ___ again indicates that we do not currently
know what real nu mber is acceptable in this position. We might choose actions at each step designed to increase the number of terms of this
four-valued vector evaluation function that are in agreement with the corresponding terms of the evaluation function for the goal. Thus, if we subtract 5x from both sides of the equation, the evaluation function is changed to (4, 7, 0, 1 5) , which is known to disagree with the evaluation function for the goal in only the first two positions (the agreement of the value in the fourth position with the desired value in the goal expression cannot be determined). Subsequently, subtracting 7 from both sides of the equation changes the evaluation function to (4, 0, 0, 8 ) , which disagrees with the goal expression in only one posi tion (the first). Finally, dividing both sides of the equation by 4 has an evaluation function ( 1 , 0, 0, 2), which is known to disagree with the evaluation function for the goal in zero positions. The state achieved at this point that includes the expression x = 2 constitutes the solution to the problem.
Rather than think of this at all in terms of a four-valued vector evaluation function, we can simply think of the nu mber of "bad" terms in the expression. I nitiall y there are three bad terms. After subtracting 5x from both sides of the equation (obtaining 4x + 7 = 1 5) ,
there are only two known bad terms. After subtracting 7 for both sides (obtaining 4x = 8), there is only one known bad term. Finally, after dividing both sides of the equation by 4 (obtaining x = 2 ) , there are no bad terms, and the problem is solved.
It may be somewhat difficult for someone experienced in solving such simple linear equations to imagine that anyone actually uses this sort of evaluation function and hill climbing in order to solve so simple a problem. H owever, this approach could be used, and , very likely, many beginning algebra students unconsiou sly use just such a method in sol ving their initial linear-equation problems.
The more experienced linear-equation sol ver very likely thinks of the problem in terms of three subgoal s, namely, getting all the x terms on the left side of the equation, getting all the numerical terms on the right side of the equation, and dividing through by the coefficient of the x term. However, this subgoal method (to be described in detail in the following chapter) u ses the same sort of evaluation function as u sed by the hill-climbing approach to linear-equation problems.
Once you are an experienced solver of linear equations you probably never think of eval uation fu nctions, subgoal s, or hill climbing at all but simply solve the problem using the same type of action sequence you have used in sol ving other such problems - namely, subtract the x term on the right-hand side of the equation from the x term on the left-hand side of the equation, then subtract the numerical term on the left-hand side of the equation from the numerical term on the
right-hand side of the equation, and finally divide through by the coefficient of the x term . (This problem-solving method, knowing how to solve a problem becau se you recognize its relationship to other problems you solved previou sly, will be discu ssed in Chapter 9.) Thus, there are many different problem-solving methods that can all