2.4.2 The Binary Search - Data Abstraction & Problem Solving with C++ Walls and Mirrors (6th Ed

Searching is an important task that occurs frequently. Often, searches are for a particular entry in an array. We now will examine a few searching problems that have recursive solutions. Our goal is to develop further your understanding of recursion.

This chapter began with an intuitive approach to a binary search algorithm by presenting—at a high level—a way to ﬁ nd a word in a dictionary. We now develop this algorithm fully and illustrate some important programming issues.

Recall the earlier solution to the dictionary problem: search(aDictionary: Dictionary, word: string)

if (aDictionary is one page in size)

Scan the page for word else

{

Open aDictionary to a point near the middle

Determine which half of aDictionary contains word if (wordis in the ﬁ rst half of aDictionary)

search(ﬁ rst half of aDictionary, word)

else

search(second half of aDictionary, word)

}

Now alter the problem slightly by searching an array anArray of integers for a given value, the

target . The array, like the dictionary, must be sorted, or else a binary search is not applicable. Hence,

assume that

anArray[0] anArray[1] anArray[2] … anArray[size - 1]

where size is the size of the array. A high-level binary search for the array problem is binarySearch(anArray: ArrayType, target: ValueType)

if (anArray is of size 1)

Determine if anArray ’s value is equal to target else

{

Find the midpoint of anArray

Determine which half of anArray contains target if (target is in the ﬁ rst half of anArray)

binarySearch( ﬁ rst half of anArray, target)

else

binarySearch( second half of anArray, target)

}

Question 4

In the previous deﬁ nition of writeArrayBackward , why does the base case occur when the value of ﬁrst exceeds the value of last ?

Question 5

Write a recursive function that computes and returns the product of the ﬁ rst n≥ 1 real numbers in an array.

Question 6

Show how the function that you wrote for the previous question satisﬁ es the properties of a recursive function.

Question 7

Write a recursive function that computes and returns the product of the integers in the array anArray[ﬁrst..last] .

CHECK POINT

A binary search conquers one of its subproblems at each step

Recursion with Arrays 69

Although the solution is conceptually sound, you must consider several details before you can implement the algorithm:

1. How will you pass half of anArray to the recursive calls to binarySearch? You can pass

the entire array at each call but have binarySearch search only anArray[first..last] , that is, the portion anArray[first] through anArray[last] . Thus, you would also pass the integers first and last to binarySearch :

binarySearch(anArray, ﬁrst, last, target) With this convention, the new midpoint is given by

mid = (ﬁrst + last) / 2

Then binarySearch(ﬁ rst half of anArray, target) becomes binarySearch(anArray, ﬁrst, mid - 1, target) and binarySearch(second half of anArray, target) becomes

binarySearch(anArray, mid + 1, last, target)

2. How do you determine which half of the array contains target? One possible

implementation of

if (targetis in the ﬁ rst half of anArray ) is

i f (target < anArray[mid])

However, there is no test for equality between target and anArray[mid] . This omission can cause the algorithm to misstarget . After the previous halving algorithm splits anArray into halves, anArray[mid] is not in either half of the array. (In this case, two halves do not make a whole!) Therefore, you must determine whether anArray[mid] is the value you seek now, because later it will not be in the remaining half of the array. The interaction between the halving criterion and the termination condition (the base case) is subtle and is often a source of error. We need to rethink the base case.

3. What should the base case(s) be? As it is written, binarySearch terminates only when an array of size 1 occurs; this is the only base case. By changing the halving process so that anArray[mid] remains in one of the halves , it is possible to implement the binary search correctly so that it has only this single base case. However, it can be clearer to have two distinct base cases as follows:

•

ﬁrst > last . You will reach this base case when target is not in the original array.

•

target == anArray[mid] . You will reach this base case when target is in the original array. These base cases are a bit different from any you have encountered previously. In a sense, the algorithm determines the answer to the problem from the base case it reaches. Many search problems have this ﬂ avor.

4. How will binarySearch indicate the result of the search? If binarySearch successfully locatestarget in the array, it could return the index of the array value that is equal to target . Because this index would never be negative, binarySearch could return a negative value if it does not ﬁ nd target in the array.

The C++ function binarySearch that follows implements these ideas. The two recursive calls to binarySearch are labeled as X and Y for use in a later box trace of this function.

The array halves are anArray[ﬁrst.. mid-1]

and anArray [mid+1..last] ; neither half contains anArray[mid]

Determine whether anArray[mid] is the target you seek

/** Searches the array anArray[ﬁrst] through anArray[last] for a given value by using a binary search.

@pre 0 <= ﬁrst, last <= SIZE - 1, where SIZE is the

maximum size of the array, and anArray[ﬁrst] <=

anArray[ﬁrst + 1] <= ... <= anArray[last].

@post anArray is unchanged and either anArray[index] contains the given value or index == -1.

@param anArray The array to search.

@param ﬁrst The low index to start searching from. @param last The high index to stop searching at. @param target The search key.

@return Either index, such that anArray[index] == target, or -1.

int binarySearch( const int anArray[], int ﬁrst,int last, int target) {

int index; if (ﬁrst > last)

index = -1; // target not in original array

else {

// If target is in anArray,

// anArray[ﬁrst] <= target <= anArray[last]

int mid = ﬁrst + (last - ﬁrst) / 2; if (target == anArray[mid])

index = mid; // target found at anArray[mid]

else if (target < anArray[mid])

// Point X

index = binarySearch(anArray, ﬁrst, mid - 1, target); else

// Point Y

index = binarySearch(anArray, mid + 1, last, target); } // end if

return index;

} // end binarySearch

Notice that if target occurs in the array, it must be in the segment of the array delineated by ﬁrst and last . That is, the following is true:

anArray[ﬁrst] target anArray[last]

Figure 2-10 shows box traces of binarySearch when it searches the array containing 1, 5, 9, 12, 15, 21, 29, and 31. Notice how the labels X and Y of the two recursive calls to binarySearch appear in the diagram. Exercise 16 at the end of this chapter asks you to perform other box traces with this function.

Note:

When developing a recursive solution, you must be sure that the solutions to the smaller problems really do give you a solution to the original problem. For example, binarySearch works because each smaller array is sorted and the value sought is between its ﬁ rst and last values.

There is another implementation issue—one that deals speciﬁ cally with C++—to consider. Recall that an array is never passed to a function by value and is therefore not copied. This aspect of C++ is particularly useful in a recursive function such as binarySearch . If the array anArray is large, many recursive calls to binarySearch might be necessary. If each call copied anArray , much

Recursion with Arrays 71

memory and time would be wasted. On the other hand, because anArray is not copied, the function can alter the array’s values unless you specify anArray as const , as was done for binarySearch .

Because an array argument is always passed by reference, a function can alter it unless you specify the array as const

FIGURE 2-10 Box traces of binarySearch with anArray = <1, 5, 9, 12, 15, 21, 29, 31>: (a) a successful search for 9; (b) an unsuccessful search for 6

target = 6 first = 0 last = 7 mid = = 3 target < anArray[3] ___ 2 X Y X target = 6 first = 0 last = 2 mid = = 1 target > anArray[1] target = 6 first = 2 last = 2 mid = = 2 target < anArray[2] target = 6 first = 2 last = 1 first > last return -1 0+7 ___ 2 0+2 ___ 2 2+2 X Y target = 9 first = 0 last = 7 mid = = 3 target < anArray[3] ____ 2 target = 9 first = 0 last = 2 mid = = 1 target > anArray[1] ____ 2 target = 9 first = 2 last = 2 mid = = 2 target = anArray[2] return 2 0+7 ____ 2 2+2 0+2 (a) (b)

A box trace of a recursive function that has an array argument requires a new consideration. Because the array anArray is neither a value argument nor a local variable, it is not a part of the function’s local environment, and so the entire array anArray should not appear within each box. There- fore, as Figure 2-11 shows, you represent anArray outside the boxes, and all references to anArray affect this single representation.

Represent reference arguments outside of the boxes in a box trace

Note:

Notice that the C++ computation of the midpoint mid is int mid = ﬁrst + (last - ﬁrst) / 2;

instead of

int mid = (ﬁrst + last) / 2;

as the pseudocode would suggest. If you were to search an array of at least 2 30_{, or about}

1 billion, elements, the sum offirst and last could exceed the largest possible int value of 2 30_⫺_{1. Thus, the computation}_{first + last}_{would overfl ow to a negative integer and}

result in a negative value for mid . If this negative value of mid was used as an array index, it would be out of bounds and cause incorrect results. The computation first + (last - first) / 2 is algebraically equivalent to (first + last) / 2 and avoids this error.

In document Data Abstraction & Problem Solving with C++ Walls and Mirrors (6th Edition) (Page 95-99)