Insertion Sort by Diminishing Increment - Advanced Sorting Methods

2. SORTING 1 Introduction

2.3. Advanced Sorting Methods

2.3.1 Insertion Sort by Diminishing Increment

A refinement of the straight insertion sort was proposed by D. L. Shell in l959. The method is explained and demonstrated on our standard example of eight items (see Table 2.5). First, all items that are four positions apart are grouped and sorted separately. This process is called a 4-sort. In this example of eight items, each group contains exactly two items. After this first pass, the items are regrouped into groups with items two positions apart and then sorted anew. This process is called a 2-sort. Finally, in a third pass, all items are sorted in an ordinary sort or 1-sort.

One may at first wonder if the necessity of several sorting passes, each of which involves all items, does not introduce more work than it saves. However, each sorting step over a chain either involves relatively few items or the items are already quite well ordered and comparatively few rearrangements are required. It is obvious that the method results in an ordered array, and it is fairly obvious that each pass profits from previous passes (since each i-sort combines two groups sorted in the preceding 2i-sort). It is also obvious that any sequence of increments is acceptable, as long as the last one is unity, because in the worst case the last pass does all the work. It is, however, much less obvious that the method of diminishing increments yields even better results with increments other than powers of 2.

44 55 12 42 94 18 06 67

4-sort yields 44 18 06 42 94 55 12 67

2-sort yield 06 18 12 42 44 55 94 67

1-sort yields 06 12 18 42 44 55 67 94

Table 2.5 An Insertion Sort with Diminishing Increments.

The procedure is therefore developed without relying on a specific sequence of increments. The T increments are denoted by h0, h1, ... , hT-1 with the conditions

ht-1 = 1, hi+1 < hi

The algorithm is described by the procedure Shellsort [2.11] for t = 4: PROCEDURE ShellSort; CONST T = 4; VAR i, j, k, m, s: INTEGER; x: Item; h: ARRAY T OF INTEGER; BEGIN h[0] := 9; h[1] := 5; h[2] := 3; h[3] := 1; FOR m := 0 TO T-1 DO k := h[m]; FOR i := k+1 TO n-1 DO x := a[i]; j := i-k;

WHILE (j >= k) & (x < a[j]) DO a[j+k] := a[j]; j := j-k END ; a[j+k] := x

END END ShellSort

Analysis of Shellsort. The analysis of this algorithm poses some very difficult mathematical problems, many

of which have not yet been solved. In particular, it is not known which choice of increments yields the best results. One surprising fact, however, is that they should not be multiples of each other. This will avoid the phenomenon evident from the example given above in which each sorting pass combines two chains that before had no interaction whatsoever. It is indeed desirable that interaction between various chains takes place as often as possible, and the following theorem holds: If a k-sorted sequence is i-sorted, then it remains k-sorted. Knuth [2.8] indicates evidence that a reasonable choice of increments is the sequence (written in reverse order)

1, 4, 13, 40, 121, ...

where hk-1 = 3hk+1, ht = 1, and t = k×log3(n) - 1. He also recommends the sequence 1, 3, 7, 15, 31, ...

where hk-1 = 2hk+1, ht = 1, and t = k×log2(n) - 1. For the latter choice, mathematical analysis yields an effort proportional to n2 required for sorting n items with the Shellsort algorithm. Although this is a significant improvement over n2, we will not expound further on this method, since even better algorithms are known. 2.3.2 Tree Sort

The method of sorting by straight selection is based on the repeated selection of the least key among n items, then among the remaining n-1 items, etc. Clearly, finding the least key among n items requires n-1 comparisons, finding it among n-1 items needs n-2 comparisons, etc., and the sum of the first n-1 integers is (n2-n)/2. So how can this selection sort possibly be improved? It can be improved only by retaining from each scan more information than just the identification of the single least item. For instance, with n/2 comparisons it is possible to determine the smaller key of each pair of items, with another n/4 comparisons the smaller of each pair of such smaller keys can be selected, and so on. With only n-1 comparisons, we can construct a selection tree as shown in Fig. 2.3. and identify the root as the desired least key [2.2].

Fig. 2.3. Repeated selection among two keys

The second step now consists of descending down along the path marked by the least key and eliminating it by successively replacing it by either an empty hole at the bottom, or by the item at the alternative branch at intermediate nodes (see Figs. 2.4 and 2.5). Again, the item emerging at the root of the tree has the (now second) smallest key and can be eliminated. After n such selection steps, the tree is empty (i.e., full of holes), and the sorting process is terminated. It should be noted that each of the n selection steps requires only log n comparisons. Therefore, the total selection process requires only on the order of n*log n elementary operations in addition to the n steps required by the construction of the tree. This is a very significant improvement over the straight methods requiring n2 steps, and even over Shellsort that requires n1.2 steps. Naturally, the task of bookkeeping has become more elaborate, and therefore the complexity of individual steps is greater in the tree sort method; after all, in order to retain the increased amount of information gained from the initial pass, some sort of tree structure has to be created. Our next task is to find methods of organizing this information efficiently.

06 12 44 55 44 12 42 12 06 18 18 94 06 67 06

Fig. 2.4. Selecting the least key

Fig. 2.5. Refilling the holes

Of course, it would seem particularly desirable to eliminate the need for the holes that in the end populate the entire tree and are the source of many unnecessary comparisons. Moreover, a way should be found to represent the tree of n items in n units of storage, instead of in 2n - 1 units as shown above. These goals are indeed achieved by a method called Heapsort by its inventor J. Williams [2-14]; it is plain that this method represents a drastic improvement over more conventional tree sorting approaches. A heap is defined as a sequence of keys hL, hL+1, ... , hR (L ≥ 0) such that

hi < h2i+1 and hi < h2i+2 for i = L ... R/2-1

If a binary tree is represented as an array as shown in Fig. 2.6, then it follows that the sort trees in Figs. 2.7 and 2.8 are heaps, and in particular that the element h0 of a heap is its least element:

h0 = min(h0, h1, ... , hn-1)

Fig. 2.6. Array viewed as a binary tree

Fig. 2.7. Heap with 7 elements

h0 42 55 94 06 18 12 h14 h0 h1 h3 h8 h7 h4 h10 h9 h2 h5 h12 h11 h6 h13 12 12 44 55 44 12 42 12 18 18 18 94 67 67 12 44 55 44 12 42 12 18 18 94 67

Fig. 2.8. Key 44 sifting through the heap

Let us now assume that a heap with elements hL+1 ... hR is given for some values L and R, and that a new element x has to be added to form the extended heap hL ... hR-1. Take, for example, the initial heap h0 ... h6 shown in Fig. 2.7 and extend the heap to the left by an element h0 = 44. A new heap is obtained by first putting x on top of the tree structure and then by letting it sift down along the path of the smaller comparands, which at the same time move up. In the given example the value 44 is first exchanged with 06, then with 12, and thus forming the tree shown in Fig. 2.8. We now formulate this sifting algorithm as follows: i, j are the pair of indices denoting the items to be exchanged during each sift step. The reader is urged to convince himself that the proposed method of sifting actually preserves the heap invariants that define a heap.

A neat way to construct a heap in situ was suggested by R. W. Floyd. It uses the sifting procedure shown in Program 2.7. Given is an array h0 ... hn-1; clearly, the elements hm ... hn-1 (with m = n DIV 2) form a heap already, since no two indices i, j are such that j = 2i+1 or j = 2i+2. These elements form what may be considered as the bottom row of the associated binary tree (see Fig. 2.6) among which no ordering relationship is required. The heap is now extended to the left, whereby in each step a new element is included and properly positioned by a sift. This process is illustrated in Table 2.6 and yields the heap shown in Fig. 2.6.

PROCEDURE sift(L, R: INTEGER); VAR i, j: INTEGER; x: Item; BEGIN i := L; j := 2*i+1; x := a[i];

IF (j < R) & (a[j+1] < a[j]) THEN j := j+1 END ; WHILE (j <= R) & (a[j] < x) DO

a[i] := a[j]; i := j; j := 2*j;

IF (j < R) & (a[j+1] < a[j]) THEN j := j+1 END END ; a[i] := x END sift 44 55 12 42 | 94 18 06 67 44 55 12 | 42 94 18 06 67 44 55 | 06 42 94 18 12 67 44 | 42 06 55 94 18 12 67 06 42 12 55 94 18 44 67

Table 2.6 Constructing a Heap.

Consequently, the process of generating a heap of n elements h0 ... hn-1 in situ is described as follows: L := n DIV 2;

WHILE L > 0 DO DEC(L); sift(L, n-1) END

In order to obtain not only a partial, but a full ordering among the elements, n sift steps have to follow, whereby after each step the next (least) item may be picked off the top of the heap. Once more, the question arises about where to store the emerging top elements and whether or not an in situ sort would be possible. Of course there is such a solution: In each step take the last component (say x) off the heap, store the top element of the heap in the now free location of x, and let x sift down into its proper position. The necessary n-1 steps are illustrated on the heap of Table 2.7. The process is described with the aid of the procedure sift as follows: 06 42 55 94 12 18 44

R := n-1;

WHILE R > 0 DO

x := a[0]; a[0] := a[R]; a[R] := x; DEC(R); sift(1, R) END 06 42 12 55 94 18 44 67 12 42 18 55 94 67 44 | 06 18 42 44 55 94 67 | 12 06 42 55 44 67 94 | 18 12 06 44 55 94 67 | 42 18 12 06 55 67 94 | 44 42 18 12 06 67 94 | 55 44 42 18 12 06 94 | 67 55 44 42 18 12 06

Table 2.7 Example of a Heapsort Process.

The example of Table 2.7 shows that the resulting order is actually inverted. This, however, can easily be remedied by changing the direction of the ordering relations in the sift procedure. This results in the following procedure Heapsort. Note that sift should actually be declared local to Heapsort).

PROCEDURE sift(L, R: INTEGER); VAR i, j: INTEGER; x: Item; BEGIN i := L; j := 2*i+1; x := a[i];

IF (j < R) & (a[j] < a[j+1]) THEN j := j+1 END ; WHILE (j <= R) & (x < a[j]) DO

a[i] := a[j]; i := j; j := 2*j+1;

IF (j < R) & (a[j] < a[j+1]) THEN j := j+1 END END ;

a[i] := x END sift;

PROCEDURE HeapSort; VAR L, R: INTEGER; x: Item; BEGIN L := n DIV 2; R := n-1;

WHILE L > 0 DO DEC(L); sift(L, R) END ; WHILE R > 0 DO

x := a[0]; a[0] := a[R]; a[R] := x; DEC(R); sift(L, R)

END END HeapSort

Analysis of Heapsort. At first sight it is not evident that this method of sorting provides good results. After

all, the large items are first sifted to the left before finally being deposited at the far right. Indeed, the procedure is not recommended for small numbers of items, such as shown in the example. However, for large n, Heapsort is very efficient, and the larger n is, the better it becomes -- even compared to Shellsort. In the worst case, there are n/2 sift steps necessary, sifting items through log(n/2), log(n/2 +1), ... , log(n-1) positions, where the logarithm (to the base 2) is truncated to the next lower integer. Subsequently, the sorting phase takes n-1 sifts, with at most log(n-1), log(n-2), ... , 1 moves. In addition, there are n-1 moves for stashing the sifted item away at the right. This argument shows that Heapsort takes of the order of n×log(n) steps even in the worst possible case. This excellent worst-case performance is one of the strongest qualities of Heapsort.

It is not at all clear in which case the worst (or the best) performance can be expected. But generally Heapsort seems to like initial sequences in which the items are more or less sorted in the inverse order, and therefore it displays an unnatural behavior. The heap creation phase requires zero moves if the inverse order is present. The average number of moves is approximately n/2 × log(n), and the deviations from this value are relatively small.

In document Algorithms and Data Structures Niklaus Wirth pdf (Page 48-53)