10. Experiments
10.2. Running time experiments
We compareQuickMergesort andQuickHeapsort with Mergesort(our own imple- mentation which is identical with our implementation of QuickMergesort, but with using an external buffer of lengthn/2), Wikisort[38] (in-place stable Mergesort based on [30]),
std::stable_sort(a bottom-up Mergesort, from GCC version 4.8.4),InSituMergesort [15] (which is essentially QuickMergesortwhere always the median is used as pivot), and
std::sort(median-of-three Introsort, from GCC version 4.8.4).
All time measurements were repeated with the same 100 deterministically chosen seeds – the displayed numbers are the averages of these 100 runs. Moreover, for each time measure- ment, at least 128MB of data were sorted – if the array size is smaller, then for this time
210 212 214 216 218 220 number of elementsn 0.26 0.28 0.30 0.32 0.34 0.36 std dev com ps /n 218 219 220 number of elementsn 0.2 0.4 0.6 0.8 1.0 std dev time /n [ns] QuickMergesort (mo3,α= 1/4) QuickMergesort (mo3,α= 1/2) QuickMergesort (mo-√n)
QuickMergesort (no sampling,α= 1/2)
QuickMergesort (mo3,α= 1) std::sort (no SIS)
Quicksort
Figure 14: Standard deviation of the number of comparisons (left) and the running times (right). For the number of comparisons, median-of-√nQuickMergesortand QuickMergesort without pivot sampling are out of range.
measurement several arrays have been sorted and the total elapsed time measured. The results for sorting 32-bit integers are displayed in Figure 16, Figure 15, and Figure 17, which all contain the results of the same set of experiments – we use three different figures because of the large number of algorithms and different scales on the y-axes.
Figure 15 compares differentQuickMergesortvariants toMergesortandstd::sort.
In particular, we compare median-of-3QuickMergesort with different values of α. While for the number of comparisons a smallerα was beneficial, it turns out that for the running time the opposite is the case: the variant withα= 1 is the fastest. Notice, however, that the difference is smaller than 1%. The reason is presumably that partitioning is faster than merging: for largeαthe problem sizes sorted by Mergesortare reduced and more “sorting work” is done by the partitioning. As we could expect our Mergesort implementation is faster than allQuickMergesort variants – because it can do simply moves instead of swaps. Except for smalln,std::sortbeatsQuickMergesort. However, notice that for
n= 228 the difference between std::sortand
QuickMergesortwithout sampling is only approximately 5%, thus, can most likely be bridged with additional tuning efforts (e.g. block partitioning [13]).
In Figure 16 we compare theQuickMergesort variants with base cases with Quick- Heapsortandstd::sort. WhileQuickHeapsorthas still an acceptable speed for smalln, it becomes very slow whenngrows. This is presumably due to the poor locality of memory accesses in Heapsort. The variants of QuickMergesortwith growing size base cases are always quite slow. This could be improved by sorting smaller base cases with the respective algorithm – but this opposes our other aim to minimize the number of comparisons. Only the version with constant size MergeInsertionbase cases reaches a speed comparable to
210 213 216 219 222 225 228 number of elementsn 2.95 3.00 3.05 3.10 3.15 3.20 time p er n lg n [ns] Mergesort QuickMergesort (mo-√n)
QuickMergesort (mo-√n, MI up to 9 Elem) QuickMergesort (mo3,α= 1)
QuickMergesort (mo3,α= 1/2) QuickMergesort (mo3,α= 1/4) QuickMergesort (no sampling,α= 1/2) std::sort
Figure 15: Running times of QuickMergesort variants, Mergesort, and std::sort when sorting random permutations of integers.
210 213 216 219 222 225 228 number of elementsn 4 6 8 10 12 14 time p er n lg n [ns] QuickHeapsort (mo3)
QuickMergesort (mo-√n, IS base) QuickMergesort (mo-√n, MI base) QuickMergesort (mo-√n, MI up to 9 Elem) std::sort
Figure 16: Running times of QuickMergesort variants with base cases andQuickHeap- sortwhen sorting random permutations of integers.
210 213 216 219 222 225 228 number of elementsn 3.00 3.25 3.50 3.75 4.00 4.25 4.50 time p er n lg n [ns] In-situ Mergesort Mergesort QuickMergesort (mo3,α= 1) std::sort std::stable_sort Wikisort
Figure 17: Running times when sorting random permutations of integers.
Figure 17 shows median-of-3 QuickMergesort together with the other algorithms listed above. As we see, QuickMergesortbeats the other in-placeMergesort variants InSituMergesortandWikisortby a fair margin. However, be aware thatQuickMerge- sort(as well as InSituMergesort) neither provides a guarantee for the worst case nor is it a stable algorithm.
Other data types. While all the previous running time measurements were for sorting 32-bit integers, in Figure 18 we also tested two other data types: (1) 32-bit integers with a special comparison function which before every comparison computes the logarithm of the operands, and (2) pointers to records of 40 bytes which are compared by the first 4 bytes. Thus in both cases, comparisons are considerably more expensive than for standard integers. Each record is allocated on the heap with new – since we do this in increasing order and
only shuffle the pointers, we expect them to reside memory in close-to-sorted order. For both data types,QuickMergesortwith constant sizeMergeInsertionbase cases is the fastest (except when sorting pointers for very largen). This is plausible since it combines the best of two worlds: on one hand, it has an almost minimal number of comparisons, on the other hand, it does not induce the additional overhead for growing size base cases. Moreover, the bad behavior of the other QuickMergesortvariants (“without” base cases) is probably because we sort base cases up to 42 elements with StraightInsertionsort– incurring many more comparisons (which we did not count in Section 10.1).