Optimal median selection under memory access constraints

(1)

2029 www.ijarcet.org



Abstract

We consider the problem of selecting the element whose order is the kth element of a list of n unsorted elements but with access cost to an address that is a function of the address, in deviation from the widely used RAM model. We study the problem under various access cost functions and the general case of a continuous monotonically increasing function and obtain optimal bounds under these constraints.

Index Terms—median, selection, time complexity, memory access.

I. INTRODUCTION

Kth order statistics or the selection of the kth element is the problem of given a list of n elements, we are required to find the element that has rank k or has k-1 elements not greater than it. When k is equal to 1 we call the element the minimum, maximum when k is n and median when k = n/2. This problem can be solved in O(n) time in the worst case [1]. The solution is optimal, since one has to read all data to decide correctly if the chosen element is indeed the kth element. However, if the data was preprocessed then it is possible to get better times. For example, if the data is sorted prior to the request for the kth element then the required element can be found O(1). One can also use a data structure like the binary heap[2]. The construction of the heap can be completed in O(n) time as a preprocessing step. However, each extractMin would require O(lgn) time which may imply that the median would need to be found in O(nlgn) time. A number of implementations of solutions to selection problem exist. An efficient solution on the average is QuickSelect [3], whose operates like the QuickSort algorithm

,

but would only recur on one of the lists instead of both as in quicksort. Unfortunately, the worst case performance of QuickSelect is O(n2), exceeding even the worst cost of sorting all the data by a good sorting algorithm. However, the first algorithm to achieve O(n) time in the worst case for the Selection problem is due to [1] which is referred to as the median of medians.

The complexity computation of these algorithms is based on the RAM model of computation, which ignores memory hierarchy by treating the memory as composed of registers only. Such a model is unrealistic, and can lead to many implementations of what can provably be optimal algorithm on the RAM model but not necessarily optimal in real life

computers due to the factor played by memory hierarchy. The cost factor caused by memory hierarchy is usually offset by bringing data from higher levels of memory to lower levels of memory in blocks. Moreover, effort is usually made to make use of both temporal and spatial localities, which needn’t be catered for on Random Access Machine model.

In this paper, we shall consider a model of computation that takes care of memory hierarchy and block transfer, yet remaining with the simplicity and ease of the RAM model.

II. MODEL OF COMPUTATION

The Hierarchical Memory Machine (HMM) of [4] is a machine similar to the Radom Access Machine (RAM), in that it consists of infinite number of registers, R1, R2, R3, …. with each register having room for one integer. The operations on the HMM are the same as those for the RAM. However, the cost of memory access is different. While memory access on the RAM model costs a single unit for every location accessed, on the HMM, on the other hand, the access to location n is a function of the location of n. The function is assumed to be monotonically increasing function. To cater for the block transfer of data from higher memory levels to lower memory levels as is the case in practice, the cost to move a block of length l, of data from location [x-l, x] to location [y-l, y] is . So in effect the cost to move a block of memory from one place in me of memory to another is determined by the cost to access the farthest memory location plus a cost of one for every other member in the block to be copied.

Definition

[4]: HMM

One can make modifications to the memory hierarchy to make it appear as layers of memory in the hierarchy, where the access cost to any location in a layer is fixed according to the access cost function. As follows:

1. When the access cost function, , is sub-linear monotonically increasing function. All memory

locations from to , for some

suitably chosen integer constant , to have

access cost . Similarly all memory

locations from to to have

access cost , and so on. In this

Optimal median selection under memory access

constraints

(2)

modification it is possible to access any block of

size in the layer whose locations are from

to and bring it to lower memory

locations at a constant cost per element. Similarly, one can define the access cost of higher memory locations such as locations

to to have access cost .

2. When the access cost function is linear or

super-linear monotonically increasing function.

All memory locations from to to

have access cost . Similarly all memory

locations from to to have

access cost , and so on. Similarly, one

can define the access cost of higher memory

locations to to be .

3. We assume that only the first

locations of the layer whose access cost is holds data and the other locations are can be used as “buffer” area.

It is obvious from the definition of the multi-layer memory (MLM) machine that it acts exactly the same as the HMM machine with the exception that each layer of memory has a uniform memory access cost for all the addresses within that layer. Moreover, the access cost to an address on the MLM machine is not asymptotically less than the corresponding address in the HMM machine, and thus a lower bound on the HMM machine can serve as a lower bound on the corresponding MLM machine. Similarly, an upper bound on the MLM machine can serve as an upper bound on the HMM machine. However, if we modify the access cost for addresses on the various layers in the MLM machine from to then the MLM machine can simulate the corresponding HMM machine and a lower bound on this modified MLM machine can serve as a lower bound for an implementation on the corresponding HMM machine. We shall see an application of this idea later when we seek to obtain lower bounds on reading n memory locations. Definition. We say that n items were read in ttime if each of these items was moved to location 1 during the time t.

Lemma 1.

The number of layers is a function

that is asymptotically equal to the number of times we compose the function on values with itself before we get a value less than a constant of our choice that is greater than or equal 1, and it grows asymptotically not more than for linear or sub-linear access cost functions.

Proof.

Given an input of size in the first locations, then according to the definition of the memory layers, one can move all the data to locations starting from in one block transfer. Now consider moving the last item of the data until it reaches the first location of memory but without

For sub-linear function is asymptitcally not more than . It follows, therefore, that the total access cost is:

The exponent of the denominator for each term is the number of layers we have moved the data item to so far. It follows, therefore, that the number of layers is ≤ .

Corollary 2. The number of layers for access

costs , and is

asymptotically not more than , , for

, and , for respectively, for some constant c> 0.

Proof.

For access cost the number of layers is . This follows from the fact that access cost to the uppermost layer is , and when a block of that size is copied into the next layer the cost at that layer to each element is , and so. Thus the cost of transfer is at most per transfer of data from one layer to the next. The transfer of data to lower memory layers is stopped when the size of a block reaches a constant > 0. It follows that the number of layers is which follows from the definition of , and the recurrence

and for the maximum

number of different layers a data item is moved.

For access cost , , we can move a block of size from the n input to the next lower layer at a cost of , which is O(1) per item moved. Similarly we can move a block of size from the second memory layer to the immediately lower layer at a cost of O(1) per element moved, and so on until the cost of the move of the block is itself is constant. This can be described by the recurrence for the maximum number of different layers of:

and

In this case it must be the case that after i block moves became a constant.

Now let , a constant >1. Thus

must be a constant.

Therefore, must be a

constant. It follows, therefore, that the number of layers

is .

For access cost the number of layers is i, where is is constant, from which it follows that

Corollary 3.n items can be read in time , , and , on the multi-memory layers machine with access cost and n, for constant 0 respectively.

Corollary 4.n operations can be performed on the following

structures in time , , and

, on the multi-memory layers machine with access cost and n, for constant 0 respectively:

(3)

b) Enqueue/Dequeue queue operations when starting from an empty queue.

While corollary 2 and 3 provide upper time bounds on reading or stack and queue operations, we need to obtain a lower bound on these operations to prove the asymptotic optimality of our solution. To this end we will make a minor modification to the multi-layer memory model introduced earlier as follows:

Definition. A specially blocked multi-layer memory machine is a multi-layer memory machine with the exception that:

a)Access to a location in a memory layer in the specially blocked machine costs if the same location in the corresponding multilayer memory machine costs

b)Data movement in the specially blocked machine can be made between immediately adjacent layers only.

Next we will show that any sequence of data movement that requires time T in made in the multi-layered memory machine can be simulated in time in the specially blocked MLM machine.

Proposition 5. Given a function a sub-linear monotonically increasing continuous function,

then , where , for constant ,

and any constant .

Proof. Follows from the fact

that , since is a sub-linear

function.

Proposition 6. The size of any layer, i, on MLM machine, is not less than the sum of size of all layers of memory from location 1 up to but not including the layer i.

Proof.

Let’s consider first the case when access cost of the MLM machine is linear or super-linear. In this case any layer with the access cost contains memory locations from to . The size of this layer is

which is the sum of all memory

locations from 1 to .

The argument made earlier together with the proposition 5 take care of the case when the access cost of the MLM is sub-linear.

Lemma 7. A sequence of data movements in an algorithm on the MLM machine with access cost that run in time T

can be simulated in time on a specially blocked MLM machine.

Proof.

Let’s consider the block data movement from [x-l..x] to [y-l .. y] on the MLM machine, and without loss of generality, let’s

assume that . Thus the cost of data transfer on the

MLM machine is , where is the access

cost to the layer which location xresides.

Now consider the corresponding specially blocked MLM machine. The access cost to the layer (say ) in which location x resides is by definition. The cost of transfer would be: for moving the block to the immediately adjacent layer . Some of the data may remain in the same layer , and others may be moved to the buffer area of layer . Again some of these data may be moved to the data area in layer , while the others may be moved to the buffer area of layer , and so on. The size of data moved in layer is less than and the rest were moved to layer . The size of data moved in layer is no more than and the rest would be moved to layer it follows that data transferred within layer were moved only once, while data transferred within layer

were moved twice; once from level to level and once within level . Similarly data transferred within layer were moved at most 3 times from layer to layer to layer and then within layer . Thus the overall cost of the data transfer is:

.

Corollary 8. A lower bound on the data movement for an algorithm implemented on a specially blocked MLM machine is also a lower bound on the MLM machine and the HMM machine with block transfer.

Proof. The corollary follows from lemma 7 and the fact that the access cost to a layer on the specially blocked MLM machine is asymptotically less than the access cost corresponding MLM machine. Moreover, the MLM machine was defined so that access cost to any locations is less than or equal to the same data location on the HMM with block transfer.

Corollary 9. Reading n data elements located in the first n

locations of HMM with block transfer, MLM machine or specially blocked MLM machine can be performed in

time. Proof.

The number of layers on specially blocked MLM machine is , and for reading the elements in the top layer where the access cost is , and the size of the layer

is , for constant , the number of layers

each data element will pass through until it reaches location 1 is . Thus the overall cost is .

The data can be read by recursively transferring data items from the layer whose access cost is to the layer below it (i.e. one with access cost ). Thus by definition of the number

of layers this is equal to .

Corollary 10.n items can be read in time , , and by MLM machine with

access cost , , constant ,

(4)

III. THE SELECTION ALGORITHM

Step

SELECT(i, n){

1 Divide the n elements into groups of 5. Find the median of each 5-element group by rote. 2 Recursively SELECT the median x of the

group medians to be the pivot.

3 Partition around the pivot x. Let k = rank(x). 4 if i = k then return x

elseif i <

then recursively SELECT the i th smallest element in the lower part else recursively SELECT the (i–k)th

smallest element in the upper part

}

Analysis of the algorithm

Access cost

.

The algorithm follows the step of the [1] technique in finding the kth element. In step 1, the whole data is read and the median of each group of 5 elements is found. Reading all the data costs time for some constant . Once a group is in lowest memory locations, the access cost is constant, and therefore finding the median of a constant number of elements is constant. A copy of the median can be added to a queue at a cost of time per element. Thus the whole step can be performed in time.

Step 2 would recursively apply select method to the list of medians obtained during step 1. If the worst case running time of the method is , then this step would require at most

in the worst case.

Step 3 partitions the list of n elements around the partition found in step 2. Since reading or writing n elements can be performed in time in the worst case, it follows that this step can be performed in time in the worst case. Step 4 recurs on either the lower or the upper partition obtained in step3. The maximum size of such a partition can be obtained as follows:

Each of the medians found in step 1 is greater than or equal to at least 3 of the 5 elements in that partition. It follows that the median of these elements, found in step 2, which by definition of median is already greater than or equal to half the list of medians found in step 1. Therefore, such a median

must be greater than or equal to at least: of

the items of the original list. Thus the largest set to be searched recursively in step is of size at most . Therefore, it follows that the running time of the select method is at most

Given that reading n items requires , it follows that this bound is optimal for the given memory access cost.

Access cost:

layers, and copying a block from higher layer to a lower layer can be performed at a cost of per element, it follows that this step can be performed in at most . Step 2 would recursively apply select method to the list of medians obtained during step 1. If the worst case running time of the method is , then this step would require at most

in the worst case.

Step 3 can be performed in time in the worst case, following the same argument made for the case when the access cost is .

Step 4 as stated earlier recurs on either the lower or the upper partition obtained in step3. The argument for the largest of these partitions being of size not more than remains valid irrespect of the access cost. Therefore, it follows that the running time of the select method is at most

Reading n on a HMM machine with the given access cost items requires . It follows that this bound is optimal for the given memory access cost.

General case for the access cost

.

Now we can generalize the argument made in obtaining the asymptotic time complexity of the selection algorithm as follows:

Step 1 can be performed in time by corollary 9. Step 2 can be performed in time where the running time for the select is .

The partitioning of the list in step 3 can be viewed as a process of reading or push/pop operations and can be performed in time by corollary 9.

Step 4 can be performed in the order of operations.

Thus the time recurrence is

, for some

constant and .

Solving this recurrence we get Thus we have:

Theorem 11. Given a set of numbers, the element of rank

k, can be found in in the worst case

on a HMM machine with access cost

IV. CONCLUSION

(5)

REFERENCES

[1] Blum, M.; Floyd, R. W.; Pratt, V. R.; Rivest, R. L.; Tarjan, R. E. (August 1973). "Time bounds for selection" (PDF). Journal of Computer and System Sciences 7 (4): 448–461.

[2] Atkinson, M.D., J.-R. Sack, N. Santoro, and T. Strothotte (1 October 1986). "Min-max heaps and generalized priority queues." Programming techniques and Data structures. Comm. ACM, 29(10): 996–1000.

[3] Hoare, C. A. R. (1961). "Algorithm 65: Find". Comm. ACM 4 (7): 321–322.

[4] Alok Aggarwal ; Ashok K. Chandra ; Marc Snir, Hierarchical memory with block transfer,28th Annual Symposium on Foundations of Computer Science, 1987., PP.204 – 216

[5] A. Aggarwal , B. Alpern , A. K. Chandra and M. Snir, "A Model for

Hierarchical Memory", Proc. of the 19th Annual ACM Symposium on

the Theory of Computing, pp. 305-314, 1987.

[6] J. W. Hong and H. T. Kung, "I/O Complexity: The Red-Blue Pebble Game", Proc. of the 13th Ann. ACM Symposium on Theory of Computing, pp. 326-333, 1981

[7] A Schonhage, A Nonlinear Lower Bound for Random Access Machines Under Logarithmic Cost, 1984

[8] Alok Aggarwal, Jeffrey, S. Vitter, The input/output complexity of sorting and related problems, Comm. ACM 31(9): 1116-1127, 1988.

[9] C. A. R. Hoare, Quicksort, Computer Journal, 5:1015, 1962.

[10] D. E. Knuth, Selected Papers on Analysis of Algorithms, CLSI Publications, 2000.