Energy-Efficient Data Gathering with Tour Length-Constrained Mobile Elements in Wireless Sensor Networks

(1)

Energy-Efficient Data Gathering with Tour

Length-Constrained Mobile Elements in Wireless

Sensor Networks

Khaled Almi’ani Anastasios Viglas Lavy Libman

School of Information Technologies, University of Sydney, NSW 2006, Australia

{kalmiani,tasos,llibman}@it.usyd.edu.au

Abstract—Several studies in recent years have considered the use of mobile elements for data gathering in wireless sensor networks, so as to reduce the need for multi-hop forwarding among the sensor nodes and thereby prolong the network lifetime. Since, typically, practical constraints preclude a mobile element from visiting all nodes in the sensor network, the solution must involve a combination of a mobile element visiting a subset of the nodes (cache points), while other nodes communicate their data to the cache points wirelessly. This leads to the optimization problem of minimizing the communication distance of the sensor nodes, while keeping the tour length of the mobile element below a given constraint. Several algorithms in existing literature have tackled this problem by separating the construction of the mobile element tour from the computation of the forwarding trees to the cache points. In this paper, we propose a new algorithm that alternates between these phases and iteratively improves the outcome of each phase, based on the result of the other. We compare the resulting performance of our algorithm with that of previous work, and show that it closes a considerable portion of the gap from the theoretical optimal solution.

Keywords-component; Wireless sensor networks; data gathering; mobile sensing.

I. INTRODUCTION

Many typical applications of wireless sensor networks (WSNs) involve the collection of data obtained by sensor nodes at a pre-defined sink. This is normally achieved by wireless transmission of the data, possibly over several hops (especially in applications where the sensors are deployed in a hostile or hard-to-access environment). In many cases, the wireless communication results in a major energy expenditure that limits the operational lifetime of the network. Even worse, in multi-hop scenarios, the depletion of the sensors’ energy sources (such as batteries) is non-uniform, as nodes that are close to the sink are required to forward all the data traffic and are likely to be the first to run out of energy. Once these sensors fail, other nodes can no longer reach the sink, and the network ceases to operate even though ample energy remains at nodes further away from the sink. This common problem occurs largely independently of the communication protocols used in the network.

In general, the energy requirements due to the wireless transmission of data can be significantly reduced by the use of

Mobile Elements (MEs) [1, 2, 3]. An ME acts as “data transporter” that roams in the network and collects data from sensors via short-range communication, the energy cost of which is considerably lower. Thus, the network lifetime increases by avoiding multi-hop communication. The primary disadvantage of this approach is the increased latency of the data collection. Indeed, the typical speed of MEs can be about 0.1-2 m/s [4, 5], resulting in substantial travelling time for the ME and, correspondingly, delay in gathering the sensors’ data. In practice, often the ME tour length is bounded by a pre-determined time deadline, either due to timeliness constraints on the sensor data or a limit on the amount of energy available to the ME itself. A possible solution is to employ more than one ME; however, this solution is often impractical due to the high cost of MEs, and may not in fact help at all if some sensors are beyond reach due to ME battery limitations in the first place.

To address this problem, several proposals (most notably [3]) presented a hybrid approach, which combines multi-hop forwarding with the use of mobile elements. In this approach, some nodes act as caching points (CPs) for data from other sensors. When an ME reaches a CP, it polls all the data stored in the CP, including data originating in other sensors. The ME is thus able to gather data from the entire network without having to visit all the sensors physically. This hybrid approach gives rise to the following generic optimization problem: determine the set of CPs such that the ME tour length is below a given constraint, that minimizes (in some sense) the communication needs of the sensor nodes. The precise nature of the minimization objective can vary according to the application requirements, e.g. it can be based on the number of hops or actual (Euclidean) distance, and target either the total sum or the maximum thereof. In this work, following [3], we focus on the problem of minimizing the total number of forwarding hops from all sensors to their respective nearest cache points; in other words, we seek a solution where the forwarding requirements of all nodes are balanced as much as possible (subject to the constraint on the ME tour length). This is arguably the variation with the most practical importance, as each node’s hop count to the nearest CP is closely correlated to the energy consumption it imposes on its peers, and, ultimately, to the network lifetime.

(2)

This paper makes the following contributions. First, we formally define the Periodic Rendezvous Data Collection (PRDC) optimization problem and present an Integer Linear Programming (ILP) formulation for it. We then propose our new Cluster-Based (CB) algorithm, where the main idea is grouping the network into a number of balanced-size clusters, constructing the ME tour to involve only one CP from each cluster, and improving the solution iteratively by alternating between the ME tour building phase and the CP forwarding tree computation phase. We evaluate the algorithm experimentally on a wide range of practical scenarios, showing that it consistently outperforms the algorithm of [3] and is not very far from the optimum in small instances.

The rest of the paper is organised as follows. Section II provides a formal definition of the problem, and Section III presents the related work in this research area. Section IV presents an ILP formulation of the problem, which allows it to be solved efficiently for small instances. Section V presents the CB algorithm, which is then evaluated in Section VI. Finally, Section VII concludes the paper.

II. PROBLEM DEFINITION

An instance of the PRDC problem consists of an undirected graph = 〈, 〉, where is the set of vertices representing the locations of the sensors in the network, and is the set of edges that represents the communication network topology, i.e. , ∈ iff , are within each other’s communication range. A distinguished vertex ∈ denotes the location of the sink. In addition, the complete graph = 〈, ′〉 , where _{= × , represents the possible movements of the ME.} Each edge , ∈ ′ has a length , which represents the time needed by the ME to travel between sensor and . The data of all sensors must be uploaded to the ME periodically at least once in L time units, where L is determined from the application requirements and the sensors’ buffer size. In other words, we assume the ME conducts its tour periodically, with L being a constraint on the maximum tour length. In this paper, for simplicity, we assume that the ME travels at constant speed, and that, therefore, the travelling times between sensors () correspond directly to their respective Euclidean distances; however, this assumption is not essential to the solution algorithm and can be easily alleviated if necessary.

A solution to the PRDC problem consists of a tour (i.e. a path in ) that starts and ends in , where the length of the tour is bounded by L, such that the sum of hop-distances between every sensor and the tour is minimized. More formally, the problem can be stated as follows:

Minimize ∑_∈min_∈d( , ′)

s.t.: d( , ) is the hop-distance (i.e. number of edges on the shortest path) between , in ;

= 〈 ′= , ′ , ⋯ , ′||# , ′||= 〉; ∑||# ) ($,$%&) ≤ (.

III. RELATED WORK

There have been many proposals in recent literature that studied using mobile element(s) to reduce the energy consumption in a sensor network. Based on the categorization given in [3], we review three major approaches.

In a typical flat-topology network, the nodes around the sink are likely to be the first to die due to having to forward the data traffic from all other sensors. Based on this observation, several proposals [6-8] have investigated using mobile sink(s) to reduce the energy consumption in the network. By varying the path to the sink(s), the residual energy in the nodes becomes more evenly balanced throughout the network, leading to a higher network lifetime. However, in order to be effective, this technique requires the sink location (and routes thereto) to change regularly, which places a potentially prohibitive overhead on the nodes due to the frequent recomputation of the routes.

In the second approach, MEs travel across the network and gather each sensor’s data via single-hop, short-range communication. Essentially, computing efficient ME tours reduces to solving the Traveling Salesman Problem (TSP), possibly with additional constraints arising from buffer limitations in the sensor nodes. In [9-13], several heuristics have been proposed to that effect, so that each sensor is visited before its buffer is full. Although this approach substantially reduces the energy consumption by avoiding multi-hop communications, it incurs a high delay when the network area is large, because of the requirement that the MEs physically visit all sensor nodes.

Finally, the third approach is a hybrid that combines multi-hop forwarding with data collection by mobile elements. Our work falls into this category. Some earlier works, e.g. [14-16], assumed the mobile route to be predetermined and were mainly concerned with the timing of transmissions, aiming to minimize the need for in-network caching by timing the transmissions to coincide with the passing of the tour. In [17, 18], the minimum-energy Rendezvous Planning Problem (RPP) is introduced. This problem deals with determining the set of rendezvous points constructing the ME tour. In RPP, the goal is to minimize the Euclidean distance between the source nodes and the tour. This is in contrast with the PRDC problem we consider in this paper, where we aim to minimize the hop distance, as the Euclidean distance is not a reliable indicator of the true communication cost between nodes, e.g. due to the presence of physical obstacles. Furthermore, the approach of [17, 18] requires that a sensor is able to aggregate packets from multiple sources into a single transmission, which thereby limits the extent that it can be used in practice. From an algorithmic perspective, the solution in [17, 18] is obtained by first computing the maximal tour under the constraint, and subsequently building the forwarding trees around that tour. Such separation unnecessarily reduces the solution search space, whereas the algorithm we present in this paper repeats the tour building and forwarding tree computation phases in a way that allows the solution to be iteratively improved.

The PRDC problem shares some similarities with the Vehicle Routing Problem (VRP) [19]. Given a fleet of vehicles assigned to a depot, VRP deals with determining the fleet

(3)

routes to deliver goods from a depot to customers while minimizing the vehicles’ total travel cost. Among the VRP variations, the Vehicle Routing Problem with Time Windows (VRPTW) [20] is the closest to PRDC. In VRPTW, each customer must be visited by exactly one vehicle and within a pre-defined time interval. PRDC also shares some similarity with the Deadline Travelling Salesman Problem (Deadline-TSP) [21], which can be described as seeking the minimum tour length for a salesman to visit a set of cities, where each city must be visited before a pre-determined time deadline. In particular, the special case when all cites have the same deadline reduces Deadline-TSP to the well known Orienteering problem [22]; PRDC can thus be considered as a generalization of the Orienteering problem.

IV. INTEGER LINEAR PROGRAM FORMULATION Let * be a 0-1 integer variable for each , ∈ ′ such that *= 1 if the edge , ∈ ′ is included in the ME tour, and 0 otherwise. Also, let , be a 0-1 integer variable for , ∈ , such that ,= 1 if node is included in the tour and is responsible for storing the data of node , which is not included in the tour (i.e. is the root of the tree that contains ). Furthermore, let -. be a positive integer variable for each vertex / ∈ , showing the order in which the vertices are visited in the resulting tour. The Integer Linear Program for the PRDC problem can thus be stated as follows.

Objective function: 0123 4 * , ∙ + 78∙ 4 ,∙ , d ,

The first term of the objective function is the travelling time of the ME (travel cost), and the second term is the number of hops between the nodes not included in the tour and the tour (assignment cost). The coefficient 7₈ must be set to any value greater than max_,;<. Multiplying the second term by 7₈ ensures that the assignment cost is strictly prioritized over the travel cost. Constraints: ∑ *∈ − ∑ *∈ = 0 ∀@ ∈ (1) ∑ *∈ A = 1 (2) ∑ *∈ A = 1 (3) ∑,∈*∙ ≤ ( (4) ,+ ∑ *8∈ 8 ≤ 1 ∀1, @ ∈ (5) ∑ *∈ + ∑ ,∈ ≥ 1 ∀1 ∈ (6) -− -+2 ∙ * ≤ 2 − 1 ∀1 ∈ , ∀@ ∈ \D E (7) Constraint (1) ensures that the load at each node is balanced. Constraints (2)-(3) ensure that the tour starts and

ends at . Constraint (4) bounds the travelling time of the ME tour. Constraint (5) ensures that only the nodes included in the mobile tour are responsible for other nodes’ data. Together with Constraint (5), Constraint (6) ensures that each node either belongs to the mobile element tour or is assigned to a node on the tour, but not both. Constraint (7) prevents solutions that allow degenerate sub-tours, which are tours that are not connected to the origin ( ). This constraint is based on the Miller-Tucker-Zemlin [27] sub-tour elimination constraints, which have been widely used in the literature for ILP formulations of both the TSP and VRP problems.

V. ALGORITHMIC SOLUTION

Our algorithm is based on the insight that, in seeking to maximize the network lifetime, the problem of finding the most efficient ME tour and that of finding the best forwarding trees from sensors to the tour are not independent; the solution of each has a profound impact on the outcome of the other. Therefore, the two problems should ideally be solved jointly. Unfortunately, since a joint solution is intractable for all but the smallest instance sizes, we propose a Cluster-Based (CB) algorithm that iteratively obtains the mobile element tour and the routing trees, improving the solution in each iteration based on the outcome of the previous one. To achieve this, the algorithm attempts to partition the network into energy-aware clusters, such that in each cluster a single CP is eventually selected.

The CB algorithm consists of two phases: tour-building and final-tour-improvement. During the tour-building phase, the algorithm attempts to find a tour that visits as many clusters as possible, where clusters are obtained by partitioning the network into groups of approximately the same number of nodes. Such construction attempts to balance the forwarding traffic within the clusters. Once this tour is obtained, the final-tour-improvement phase proceeds to improve the quality of this tour.

Figure 1 outlines the structure of the CB algorithm. Here, lines 1-11 correspond to the tour-building phase; the second phase (final-tour-improvement) is captured in line 12 and is described in detail in subsection V.C. Essentially, the tour-building phase follows a process similar to a binary-search mechanism. In each round, the process chooses a number of clusters to be established (F) from the middle of the range of possible values so far, which initially is from 1 to the total number of sensors in the network. If the choice of F results in a tour that satisfies the length constraint L, then the lower half of the range is eliminated and the process is repeated; conversely, if the length constraint is not satisfied, the upper range is eliminated instead. The search terminates once the maximum number of clusters is found that is able to satisfy the tour length constraint.

We now elaborate on the details of the calculation performed in line 4, i.e. constructing a tour for a given number of clusters G. This construction involves two steps, namely: partitioning the network into clusters first, followed by finding a minimum-length tour that traverses one node in each cluster. These steps are described in further detail in the following subsections.

(4)

Input: (network topology graph), ′ (ME movements graph), ( (tour length constraint)

Output: H12IJ_LM/ (ME tour) 1 J ← 0, / ← 2

2 F ← O(/ − J) 2⁄ R 3 While / − J > 1

4 do LM/ ← T/1JU_M/(, ′,F, () 5 If tour travelling time ≤ L

6 Then J ← F 7 JIVL_ IJ1U_LM/ ← LM/ 8 If F = 2 9 Then break 10 Else / ← F 11 F ← O(/ + J) 2⁄ R 12 H12IJ_LM/ ← LM/_10WM X0X2L( IJ1U_LM/, FJ/VLXV) A. Clustering Step

The clustering step attempts to find a given number of clusters such that the sum of hop-distances among nodes belonging to the same cluster is minimized. Roughly speaking, in a network of homogeneous node density, this leads to a balanced number of nodes in the clusters. To that end, the clustering step works by bounding the distance between a node and its cluster centre node (F), defined as the node that has the minimum total hop-distance to all other nodes in the cluster.

Our clustering step is inspired by the well-known k-means clustering algorithm developed by MacQueen [28]. Figure 2 outlines the process of this step. At first, Fnodes are selected at random as the initial clusters centre nodes. Subsequently, every other node is assigned to its nearest cluster, based on the hop distance to the cluster’s centre node. Once all nodes have been thus assigned, the centre node for each cluster (i.e., the node that has the minimum total hop-distance to all other nodes in the cluster) is determined, and the process is repeated from the beginning based on the new F cluster centres. The clustering step terminates when the identity of the clusters’ centre nodes does not change between two consecutive iterations.

Input: (topology graph), F (number of clusters to be established) Output: a set of clusters

1 Randomly choose F nodes as initial cluster centres 2 do

3 for all nodes in

4 assign each node to the nearest cluster centre (in terms of hop-distance)

5 Recalculate the centre nodes of the resulting clusters 6 Until set of centre nodes is unchanged from previous iteration

B. Tour-finding Step

Once the clusters are determined in the clustering step, the next task is to find a tour that traverses precisely one node (the CP) from each cluster. Ideally, from a perspective of communication energy consumption, the CP of each cluster should be the cluster centre node, which has the minimum hop distance to all other nodes in the cluster. Unfortunately, this will usually lead to a tour with a much longer travelling time than many other possible tours. Rather, the objective of the tour-finding step is to determine the tour with the shortest overall length that visits exactly one node from each cluster. Indeed, even though such a tour will probably end up with sub-optimal CPs for the given set of clusters, it allows the overall algorithm to ultimately achieve a larger number of clusters while satisfying the tour length constraint. Arguably, the number of clusters has a greater impact on the overall quality of the solution than the choice of a particular CP within a cluster.

Based on the description above, it is clear that the problem of determining such a tour is essentially an instance of One-of-a-Set TSP [23]. Its solution consists of two parts, namely: nodes identification, during which the identity of the CPs is determined, and tour construction which finds the optimal tour among the chosen points. In our algorithm, the nodes identification simply iterates over the clusters and chooses the closest node to the set of CPs so far. Once the nodes are identified, the optimal tour connecting them is found, e.g., using Christofides algorithm [24]. Figure 3 outlines the process performed in this step.

Input: (topology graph), FJ/VLXV Output: YZV, LM/

1 YZV ← 2 LI[[XU ← ∅

3 While LI[[XU ≠ FJ/VLXV

4 find the node in DFJ/VLXVE\DLI[[XUE that is closest to any of the nodes in YZV

5 Add the cluster of the selected node to LI[[XU 6 Add the selected node to YZV

7 LM/ ← LM/_FM2VL/FL1M2(YZV)

C. Final Tour Improvement Phase

Finally, we describe a method to improve the quality of the tour in the first phase. At this stage, the network is partitioned to the maximum possible number of clusters such that the resulting tour does not violate the length constraint L. Nevertheless, the tour itself is the minimum-length one for this set of clusters, and typically, the travelling time of the tour at this stage is strictly less than L. This gap raises the possibility of amending the tour by choosing alternative CPs (closer to the respective cluster centres), as long as the tour length constraint remains satisfied, so as to reduce the total hop distance between CPs and other nodes in their respective clusters.

Figure 3. The tour-finding step

Figure 2. The clustering step

(5)

Figure 4 outlines the final tour improvement phase. For every cluster, unless the corresponding CP already happens to be the cluster’s centre node, an alternative CP is considered (denoted CP') which is the next node on the shortest path from CP to the cluster’s centre in . The algorithm computes how much the ME tour length would increase if CP were replaced by CP' in this cluster only, and commits the change to the alternative CP' for the cluster where the length increase would be minimal. The process is then repeated, until it is no longer possible to change the CP without violating the tour length constraint L.

Input: (tour), YZV (clusters’ caching points), ( (tour length constraint), clusters

Output: (tour to be assigned to the ME)

1 J ← find the closest cluster to the tour (the distance between the cluster centre node and the tour)

2 ′_←

3 While LI XJJ12[ L10X HM ′< L 4 do ← ′

5 If all clusters’ centre nodes are already the CPs, Then exit

6 For every cluster l where CP(J) is not the centre node, denote CP′(J) to be the next node on the shortest path from CP(J) to the cluster’s centre 7 Set J∗ to be the cluster where swapping CP(J) for

CP′(J) in T causes the minimal increase in the tour length of T

8 In ′ swap CP(J∗) for CP′(J∗) , update edges accordingly

9 Return (the final tour)

D. Complexity analysis

We consider the complexity of the CB algorithm. Each clustering procedure requires a(F ∙ 2 ∙ 1) steps, where 1 is the number of iterations required for the clustering algorithm to converge to a final solution. The number of such iterations depends on the number of clusters, initial choice of the cluster centers, and the node distribution. To clarify the relationship between these parameters, we ran the algorithm for varying numbers of nodes, clusters, and selections of initial centre nodes. Figures 5 (a) and (b) demonstrate the results. It can be seen that the average number of iterations is mainly influenced by the number of clusters. Finally, the tour finding step requires a(Fb) to construct a tour of F nodes.

VI. SIMULATION EVALUATION AND RESULTS To validate the performance of the CB algorithm, we have conducted an extensive set of experiments using the J-sim simulator for WSN [25]. Due to space constraints, we only present a sample of our results here, based on a few representative scenarios that are described henceforth. Unless mentioned otherwise, the network area is 160,000 0d. The radio parameters are set according to the MICAz data sheet [26], namely: the radio bandwidth is 250 eTWV , the

transmission power is 21 0f, the receiving power is 15 0f, and the initial battery power is 20 Joules. Each sensor node sends one packet per ME tour, where the packet has a fixed size of 100 bytes. Each experiment is an average of 10 different random topologies. We are interested in evaluating (1) network lifetime, (2) number of CPs , and (3) total size of routing trees.

The deployment of sensors is typically application-dependent and the evaluation results will depend on the deployment characteristics. In this evaluation, we consider the following scenarios:

• Uniform deployment: in this scenario, we assume that the nodes are uniformly deployed in a square area of 400 × 400 0d_.

• Multi-level: in this scenario, we divide the network into a 5 × 5 grid of squares, where each square is 80 × 80 0d_. Then, we randomly choose 10 of the squares, and in each one of those we fix the node density to be 5 times the density in the remaining squares.

To benchmark our algorithm, we compare it to the Rendezvous Design for Variable Tracks (RD-VT) algorithm presented in [18]. The RD-VT algorithm works by constructing the MST of the sensor network and traversing it in preorder, until a tour of the nodes covered so far can no longer be found without violating the length constraint. To ensure the fairness of the comparison, we use the Christofides algorithm [24], i.e. the same algorithm we use in our tour-finding step (see subsection V.B), to find a tour for a given set of nodes in every iteration of RD-VT as well. Eventually, each sensor is connected to the nearest point of the tour via the shortest path. A. Network lifetime

In this evaluation, we consider the 1% network lifetime metric, which is defined as the time until 1% of nodes run out of energy. For simplicity, we only account for the radio receiving and transmitting energy. Figures 6 (a) and (b) show

Figure 5(a). Average number of iterations vs number of nodes, number of clusters is 5 (experiment repeated 10 times).

Figure 5(b). Average number of iterations vs number of clusters, number of nodes is 1500 (experiment repeated 10 times).

6 6.5 7 7.5 500 700 900 1100 1300 1500 A V G n u m b e r o f it e ra ti o n s Number of nodes 0 5 10 15 2 3 4 5 6 A V G n u m b e r o f it e ra ti o n s Number of clusters

(6)

the results for both deployment scenarios as a function of the number of nodes (equivalently, network density), for a value of

L that is set to 0.15 ∙ V ∙ _k, where V = 1 0/V is the speed of the

mobile element, and _k is the length of the minimum spanning tree (MST) that connects all the nodes in each instance. It is clearly seen that CB consistently outperforms RD-VT due to the superior tour construction approach. In RD-VT, the solution is constructed by traversing the MST, without considering the density of nodes or their distribution. On the other hand, in CB, the solution is constructed by dividing the network into similarly sized clusters and building the tour to visit one node from each cluster; this results in a roughly balanced communication load of the clusters, and thereby increases the network lifetime. The improvement is more notable in the multi-level deployment scenario, where the density variations between different regions of the network are catered for in the CB approach but do not make a difference for RD-VT.

We proceed to show how the network lifetime depends on the value of the tour length constraint (. Figures 7(a) and (b) show the results for both deployment scenarios with 500 nodes. Here, the horizontal axis shows the value of L normalized as a fraction of V ∙ _k . We observe that, in the multi-level scenario, there is little difference in the performance of CB and RD-VT if the value of L is 0.06V ∙ _k or less. This effect is explained by the variation of node density. Indeed, in the multi-level scenario, there are some areas (specifically, 10 out of the 25 squares) where the node density is 5 times greater than the rest of the network. From the perspective of network lifetime, each one of these dense areas should have at least one CP of its own, to avoid having bottleneck nodes that funnel all the traffic from that area to a CP elsewhere. In turn, this requires the tour length constraint to be sufficiently generous. For L that is below 0.06V ∙ k, regardless of the method used to construct the tour, the number of CPs that can be included in the tour is insufficient to cover all the dense areas, resulting in bottleneck nodes near the CPs that deplete their energy quickly.

B. Number of CPs

To investigate the impact of the network density on the number of CPs each algorithm obtains, we ran the experiments while varying the number of nodes. Figures 8(a) and (b) show the results for the uniformly deployment and the multi-level scenarios, respectively. It can be seen that, for the same tour length, RD-VT includes more CPs than CB. As one would expect, this is due to RD-VT mechanism of traversing the tree. However, as previously shown in the discussion on network lifetime, the number of CPs in itself has no impact on the resulting performance; it is the location of the CPs that is the main factor that influences the network lifetime.

C. Size of routing trees

Figures 9(a) and (b) show the impact of the number of nodes on the total size of the routing trees (i.e. the sum of hop-distances from all sensor nodes to their respective CPs) that each algorithm obtains. It shows that although RD-VT obtained more CPs than CB, the latter nevertheless achieves smaller routing trees overall. This is because the tree traversal process used by RD-VT results in a set of CPs that includes many mutual neighbors, which is not useful when those nodes are used as roots for separate trees; in other words, many of the

resulting trees are very small, while only a limited number of CPs end up being roots of large trees (i.e. serve as cache points for a large number of sensors). On the other hand, the CPs selected by the CB algorithm are distributed more evenly across the network, resulting in trees whose size is better balanced and therefore lowering the total sum of hop-distances of all nodes.

We conclude the comparison between CB and RD-VT by considering a variation of the uniform deployment scenario where the area is rectangular rather than square, and obtain the performance of both algorithms as a function of the network aspect ratio (i.e. the ratio between the width and the height of the area), while keeping the area of the network constant. The

Figure 6(a): Network lifetime vs number of nodes for the uniform deployment scenario.

Figure 6(b): Network lifetime vs number of nodes for the multi-level deployment scenario.

Figure 7(a): Network lifetime vs L value for the uniform deployment scenario.

Figure 7(b): Network lifetime vs L value for the multi-level deployment scenario. 0 500 1000 1500 2000 2500 3000 500 600 700 800 900 1000 N e tw o rk li fe ft im e (m ) Number of nodes CB RD-VT 0 1000 2000 3000 4000 500 600 700 800 900 1000 N e tw o rk li fe ti m e ( m ) Number of nodes CB RD-VT 0 500 1000 1500 2000 2500 3000 0.02 0.04 0.06 0.08 0.1 N e tw o rk li fe ti m e ( m ) Normalized L value CB RD-VT 0 500 1000 1500 2000 2500 0.02 0.04 0.06 0.08 0.1 N e tw o rk li fe ti m e ( m ) Normalized L value CB RD-VT

(7)

result of this comparison is shown in Figure 10. It demonstrates that while reducing the aspect ratio (i.e. moving towards longer and thinner rectangles) adversely impacts the outcome of both algorithms, the CB algorithm is more sensitive to the network shape, to the point that RD-VT outperforms CB for very long rectangles. To explain this effect, consider the relationship between the aspect ratio and the network topology. As the aspect ratio is reduced, the topology (i.e. the graph ) becomes more “chain-like”, i.e. the average degree of nodes goes down as well. Consequently, the tree traversal method of RD-VT becomes more efficient, whereas on the other hand, clustering loses its advantage when the clusters effectively end up in a linear sequence rather than being distributed more evenly in the area.

D. Comparisons with the optimal solution

In this subsection, we compare the performance of the RD-VT and CB algorithms with the optimal solution, obtained by solving the Integer Linear Program (ILP) presented in Section IV using a commercial version of CPLEX and AMPL. Obviously, since the running time required to obtain the optimal solution increases exponentially with the number of sensors, comparison with the optimal result is only possible small-scale network instances. Therefore, in this subsection we limit the number of nodes to 100 or less. Due to the small number of nodes, the multi-level scenario suffers from a high volatility or “noise” that makes it difficult to obtain statistically reliable results. We therefore focus exclusively on the uniform deployment scenario for the rest of this section, with an area size reduced to 100 × 100 0d to compensate for the smaller number of nodes.

Figure 8(a): Number of caching points vs number of nodes for the uniform deployment scenario.

Figure 8(b): Number of caching points vs number of nodes for the multi-level deployment scenario.

Figure 9(a): Size of routing trees vs number of nodes for the uniform deployment scenario.

Figure 9(b): Size of routing trees vs number of nodes for the multi-level deployment scenario

0 50 100 150 200 500 600 700 800 900 1000 N u m b e r o f C P s Number of nodes RD-VT CB 0 50 100 150 500 600 700 800 900 1000 n u m b e r o f C P s Number of nodes RD-VT CB 0 1000 2000 3000 4000 500 600 700 800 900 1000 R o u ti n g t re e s si ze Number of nodes RD-VT CB 0 1000 2000 3000 4000 500 600 700 800 900 1000 R o u ti n g t re e s si ze Number of nodes CB RD-VT

Figure 10: Size of routing trees vs aspect ratio for the uniform deployment scenario.

Figure 11(a): Size of routing trees vs number of nodes for the uniform deployment scenario.

Figure 11(b): Network lifetime vs number of nodes for the uniform deployment scenario.

0 500 1000 1500 2000 2500 1 0.777 0.6 0.45 0.33 0.23 0.14 R o u ti n g t re e s si ze Aspect ratio CB RD-VT 0 20 40 60 80 100 50 60 70 80 90 100 R o u ti n g t re e z si ze Number of nodes Optimal CB RD-VT 200 400 600 800 1000 1200 1400 1600 50 60 70 80 90 100 N e tw o rk li fe ti m e ( s) Number of nodes Optimal CB RD-VT

(8)

Figures 11 (a) and (b) show how well the CB and RD-VT algorithms perform compared to the optimal solution in term of routing tree size and network lifetime, respectively. In Figure 11(a), we note that the gap between the outcome of CB and the optimal solution tends to remain nearly constant, which suggests that the number of nodes (i.e. the problem size) has little impact on the performance ratio between the two. Figure 11(b) shows that, for the entire range of problem sizes considered, the CB algorithm attains a network lifetime that is roughly mid-way between the other two, i.e. it consistently closes about one half of the gap between the result achieved by RD-VT and the ideal optimum.

VII. CONCLUSION

In this paper, we presented a novel cluster-based algorithm to find efficient tours for mobile elements used for data collection in WSN. Through a wide range of simulation scenarios, we showed that the proposed algorithm increases the resulting network lifetime significantly compared to the previously best known solution, and reduces roughly one half of the gap between the best previous solution and the theoretical optimum.

To further maximize the network lifetime, the heuristic used in our algorithm can be extended to take the nodes’ residual energy into account during the establishment of the tours. In addition, to cope with unexpected delays in the network, the heuristics can be modified in a straightforward manner to allow the mobile element to pause and wait at nodes along the tour, as long as the overall deadline constraint can be met. An interesting open problem that remains is how one can deal with networks where nodes have heterogeneous communication capabilities.

REFERENCES

[1] J. Butle, "Robotics and microelectronics: mobile robots as gateways into wireless sensor networks," Technology@Intel Magazine, May 2003. [2] A. LaMarca, W. Brunette, D. Koizumi, M. Lease, S. Sigurdsson,

K. Sikorski, D. Fox and G. Borriello, "Making sensor networks practical with robots," Intel Research, Technical Report IRS-TR-02-004, 2002. [3] E. Ekici, Y. Gu, and D. Bozdag, "Mobility-based communication in

wireless sensor networks," IEEE Communications Magazine, vol. 44, pp. 56-62, Jul 2006.

[4] K. Dantu, M. Rahimi, H. Shah, S. Babel, A. Dhariwal, and G. S. Sukhatme, "Robomote: enabling mobility in sensor networks," in Proceedings of the Fourth International Symposium on Information Processing in Sensor Network (IPSN), pp. 404-409, Apr 2005.

[5] W. Kaiser, G. Pottie, M. Srivastava, G. Sukhatme, J. Villasenor, and D. Estrin, "Networked infomechanical systems (NIMS) for ambient intelligence," Center for Embedded Networked Sensing UCLA, Technical Report 31, 2003.

[6] S. Gandham, M. Dawande, R. Prakash, and S. Venkatesan, "Energy efficient schemes for wireless sensor networks with multiple mobile base stations," in Proceedings of IEEE Global Telecommunications Conference (GLOBECOM), pp. 377-381, Dec 2003.

[7] Z. M. Wang, E. Melachrinoudis, and C. Petrioli, "Exploiting sink mobility for maximizing sensor networks lifetime," in Proceedings of the 38th Annual Hawaii International Conference on System Sciences (HICSS), Jan 2005.

[8] K. Almi'ani, J. Taheri, and A. Viglas, "A data caching approach for sensor applications," in Proceedings of the 10th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), Dec 2009.

[9] A. Somasundara, A. Ramamoorthy, and M. B. Srivastava, "Mobile element scheduling with dynamic deadlines," IEEE Transactions on Mobile Computing, vol. 6, pp. 395-410, Apr 2007.

[10] A. Somasundara, A. Ramamoorthy, and M. B. Srivastava, "Mobile element scheduling for efficient data collection in wireless sensor networks with dynamic deadlines," in Proceedings of the 25th IEEE International Real-Time Systems Symposium (RTSS), pp. 296-305, Dec 2005.

[11] Y. Gu, D. Bozdag, E. Ekici, F. Ozguner, and C.-G. Lee, "Partitioning based mobile element scheduling in wireless sensor networks," in Proceedings of the Second Annual IEEE Communications Society Conference on Sensor and Ad Hoc Communications and Networks (SECON), pp. 386-395, Sep 2005.

[12] K. Almi'ani, S. Selvadurai, and A. Viglas, "Periodic mobile multi-gateway scheduling," in Proceedings of the Ninth International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), pp. 195-202, Dec 2008.

[13] M. Ma and Y. Yang, "Data gathering in wireless sensor networks with mobile collectors," in Proceedings of IEEE International Symposium on Parallel and Distributed Processing (IPDPS), pp. 1-9, Apr 2008. [14] A. Somasundara, A. Kansal, D. Jea, D. Estrin, and M. Srivastava,

"Controllably mobile infrastructure for low energy embedded networks," IEEE Transactions on Mobile Computing, vol. 5, pp. 958-973, Aug 2006. [15] R. Shah, S. Roy, S. Jain, and W. Brunette, "Data mules: modeling a three-tier architecture for sparse sensor networks," in Proceedings of IEEE Workshop on Sensor Network Protocols and Applications (SNPA), pp. 30-41, May 2003.

[16] D. Jea, A. Somasundara, and M. Srivastava, "Multiple controlled mobile elements (data mules) for data collection in sensor networks," in Proceedings of the 5th IEEE International Conference on Distributed Computing in Sensor Systems (DCOSS), pp. 30-41, Apr 2005.

[17] G. Xing, T. Wang, Z. Xie, and W. Jia, "Rendezvous planning in wireless sensor networks with mobile elements," IEEE Transactions on Mobile Computing, vol. 7, pp. 1430-1443, Aug 2008.

[18] G. Xing, T. Wang, W. Jia, and M. Li, "Rendezvous design algorithms for wireless sensor networks with a mobile base station," in Proceedings of the 9th ACM international symposium on Mobile ad hoc networking and computing (MobiHoc), pp. 231-240, Jun 2008.

[19] Toth and E. Vigo, "The vehicle routing problem," Society for Industrial and Applied Mathematics (SIAM), 2001.

[20] M. Solomon, "Algorithms for the vehicle routing and scheduling problems with time window constraints," Operations Research, vol. 35, pp. 254-265, 1987.

[21] N. Bansal, A. Blum, S. Chawla, and A. Meyerson, "Approximation algorithms for deadline-TSP and vehicle routing with time-windows," in Proceedings of the 36th ACM Symposium on Theory of Computing (STOC), pp. 621-636, Jun 2004.

[22] B. Golden, L. Levy, and R. Vohra., "The orienteering problem," Naval Research Logistics, vol. 34, pp. 307-318, 1987.

[23] J. Mitchell, "Geometric shortest paths and network optimization," in Handbook of Computational Geometry, Elsevier Science Publishers B.V. North-Holland, pp. 633-701, 1998.

[24] N. Christofides, "Worst-case analysis of a new heuristic for the traveling salesman problem," Graduate School of Industrial Administration, Carnegie-Mellon University, Technical Report 388, 1976.

[25] A. Sobeih, W.-P. Chen, J. Hou, L-C. Kung, N. Li, H. Lim, H.-Y. Tyan, and H. Zhang, "J-Sim: a simulation and emulation environment for wireless sensor networks," IEEE Wireless Communications Magazine, vol. 13, pp. 104-119, Aug 2006.

[26] J. Hill and D. Culler, "Mica: a wireless platform for deeply embedded networks," in IEEE Micro, vol. 22, pp. 12-24, Dec 2002.

[27] C.E. Miller, A.W. Tucker, and R.A. Zemlin, "Integer programming formulations and traveling salesman problems," Journal of the ACM (JACM), vol. 7, 1960, pp. 326-329.

[28] J. MacQueen, "Some methods for classification and analysis of multivariate observations," in Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281-297, 1966.