ISSN: 2348 - 8549
http://www.internationaljournalssrg.org
Page 48
Abstract
:
Wireless Sensor Networks (WSNs) have matured as the technology most suited for monitoring of the environment and data collection. But harnessing the powers of a WSN presents innumerable challenges. Issues related to networking (routing), storage, and transport of the huge data flooding the Network needs to be planned in advance before the data reaches the sink and one has to minimise creation of hotspot on the sensors around the sink. Salient aspects are analysed in this paper.
Keywords
:
DCS - Data Centric Storage QoS - Quality of Service
WSN – Wireless Sensor Networks GHT - Geographic Hash Table
GPSR – Greedy Perimeter Stateless Routing Sensornet – Network of Sensor Equipment/ Devices. Q-NiGHT - System based on geographical routing QoS.
I. INTRODUCTION
A sensornet is a distributed sensing network consisting of a large number of small devices, each
having some computational, storage and
communication capability. Such networks best operated in an unattended mode to record detailed information about their surroundings. They are thus well suited to applications such as location tracking and habitat monitoring, etc. Communication between nodes requires the expenditure of energy, a scarce commodity for most sensornets. Thus, making effective use of sensornet data needs scalable, self-organizing as well as energy-efficient data dissemination algorithms. Making effective use of the vast amounts of data gathered by large-scale sensor networks (sensornets) needs scalable, self-organizing, as well as energy-efficient data dissemination algorithms.
The requests for data are normally broadcast to all the sensors, since the sink can hardly know in advance the identity of the sensors that eventually manages to sends the relevant data that sink needs on regular basis.
Communication abstractions that are data-centric
become the best model, as the data are “named” and the communication abstractions refer to these names rather than to the
node network addresses. The directed diffusion data- centric routing scheme is most promising and energy-efficient data dissemination method for sensornet environments. The three canonical methods described in the previous section certainly do not exhaust the design space; combinations of them yield hybrid methods specialized for particular needs. For example : Data-Centric Storage for location guidance. For certain applications, one might combine LS and DCS by storing detailed event information locally and using DCS to inform a querier of an event’s location so that subsequent queries can be directed to the proper local store.
Data-Centric Storage for context. In the course of processing local data, nodes may find it useful to have some context about global parameters. For instance, data-centric storage could give nodes access to the number of other animals sighted when a node is trying to determine if a migration is underway using simple gadgets as shown below consisting of hardware available off the shelf :
Directed queries (geographically targeted). The canonical methods are designed for cases where one does not a priori know the event location. If one already knows the location of the event through out-of-band techniques, then one can direct queries to that location using geographic routing methods . This LS variant stores data locally, and queries are sent (at cost O(√n)) to the relevant locations to retrieve the desired data. It avoids the cost of flooding in the canonical LS approach, and the cost of storing each event in the canonical DCS approach. In-network storage and in particular Data Centric Storage (DCS) stores data on a set of sensors that depend on a meta-datum describing the data. DCS is a paradigm that is promising for Data Management in WSNs, since it addresses the problem of scalability (DCS employs unicast communications to manage WSNs), allows in-network data pre-processing and can mitigate hot-spots insurgence.
The use of DCS for Data Management in middleware for WSNs. Since WSNs can feature different paradigms for data routing (geographical routing and more
Data Centric Storage of Data Management in
Wireless Sensor Network
Dr Col (Retd) Narendra Kumar
#1#
ISSN: 2348 - 8549
http://www.internationaljournalssrg.org
Page 49
traditional tree routing), this thesis introduces twodifferent DCS protocols for these two kinds of WNSs. Q-NiGHT involves geographical routing and manages the quantity of resources assigned to the storage of different meta-data. It implements a load balance for storage of the data over the sensors in the WSN.
II. THE ROLE OF WSN
The design of WSN applications is challenging since they have to deal with their own business logic, and with the issues that naturally arise when WSNs are taken into account, such as network formation, data transport and data management, security and energy saving. This last requirement arises since nodes are typically battery-powered, and the energy budget of the nodes is limited, hence energy saving techniques are to be implemented to avoid energy starvation and the subsequent death of nodes. Dealing with these issues can be done either explicitly, thus adding complexity to the applications, or implicitly by means of a middleware, that is a software layer that abstracts from common issues of the WSNs.
III. FACTS OF WSN
In WSN, the particular routing protocol that is used becomes a middleware parameter and the application developer can switch between different routing mechanisms at network creation time, instead at the application design time. Distilling a given high-level behavior from a set of sensors is a challenging problem, especially when explicitly dealing with WSN issues, since the complexity can be overwhelming.
Complex scenarios can arise in WSN applications, and a number of theoretical and practical problems blossom when the level of the abstraction of the middleware is deeply pinpointed. Some of them are analyzed here. For example, if an application wants to deliver data to more than one sink, or if the sink is inaccessible for some time, it may be useful to cache data into the network before sending them to the user application which is essentially a scenario of Data Centric Storage (DCS). The paradigm states that data are stored inside the network, in a subset of the sensors that is dependent on a meta-datum that describes the data, and the sinks query the WSN to retrieve the data they are interested in. DCS in general is considered as a useful mechanism to decouple data production from data collection.
IV. APPLICATIONS OF WIRELESS SENSOR
NETWORKS
A large number of low-capacity sensors must be orchestrated into a common behaviour to give support to complex WSN applications. Despite the complexity that must be faced during the development, several kinds of applications are supported by WSNs, since they are flexible enough to be applied in diverse areas, such as: Critical Infrastructure Protection, Tracking. According to the European Commission, Critical Infrastructures consist of “those physical and information technology facilities, networks, services
and assets which, if disrupted or destroyed, would have a serious impact on the health, safety, security or economic well-being of citizens or the effective functioning of governments in the Member States. Home systems and health monitoring. A “smart house” is usually made up of several intelligent devices that can control home appliances or “feel” their surroundings, and it can feature devices that intermix control on “intelligent” furniture and that monitor user position and/or state. This kind of infrastructure can lead to applications supporting rehabilitation after illness, such as providing assistive technology for cognitive rehabilitation and independent aging. Their technique incorporates a node’s residual energy into the cost metric that is computed when determining what route to send packets on. The applications for such networks include inventory management, product quality monitoring, factory process monitoring, disaster
area monitoring, biometrics monitoring, and
surveillance.
V. PERFECT DATA CENTRIC STORAGE
A middleware based on DCS represent a novelty into the WSN scene that could expand current WSN capabilities. On the other hand, the most studied implementations of DCS have limitations:
The set of sensors that store the data depends on the topology of the WSN, and it is not selectable by the WSN application developer. The storage primitive of DCS assumes uniform distribution of the sensors, thus it is inefficient when the sensors have a different distribution. Dependability is ensured by means of pure replication, in the sense that all the sensors that store a given meta-datum, have to store a copy of all related data. Such issues have been addressed by introducing new mechanisms for the Data Management layer of WSN middleware. Dependability mechanisms used for WSNs, introduce a new way of exploiting erasure coding that can enrich the Data Management layer without increasing the complexity of DCS. A novel Data Management layer mechanism combines gossip routing and erasure coding to disseminate data in WSN efficiently. DCS systems can implement effective in-network data storage and retrieval, since they require only unicast communications. However existing approaches disregard issues related to load balancing of the sensors, and QoS. The work on Q-NiGHT addressed these issues in WSNs that use a geographical routing protocol. In particular, Q-NiGHT uses a rejection hashing technique that produces pairs of coordinates by taking into account the actual sensor density, and then relies on a novel dispersal technique that enforces QoS and provides load balance. Simulation results show that Q-NiGHT significantly balances the storage load on the sensors and it adapts to different sensors’ distributions. The Data Centric Storage systems can be combined with memory-efficient erasure codes. The use of erasure codes is however not immediate, since it requires the sensors to perform the encoding of the data before the storage
ISSN: 2348 - 8549
http://www.internationaljournalssrg.org
Page 50
(encoding that is not necessary in traditional DCS sincethey exploit pure replication). In fact it is necessary that each sensor be assigned with a coding parameter, and this assignment is critical from the point of view of correct data coding and decoding. For this reason a probabilistic model was proposed, and it allowed the estimation of correct coding and decoding probability; this model was adapted to estimate this probability in three cases of study: local storage, DCS-GHT, and Q-NiGHT.
From the analytical and simulative results, it emerged that correct coding/ decoding can be achieved with high probability with the three systems for
different network configurations. Further, the
application of erasure coding to spatial gossiping, showed that by using spatial gossip, it is possible to quickly establish an in-network erasure coding of the generated data. The gossip algorithm has a near linear message cost. Using erasure codes, generated data can be recovered from a subset of the nodes with high probability. The application of this storage scheme is useful to combat node failures and to help with efficient data collections with data mules. A variation of the basic gossip algorithm has data mules that can effectively reconstruct data as soon as it collects the code words. Futuristic DCS systems may take into account sensor distribution, as in the case of Q-NiGHT, or the on-the-fly estimation of the distribution of the sensors, can be exploited to better fashion or tailor the generalized hash function that associates meta-data to locations in the sensing area and to deal with topology changes due to sensors’ failures and/or sensors’ mobility
VI. THE DCS PROBLEM
We have argued for the utility of a DCS service for sensornets. Now we will define the data-centric storage problem in more detail: the storage abstraction DCS provides, the design goals a robust, scalable DCS
system must meet, and our geographic hashing
approach to DCS architecture that meets these design goals.
Storage abstraction
Like the many distributed hash table systems before it .DCS provides a (key, value)-based associative memory. Events are named with keys. Both the storage of an event and its retrieval are performed using these keys. DCS is naming-agnostic; any naming scheme that distinguishes events that users of the sensornet wish to identify distinctly suffices. The two operations DCS supports are:
Put(k, v) stores v (the observed data) according to the key k, the name of the data.
Get(k) retrieves whatever value is stored associated with key k.
Design criteria
The challenge in any design for a DCS system is to meet scalability and robustness criteria despite the
system’s fundamentally distributed nature. Sensornets represent a particularly challenging environment for a distributed storage system:
Node failures may be routine; exhaustion of battery power and permanent or transient failure in a harsh environment are problems in any realistic sensornet deployment.
Topology changes will be more frequent than on traditional wired networks. Node failures, node mobility, and received signal strength variations in real radio deployments each independently cause neighbor relationships among nodes to change over time.
System scale in nodes may be very great. Sensor nodes may be deployed extremely densely (consider the limit case of smart dust [1]), and may be deployed over a very wide physical region, such that the total number of devices participating in the DCS system may be on the order of 106 or more nodes.
Energy constraints will often be severe; nodes will operate from battery power. These challenges suggest several specific, important design criteria for ensuring scalability and robustness in the distributed storage system we envision:
Persistence: a (k, v) pair stored in the system must remain available to queries, despite sensor node failures and changes in the sensor network topology.
Consistency: a query for k must be routed correctly to a node where (k, v) pairs are currently stored; if this node changes (e.g., to maintain persistence after a node failure), queries and stored data must choose a new node consistently.
Scaling in database size: as the number of (k, v) pairs stored in the system increases, whether for the same or different ks, storage should not concentrate at any one node.
Scaling in node count: as the number of nodes in the system increases, the system’s total storage capacity should increase, and the communication cost of the system should not grow unduly. Nor should any node become a concentration point of communication.
Topological generality: the system should work well on a broad range of network topologies.
VII. GHT: A GEOGRAPHIC HASH TABLE
The DCS system architecture we describe in this paper to meet the above-enumerated design criteria is
GHT, a Geographic Hash Table. The core step in GHT
is the hashing of a key k into geographic coordinates. Both a Put() operation and a Get() operation on the same key k hash k to the same location. A key–value pair is stored at a node in the vicinity of the location to which its key hashes. Choosing this node consistently is central to building a GHT. If we assume a perfectly static network topology and a network routing system that can deliver packets to positions, such a GHT will cause storage requests and queries for the same k to be routed to the same node, and will distribute the storage request and query load for distinct k values evenly across the area covered by a network.
ISSN: 2348 - 8549
http://www.internationaljournalssrg.org
Page 51
The service provided by GHT is similar in characterto those offered by other distributed hash table systems [2,6,]. However, as is the case with those systems, much of the nuance to the GHT system design arises specifically to ensure robustness and scalability in the face of the many sorts of failures possible in a distributed system. GHT uses a novel perimeter refresh protocol to provide both persistence and consistency when nodes fail or move. This protocol replicates stored data for key k at nodes around the location to which k hashes, and ensures that one node is chosen consistently as the home node for that k, so that all storage requests and queries for k can be routed to that node. Yet the protocol is efficient; it typically uses highly local communication, especially on networks where nodes are deployed densely. By hashing keys, GHT spreads storage and communication load between different keys evenly throughout the sensornet. When many events with the same key are stored,
GHT avoids creating a hotspot of communication and storage at their shared home node by employing structured replication, whereby events that hash to the same home node can be divided among multiple mirrors.
VIII.
ALGORITHMSWe proceed now to describe the algorithms that comprise GHT. GHT is built atop GPSR [14–16], a geographic routing system for multi-hop wireless networks. After briefly reviewing the features of GPSR’s design relevant to GHT, we identify a previously unexploited characteristic of GPSR that allows all packets destined for an arbitrary location (unoccupied by a node) to be routed consistently to the same node in the vicinity of that location. GHT leverages this characteristic to route storage requests and queries for the same key to the same node, despite the ignorance of the hash function that maps keys into locations of the placement of nodes in the network. We then describe algorithms and implementations of the perimeter refresh protocol and structured replication, which allow GHT to achieve the DCS design criteria for scalability and robustness discussed in the previous section
GPSR
Under GPSR, packets are routed geographically. All packets are marked with the positions of their destinations. All nodes know their own positions, and the positions of the nodes a single hop away from them. Using only this local knowledge, GPSR can
route a packet to any connected destination. Performance of such systems are strained in actual scenario depending on the operational efficiency of hardware and algorithms that executes the onerous tasks of reducing the redundancies in the live-streaming data.
IX.
CONCLUSION
The live streaming Data received in a WSN is voluminous. The Nodes have to act as supervisors to reduce the redundancies in order to enhance their efficiency and longevity / survival of Sensors, and thereby the WSN . GHT works well in sensornets with stable and static nodes. But failures (and movement, in some cases) are inevitable, and thus the robustness of design against these factors becomes paramount. Various robustness tests, subject design to very harsh environments. Most generous run with failing nodes use mean cycle-times of the order of minutes, far worse than for most projected sensornet systems, yet, as long as the fraction of failing nodes is not overly high, or the cycle times are tens of minutes, the system performs fairly well. Similarly, if the extent and rate of movement in the mobile node case is significant; nodes rest only a minute between movements, and the movements are large excursions (half the size of the sensornet, on average), not slight adjustments. Here, too, availability remains high.
GHT algorithm replicates a key–value pair at nodes in the immediate vicinity of the home node. Localized replication of this form is of little use if all the nodes in an area fail at the same time (e.g., a fire destroys all nodes in a region). Resilience against these clustered failures could be provided by storing each event-multiple times at dispersed locations (using event-multiple hash functions). The detailed ns-2 simulations verify the correctness and robustness of the GHT system in a realistic wireless environment, including MAC-layer behavior, packet loss, node dynamics, etc. However, they are limited to system sizes of the order of 100 nodes. Less detailed simulations are used to compare the three canonical mechanisms – external storage (ES), local storage (LS), and data-centric storage (DCS) – in much larger systems.
By building special-purpose simulator that assumes that nodes are stable and stationary and that packet delivery to neighboring nodes is instantaneous and error-free. This simulator thus faithfully represents the packet generation and
forwarding behavior of the various canonical mechanisms.
This simulator may be used to examine the number and pattern of packet transmissions (as a measure of energy consumption) as the size and nature of the sensornet and workload vary, to illuminate the relative performance of the canonical data dissemination algorithms. We do not count PRP packets in these simulations.
The length of a perimeter is purely determined by the density of the network, and only vary system scale in nodes, not density, in the large-scale simulations. Two metrics may be used to evaluate the performance namely :
- total number of packets generated, - and the hotspot usage,
ISSN: 2348 - 8549
http://www.internationaljournalssrg.org
Page 52
The hotspot usage is the maximum number of packetstransmitted by any single node. Latency is not measured, as that is approximately the same across the algorithms, assuming that all other factors (such as the fidelity of the data) are held fixed across the various algorithms. The relevant system parameters are:
• n, the number of nodes in the system; • T, the number of event types;
• Q, the number of event types queried for; • Di, the number of detected events of event type i.
A typical WSN, may have T = 100 and Di = 100 for all i, then by varying n and Q three tests may be done . In test #1, hold n fixed (n = 10000) and vary Q. In test #2, set Q = 50 and vary n; or hold the system density fixed and increase the sensornet size as n increases. In Test #3, use parameters as in test #1, but employ further optimization of in-network aggregation. All these results when averaged across ten different topologies, with ten runs on each topology of these tests, shows the results pertaining to the ES, the LS, and the following three versions of DCS:
Normal DCS (N-DCS). A query returns a separate message for each detected event.
Summarized DCS (S-DCS). A query returns a single message regardless of the number of detected events. Structured Replication DCS (SR-DCS). An optimal level of SR provides a lower bound. Summarization may be feasible. In fact, Other Researchers have predicted analytically and verified in simulations of networks of 100,000 nodes that the cases where DCS offers reduced total network load and hotspot network usage as compared with external storage and local storage are appreciable. Analysis reveals that these benefits occur on sensornets comprised of large node populations, and where many events are detected, but not all event types are queried.
GHT leverages the GPSR geographic routing system to offer an efficient DCS service that maintains persistence of data when nodes fail or move, while scalably spreading the load of (key, value) pairs evenly
throughout the sensornet. Some aspects need further investigation. Foremost among these is the effect of varying node density.
GHT fundamentally requires that a node know it’s own geographic position. Under GHT, keys are uniformly hashed over the geographic space. Hence, as nodes are distributed increasingly non-uniformly, we expect the storage and forwarding load across nodes to be correspondingly skewed. Consequent performance implications and developing mechanisms that can adapt to such non-uniformity need further research work to achieve higher efficiency in the field
ACKNOWLEDGMENT
I acknowledge the Management of Swami Devi Dyal Group of Institutions for providing me the opportunity to write this technical paper and the resources to conduct the National Conference.
REFERENCES
[1] Y.F. Hu: Wireless Sensor Networks: a Survey on the State of the Art and the 802.15.4 and ZigBee Standards. In: Computer Communications, vol. 30, n. 7, pp. 1655-1695 (2007)
[2] E. Cayirci: A survey on sensor networks. In:IEEE Communications Magazine vol. 40, n. 8 (2002)
[3] S. Chessa: Sensor Network Standards. book chapter in: J. Zheng and A. Jamalipour, Wireless Sensor Networks: A Networking Perspective, Wiley-IEEE Press, ISBN: 978-0-470-16763-2, p. 407-431, September 2009
[4] P.A. Dinda: Archetype-based design: Sensor network programming for application experts, not just programming experts. In: IPSN ’09: Proceedings of the 2009 International Conference on Information Processing in Sensor Networks, p. 85-95 (2009) [5] R.K. Gupta: Programming Models for Sensor Networks: A Survey. In: ACM Transactions on Sensor Networks vol. 4, n. 2, 1-28, March 2008
[6] R. Buyya: Service Oriented Sensor Web. In: Sensor Networks and Configuration: Fundamentals, Standards, Platforms and Applications, Springer- Verlag, pp. 51-74 (2007)
[7] T. Braun: Programmability Models for Sensor Networks. In: First International Conference on Autonomous Infrastructure, Management and Security, AIMS 2007, p. 233, June 21-22, 2007
[8] S.K. Dasi: Middleware for wireless sensor networks: A survey. In: Journal of Computer Science and Technology 23(3) (May 2008) 305-326
[9] Middleware for wireless sensor networks: Challenges and Approaches. In: Seminar on Internetworking, Helsinki University of Technology, Finland, April 27th 2009.
[10] An Evaluation Framework for middleware approaches on Wireless Sensor Networks. In: Seminar on Internetworking, Helsinki University of Technology, Finland, April 27th 2009