An Integrated Data Management Framework of Wireless Sensor Network
for Agricultural Applications
1,2
Zhao Liang,
2He Liyuan,
1Zheng Fang,
1Jin Xing
1College of Science, Huazhong Agricultural University, Wuhan 430070, People’s Republic of
China,[email protected]
*2
Resources and Environmental Science, Huazhong Agricultural University, Wuhan 430070,
People’s Republic of China,[email protected]
Abstract
The framework proposed in this paper is to provide integrated services of sensor data management, which processes data collected from agricultural environment applying WSN (Wireless Sensor Network) technology. The main functions are: analyzing sensor data, standardizing different data forms, providing intelligent diagnosis service and event service based on sensor data library and crop knowledge database, which will guide agriculture production process, such as irrigation control or disease prevention. In addition, a fuzzy inference-based intelligent diagnosis method is developed to provide more precise decision-makings for agricultural producers, and a web-based remote login is provided to all users to interact with the integrated service system. Different from other platforms, the advantage of the integrated framework is to provide transparent services to users rather than to display sensor data only, which means little use.
Keywords:
WSN, Agricultural Application, Intelligent Diagnosis, Fuzzy Inference1. Introduction
WSN technology is increasingly common in all sectors because of its small size node and low cost advantages. Typical applications include agricultural production process management, precision agriculture, optimization of plant growth, farmland monitoring and so on [1]. In these applications, the acquisition of farmland environmental parameters, such as air temperature, humidity, light intensity, wind speed, soil moisture information, are important foundation of the practice of agriculture and farmland information management. But there are some difficulties to get these parameters continuously and quickly, the same to farmland managers and agricultural decision-makers to make accurate decisions timely. With the maturity and popularity of WSN technology, a large number of heterogeneous wireless sensor nodes deployed in the fields, which can be organized into a multi-hop intelligent network to realize the distributed farmland environmental information acquisition continuously and timely.
While WSN can solve the problem of data acquisition, but there are still several contradictions in most applications: the first, the data collection and storage of heterogeneous sensor node; the second, how to convert these raw data into meaningful information and decision-makings; and the third, system maintenance, deployment and development of WSN applications subject to specific constraints, there is no universal solution mode [2]. To solve these problems, finding effective methods of data collection, processing, management and application is the key problem.
The research problems based on sensor data are mainly concentrated on the following areas: sensor data missing value imputation, fault management and outlier detection, as well as effectively sensing data collection.
In the literature [3] [4], the missing values estimation algorithms based on the multiple regression model and the spatio-temporal correlation are introduced, separately for the sensing data changing smoothly or changing non-smoothly. In [5], a low complexity, effective recursive implementation, and good performance fault detection method for WSNs based on principal component analysis is introduced. Two new robust subspace tracking algorithms, the robust orthonormal projection approximation subspace tracking (OPAST) with rank-1 modification and the robust OPAST with deflation are developed to reduce the complexity of the computation of eigendecomposition (ED) or singular value decomposition. Furthermore, new robust T2 score and SPE detection criteria with recursive update formulas are developed to improve the robustness over their conventional counterparts
and to facilitate online implementation for the proposed robust subspace ED and tracking algorithms. In [6], a fault detection method for greenhouse WSN is introduced, the spatio-temporal correlation of the sample data is analyzed to establish the fault detection mathematical model, and a comprehensive algorithm is given to analyze the working status of the sensor nodes. In [7], a fault detection strategy based on modeling a sensor node by Takagi–Sugeno–Kang (TSK) fuzzy inference system (FIS) and recurrent TSK-FIS (RFIS), where the sensor measurement of the node is approximated as function of real measurements of the neighboring nodes and the previously approximated value of the node itself. But the data the proposed method used is generated by mathematical formula, so the performance of the algorithm needs to be verified on measured data set. Same as [7], a fuzzy logic based fault detection and management scheme proposed in [8] is to analyze the possibility of sensor node failure from the hardware point of view of battery condition, sensor condition and receiver condition. In addition, a distributed fault detection algorithm is presented in the literature [9] for wireless sensor networks based on comparisons between neighboring nodes and dissemination of the decision made at each node. A sliding window is employed to eliminate delay involved in time redundancy scheme. Effective collection algorithms of sensing data based on the spatial correlation and spatio-temporal correlation are discussed in the literature [10] [11], at the same time, the problems of delay and the energy of sensor nodes are concerned. In order to delay the sensor networks lifetime, many data compression methods are proposed, such as in [12], a new distributed data aggregation technique hybrid compression technique (HCT) based on voronoi diagram is proposed considering the characteristics and location information of nodes in sensor networks. In the literature [13], an approximate data gathering technique, called EDGES, is presented that utilizes temporal and spatial correlations. The multiple model Kalman filter as an approximation approach is utilized to efficiently obtain the sensor reading within a certain error bound. What’s more, the data storage is a hot point in Heterogeneous WSN area. Some data storage technologies focus on the efficient data storage and access ways, and some focus on other fields, but the security problems of data storage in WSN are often ignored. In [14], a new security data storage technology for Heterogeneous WSN by applying the multi-key mechanism into the data storage is proposed based on the efficient network hierarchy.
2. Architecture design of the proposed system
The architecture of the data management middleware is divided into five functions, namely, data collection, data preprocessing, data storage, service delivery, and Web service. The system hierarchy is shown in Fig.1.
Figure 1. Hierarchy Chart of the Proposed Framework The specific data flow chart and structure is shown in Fig. 2.
Data preprocessing layer Data storage layer Service delivery layer Web service interface layer
Figure 2. Data Flow Chart and Structure of the Proposed System
The main function of data collection is to interact with the heterogeneous WSN, continuously access to sensor data, and to manage functions for the states of various sensor networks.
There are two main functions of data preprocessing, data estimation and data aggregation. The received sensor data may be not continuous or even missing because of nodes faulty or interference in the transmission process which will impact on the data analysis. In order to reduce the impacts by the missing data, the corresponding missing value estimation algorithms are used to predict the missing data if it is found when the sensor data is delivered from the data acquisition engine. The data aggregation is another function, which is to calculate sensor data, such as average value, the maximum value and the minimum value during per aggregation cycle.
The data storage’s main task is to store the heterogeneous sensor data preprocessed in data preprocessing layer and then be deposited in the sensor DB. In addition, crop models are stored in crop knowledge DB. Each crop may have more than one crop model, and each model may have one or more rules. A data access controller is placed in the data storage layer to support various forms of queries.
The service delivery plays the role of providing various query processing functions for sensor data and crop model data. In this layer, there are two functions, including intelligent diagnosis and real-time forecasting. An aggregation engine is called to make intelligent diagnosis possible by comparing with the rules, in advance to notify the intelligent service management module if it is identical or exceeds some threshold value.
The Web service is an open interface, which is the top layer of the system, supports connections with the outside users through browse and query processing.
WSN interface Data Preprocessing Database controller Web service interface
Service delivery
Diagnosis Forecast
Sensor Sensor Sensor
Gateway Sensor data DB Environment monitoring Feedback control Intelligent diagnosis Data query/visualization Real-time alerting Crop knowledge DB Aggregation Estimation Query Parse
3. Implementation of the proposed system
3.1. Data requirementThe main data repository of the proposed middleware is RDBM and is built on MySQL. Fig.3 depicts the entity relationships of data management subsystem, composed by 9 tables [15], which is part of the entity relationship diagram.
Figure 3. Entity Relationship Diagram of Sensor Data Management System
The sensor table is the main table and it is the most basic device of generating data in the network. In the table, sensor type, data value, and other information such as date are stored. Each sensor belongs to a node, and each node belongs to a user specific management zone, that is a certain gateway, in which the coordinate stores a relevant location within a zone. At the same time, there are different sensor types in one zone, all the sensor type information is stored in sensortype table. Operation rules are described in the rule table, which contains trip point values that are used by the crop model. In addition, diagnosis results are stored in the diagnosis table. Z shows zero or multiple relationship, P expresses one or multiple relationship, and FK is foreign key.
3.2. Intelligent Diagnosis Algorithm based on fuzzy inference
Generally, if temperature, humidity or illumination is in a certain range, the crop growth state is the best, or vulnerable to some kind of plant diseases or insect pests, etc. For example, paddy is easily to have rice blast under the existing conditions of optimum temperature, humidity, rain, and fog. The suitable hypha growth temperature is 8 ~ 37℃, and the optimum temperature is 26 ~ 28℃. Spore
forming temperature is 10 ~ 35℃, optimum temperature is 25 to 28℃, relative humidity is above 90%, and the spore will germinate in the condition of water for 6 ~ 8 hours.
How to accurately control and adjust environment parameters so as to control the growth in the best state, or to make effective prediction, which is a key question for crop growth management. To solve the above problems, the following several aspects must be considered: the first is how to use the sensor data to mining hidden knowledge, the second is how to use and express crop model and expert knowledge, and the third is which methods are used to predict. In the proposed system, the fuzzy logic based system is used to diagnose and predict crop growth states in time. The process is as follows:
(1) Processing sensor data
a. Receive sensor data D from gateway; b. Send D to Data Queue;
c. Receive data D from Data Queue and decode the data; (2) Parse Crop model
a. Query ModelID, CropID according to GWID and ModelType;
b. Get information of all rules according to ModelID, such as RuleType and other values; (3) Aggregate sensor data
a. Compute the MaxValue, MinValue, and AvrValue of each type sensor in a time duration according to user’s requirement;
(4) Call fuzzy inference system
a. Determine fuzzy parameters, membership functions and fuzzy rules; b. Input the data obtained in the third step to the fuzzy inference system; (5) Return results to the proposed system
3.3. Implementation of the proposed system
A simple disease probability monitoring prediction subsystem is developed using Java and Matlab. The example is based on rice blast prediction model. The input variables of the fuzzy system are parameters of each rule, such as hypha growth temperature, spore forming temperature, humidity and so on, and the output of the fuzzy system is occurrence probability of rice blast, classified into three types, 0~50% is low, 50%~80% is middle, and 80%~100% is high.
The labels of input variables are as follows:
Hypha growth temperature = {Low, Optimum, High} Spore growth temperature = {Low, Optimum, High} Humidity = {Low, High}
Time duration = {Short, Optimum, Long} The output labels are as follows:
Rice blast disease possibility = {Low, Medium, High}
The triangular and trapezoidal membership functions are selected to model the environmental parameters, the membership functions of hypha growth temperature, spore growth temperature and time duration are represented as follows[8]:
1, , 0, L x a x a a x b b a x b 0, , 1, , 0, M x a x a a x b b a b x c d x c x d d c x d 0, , 1, H x a x a a x b b a x b
The membership functions of humidity are presented as: 1, , 0, L x a x a a x b b a x b
0, , 1, H x a x a a x b b a x b
And there are 54 fuzzy rules built to model different conditions. A fuzzy rule is written as the following statement [7]: l R:IF
x
1is 1l B andx
2is 2 l B and …x
nis Bnl THENy
is ly
where Rl(l=1,2,…,M) denotes the
l
th implication,x
j(j=1,2,…,n) are input environmental variables of the fuzzy logic system,y
lis a singleton, lj
B is the fuzzy membership function which can represent
the uncertainty in the reasoning. Part of the rules is shown in Table1.
Table 1. Part of the Fuzzy Reasoning Rules Number Hypha growth
temperature
Spore growth
temperature Humidity Time duration Output
1 Low Low Low Short Low 2 Low Optimum High Optimum Medium 3 Low High High Long Low 4 Optimum Low High Optimum Medium 5 Optimum Optimum High Optimum High 6 Optimum High High Optimum Medium 7 High Optimum High Optimum Medium 8 High High Low Short Low 9 Low Low Low Short Low 10 Low Optimum High Optimum Medium
The plot of membership functions of the variable xi(where i = 4) obtained through fuzzy tool box of Matlab. A group of 10 aggregated sensor data is selected to the fuzzy system, the fuzzy reasoning results are shown in Table 2.
Table 2. Part of the Fuzzy Reasoning Results Hypha growth temperature(℃) Spore growth temperature(℃) Humidity (%) Time duration (hour) Possibility (%) 4 5 50 4 19.1842 15 20 50 4 21.1544 21 26 70 7 56.0277 27 26 95 7 93.6667 27 30 50 4 20.6033 35 30 95 7 55.7214 50 40 95 7 19.1842 50 40 50 4 19.1842 15 20 95 7 54.5700 4 5 99 7 19.1842
From the results in the table, duration of time has the minimum impact on the results. When hypha growth temperature and spore growth temperature are suitable, the probabilities of blast occurrence is higher than in other conditions, but in the optimum range, the probability is significantly higher than the other conditions. Thus a reference will be provided for agricultural manager to adjust the environmental parameters and to control the crop growth in the optimal state. At the same time, because there is no consideration of correlation coefficient of the four factors to the possibility result,
so there are four same results such as 19.1842. It’s difficult to distinguish which is the key factor and the secondary.
The web service of this system is implemented with JSP(Java Server Pages), users can remotely login to the system, browse and query all the information, including historical sensor data, expert knowledge, intelligent diagnosis and other services. Fig. 4 shows the data statistics interface, and Fig.5 shows the functions’ UI of the integrated system, which displays real-time sensor data fluctuations such as illumination, temperature, humidity, etc. collected from the environment, at the same time, if an intelligent diagnosis produces, a warning information is displayed in the interface.
Figure 4. the Data Statistics Functions’ UI of the Integrated System
4. Conclusions and Future Work
In this paper, an agricultural application-oriented sensor data management middleware is developed, which can efficiently process sensor data collected from the environment and implement combined services through Web. Different from other platforms, the advantage of the integrated framework is to provide transparent services to users rather than to display sensor data only, which means little use. The system runs a data analysis engine and a model parse engine. An intelligent diagnosis is developed based on the fuzzy inference to provide more precise information for agriculture management. The simple sensor data set is changed into meaningful knowledge.
In future studies, an expansion of different models is needed to enrich the model DB, which will make the diagnosis more versatile, and simultaneously expression of crop models and associated expert knowledge need to be improved, more effective and performance algorithms will be developed for sensor data estimation and intelligent diagnosis. In addition, in the fuzzy inference system, correlation coefficient of different factors can be considered to improve the accuracy of the results. And more than that is, sensor data can not simply as discrete data, better methods for sensor data stream processing must be explored in future work.
5. References
[1] W.S. Lee, V.Alchanatis, C.Yang, M.Hirafuji, D.Moshou, C.Li, “Sensing Technologies for Precision Specialty Crop Production”, Computers and Electronics in Agriculture, vol.74, pp.2-33, 2010.
[2] Jeonghwang, H., Hyun, Y., “Study on the Context-Aware Middleware for Ubiquitous Greenhouses Using Wireless Sensor Networks”, Sensors, vol.11, pp. 4539-4561, 2011.
[3] Pan Liqiang, Li Jianzhong,“A Multiple-Regression-Model-Based Missing Values Imputation Algorithm in Wi reless Sensor Network”, Journal of Computer Research and Development, pp. 2101-2110, 2009.(in Chinese)
[4] PAN Li Qiang, LI Jian-Zhong, LUO Ji Zhou, “A Temporal and Spatial Correlation Based Missing Values Imputation Algorithm in Wireless Sensor Networks”, CHINESE JOURNAL OF COMPUTERS, pp 1-11, 2010.(in Chinese)
[5] S. C. Chan, H. C. Wu, K. M. Tsui, “Robust Recursive Eigendecomposition and Subspace-Based Algorithms with Application to Fault Detection in Wireless Sensor Networks”, IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, pp 1703-1718, 2012. [6] Zhang Rongbiao, Bai Bin, Li Kewei, et al, “Fault Diagnosis of the Greenhouse WSN Based on the
Time Seris and Space Series Analysis”, Transactions of the Chinese Society for Agricultural Machinery, pp. 155-179, 2009
[7] S.A.Khan, B.Daachi, K.Djouani, “Application of fuzzy inference systems to detection of faults in wireless sensor networks”, Neurocomputing, pp. 111-120, 2012.
[8] P. Chanak, I. Banerjee, T. Samanta, et al, “FFMS: Fuzzy Based Fault Management Scheme in Wireless Sensor Networks”, Proc.of ICECCS 2012,CCIS, pp. 30-38.
[9] Myeong-Hyeon Lee, Yoon-Hwa Choi, “Fault detection of wireless sensor networks”,Computer Communications, pp. 3469-3475, 2008.
[10]Leandro A.Villas, Azzedine Boukerche, Daniel L. Guidoni,et al, “An energy-aware spatio-temporal correlation mechanism to perform efficient data collection in wireless sensor networks”, Computer Communications, pp. 1-13, 2012.
[11]Leandro A. Villas, Azzedine Boukerche, Horacio A.B.F. de Oliveira,et al, “A spatial correlation aware algorithm to perform efficient data collection in wireless sensor networks”. Ad Hoc Networks, pp. 1-17, 2011.
[12]Zaid A. Ali Al-Marhabi, LiRen Fa, FanZi Zeng and et.al, “The Design and Evaluation of a Hybrid Compression Technique (HCT) for Wireless Sensor Network”. International Journal of Digital Content Technology and its Applications (JDCTA), pp.201-207, 2011.
[13]Jun-Ki Min, Chin-Wan Chung, “EDGES: Efficient data gathering in sensor networks using temporaland spatial correlations”. The Journal of Systems and Software, pp. 271-282, 2010.
[14]Fang Rui, “The Study of Security Data Storage Technology in Heterogeneous Wireless Sensors Network”. International Journal of Digital Content Technology and its Applications(JDCTA), pp.21-27, 2012.
[15]Emanuel, P., Miguel, A. F., Raul, M., et al, “An Autonomous Intelligent Gateway Infrastructure for in-field Processing in Precision Viticulture”, Computers and Electronics in Agriculture, pp. 176-187, 2011.