This thesis investigates the problems of mining sensor and mobility data in cyber-physical systems. Chapter 3 proposes a novel method, LiSM, to discover intruder trajectories from untrustworthy sensor data. The watching network is designed to detect intruder appear-ances and the cone model is used to track their trajectories. The proposed algorithms are evaluated in extensive experiments on big datasets. LiSM outperforms the state-of-the-art methods on both detection and tracking tasks with higher precision and recall. Chapter 4 investigates the problem of traveling companion discovery on trajectory data streams.
The study proposes the algorithms to efficiently generate companions from trajectory data.
The model of traveling buddy is designed to improve both the clustering and intersection processes for companion discovery. The proposed methods are extended to more complex scenarios for road companion and loose companion discovery. This study evaluate the pro-posed algorithms in extensive experiments on both real and synthetic datasets. The buddy-based method is shown to be an order of magnitude faster than existing approaches on both Euclidean space and road networks. The effectiveness of buddy-based algorithm also outperforms other competitors in terms of precision and recall.
There are many interesting directions of future work in the line of cyber-physical data mining, such as combining CPS with social networks, developing novel mining functions on feature-rich movement data, and integrating the technology with real-world interdisciplinary applications.
• Combining CPS with social network: Social network analysis has attracted much at-tention in recent years. It is attractive to integrate CPS with social network. There are
several novel problems while combing the physical data with social network, including:
(1) indexing, searching and querying the CPS data via social network structure; (2) considering privacy and security issues when publishing CPS data on social network;
(3) discovering spatial-temporal patterns from CPS data with social network, e.g., mining traveling patterns in a location based social network. This line of study will definitely enrich the research of both CPS and social network.
• Mining feature-rich movement data: The movement data are usually collected with rich information features of the objects, e.g., a travelers trajectory can be collected by the smart phone, with the information of the users profile, text message, contacts and so on. These information does not only help understand the semantic and purpose of the movements, but also contributes to improve many mining functions, such as target prediction, traveler clustering, route recommendation, and so on.
• Integrating the real-world interdisciplinary applications: Information management and data mining on CPS represent an important research frontier in database, data mining, sensor network, and information technology. This technology has a wide range of ap-plications across different domains, such as patient healthcare, battlefield surveillance, traffic monitoring, and other cases in science, engineering, education, society, and any field with massive, dynamic, heterogeneous, and interrelated physical and virtual data.
It is important to integrate the algorithms with real applications to improve system performance.
Acknowledgements
The thesis work was supported in part by the U.S. Army Research Laboratory under Cooperative Agreement No. W911NF-09-2-0053 (NS-CTA) and W911NF-11-2-0086 (Cyber-Security), the U.S. Army Research Office under Cooperative Agreement No. W911NF-13-1-0193, DTRA, and U.S. National Science Foundation grants CNS-0931975, IIS-1017362, IIS-1320617, IIS-1354329. The views and conclusions contained in this document are those
of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory or the U.S. Government. The U.S.
Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation here on.
References
[1] http://foursquare.com/.
[2] http://www.movebank.org.
[3] A. Arora, P. Dutta, and S. Bapat. A line in the sand: a wireless sensor network for target detection, classification, and tracking. Computer Networks, 46(5):605–634, 2004.
[4] A. Arora, P. Dutta, S. Bapat, V. Kulathumani, H. Zhang, V. Naik, V. Mittal, H. Cao, M. Demirbas, M. G. Gouda, Y. ri Choi, T. Herman, S. S. Kulkarni, U. Arumugam, M. Nesterenko, A. Vora, and M. Miyashita. A line in the sand: a wireless sensor network for target detection, classification, and tracking. Computer Networks, 46(5), 2004.
[5] J. Aslam, Z. Butler, F. Constantin, and V. Crespi. Tracking a moving object with a binary sensor network. In SenSys, 2003.
[6] H.-H. Aung. Discovering moving groups of tagged objects. In Technique Report, Na-tional University of Singapore, 2008.
[7] P. Bakalov, M. Hadjieleftheriou, and V. J. Tsotras. Time relaxed spatiotemporal tra-jectory joins. In ACM GIS, 2005.
[8] M. Benkert, J. Guddmundsson, F. Hubner, and T. Wolle. Reporting flock patterns.
Comput. Geom. Theory Appl., 41(3):111–125, 2008.
[9] I. V. Cadez, S. Gaffney, and P. Smyth. A general probabilistic framework for clustering individuals and objects. In SIGKDD, 2000.
[10] V. Cevher and L. M. Kaplan. Acoustic sensor network design for position estimation.
In ACM Transactions on Sensor Networks, 2007.
[11] V. Crespi, G. Cybenko, and G. Jiang. The theory of trackability with applications to sensor networks. ACM Transaction on Sensor Networks, 2008.
[12] A. Deshpande, C. Guestrin, S. R. Madden, J. M. Hellerstein, and W. Hong. Model-driven data acquisition in sensor networks. In VLDB, 2004.
[13] E. Elnahrawy and Nath.B. Cleaning and querying noisy sensors. In WSNA, 2003.
[14] M. Ester, H.-P. Kriegel, J. Sander, M. Wimmer, and X. Xu. Incremental clustering for mining in a data warehousing environment. In VLDB, 1998.
[15] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In SIGKDD, 1996.
[16] T. N. S. Foundation. Cyber-physical systems. In Program Announcements and Infor-mation, 2008.
[17] S. Gaffney and P. Smyth. Trajectory clustering with mixtures of regression models. In SIGKDD, 1999.
[18] R. K. Ganti, Y.-E. Tsai, and T. F. Abdelzaher. Senseworld: Towards cyber-physical social networks. In International Conference on Information Processing in Sensor Net-works, 2008.
[19] O. Ghica, G. Trajcevski, F. Zhou, R. Tamassia, and P. Scheuermann. Selecting tracking principals with epoch awareness. In ACM SIG Spatial GIS, 2010.
[20] F. Giannotti, M. Nanni, D. Pedreschi, and F. Pinelli. Trajectory pattern mining. In SIGKDD, 2007.
[21] T. Gonzalez. Clustering to minimize the maximum intercluster distance. Theoretical Computer Science, pages 293–306, 1985.
[22] J. Gudmundsson and M. v. Kreveld. Computing longest duration flocks in trajectory data. In ACM GIS, 2006.
[23] J. Gudmundsson, M. v. Kreveld, and B. speckmann. Efficient detection of motion patterns in spatio-temporal data sets. In ACM GIS, 2004.
[24] D. Gunopoulos. Discovering similar multidimensional trajectories. In ICDE, 2002.
[25] M. A. Hammad, W. G. Aref, and A. K. Elmagarmid. Stream window join: Tracking moving objects in sensor network databases. In SSDBM, 2003.
[26] J. Han and M. Kamber. Data Mining: Concepts and Techniques Second Edition. Morgan Kaufmann, 2006.
[27] S. Har-Peled. Clustering motion. Discrete and Computational Geometry, 31(4):545–565, 2003.
[28] M. Hewish. Reformatting fighter tactics. In Jane’s International Defense Review, 2001.
[29] E. Horvitz, J. Apacible, R. Sarin, and L. Liao. Prediction, expectation, and surprise:
Methods, designs, and study of a deployed traffic forecasting service. In Twenty-First Conference on Uncertainty in Artificial Intelligence, UAI, 2005.
[30] X.-Y. Hsiao, W.-C. Peng, C.-C. Hung, and W.-C. Lee. Using sensorranks for in-network detection of faulty readings in wireless sensor networks. In DEWMA, 2007.
[31] I. Hwang, H. Balakrishnan, K. Roy, and C. Tomlin. Multiple-target tracking and iden-tity management in clutter, with application to aircraft tracking. In Proceedings of the American Control Conference, 2004.
[32] S. R. Jeffery, G. Alonso, Franklin, and H. M. J. Declarative support for sensor data cleaning. In ICPC, 2006.
[33] C. S. Jensen, D. Lin, and B. C. Ooi. Continuous clustering of moving objects. IEEE TKDE, 19(9):1161–1174, 2007.
[34] H. Jeung, M. L. Yiu, X. Zhou, C. S. Jensen, and H. T. Shen. Discovery of convoys in trajectory databases. In VLDB, 2008.
[35] T. Johnson and S. Mitra. Handling failures in cyber-physical systems: Potential direc-tions. In The Real-Time Systems Symposium, 2009.
[36] P. Kalnis, N. Mamoulis, and S. Bakiras. On discovering moving clusters in spatial-temporal data. In SSTD, 2005.
[37] F. Koushanfar, M. Potkonjak, and Sangiovanni-Vincentelli. On-line fault detection of sensor measurements. In IEEE Conference on Sensors, 2003.
[38] B. Krishnamachari and S. Iyengar. Distributed bayesian algorithms for fault-tolerant event region detection in wireless sensor networks. In IEEE Trans. Comput. 53, 3, 2004.
[39] T. Krout. Cb manet scenario data distribution. In BBN Tech. Report, 2007.
[40] J.-G. Lee, J. Han, and K.-Y. Whang. Trajectory clustering: a partition-and-group framework. In SIGMOD, 2007.
[41] M. Lee, W. Hsu, C. S. Jensen, B. Cui, and K. Teo. Supporting frequent updates in r-trees: A bottom-up approach. In VLDB, 2003.
[42] X. Li, J. Han, J.-G. Lee, and H. Gonzalez. Traffic density based discovery of hot routes in road networks. In SSTD, 2007.
[43] X. Li, R. Lu, X. Liang, X. Shen, J. Chen, and X. Lin. Smart community: an internet of things application. IEEE Communications Magazine, 49(11), 2011.
[44] Y. Li, J. Han, and J. Yang. Clustering moving objects. In SIGKDD, 2004.
[45] Z. Li, B. Ding, J. Han, and R. Kays. Swarm: Mining relaxed temporal moving object clusters accurate discovery of valid convoys from moving object trajectories. In VLDB, 2010.
[46] Z. Li, J. Han, M. Ji, L. A. Tang, Y. Yu, B. Ding, J.-G. Lee, and R. Kays. Movemine:
Mining moving object data for discovery of animal movement patterns. ACM TIST, 2(4), 2011.
[47] C. Lin, W. Peng, and Y. Tseng. Efficient in-network moving object tracking in wireless sensor network. IEEE Transaction on Mobile Computing, 5(8), 2006.
[48] J. Liu, M. Chu, J. Liu, J. Reich, and F. Zhao. Distributed state representation for tracking problems in sensor networks. In The 3rd Workshop of Information Processing in Sensor Networks, 2004.
[49] Y. Liu, L. Chen, J. Pei, Q. Chen, and Y. Zhao. Mining frequent trajectory patterns for activity monitoring using radio frequency tag arrays. In IEEE PerCom, 2007.
[50] C. Lo and W. Peng. Carweb: A traffic data collection platform. In MDM, 2008.
[51] F. Makedon, Z. Le, H. Huang, E. Becker, and D. Kosmopoulos. An event driven framework for assistive cps environments. In ACM SIGBED Review, Volume 6 Issue 2, July 2009.
[52] K. Ni and N. Ramanathan. Sensor network data fault types. In ACM Transactions on Sensor Networks, 2009.
[53] S. Oh, S. Russell, and S. Sastry. Markov chain monte carlo data association for multi-target tracking. IEEE Trans. Automat. Contr., 54(3), 2009.
[54] O. Ozdemir, R. Niu, and P. K. Varshney. Tracking in wireless sensor network using particle filtering: Physical layer considerations. In IEEE Trans. on Signal Processing, 2009.
[55] R. Pan, J. Zhao, V. W. Zheng, and J. J. Pan. Domain constrained semi-supervised mining of tracking models in sensor networks. In KDD, 2007.
[56] S. J. Pan, J. T. Kwok, Q. Yang, and J. J. Pan. Adaptive localization in a dynamic wifi environment through multi-view learning. In AAAI, 2007.
[57] PCAST. Supplement to the presidents budget for fiscal year 2013. In The Networking and Information Technology Research and Development Program, 2012.
[58] J. Pearl. Heuristics: intelligent search strategies for computer problem solving. Addison-Wesley Longman Publishing Co., Inc., 1984.
[59] N. Ramanathan, T. Schoellhammer, and Estrin. Bayesian selection of non-faulty sen-sors. In Tech. Rep. 68, CENS, 2006.
[60] L. Sha, S. Gopalakrishnan, X. Liu, and Q. Wang. Cyber-physical systems: A new fron-tier. In IEEE International Conference on Sensor Networks, Ubiquitous, and Trustwor-thy Computing, 2008.
[61] X. Sheng and Y. Hu. Maximum likelihood multiple source localization using acoustic energy measurements with wireless sensor networks. In IEEE Trans. on Signal Process-ing, 2005.
[62] F. Sivrikaya and B. Yener. Time synchronization in sensor networks: a survey. IEEE Network, 2004.
[63] S. Subramaniam, T. Palpanas, D. Papadopoulos, V. Kalogeraki, and D. Gunopulos.
Online outlier detection in sensor data using non-parametric models. In VLDB, 2006.
[64] R. Szewczyk, J. Polastre, and J. Mainwaring. Lessons from a sensor network expedition.
In European Workshop on Wireless Sensor Networks, 2004.
[65] L. Tang, B. Cui, and H. Li. Effective variation management for pseudo periodical streams. In SIGMOD, 2007.
[66] L. Tang, X. Yu, S. Kim, J. Han, C. Hung, and W. Peng. Tru-alarm: Trustworthiness analysis of sensor networks in cyber-physical systems. In ICDM, 2010.
[67] L. A. Tang, Q. Gu, X. Yu, J. Han, T. F. L. Porta, A. Leung, T. F. Abdelzaher, and L. M. Kaplan. Intrumine: Mining intruders in untrustworthy data of cyber-physical systems. In SDM, 2012.
[68] L.-A. Tang, Y. Zheng, X. Xie, J. Yuan, X. Yu, and J. Han. Retrieving k-nearest neighboring trajectories by a set of point locations. In SSTD, 2011.
[69] Y. Tao, G. Kollios, J. Considine, F. Li, and D. Papadias. Spatio-temporal aggregation using sketches. In ICDE, 2004.
[70] G. Tolle, J. Polastre, and R.Szewczyk. A macroscope in the redwoods. In Sensys, 2005.
[71] G. Trajcevski, O. Ghica, and P. Scheuermann. Tracking-based trajectory data reduction in wireless sensor networks. In SUTC/UMC, pages 99–106, 2010.
[72] D. Yang, E. A. Rundensteiner, and M. O. Ward. Neighbor-based pattern detection for windows over streaming data. In EDBT, 2009.
[73] B. Yi, H. V. Jagadish, and C. Faloutsos. Efficient retrieval of similar time sequences under time warping. In ICDE, 1998.
[74] J. S. Yoo and S. Shekhar. A partial join approach for mining co-location patterns. In ACM GIS, 2004.
[75] X. Yu, L. Tang, and J. Han. Filtering and refinement: A two-stage approach for efficient and effective anomaly detection. In ICDM, 2009.
[76] J. Yuan, Y. Zheng, C. Zhang, W. Xie, X. Xie, G. Sun, and Y. Huang. T-drive: driving directions based on taxi trajectories. In GIS, 2010.
[77] Q. Zhang and X. Lin. Clustering moving objects for spatial-temporal selectivity esti-mation. In ADC, 2004.
[78] K. Zheng, Y. Zheng, X. Xie, and X. Zhou. Reducing uncertainty of low-sampling-rate trajectories. In ICDE, 2012.
[79] Y. Zheng, X. Xie, and W. Ma. GeoLife: A Collaborative Social Networking Service among User, location and trajectory. IEEE Data Engineering Bulletin, 2010.
[80] Y. Zheng and X. Zhou. Computing with Spatial Trajectories. Springer, 2011.
[81] Z. Zhong, T. Zhu, D. Wang, and T. He. Tracking with unreliable node sequences. In INFOCOM, 2009.
[82] F. Zhu, X. Yan, J. Han, P. S. Yu, and H. Cheng. Mining colossal frequent patterns by core pattern fusion. In ICDE, 2007.