Three approaches to missing attribute values are presented in a unified way. The main applied tool is a characteristic relation, a generalization of the indis- cernibility relation. It is shown that all three approaches to missing attribute values may be described using the same idea of attribute-value blocks. More- over, attribute-value blocks are useful not only for computing characteristic sets but also for computing characteristic relations, lower and upper approx- imations, and, finally for rule induction. Additionally, using attribute-value blocks, it is quite easy to combine a few strategies to handle missing attribute values within the same data set. Thus, the entire data mining process, starting from computing characteristic relations and ending with rule induction, may be implemented using the same simple tool: attribute-value blocks.
References
1. C. C. Chan and J. W. Grzymala-Busse: On the attribute redundancy and the learning programs ID3, PRISM, and LEM2. Department of Computer Science, University of Kansas, TR-91-14, December 1991, 20
2. S. Greco, B. Matarazzo, and R. Slowinski: Dealing with missing data in rough set analysis of multi-attribute and multi-criteria decision problems. In Deci- sion Making: Recent Developments and Worldwide Applications, ed. by S. H. Zanakis, G. Doukidis, and Z. Zopounidis, Kluwer, Dordrecht, 2000, 295–316 3. J. W. Grzymala-Busse: Knowledge acquisition under uncertainty – A rough set
4. J. W. Grzymala-Busse: On the unknown attribute values in learning from exam- ples. Proc. of the ISMIS-91, 6th International Symposium on Methodologies for Intelligent Systems, Charlotte, North Carolina, October 16–19, 1991. Lecture Notes in Artificial Intelligence, vol. 542, Springer, Berlin Heidelberg New York, 1991, 368–377
5. J. W. Grzymala-Busse: LERS – A system for learning from examples based on rough sets. In Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory, ed. by R. Slowinski, Kluwer, Dordrecht, 1992, 3–18
6. J. W. Grzymala-Busse. MLEM2: A new algorithm for rule induction from imperfect data. Proceedings of the 9th International Conference on Informa- tion Processing and Management of Uncertainty in Knowledge-Based Systems, IPMU 2002, July 1–5, Annecy, France, 243–250
7. J. W. Grzymala-Busse. Rough set strategies to data with missing attribute values. Proceedings of the Workshop on Foundations and New Directions in Data Mining, Associated with the Third IEEE International Conference on Data Mining, Melbourne, FL, November 19–22, 2003, 56–63
8. J. W. Grzymala-Busse. Characteristic relations for incomplete data: A gener- alization of the indiscernibility relation. Proceedings of the RSCTC’2004, the Fourth International Conference on Rough Sets and Current Trends in Comput- ing, Uppsala, Sweden, June 1–5, 2004. Lecture Notes in Artificial Intelligence 3066, Springer, Berlin Heidelberg New York, 2004, 244–253
9. J. W. Grzymala-Busse. Data with missing attribute values: Generalization of idiscernibility relation and rule induction.Transactions on Rough Sets, Lecture Notes in Computer Science Journal Subline, Springer Berlin Heidelberg New York, vol. 1, 2004, 78–95
10. J. W. Grzymala-Busse. Three approaches to missing attribute values – A rough set perspective. Proceedings of the Workshop on Foundation of Data Mining, as- sociated with the 4th IEEE International Conference on Data Mining, Brighton, UK, November 1–4, 2004, 55–62
11. J. W. Grzymala-Busse and M. Hu. A comparison of several approaches to miss- ing attribute values in data mining. Proceedings of the 2nd International Con- ference on Rough Sets and Current Trends in Computing RSCTC’2000, Banff, Canada, October 16–19, 2000, 340–347
12. J. W. Grzymala-Busse and S. Siddhaye. Rough set approaches to rule induc- tion from incomplete data. Proceedings of the IPMU’2004, the 10th Interna- tional Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, Perugia, Italy, July 4–9, 2004, vol. 2, 923–930 13. J. W. Grzymala-Busse and A. Y. Wang: Modified algorithms LEM1 and LEM2
for rule induction from data with missing attribute values. Proceedings of the 5th International Workshop on Rough Sets and Soft Computing (RSSC’97) at the 3rd Joint Conference on Information Sciences (JCIS’97), Research Triangle Park, NC, March 2–5, 1997, 69–72
14. M. Kryszkiewicz: Rough set approach to incomplete information systems. Proceedings of the 2nd Annual Joint Conference on Information Sciences, Wrightsville Beach, NC, September 28–October 1, 1995, 194–197
15. M. Kryszkiewicz: Rules in incomplete information systems. Information Sci- ences 113, 1999, 271–292
16. T. Y. Lin: Neighborhood systems and approximation in database and knowledge base systems. 4th International Symposium on Methodologies of Intelligent Sys- tems (Poster Sessions), Charlotte, North Carolina, October 12–14, 1989, 75–86 17. T. Y. Lin: Chinese wall security policy – An aggressive model. Proceedings of the 5th Aerospace Computer Security Application Conference, Tucson, Arizona, December 4–8, 1989, 286–293
18. T. Y. Lin: Topological and fuzzy rough sets. In Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory, ed. by R. Slowinski, Kluwer, Dordrecht, 1992, 287–304
19. Z. Pawlak: Rough sets. International Journal of Computer and Information Sciences 11, 1982, 341–356
20. Z. Pawlak: Rough Sets. Theoretical Aspects of Reasoning about Data. Kluwer, Dordrecht, 1991
21. J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, Los Altos, CA, 1993
22. R. Slowinski and D. Vanderpooten. A generalized definition of rough approxi- mations based on similarity.IEEE Transactions on Knowledge and Data Engi- neering 12, 2000, 331–336
23. J. Stefanowski: Algorithms of Decision Rule Induction in Data Mining. Poznan University of Technology Press, Poznan, Poland, 2001
24. J. Stefanowski and A. Tsoukias: On the extension of rough sets under incomplete information. Proceedings of the 7th International Workshop on New Directions in Rough Sets, Data Mining, and Granular-Soft Computing, RSFDGrC’1999, Ube, Yamaguchi, Japan, November 8–10, 1999, 73–81
25. J. Stefanowski and A. Tsoukias: Incomplete information tables and rough clas- sification.Computational Intelligence 17, 2001, 545–566
26. Y. Y. Yao: Two views of the theory of rough sets in finite universes.International Journal of Approximate Reasoning 15, 1996, 291–317
27. Y. Y. Yao: Relational interpretations of neighborhood operators and rough set approximation operators.Information Sciences 111, 1998, 239–259
28. Y. Y. Yao: On the generalizing rough set theory. Proceedings of the 9th In- ternational Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing (RSFDGrC’2003), Chongqing, China, October 19–22, 2003, 44–51