Fuzzy Logic for Rule Mining - Soft Computing Approaches for Rule Mining

Chapter 2: Data Mining Background and Preliminaries

2.6 Soft Computing Approaches for Rule Mining

2.6.3 Fuzzy Logic for Rule Mining

Zadeh first introduced the concept of fuzzy sets in 1965 [Zadeh 1965, 1971] and since then there has been a tremendous interest in this subject due to its strong capability to model imprecise and qualitive knowledge. The fuzzy set is an extension of classical set theory. In classical set theory, or set theory for short, an item either belongs to a set, or it does not; there is no concept of in between [Paul 1987]. To describe classical set theory mathematically, consider R is a set (R⊂U ) and xis a member of U(x∋U), then xeither belongs to R or not i.e. the membership function of x is

] 0 | 1 [ ) ( : x →

f χ . Since the classical set theory membership function of an object is either 1 or 0, it is called also a crisp set or logic. The logic of a crisp set is extended to a fuzzy set or logic with the introduction of the idea of partial truth. In this logic, an object is a member of a set to some degree and a logical proposition may hold true to some degree e.g. a person height is 5.5 feet can be translated in crisp logic as “the person is tall”, but in FL it can be viewed as “the person is reasonably tall”. Here,

reasonably is a linguistic variable. To support such linguistic variables, fuzzy set or

logic is introduced. In FL reasonably gets the membership value of tall for some point between 0 and 1 (e.g. 0.6 etc.). Unlike crisp logic, the fuzzy membership function of an object x belonging to the set R is not restricted to two values but all values 0 to 1. It is expressed as f :µ(x)→[0,1]. One of the examples of fuzzy membership functions for tall is shown in Figure 2.11. It shows that a person with height 6 feet or over is called fully tall, and less that or equal to 5 feet is not tall. A person between 5 feet and 6 feet is tall with a confidence factor between 0 and 1. This kind of fuzzy membership function is linear type. However, there could be many shapes of membership function in the fuzzification of variables.

FL is very useful for rule mining because of its power to handle vagueness and imprecise information, which is very common in the practical rules as found in real data sets. Also, FL is a vital technique for human knowledge representation that involves in rule mining methods [Maeda et al 1995]. For example, in a credit card dataset when rule mining system gives a rule like if an account holder is young and income is low

if an account holder age is [20,25] and income is [10K, 20K] and credit amount is [5K,

10K] poses risk=high. The first type rule is the target of rule mining algorithms using

FL.

Kuok et al. proposed the fuzzy rule mining technique in 1998 [Kuok et al 1998]. Like any general rule it is also of the form “If X is A then Y is B ”. Here X and

Yare the attributes of the dataset where as A and B come from the fuzzy sets which characterizes X and Y respectively. Their technique is simple but it needs to know about fuzzy sets, quantities attributes and their membership functions Apriori, which may not be available at the time of running algorithm. Wai-Chee et al. proposed a technique for finding fuzzy sets for each quantitive attribute in a database using clustering techniques. They also proposed the same technique in finding the corresponding membership function for each discovered fuzzy set to avoid having to apply expert knowledge to rule mining [Wai-Chee et al 1998].

Wei and Chen demonstrated association rule mining using fuzzy taxonomic structures [Wei and Chen 1999]. The fuzzy taxonomy structure is a directed acyclic graph, each of whose arcs represents a fuzzy IS-A relationship with the degree

) 1 0

( ≤µ <

µ . The DM measures support and confidence [Agarwal 1993] is taken into Height(ft) Membership 5 5.5 0.6 1 6

account while an item belongs to the taxonomy structure. Au and Chan describe a novel algorithm named as F-APACS for mining fuzzy association rules [Au and Chan 1997, 1998]. F-APACS uses adjusted difference analysis between observed and expected frequencies of attributes to discover interesting associations among attributes. F-APACS divides the quantitive attributes of a dataset into linguistic terms. The division does not produce fixed length intervals; rather it defines a set of linguistic terms using mathematical functions over the domain of the attribute, which introduces fuzziness in the attributes. Based on linguistic terms F-APACS discovers fuzzy association rules which are presented to human experts for interpretation. In F-APACS, fuzzy techniques eliminate the adjacent intervals of numerical attributes. This feature enables F-APACS to deal with missing values in databases. The main advantage of this technique is that it does not need any user-supplied threshold that is often difficult to determine. Moreover, it allows discovering both positive and negative types of association rules. The positive association rules describe the presence of an attribute value, whereas the negative association rules describe the absence of an attribute value in a database. Another novel method of generating desirable fuzzy rules for classification problems is proposed in [Shen and Chouchoulas 2002]. This method integrates a potentially powerful fuzzy rule induction algorithm with a rough set-assisted feature reduction approach. This integration also maintains the underlying semantics of the feature set which is important for any rule mining method. Their work has been applied to several real problem- solving tasks and results are encouraging.

In document Data Mining Using Neural Networks (Page 55-57)