Current Prediction Methods in Associative Classification

Prediction measures are used to judge the accuracy of the classifier built at the end of all the rule generation and pruning phases. It is performed on the test data whose class label of each instance is unknown. The test data is actually the training data with hidden class values. The prediction approaches apply the rules generated to predict the class labels of the test data and computes the prediction accuracy percentage at the end of this process. In AC predicting class labels are divided into two main groups, one is single Accurate Rule Prediction and second is Group of Rules Prediction. In the former approach prediction is based on single rule that has highest precedence and is applied to the test data. The latter measures the prediction on the basis of multiple rules. Both the approaches are discussed in the section below.

2.10.1 Single Accurate Rule Prediction

Consider a set of classified rules R and a test data instance test(i), where i ={1,…..testdata. length}, single rule prediction approaches considers the top ranked rule with high confidence value in R and matches the test(i) attribute values or the body of test instance. Where there is no rule in R that can predict any instance of the test(i) then the instance is allocated the value of the default class. The majority class in the training data that is not covered during pruning process is called the default class.

The AC algorithms that use the single rule comparison approach for prediction are (Liu et al., 1998; Baralis and Torino, 2000; Wang et al., 2000; Tang and Liao, 2007; Baralis, et al., 2004; Li et al., 2008; Niu et al., 2009; Kundu et al., 2008). It is simple, useful and effective approach for classification. Rules with the high confidence values are used for prediction and these contribute mainly in test instances classification. When the confidence value is high, likelihood of the test instance to be measured correctly is also high and so is the probability measure to predict the test instances correctly (Liu et al., 2003). However this approach can show some drawbacks when data sets have uneven class distribution (Liu et al., 2003; Li et al., 2001)

and there could be more than one rule with same confidence in ‘R’ that can match the test

instance. The above shortcoming led to a discovery of more appropriate technique which groups a small subset of the rules having same confidence value but different class labels. Grouping of

rules in a subset have shown better results. The next section will explain different ways in which this approach is used.

2.10.2 Group of Rules Prediction

The process to predict the test instance and the decision to allocate a class value, multiple rules having nearly same confidence values are applied to match the conditions of a test instance. In (Vyas et al., 2008; Yin and Han, 2003; Antonie and Zaïane, 2002; and Li et al., 2001) it is argued that making decision on one rule has shown poor results. Some techniques in AC are reviewed in the next section that incorporated multiple rules in the prediction step.

2.10.2.1 Score based Prediction Methods

A new score based prediction method is proposed in CMAR algorithm (Li et al., 2001) that selects all the rules with high confidences that can be applied to a test instance and calculates the associations between the selected rules. The correlation is measured to estimate the bonding between the rules, which is calculated by considering the support and frequency of the class by a method called weighted chi-square (Li, 2001).

CMAR finds a group of rules ‘Ck’ from a discovered classifier in ‘C’ those cover the test instance test(i). If the same class is predicted by all the rules in subset ‘Ck’ then this class is obviously assigned to the test instance test(i). But if the rules in ‘Ck’ points to different classes, the algorithm CMAR divides the classes into groups and then the strength of each group is measured and compared. The strength of the group is determined by the support and correlation among the rules. The class of the group showing the highest value of strength is selected and allocated to the test(i). Weighted (χ2) analysis (Li, 2001) tries to find the positively of the rules in the group which in turn helps to evaluate the correlation within each group.

A related prediction method is introduced by (Dong, et al., 1999), final decision to select a class label for a test instance is based on all emerging patterns (ep) of a class that have the test instance. Suppose a test database (test), all the classes that are present are assigned an estimated score. The score is formulated from the emerging patterns (ep’s) that covers test(i) for the related class, and the test instance is linked with class containing highest score value.

Experimental results have shown that (Dong et al., 1999) and (Vyas et al., 2008) classification methods that have used correlation in a group of rules to predict the class label of a test instance increased the prediction accuracy slightly when comparison is drawn with single rule methods.

2.10.2.2 Laplace based Prediction Method

It is also a prediction method implemented by CPAR (Yin and Han; 2003) and uses multiple rules for prediction. The groups are compared by using the Laplace expected error estimates (Clark and Boswell, 1991) and here to classify a test instance, expected accuracy of all rules is calculated in advance. The process to classify a test instance in CPAR works as follows: 1) it selects all the rules from the classifier ‘C’ whose attributes values satisfies the test instance. 2) Selection of the best rule is determined for each class from the step 1. 3) Comparison of average expected accuracy is done among the best rule for each class and the one rule is selected with the highest expected accuracy to predict the test instance.

In document LC an effective classification based association rule mining algorithm (Page 81-83)