Introducing Basic Terminologies and Technologies

1.4.1 Fuzzy Logic Model

Fuzzy Logic (FL) is a problem-solving control system methodology that lends itself to implementation in systems ranging from simple, small, embedded micro-controllers to large, networked, multi-channel PCs or workstation-based data acquisition and control systems. It can be implemented in hardware, software, or a combination of both. FL provides a simple way to arrive at a definite conclusion based upon vague, ambiguous, imprecise, noisy, or missing input information. FL's approach to control problems mimics how a person would make faster decision. FL incorporates a simple, rule-based

‘IF X AND Y THEN Z’ approach to a solving control problem rather than attempting to model a system mathematically. The FL model is empirically-based, relying on an operator's experience rather than their technical understanding of the system.

The fuzzy logic approach provides more information to help risk managers effectively manage assessing and identifying phishing website risk rates than the current qualitative approaches as the risks are quantified based on a combination of historical data and expert input. Modelling techniques that can accommodate a combination of data and expert input are better suited for modelling phishing operational risks. Fuzzy logic has been used for decades in the computer sciences to embed expert input into computer models for a broad range of applications. It offers a promising alternative for measuring operational risks (Samir, 2003). The advantage of the fuzzy approach is that it enables processing of vaguely defined variables, and variables whose relationships cannot be defined by mathematical relationships. Fuzzy logic can incorporate expert human judgment to define those variables and their relationships. The model can be closer to reality and be more site specific than some of the other methods (Mahant, 2004).

In contrast to the true or false world of Boolean logic, fuzzy logic techniques allow the use of degrees of truth to calculate results. They allow one to represent concepts that could be considered to be in more than one category. In other words, these techniques allow representation of overlapping and partial membership in sets or categories (Bridges and Vaughn, 2001).

Fuzzy logic can be justified in our work since it can tolerate imprecisely-defined data, can model non-linear functions of arbitrary complexity and can build on the experience of experts (Mahant, 2004).

1.4.2 Data Mining

Data Mining is the automated extraction of previously unrealized information from large data sources for the purpose of supporting actions. The rapid development of data mining has made available a wide variety of algorithms, drawn from the field of statistics, pattern recognition, machine learning and databases. Fayyad, et al. (1998) defines data mining as one of the main phases in Knowledge Discovery from Databases (KDD), which extracts useful patterns from data. The availability of high speed computers, automated data collection tools and large memory capacities has made the process of collecting and storing huge quantities of information possible. The process of extracting this useful knowledge is accomplished using data mining techniques (Fayyad, et al., 1998; Elmasri and Navathe, 1999).

Consider a retail store with a large collection of sales transactions and customer information. The marketing division at the store is promoting a new credit card in a new geographical area. Typical business decisions have to be made such as how credit card limits are decided for each customer and how each customer’s total purchases contribute to the decision process, etc. Finding associations between customer’s different features can help the managers in making business decisions. These associations are known as association rules, an example of which is: “55% of customers who buy crisps are likely to buy a soft drink as well; 4% of all database transactions contain crisps and a soft drink”.

“Customers who buy crisps” is known as rule antecedent, and “buy a soft drink as well” is known as rule consequent. The antecedent and consequent of an association rule contain at least one item. The 55% of the association rule mentioned above represents the strength of the rule and is known as rule’s confidence, whereas the 4% is a statistical

significance measure, known as the rule’s support. In a credit card application, the store’s management is only interested in one class of association rules where the rules consequent is related to whether a credit card should be offered. They would like to develop an automated computer system, which analyses the customer’s different attributes in a certain geographical area to come up with a set of rules. These rules are then used to assess credit card applications for new customers by predicting the credit card attribute. The subset of the association rules, which consider the credit attribute as the class attribute is known as Class Association Rules (CARs) (Liu, et al., 1998). An example of a CAR is: “60% of rows that contain incomes which exceed 25k have been granted a credit card; 4% of all rows contain incomes exceeding 25k”. Similar to the association rule approach, the 60% of the above CAR represents the confidence and the 4% denotes the support. The main significant difference between a CAR and an association rule is that the consequent of the CAR is only the class attribute, whereas in an association rule, the consequent could be multiple items (Freitas, 2000).

1.4.3 Association Rule Mining

Association rule algorithms find correlations between features or attributes used to describe a data set. Association rule mining can be decomposed into two sub-tasks (Agrawal, et al., 1993; Agrawal and Srikant, 1994): (1) The discovery of all frequent itemsets (those whose support is above the minsupp threshold) and (2) for each frequent itemset found, Z, produce rules of the form X → (Z − X ),�X � Z whose confidence is above the minconf threshold. The support of an itemset in association rule mining is defined as the proportion of transactions in the database that contain that itemset and the confidence of a rule X → Z , defined as support(X�Z)/support(Z).

1.4.4 Traditional Classification Rule Mining

Given a training data set of historical transactions, the problem is to discover the CARs with significant supports and high confidences (attribute values that have frequencies above user specified minimum support and minimum confidence thresholds). One subset of the generated CARs is chosen to build an automatic model (classifier) that could be used to predict the classes of previously unseen data. This approach, which uses association rule mining to build classifiers, is called "associative classification" (AC) (Liu, et al., 1998, Li, et al., 2001). Unlike the classic classification approaches such as rule induction and decision trees which usually construct small sized classifiers, AC explores all associations between attribute values and their classes in the training data set, aiming to construct larger sized classifiers. This is because AC methods aim to produce useful knowledge missed by traditional methods, which therefore should improve the predictive accuracy within applications.

1.4.5 Associative Classification Rule Mining

The AC approach was introduced in 1997 by Ali, et al. (1997) to produce rules for describing relationships between attribute values and the class attributes and not for prediction, which is the ultimate goal for classification. In 1998, AC has been successfully employed to build classifiers by Liu, et al. (1998) and later attracted many researchers (e.g. Li, et al., 2000; Dong, et al., 1999; Yin and Han, 2003) from data mining and machine learning communities.

AC is a special case of association rule mining in which only the class attribute is considered in the rule’s consequent (Liu et al., 1998). For example in a rule such as X → Y, Y must be a class attribute. Let us define the AC problem, where training data set

T has m distinct attributes A1, A2… Am and C is a list of class labels. The number of

rows in T is denoted |T|. Attributes could be categorical (meaning they take a value from a finite set of possible values) or continuous (where they are real or integer). In the case of categorical attributes, all possible values are mapped to a set of positive integers. For continuous attributes, a discretisation method is first used to transform these attributes into categorical ones.

An AC task is different from association rule mining. The most obvious difference between association rule mining and AC is that the latter considers only the class attribute in the rules consequent. However, the former allows multiple attribute values in the rules consequent.

In document Phishing website detection using intelligent data mining techniques. Design and development of an intelligent association classification mining fuzzy based scheme for phishing website detection with an emphasis on E-banking. (Page 32-37)