Chapter 5 Fuzzy-Based Classification Mining Intelligent Model for
7.1 Conclusions
An AI-based hybrid system has been proposed for phishing website detection systems. Fuzzy logic has been combined with association classification data mining algorithms to provide efficient techniques for building intelligent models to detect phishing websites. Empirical phishing experimental case studies have been implemented to gather and analyze range of different phishing website features and patterns, with all its relations. Our experimental case-studies point to the need for extensive educational campaigns about phishing and other security threats. People can become less vulnerable with a heightened awareness of the dangers of phishing. Our experimental case-studies also suggest that a new approach is needed to design a usable model for detecting e-banking phishing websites, taking into consideration the user's knowledge, understanding, awareness and consideration of the phishing pointers located outside the user’s centre of interest.
The fuzzy logic based detection model has been proposed using its four standard phases (Fuzzification, Rule Evaluation, Aggregation and Defuzzification). Phishing website features and patterns are characterized as fuzzy variables with specific fuzzy sets. Fuzzy rules captured from previous human expert knowledge, processed by the fuzzy set operations into the inference engine for the final calculation of the phishing website detection rate. Results shows the significance and importance of the phishing website criteria (URL & Domain Identity) represented by layer one, especially when compared to the other criteria and layers.
Enchantment has been proposed by utilising supervised machine learning techniques to automate the fuzzy rule generation process, in order to reduce the human expert knowledge intervention and increase performance of the phishing detection system. In this investigation, we have generated classification rules and investigated the predictive accuracy of five classifiers on a phishing data set. The classifiers included JRip (RIPPER), PART, PRISM, C4.5 Decision Tree (J48) and Classification Based on Association (CBA). By analyzing a large number of phishing pages, we built an associative classification model that attempts to use the properties of a page (e.g., URL address length, SSL certificate, Abnormal URL request, Certification Authority, etc.) to distinguish between phishing and legitimate website pages. We constructed a data set from 731 phishing websites, 711 suspicious websites and 1718 legitimate websites, where 27 phishing features were trained and tested to detect phishing websites. During training and testing we used 10-fold cross-validation to evaluate the error rate for all classifiers. Mining association classification rules were then combined with the fuzzy logic inference engine to provide efficient and competent techniques for phishing website detection rate.
We showed that data mining associative classification fuzzy-based solutions are actually quite effective in building detection solutions for protecting users against phishing websites attacks. We believe our model can be used to improve existing anti-phishing approaches which use an Artificial Intelligence heuristics page search. Using this approach will automate the fuzzy rule generation process and reduce the human intervention in building an effective phishing detection intelligent model.
A browser-based plug-ins phishing detection toolbar has been implemented using an intelligent heuristic approach. The toolbar has extracted all the phishing website features and patterns. Validation of the extracted features has been integrated into the solution to effectively identify phishing, legitimate and suspicious website. An intelligent pruning technique has been used to increase the performance of the phishing detection rate. The intelligent phishing detection toolbar reduces the requirement of human knowledge intervention for detection of a phishing website. Our toolbar has been provided as an alternative solution of depending only on the black-list or white-list approach, by adopting a new fuzzy-based classification mining technique to detect phishing website. The results of our testing and validation shows that the proposed solution outperformed the existing detection toolbars regarding the accuracy, efficiency and the speed of classifying and detecting phishing websites. It managed to classify correctly approximately 92% of all tested websites.
The experimental results showed that both its false-positive rate and miss rate are reasonably low. A comparative performance of the proposed scheme was presented in order to demonstrate the merits of capabilities through a set of experiments. It is noted
that the proposed intelligent system offers better performance as compared to other existing tools and techniques.
Many contributions evolved from our investigation research which can be very useful for all researchers interested in the field of internet security and online identity theft protection using artificial intelligence (AI). Following are summary of the main contributions:
• Two phishing experiments which covered website phishing attack techniques and phishing detection survey scenario were conducted to cover all phishing approaches, motivations and deception behaviour techniques.
• 27 phishing features and patterns which characterize any phishing website were successfully extracted, divided into 6 criteria or categories distributed in three layers, depending on its attack type.
• A dynamic intelligent phishing website detection system has been proposed based on specific AI supervised machine learning approach. The technique utilises fuzzy logic combined with simple data mining associative classification techniques and algorithms to process the phishing data features and patterns, for extracting classification rules into the data miner. The proposed phishing website system combines these techniques together to automate the fuzzy rules production by using the extracted classification rules to be implemented inside the fuzzy inference engine for the final phishing website detection.
• A web-based plug-ins intelligent phishing website detection toolbar has been designed for testing and validation, using our integrated fuzzy based classification mining model to prove its feasibility, reliability and detection precision. The implementation was programmed using Java language, and it successfully
recognized and detected approximately 92% of the phishing websites selected from our test data subset, avoiding many miss-classified websites and false phishing alarms.