Vol. 4, No. 10, October 2019
Abstract—Peer to peer applications have modified the nature of internet traffic. It will consume high internet bandwidth and affect the performance of traditional traffic internet applications. Therefore, the management and monitoring activity of internet traffic is the important activities involved in the optimization. In order to detect and mitigate the P2P traffic, port, payload, and transport layer based methods were developed in the past. Nevertheless, the performances of these methods were not up to the expectation.
Machine Learning (ML) is one of the promising methods to identify and mitigate the traffic of the Internet. However, the classification accuracy is inconsistent. The reason for the inconsistency is the relevant training datasets generation and feature selection. In this research, a technique based on signature-based and ML is proposed to develop a model for online P2P traffic detection and mitigation. The proposed work can be employed to evaluate the robustness of the online P2P machine learning classifier based on real network traffic traces containing flows labelled by SNORT tool and from special shared resources. Analysis and validation were carried out on traffic traces of University Technology Malaysia. The period of traffic was 2011 and 2013. The output of research is revealing that the proposed work has spent less computation time for classification. This method gives 99.7% accuracy which equals the classification performance attained for P2P using deep packet inspector. The findings show that classifying network traffic at the flow level can differentiate P2P over non-P2P (nP2P) with high confidence for online P2P mitigation.
Index Terms—P2P Traffic Flow, Traffic Classification and Mitigation, SNORT, Machine Learning.
I. INTRODUCTION
The edge services platform and Peer to Peer (P2P) computing are the recent developments in the architecture of the Internet. These two techniques are used to reduce the traffic of the network. The classification of traffic is used to enhance the performance of the network with less bandwidth. The management of network traffic is one of the challenges in heterogeneous network. Traffic detection and mitigation are useful tools to monitor the network. There is a lack of commendable research in the area of P2P classification of traffic on network. The variation in the properties in traffic will lead to a change in the performance of traffic [1-5]. The computation ability of an ML method will depend on the dataset and features used in the training phase. The paper will propose a technique for online P2P
Published on October 25, 2019.
H. A. Jamil is with University of Elimam Elmahdi, Kosti, White Nile State, Sudan (e-mail: [email protected]).
B. M Ali, Ahmed E. Osman, and Mosab Hamdan are PhD candidates with University Technology Malaysia, 81310 Skudai, Johor Bahru,
Malaysia. (e-mail: [email protected],
[email protected] and [email protected]).
Internet traffic detection and mitigation, which is based on SNORT rules and ML.
The proposed model can be used in a variety of network location and to efficiently identify the emergence of new P2P traffic application. The aim of the research is to develop an effective classifier to classify the P2P traffic data. The state of the art classifiers is compared with the proposed model in terms of accuracy and computation cost.
The structure of the paper is organized as follows: section two will provide information about the existing literature on network traffic classification. Section three will give details about the methodology of the research. Section four will discuss the experimental setup of the research. Section five will provide results and analysis and finally, the paper will be concluded in section six.
II. REVIEW OF LITERATURE
Packet payload, traffic characteristics, and flow statistics are the techniques used in the existing detection methods [6- 8]. Packet payload [9,10] was the familiar technique, which had shown the effective and optimal results in P2P traffic classification. Increased storage capacity and privacy concerns are the drawbacks of packet payload technique.
The performance of this approach was limited in encrypted and unknown traffic. Flow statistics approach depends on the behavioural pattern of the network. Due to the nature of the Internet, flow statistics are harder to achieve and the obfuscations of them are possible [11]. The attributes such as size and packet size of flow statistics are used in the existing methods. The techniques that are not using port numbers and characteristics of protocol that are not facing the problems reported in [12–17]. The supervised and unsupervised learning are the popular concepts of ML.
The literatures are discussed the problems of supervised learning methods.
Zarei et. al. [5] have developed a P2P ML classifier. The classifier had generated patterns from three classes of the dataset. The method was used to retrain online P2P classifier. Moore et. al. [11] have developed a Naïve Bayes classifier to classify a complex dataset. Williams et.al. [15]
have compared different ML methods to evaluate the performance in the classification of internet traffic. The comparison results have shown that the performance of C4.5 was better than other methods. Auld et. al. [16] have developed an ML-based Bayesian Neural Network (BNN) that has produced optimal results. Ma et.al. [17] have produced a high accuracy classification results through ML classifiers, which was using the features of datasets. It had taken more computation time to produce such high accuracy results. Erman et. al. [18] have compared K – Means and
Online P2P Internet Traffic Classification and Mitigation Based on Snort and ML
Haitham A. Jamil, Bushra M. Ali, Mosab Hamdan, and Ahmed E. Osman
Vol. 4, No. 10, October 2019 BSCAN algorithms and results have shown that K-Means
have better ability to cluster internet traffics than BSCAN.
Bernaille et. al. [13] have developed a cluster algorithm, which was based on K – Means algorithm. The method has classified the five bundles of Internet traffic data.
Internet service providers (ISPs) have increased the bandwidth to overcome the network bandwidth problem.
ISP has to choose suitable technique to control P2P traffic.
Blocking the P2P traffic, installation of cache tools between domains and backbone links, shaping of P2P traffic, and controlling of P2P traffic are the four aspects of traffic control research. With bandwidth control, the aggregate total users’ bandwidth is limited to a particular volume (usually in gigabytes) over a period of time. The bandwidth limitation can also be used as the base breakpoint in a tiered pricing scheme where the SP offers to vary levels of service to its users. P2P users, who require a higher traffic, would pay a higher price for it. Hence, by charging different prices for each tier of service, service providers recover some of the additional costs that are incurred by the heavy-traffic users. Although the capping of bandwidth discourages most users from using more than the prescribed cap of n GB per month and reducing associated P2P costs, it also leaves the upstream traffic unchanged and so continues to congest the network and incur unnecessary costs. Moreover, this state cannot specifically deal with P2P traffic but rather deals with the aggregate bandwidth resolution of a user, which includes P2P and nonP2P traffic. Therefore, the above shortcomings make it difficult to determine bandwidth limitation as an effective and encouraging P2P traffic mitigation strategy.
III. METHODOLOGY
In this section, a method for P2P traffic detection and mitigation is described with details. The paper did not involve the privacy concerns in processing the payload of a packet. Fig. 1 illustrates the framework for classification of P2P traffic.
A. SNORT
SNORT is a Network Intrusion Detection System (NIDS), designed in the year 1998. It was a popular NIDS, provided many features to restrict malicious activities on network.
The modules of SNORT are sniffer, packet logger, NIDS and IPs. It is based on the payload of packets. It will search for a given signature according to the application protocol.
SNORT rules are used to display the signatures. It is significant for traffic detection because of its effectiveness in term of accuracy. Packet decoder, preprocessors, detection engine, logging, and alerting system are the parts of SNORT architecture [19-21].
The packet decoder part will process the data for the detection engine. The pre - processor part will analyze the packets. The detection engine will apply set of rules for the verification of packets. The output module will generate the output packets. SNORT rules will be applied for the analysis of each packet. The engine will reject the packets that are not matched with the SNORT rules. The two separate sub components of engine are logging and alerting. The logging system will generate logs in human readable format or
tcpdump format. The alert system will configure alerts for sending files. Fig. 2 shows the application of SNORT rules to detect data signature from BitTorrent [22]. The following command is used to extract signature from dataset.
Fig. 1. The proposed framework
alert udp $OUTPUT_NET any -> $IN_NET any Fig. 2. Sample SNORT Rule
B. Features of the traffic
The features of internet traffic data can be either statistical or behavioral. The features will be used to classify the traffic data. The feature selection will be used to choose optimal subsets of features. The feature selection process will improve the performance of classifier [23, 24]. Moore [16] has proposed a feature extractor, which has used 249 features from multiple packet headers. The features of Moore are found in [25]. The process of extracting features is more complex and difficult. The correlation – based feature selection(CFS) is used to define the inter correlation and usefulness of each feature. The consistency based feature selection (CoFS) is used to inspect the subset of features and choose the optimal subset. Table I shows the features of subset collected through the fuzzy rough evaluator. The CFS and CoFS algorithms were used as optimal special features extractors.
C. ML
ML is a study of techniques and statistical methods to enhance the performance of a task. ML algorithms are extensively used in real – time applications. Support Vector Machines (SVM) [26], C5.0, and Neural Network (NN) are some of the ML algorithms used in online P2P classification. SVM is a supervised ML method, which derive mapping function from data. The mapping function is used to classify the data according to the labels. The classified model represents the data that was used in the training phase. The literature has used SVM to classify P2P online traffic data [27].
C5.0 is one of the ML method. It is a subset of Decision tree algorithm. It uses the concept of estimation of entropy.
The features of dataset will be used to derive patterns and matched with a target class. J48 algorithm is used to select appropriate feature by evaluating a node of the tree. It is an iterativ method that derive feature from larger subsets. It is a familiar method in the classification of internet traffic.
Vol. 4, No. 10, October 2019 NN is a popular technique in ML. Many real time
applications have implemented NN for the optimal solutions. The NN will requrie training to produce better results. The study [28] has used NN for the P2P traffic classification and produced better classfied results with less computation time.
IV. EXPERIMENTAL SETUP
This section will describe the process of the configuration of SNORT. Then, we demonstrate how online flow features can be chosen for fast computation. Finally, factors that affect the ML classifier accuracy are displayed.
A. SNORT configuration
In this subsection, we explain Snort configuration. The main task of SNORT is to capture, analyze and compare the performance of its primary functions of classifying the P2P traffic. Consequently, we have decided to fix and operate the kernel Snort files with interface. It has a product of general software services with GUI interfaces and capabilities to produce a graphical representation of data. We have also used SNORT 2. 9. 2. 3 as a dedicated NIDS with minimum
configuration. We have fixed logging and alerting activities for output methods.
The /var/log/snort is used as a default location, while we have manipulated the alert to be saved to an alert file (/home/log/alert.csv) with the information of timestamp, sig.
operator, sig.id, sig. rev, alert message, and other remaining attributes. We have manipulated the alert to be saved to an alert file with the information of timestamp, sig. operator, sig.id, sig. rev, alert message and other remaining attributes.
The CentOS command to execute Snort as follows: ‘‘#
snort – I eth4 -c / machine location”.SNORT will be confirmed using snort.conf and limit the traffic in the network. The details of attributes of SNORT will be stored in snort.conf. It will capture the payload packets and compare with rules in the database. The flow will be classified according to the match. We have used the SNORT default rule set for the evaluation of performance of SNORT in P2P classification [29-32].
B. Features of online traffic
The focus of the research is to reduce the set of features and enhance the classification of traffic data set.
Consistency Feature selection algorithm_1 Consistency Feature selection algorithm_1
Fuzzy-rough Feature selection algorithm_2 Fuzzy-rough Feature selection algorithm_2
Chi-Squared Feature selection algorithm_10 Chi-Squared Feature selection algorithm_10
Total Features (8)
Total Features (7)
Total Features (12)
Testing Moore’s
248 features SVM
classifier SVM classifier
Accuracy of P2P classification: 91.7%
Build time/s : 2.01 Number of features : 8 Accuracy of P2P classification: 91.7%
Build time/s : 2.01 Number of features : 8
SVM classifier
SVM classifier
Accuracy of P2P classification: 90%
Build time/s : 19.8 Number of features : 7 Accuracy of P2P classification: 90%
Build time/s : 19.8 Number of features : 7
SVM classifier
SVM classifier
Accuracy of P2P classification: 97.5%
Build time/s : 27.53 Number of features : 12 Accuracy of P2P classification: 97.5%
Build time/s : 27.53 Number of features : 12
An integration of three optimal Feature selection algorithms An integration of three
optimal Feature selection algorithms
Candidate features On-line (7) Offline (20) Candidate features On-line (7) Offline (20)
Extract on-line features Extract on-line features
J48 Classifier
model J48 Classifier
model P2P nP2P
Testing data Moore’s
248 features
(Phase 2) (Phase 1)
output Online & offline
features
Fig. 3. Illustration of extraction of online features The process of collecting online features is simple. It can
be done before the completion of flow. Chi – Square and fuzzy rough algorithms and employed in P2P traffic detection. The evaluation of features will produce an optimal online P2P traffic classification. Fig. 3 illustrates the phases involved in the extraction. The subset of features of traffic data and classification is listed in Table I. The performance of J48 is measured using optimal feature.
TABLEI:PROPOSED FEATURE SUBSET FOR P2P DETECTION [33]
No Feature Description
1 2 3 4 5 6 7
Server Port IAT data ip data ip g→h IAT g→h data ip g→h class
Destination port number Inter-arrival time
Memory in bytes (IP packet) Memory in bytes (download) Inter-arrival time (download) Memory in bytes (upload) Application class
C. Online P2P ML algorithm
This subsection explains the algorithm that use for the purpose of classification of traffic data, firstly, model is created based on three individual algorithms which are:
SVM [26], decision tree (C4.5) and Artificial NN. These algorithms are used because of their better performance to classify P2P compared to other algorithms. By the way, we investigated more than ten algorithms for this matter. Then the classifiers are measured practically in term of effectiveness and efficiency. Furthermore, multi classifiers model is built and evaluated. Fig. 4 and 5 and Table II depict the process of the P2P ML classifier.
Vol. 4, No. 10, October 2019
Traffic traces Traffic traces
Sniffing by Snort Split to flows Sniffing by Snort
Split to flows
Input Extracting,
selecting feature Extracting, selecting feature
Flow-based
Controlled traffic (result)
Controlled traffic (result)
ML Classifier Model ML Classifier
Model
Validation Using DPI classifier
Validation Using DPI classifier Compare to
output
Online traffic Online traffic
Controlled traffic Controlled
traffic Classifier
Model Classifier
Model
Validation Using Manual classifier
Validation Using Manual classifier
output Tcpdump
A. Off-line classifier
B. On-line classifier
Flow level
Flow level Input
Fig. 4. Research process of on-line P2P classifier Start
Capture data Save captured date to file1
Sniff using snort Save sniffed data to file2
Create labeled data Open flow information
Validation
No
Apply on-line feature selection algorithm
Apply gendered examples using machine learning
validation
No
Understand the effective mitigation strategies for online traffic management
Suggest suitable level of BW management
Valid
No
Overall system integration and evaluation
Satisfy
End
NO yes
yes
yes
yes
Evaluation and results
Fig. 5. Flow Chart
TABLEII:STEPS OFBUILDING THE MODEL
Algorithm1 P2P classifier Input
P: {p1, p2, …, pn}. P is packet traffic X: {f1, f2, … ,fn}. X is flow info D: {f1, f2, … ,fn, c}D is labeled dataset
F: F = Fsub0 Fsub1, Fsub0 is the offline features, Fsub1 is the online features
Output
System sudo tcpdump tcp or udp; captured packet For each packet write
{
Extract packet level information Read row
Capture = “File1”; save capture data }
Sniff = “File1”; save sniffed data. “File1= File2”
Vol. 4, No. 10, October 2019
For each row #Create labeled data#
{ Add class c
Read D= [f1, f2, … ,fn, c];
}
Split flow information For each file/packet
If the packet (IP_src/IP_dst) belong to existing flow Add packet
Else New flow
Read D= [f1, f2, … ,fn, c]; “F = F0 F1 Array to save flow features
open Online flow information;
Extract the online features Run the algorithm
Array to save the P2P flow && nP2P flow Build the model
{
In online features Out class Classifier } Evaluation (check)
{ Write Data Read P2p && nP2P classes Close
V. RESULTS AND DISCUSSION
The section will offer traces of traffic and the metrics to evaluate the performance of the proposed and existing methods.
A. Data preprocessing
Datasets were downloaded from Universiti Technologi Malaysia with proper permissions. Table II to V shows the dataset of traffic flows with size. Academic and affiliated college datasets are included in the downloaded dataset. The dataset is containing a number of 1834122 packets (approx.
15365 flows) from different parts. The first 3 datasets were downloaded in the period between July and October 2011.
WireShark was used to capture 3 datasets [34]. The dataset 4 was downloaded in October 2012 using TCPDUMP and analyzed by SNORT [35]. The fifth dataset was a combined traffic, which were downloaded in November 2012. Table II has shown the details of dataset.
TABLEIII:DATASETS
Dataset Application Packets Flows
1 eMule 25970 415
2 PPlive 88925 861
3 BitTorrent 951749 12012
4 Mix 700969 1463
5 HTTP 66509 614
CAIDA dataset [36] have active and passive values of internet connections. The active values are used to find out the value of latency. The passive values are calculated by the network operators. Researchers can access the datasets with appropriate permission from University of California.
The datasets of the years 2009 to 2013 can be downloaded with secure login. The first quarter of the flow will be generated through TCP trace. The first set is having a size of 897MB, and the next set is of 1.11 GB and the last set is of 703 MB.
TABLEIV:THE TRACES OF CAIDA DATASETS [36]
Dataset
equinix-****.20130117-8***.UTC.anon equinix-****.20130117-****.UTC.anon equinix-****.20130221-****.UTC.anon
University of Brescia (UNIBS) traces were have the flow between September and October 2009. The traces were generated by a collection of workstations. Edge router was connected to internet with a speed of 100 mbps to download the traffic flow. A dedicated hard disk was used to store the tarces with the help of AIA controller. A total of 78998 flows, which includes Web(61.2%), Mail (5.7%), P2P traffic(32.9%), and others (0. 2%). The first set is of 317 MB, the second is of 236 MB, and the last is of 1. 94 GB.
TABLEV:THE TRACES OF UNIBS DATA SETS [37]
Dataset unibs20090930.anon unibs20091001.anon unibs20091002.anon
Cambridge datasets [38] were downloaded from Genome campus network. The datasets were captured in August 2003. University of Cambridge has provided the dataset for researchers. Apart from the discussed dataset, ten different datasets were used in the research. Dataset has covered the wide range of TCP flows. High dimensional dataset of 248 features were in the dataset.
TABLEVI:THE SAMPLES OF THE CAMBRIDGE DATA SETS [38]
Dataset Instances (flows)
1 2 3 4 5 6 7 8 9 10
24863 23801 22932 22285 21648 19384 55835 55494 66248 65036
B. Evaluation Metric
Accuracy and computation cost are the evaluation metric applied on the proposed approach. Retrieval capacity is one of the feature of the classifier. True Positive (TP), False Positive (FP), True Negative (TN), and False Negative (FN) are the attributes of the classifiers, which reveals the performance. Timer for training and testing phase are also recorded to measure the computation time [39].
server Switch
Hub Packeteer
Platform: CentOS CPUs 2.4GHz Intel 1GB DDR memory Ultra320 SCSI drives Snort
Jflow
Colleges area Academic area
Main router
Colleges router Admin
router
Fig. 6. Framework of Data Preparation
Vol. 4, No. 10, October 2019
C. Evaluation Metric
Accuracy and computation cost are the evaluation metric applied on the proposed approach. Retrieval capacity is one of the feature of the classifier. True Positive(TP), False Positive(FP), True Negative(TN), and False Negative (FN) are the attributes of the classifiers, which reveals the performance. Timer for training and testing phase are also recorded to measure the computation time.
D. Comparision of Results
Table VI and Fig. 5 define the classification performance of the proposed topology. The benefit (accuracy) of the training part using individual model is 98.8% of neural network classifier and 98.74% using a decision tree. While it is 98.78% and 98.71% respectively for the testing time. The accuracy of the multi-classifier has shown significant improvement. The benefit is 99.72% for the training and 99.72% for testing. The cost is 0.28.
Table VII presents the comparison between our proposed approach, hybrid naïve Bayes Tree and PORT-SCAN. As compared to these methods in term of false positive, our proposed approach has less FP which is 0.28%. Moreover, our classifier is speed up the process of the classification as compared to the result for NBTree (which is 416s) and port- based (which is 4s) when using the same dataset. This improvement is a result of using multi classifier and also because of reducing the number of features.
Table VIII depicts the validation of the proposed approach. The classification performance results of our approach shows that the system provides higher accuracy compared to SNORT and individual ML. Also, the approach is able to detect the emergence of new P2P application.
TABLEVII: THE EVALUATION RESULTS
Partition Classifier TP FP
Training
ANN 98.43% 1.57%
J48 SVM
98.58%
98.00%
1.42%
2.00%
ANN+J48 99.9% 0.1%
Testing
ANN 98.31% 1.69%
J48 SVM
98.46%
97.90%
1.54%
2.10%
ANN+J48 99.72% 0.28%
Fig. 7. Evaluation of ANN and C5.0 using Gains Chart
TABLEVIII:COMPARISON OF OUR METHOD, HYBRID NBTREE AND PORT-
BASED METHOD
Methods TP FP TN FN Time
Port based NBTree
97.7%
99.5%
5.2%
0.3%
94.8%
99.7%
2.3%
0.5%
4.09 416.32
Our proposal 99.7% 0.28% 97.2% 0.3% 1.94
TABLEIX:THE VALIDATION OF THE PROPOSED APPROACH
Methods Our method Snort ML
Need port info Accuracy
Yes High
No High
Yes Fluctuant Unknown P2P
Online learning
Yes Yes
No No
Yes Yes
E. Mitigation strategy
The creation of dynamic mitigation strategy is mention in the following table:
TABLEX:THE DYNAMIC MITIGATION STRATEGY
Strategy 1
Categorize the application into Critical application
Online (sensitive to latency) application P2P applications or protocols Inbound P2P application
Outbound P2P applications/protocols Calculate bandwidth management statistics Create bandwidth classes
Dynamic strategy
Set min bandwidth to critical applications
Set med bandwidth to sensitive to latency application Set max bandwidth to P2P application
Limit total P2P traffic to 30% of the link capacity Limit P2P outbound bandwidth to 24kbps per user Limit offline P2P Internet traffic
No limit online P2P traffic
Provide optional P2P bandwidth-on-demand to allow users more bandwidth
VI. CONCLUSION
The growth of internet has changed the nature of network traffic. Management of network traffic will lead to the improvement of network performance. Classification of traffic flow will improve the efficiency of network. This paper has proposed a technique that uses SNORT rules and ML to detect and mitigate online traffic flow. The experiment results have confirmed that the classifier has scored a high accuracy and flow computation cost. The SVM classifier has obtained 99.5% of accuracy in less computation time than other classifiers.
ACKNOWLEDGMENT
The authors would like to express gratitude towards University of Cambridge, University of Brescia and CAIDA for sharing the datasets. We are thankful for the Cooperation and support of the University of Elimam Elmahdi.
REFERENCES
[1] Jamil, H.A. and B. M Ali, Classifying Internet Traffic Using an Efficient Classifier. International Journal of Recent Technology and Engineering (IJRTE), 2019. 8(3).
[2] Jamil, H.A., Feature Selection and Machine Learning Classification for Live P2P Traffic. IJEOM, 2019.
[3] Abdalla, B.M.A., et al. Multi-stage Feature Selection for On-Line Flow Peer-to-Peer Traffic Identification. in Asian Simulation Conference. 2017. Springer.
[4] Jamil, H.A., A. Abdalla, and B. M K, Improving P2P Network Traffic Classification with ML multi-classifiers. International Journal of P2P Network Trends and Technology (IJPTT), 2014. 4(2).
[5] Ibrahim, H.A.H., S.M. Nor, and H.A. Jamil. Online hybrid internet traffic classification algorithm based on signature statistical and port
Vol. 4, No. 10, October 2019
methods to identify internet applications. in 2013 IEEE International Conference on Control System, Computing and Engineering. 2013.
IEEE.
[6] Jamil, H.A., Detection and Mitigation Framework of Peer-to-Peer Traffic in Campus Networks. International Review on Computers and Software (I.RE.CO.S.), 2013. 8(8).
[7] O. Mula-Valls, "A practical retraining mechanism for network traffic classification in operational environments," Master Thesis in Computer Architecture, Networks and Systems, Universitat Politecnica de Catalunya, 2011.
[8] M. M. Hassan and M. Marsono, "A three-class heuristics technique:
Generating training corpus for Peer-to-Peer traffic classification," in Internet Multimedia Services Architecture and Application (IMSAA), 2010 IEEE 4th International Conference on, 2010, pp. 1-5.
[9] H. Lu and C. Wu, "Identification of P2P traffic in campus network,"
2010, pp. V1-21-V1-23.
[10] A. Moore and K. Papagiannaki, "Toward the accurate identification of network applications," Passive and Active Network Measurement, pp.
41-54, 2005.
[11] A. W. Moore and D. Zuev, "Internet traffic classification using bayesian analysis techniques," 2005, pp. 50-60.
[12] J. Erman, A. Mahanti, M. Arlitt, I. Cohen, and C. Williamson,
"Offline/realtime traffic classification using semi-supervised learning," Performance Evaluation, vol. 64, pp. 1194-1213, 2007.
[13] L. Bernaille, R. Teixeira, I. Akodkenou, A. Soule, and K. Salamatian,
"Traffic classification on the fly," ACM SIGCOMM Computer Communication Review, vol. 36, pp. 23-26, 2006.
[14] J. Erman, M. Arlitt, and A. Mahanti, "Traffic classification using clustering algorithms," in ACM SIGCOMM 2006 - Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, September 11, 2006 - September 15, 2006, Pisa, Italy, 2006, pp. 281-286.
[15] N. Williams, S. Zander, and G. Armitage, "A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification," ACM SIGCOMM Computer Communication Review, vol. 36, pp. 5-16, 2006.
[16] T. Auld, A. W. Moore, and S. F. Gull, "Bayesian neural networks for internet traffic classification," Neural Networks, IEEE Transactions on, vol. 18, pp. 223-239, 2007.
[17] Y. Ma, Z. Qian, G. Shou, and Y. Hu, "Study of information network traffic identification based on C4. 5 algorithm," 2008, pp. 1-5.
[18] Y. Luo, "Survey on P2P traffic managements," vol. 145 AISC, ed.
Bali, 2012, pp. 191-196.
[19] K. Salah and A. Kahtani, "Performance evaluation comparison of Snort NIDS under Linux and Windows Server," Journal of Network and Computer Applications, vol. 33, pp. 6-15, Jan 2010.
[20] K. Salah and F. Haidari, "Performance evaluation and comparison of four network packet rate estimators," Aeu-International Journal of Electronics and Communications, vol. 64, pp. 1015-1023, 2010.
[21] D. A. Carvalho, M. Pereira, and M. M. Freire, "Towards the Detection of Encrypted BitTorrent Traffic through Deep Packet Inspection," in Security Technology, ed: Springer, 2009, pp. 265-272.
[22] (2012). Emergingthreats (ET) Rules. Available:
http://rules.emergingthreats.net/open/snort-2.9.0/rules/emerging- p2p.rules
[23] J.-j. Zhao, X.-h. Huang, Q. Sun, and Y. Ma, "Real-time feature selection in traffic classification," The Journal of China Universities of Posts and Telecommunications, vol. 15, Supplement, pp. 68-72, 2008.
[24] H. A. Jamil, R. Zarei, N. O. Fadlelssied, M. Aliyu, S. M. Nor, and M.
N. Marsono, "Analysis of features selection for P2P traffic detection using support vector machine," in Information and Communication Technology (ICoICT), 2013 International Conference of, 2013, pp.
116-121.
[25] A. W. Moore, D. Zuev, and M. Crogan, "Discriminators for use in flow-based classification," Technical report, Intel Research, Cambridge2005.
[26] (2012). Support vector machines (SVM). Available:
http://www.support-vector-machines.org
[27] R. Wang, Y. Liu, Y. Yang, and H. Wang, "A new method for P2P traffic identification based on support vector machine," Artificial
Intelligence Markup Language. Egypt: IEEE Computer Society, pp.
58-63, 2006.
[28] A. Nogueira, P. Salvador, A. Couto, and R. Valadas, "Towards the On-line Identification of Peer-to-peer Flow Patterns," Journal of Networks, vol. 4, 2009.
[29] (2012). Peer-to-Peer rules for snort. Available:
http://rules.emergingthreats.net/open/snort-2.9.0/rules/emerging- p2p.rules
[30] (2012). SOURCEfire. Available: http://www.sourcefire.com/security- technologies/snort/snort-rules
[31] (2013). SANS detecting-torrents-snort. Available:
http://www.sans.org/reading-room/whitepapers/detection/detecting- torrents-snort-33144
[32] (2012). Snort community-rules. Available:
http://www.snort.org/snort-rules
[33] H. A. Jamil, A. M, A. Hamza, S. M. Nor, and M. N. Marsono,
"Selection of online Features for Peer-to-Peer Network Traffic Classification," in Recent Advances in Intelligent Informatics. vol.
235, ed: Springer International Publishing, 2014, pp. 379-390.
[34] (2010). Wireshark. Available: http://www.wireshark.org
[35] SNORT Network Intrusion Detection System. Available:
www.snort.org
[36] (2013, 10 April 2013). The Cooperative Association for Internet Data Analysis. Available: http://www.caida.org/data
[37] (19 Nov). Università Brescia data sets. Available:
http://www.ing.unibs.it/ntw/tools/traces/download/
[38] (18 nov 2012). Cambridge data sets. Available:
http://www.cl.cam.ac.uk/research/srg/netos/nprobe/data/papers/sigmet rics/index.html
[39] H. L. Zhang, G. Lu, M. T. Qassrawi, Y. Zhang, and X. Z. Yu,
"Feature selection for optimizing traffic classification," Computer Communications, vol. 35, pp. 1457-1471, Jul 1 2012.
Haitham Ahmed Jamil is assistant professor in the faculty of Computer Science and Information Technology at Elimam Elmahdi university. He received the B.Sc. and M.Sc. from University of Gezira in Sudan and PhD, from the faculty of Electrical Engineering at University Technology Malaysia. His research interests include computer network, Network Traffic classification, Peer-to-Peer computing, and optimization techniques.
Bushra Mohammed Ali, is a PhD candidate at the Faculty of Engineering University Technology Malaysia. He obtained the B.Sc. and M.Sc. from University of Gezira, Sudan. He is a lecturer in University of Kordofan. His research interests include computer architecture, network traffic classification and control, Artificial Intelligence and optimization techniques.
Mosab Hamdan is a PhD student at VECAD research group in university technology Malaysia. He obtained BSc in Electronic and Electrical engineering from University of Science and Technology (Sudan), and MSc in computer architecture and networking from University of Khartoum (Sudan). His current research interests are Software Defined Networking, load balancing, network traffic classification, and future network.
Ahmed Elhaj Osman is a PhD student in the school of Electrical Engineering (faculty of Engineering) at University Technology Malaysia. He obtained his Bachelor degree from University of Khartoum and MSc in Computer Engineering and Networking from University of Gezira (Sudan). His current research interests are Software Defined Networking, Network Communication, and Network Disaster.