Online P2P Internet Traffic Classification and Mitigation Based on Snort and ML

(1)

Vol. 4, No. 10, October 2019



Abstract—Peer to peer applications have modified the nature of internet traffic. It will consume high internet bandwidth and affect the performance of traditional traffic internet applications. Therefore, the management and monitoring activity of internet traffic is the important activities involved in the optimization. In order to detect and mitigate the P2P traffic, port, payload, and transport layer based methods were developed in the past. Nevertheless, the performances of these methods were not up to the expectation.

Machine Learning (ML) is one of the promising methods to identify and mitigate the traffic of the Internet. However, the classification accuracy is inconsistent. The reason for the inconsistency is the relevant training datasets generation and feature selection. In this research, a technique based on signature-based and ML is proposed to develop a model for online P2P traffic detection and mitigation. The proposed work can be employed to evaluate the robustness of the online P2P machine learning classifier based on real network traffic traces containing flows labelled by SNORT tool and from special shared resources. Analysis and validation were carried out on traffic traces of University Technology Malaysia. The period of traffic was 2011 and 2013. The output of research is revealing that the proposed work has spent less computation time for classification. This method gives 99.7% accuracy which equals the classification performance attained for P2P using deep packet inspector. The findings show that classifying network traffic at the flow level can differentiate P2P over non-P2P (nP2P) with high confidence for online P2P mitigation.

Index Terms—P2P Traffic Flow, Traffic Classification and Mitigation, SNORT, Machine Learning.

I. INTRODUCTION

The edge services platform and Peer to Peer (P2P) computing are the recent developments in the architecture of the Internet. These two techniques are used to reduce the traffic of the network. The classification of traffic is used to enhance the performance of the network with less bandwidth. The management of network traffic is one of the challenges in heterogeneous network. Traffic detection and mitigation are useful tools to monitor the network. There is a lack of commendable research in the area of P2P classification of traffic on network. The variation in the properties in traffic will lead to a change in the performance of traffic [1-5]. The computation ability of an ML method will depend on the dataset and features used in the training phase. The paper will propose a technique for online P2P

Published on October 25, 2019.

H. A. Jamil is with University of Elimam Elmahdi, Kosti, White Nile State, Sudan (e-mail: [email protected]).

B. M Ali, Ahmed E. Osman, and Mosab Hamdan are PhD candidates with University Technology Malaysia, 81310 Skudai, Johor Bahru,

Malaysia. (e-mail: [email protected],

[email protected] and [email protected]).

Internet traffic detection and mitigation, which is based on SNORT rules and ML.

The proposed model can be used in a variety of network location and to efficiently identify the emergence of new P2P traffic application. The aim of the research is to develop an effective classifier to classify the P2P traffic data. The state of the art classifiers is compared with the proposed model in terms of accuracy and computation cost.

The structure of the paper is organized as follows: section two will provide information about the existing literature on network traffic classification. Section three will give details about the methodology of the research. Section four will discuss the experimental setup of the research. Section five will provide results and analysis and finally, the paper will be concluded in section six.

II. REVIEW OF LITERATURE

Packet payload, traffic characteristics, and flow statistics are the techniques used in the existing detection methods [6- 8]. Packet payload [9,10] was the familiar technique, which had shown the effective and optimal results in P2P traffic classification. Increased storage capacity and privacy concerns are the drawbacks of packet payload technique.

The performance of this approach was limited in encrypted and unknown traffic. Flow statistics approach depends on the behavioural pattern of the network. Due to the nature of the Internet, flow statistics are harder to achieve and the obfuscations of them are possible [11]. The attributes such as size and packet size of flow statistics are used in the existing methods. The techniques that are not using port numbers and characteristics of protocol that are not facing the problems reported in [12–17]. The supervised and unsupervised learning are the popular concepts of ML.

The literatures are discussed the problems of supervised learning methods.

Zarei et. al. [5] have developed a P2P ML classifier. The classifier had generated patterns from three classes of the dataset. The method was used to retrain online P2P classifier. Moore et. al. [11] have developed a Naïve Bayes classifier to classify a complex dataset. Williams et.al. [15]

have compared different ML methods to evaluate the performance in the classification of internet traffic. The comparison results have shown that the performance of C4.5 was better than other methods. Auld et. al. [16] have developed an ML-based Bayesian Neural Network (BNN) that has produced optimal results. Ma et.al. [17] have produced a high accuracy classification results through ML classifiers, which was using the features of datasets. It had taken more computation time to produce such high accuracy results. Erman et. al. [18] have compared K – Means and

Online P2P Internet Traffic Classification and Mitigation Based on Snort and ML

Haitham A. Jamil, Bushra M. Ali, Mosab Hamdan, and Ahmed E. Osman

(2)

Vol. 4, No. 10, October 2019 BSCAN algorithms and results have shown that K-Means

have better ability to cluster internet traffics than BSCAN.

Bernaille et. al. [13] have developed a cluster algorithm, which was based on K – Means algorithm. The method has classified the five bundles of Internet traffic data.

Internet service providers (ISPs) have increased the bandwidth to overcome the network bandwidth problem.

ISP has to choose suitable technique to control P2P traffic.

Blocking the P2P traffic, installation of cache tools between domains and backbone links, shaping of P2P traffic, and controlling of P2P traffic are the four aspects of traffic control research. With bandwidth control, the aggregate total users’ bandwidth is limited to a particular volume (usually in gigabytes) over a period of time. The bandwidth limitation can also be used as the base breakpoint in a tiered pricing scheme where the SP offers to vary levels of service to its users. P2P users, who require a higher traffic, would pay a higher price for it. Hence, by charging different prices for each tier of service, service providers recover some of the additional costs that are incurred by the heavy-traffic users. Although the capping of bandwidth discourages most users from using more than the prescribed cap of n GB per month and reducing associated P2P costs, it also leaves the upstream traffic unchanged and so continues to congest the network and incur unnecessary costs. Moreover, this state cannot specifically deal with P2P traffic but rather deals with the aggregate bandwidth resolution of a user, which includes P2P and nonP2P traffic. Therefore, the above shortcomings make it difficult to determine bandwidth limitation as an effective and encouraging P2P traffic mitigation strategy.

III. METHODOLOGY

In this section, a method for P2P traffic detection and mitigation is described with details. The paper did not involve the privacy concerns in processing the payload of a packet. Fig. 1 illustrates the framework for classification of P2P traffic.

A. SNORT

SNORT is a Network Intrusion Detection System (NIDS), designed in the year 1998. It was a popular NIDS, provided many features to restrict malicious activities on network.

The modules of SNORT are sniffer, packet logger, NIDS and IPs. It is based on the payload of packets. It will search for a given signature according to the application protocol.

SNORT rules are used to display the signatures. It is significant for traffic detection because of its effectiveness in term of accuracy. Packet decoder, preprocessors, detection engine, logging, and alerting system are the parts of SNORT architecture [19-21].

The packet decoder part will process the data for the detection engine. The pre - processor part will analyze the packets. The detection engine will apply set of rules for the verification of packets. The output module will generate the output packets. SNORT rules will be applied for the analysis of each packet. The engine will reject the packets that are not matched with the SNORT rules. The two separate sub components of engine are logging and alerting. The logging system will generate logs in human readable format or

tcpdump format. The alert system will configure alerts for sending files. Fig. 2 shows the application of SNORT rules to detect data signature from BitTorrent [22]. The following command is used to extract signature from dataset.

Fig. 1. The proposed framework

alert udp $OUTPUT_NET any -> $IN_NET any Fig. 2. Sample SNORT Rule

B. Features of the traffic

The features of internet traffic data can be either statistical or behavioral. The features will be used to classify the traffic data. The feature selection will be used to choose optimal subsets of features. The feature selection process will improve the performance of classifier [23, 24]. Moore [16] has proposed a feature extractor, which has used 249 features from multiple packet headers. The features of Moore are found in [25]. The process of extracting features is more complex and difficult. The correlation – based feature selection(CFS) is used to define the inter correlation and usefulness of each feature. The consistency based feature selection (CoFS) is used to inspect the subset of features and choose the optimal subset. Table I shows the features of subset collected through the fuzzy rough evaluator. The CFS and CoFS algorithms were used as optimal special features extractors.

C. ML

ML is a study of techniques and statistical methods to enhance the performance of a task. ML algorithms are extensively used in real – time applications. Support Vector Machines (SVM) [26], C5.0, and Neural Network (NN) are some of the ML algorithms used in online P2P classification. SVM is a supervised ML method, which derive mapping function from data. The mapping function is used to classify the data according to the labels. The classified model represents the data that was used in the training phase. The literature has used SVM to classify P2P online traffic data [27].

C5.0 is one of the ML method. It is a subset of Decision tree algorithm. It uses the concept of estimation of entropy.

The features of dataset will be used to derive patterns and matched with a target class. J48 algorithm is used to select appropriate feature by evaluating a node of the tree. It is an iterativ method that derive feature from larger subsets. It is a familiar method in the classification of internet traffic.

(3)

Vol. 4, No. 10, October 2019 NN is a popular technique in ML. Many real time

applications have implemented NN for the optimal solutions. The NN will requrie training to produce better results. The study [28] has used NN for the P2P traffic classification and produced better classfied results with less computation time.

IV. EXPERIMENTAL SETUP

This section will describe the process of the configuration of SNORT. Then, we demonstrate how online flow features can be chosen for fast computation. Finally, factors that affect the ML classifier accuracy are displayed.

A. SNORT configuration

In this subsection, we explain Snort configuration. The main task of SNORT is to capture, analyze and compare the performance of its primary functions of classifying the P2P traffic. Consequently, we have decided to fix and operate the kernel Snort files with interface. It has a product of general software services with GUI interfaces and capabilities to produce a graphical representation of data. We have also used SNORT 2. 9. 2. 3 as a dedicated NIDS with minimum

configuration. We have fixed logging and alerting activities for output methods.

The /var/log/snort is used as a default location, while we have manipulated the alert to be saved to an alert file (/home/log/alert.csv) with the information of timestamp, sig.

operator, sig.id, sig. rev, alert message, and other remaining attributes. We have manipulated the alert to be saved to an alert file with the information of timestamp, sig. operator, sig.id, sig. rev, alert message and other remaining attributes.

The CentOS command to execute Snort as follows: ‘‘#

snort – I eth4 -c / machine location”.SNORT will be confirmed using snort.conf and limit the traffic in the network. The details of attributes of SNORT will be stored in snort.conf. It will capture the payload packets and compare with rules in the database. The flow will be classified according to the match. We have used the SNORT default rule set for the evaluation of performance of SNORT in P2P classification [29-32].

B. Features of online traffic

The focus of the research is to reduce the set of features and enhance the classification of traffic data set.

Consistency Feature selection algorithm_1 Consistency Feature selection algorithm_1

Fuzzy-rough Feature selection algorithm_2 Fuzzy-rough Feature selection algorithm_2

Chi-Squared Feature selection algorithm_10 Chi-Squared Feature selection algorithm_10

Total Features (8)

Testing Moore’s

248 features SVM

classifier SVM classifier

Accuracy of P2P classification: 91.7%

Build time/s : 2.01 Number of features : 8 Accuracy of P2P classification: 91.7%

Build time/s : 2.01 Number of features : 8

SVM classifier

Accuracy of P2P classification: 90%

Build time/s : 19.8 Number of features : 7 Accuracy of P2P classification: 90%

SVM classifier

Accuracy of P2P classification: 97.5%

Build time/s : 27.53 Number of features : 12 Accuracy of P2P classification: 97.5%

An integration of three optimal Feature selection algorithms An integration of three

optimal Feature selection algorithms

Candidate features On-line (7) Offline (20) Candidate features On-line (7) Offline (20)

Extract on-line features Extract on-line features

J48 Classifier

model J48 Classifier

model P2P nP2P

Testing data Moore’s

248 features

(Phase 2) (Phase 1)

output Online & offline

features

Fig. 3. Illustration of extraction of online features The process of collecting online features is simple. It can

be done before the completion of flow. Chi – Square and fuzzy rough algorithms and employed in P2P traffic detection. The evaluation of features will produce an optimal online P2P traffic classification. Fig. 3 illustrates the phases involved in the extraction. The subset of features of traffic data and classification is listed in Table I. The performance of J48 is measured using optimal feature.

TABLEI:PROPOSED FEATURE SUBSET FOR P2P DETECTION [33]

No Feature Description

1 2 3 4 5 6 7

Server Port IAT data ip data ip g→h IAT g→h data ip g→h class

Destination port number Inter-arrival time

Memory in bytes (IP packet) Memory in bytes (download) Inter-arrival time (download) Memory in bytes (upload) Application class

C. Online P2P ML algorithm

This subsection explains the algorithm that use for the purpose of classification of traffic data, firstly, model is created based on three individual algorithms which are:

SVM [26], decision tree (C4.5) and Artificial NN. These algorithms are used because of their better performance to classify P2P compared to other algorithms. By the way, we investigated more than ten algorithms for this matter. Then the classifiers are measured practically in term of effectiveness and efficiency. Furthermore, multi classifiers model is built and evaluated. Fig. 4 and 5 and Table II depict the process of the P2P ML classifier.

(4)

Traffic traces Traffic traces

Sniffing by Snort Split to flows Sniffing by Snort

Split to flows

Input Extracting,

selecting feature Extracting, selecting feature

Flow-based

Controlled traffic (result)

ML Classifier Model ML Classifier

Model

Validation Using DPI classifier

Validation Using DPI classifier Compare to

output

Online traffic Online traffic

Controlled traffic Controlled

traffic Classifier

Model Classifier

Model

Validation Using Manual classifier

output Tcpdump

A. Off-line classifier

B. On-line classifier

Flow level

Flow level Input

Fig. 4. Research process of on-line P2P classifier Start

Capture data Save captured date to file1

Sniff using snort Save sniffed data to file2

Create labeled data Open flow information

Validation

No

Apply on-line feature selection algorithm

Apply gendered examples using machine learning

validation

No

Understand the effective mitigation strategies for online traffic management

Suggest suitable level of BW management

Valid

No

Overall system integration and evaluation

Satisfy

End

NO yes

yes

Evaluation and results

Fig. 5. Flow Chart

TABLEII:STEPS OFBUILDING THE MODEL

Algorithm1 P2P classifier Input

P: {p1, p2, …, pn}. P is packet traffic X: {f1, f2, … ,fn}. X is flow info D: {f1, f2, … ,fn, c}D is labeled dataset

F: F = Fsub0 Fsub1, Fsub0 is the offline features, Fsub1 is the online features

Output

System sudo tcpdump tcp or udp; captured packet For each packet write

{

Extract packet level information Read row

Capture = “File1”; save capture data }

Sniff = “File1”; save sniffed data. “File1= File2”

(5)

For each row #Create labeled data#

{ Add class c

Read D= [f1, f2, … ,fn, c];

}

Split flow information For each file/packet

If the packet (IP_src/IP_dst) belong to existing flow Add packet

Else New flow

Read D= [f1, f2, … ,fn, c]; “F = F0 F1 Array to save flow features

open Online flow information;

Extract the online features Run the algorithm

Array to save the P2P flow && nP2P flow Build the model

{

In online features Out class Classifier } Evaluation (check)

{ Write Data Read P2p && nP2P classes Close

V. RESULTS AND DISCUSSION

The section will offer traces of traffic and the metrics to evaluate the performance of the proposed and existing methods.

A. Data preprocessing

Datasets were downloaded from Universiti Technologi Malaysia with proper permissions. Table II to V shows the dataset of traffic flows with size. Academic and affiliated college datasets are included in the downloaded dataset. The dataset is containing a number of 1834122 packets (approx.

15365 flows) from different parts. The first 3 datasets were downloaded in the period between July and October 2011.

WireShark was used to capture 3 datasets [34]. The dataset 4 was downloaded in October 2012 using TCPDUMP and analyzed by SNORT [35]. The fifth dataset was a combined traffic, which were downloaded in November 2012. Table II has shown the details of dataset.

TABLEIII:DATASETS

Dataset Application Packets Flows

1 eMule 25970 415

2 PPlive 88925 861

3 BitTorrent 951749 12012

4 Mix 700969 1463

5 HTTP 66509 614

CAIDA dataset [36] have active and passive values of internet connections. The active values are used to find out the value of latency. The passive values are calculated by the network operators. Researchers can access the datasets with appropriate permission from University of California.

The datasets of the years 2009 to 2013 can be downloaded with secure login. The first quarter of the flow will be generated through TCP trace. The first set is having a size of 897MB, and the next set is of 1.11 GB and the last set is of 703 MB.

TABLEIV:THE TRACES OF CAIDA DATASETS [36]

Dataset

equinix-****.20130117-8***.UTC.anon equinix-****.20130117-****.UTC.anon equinix-****.20130221-****.UTC.anon

University of Brescia (UNIBS) traces were have the flow between September and October 2009. The traces were generated by a collection of workstations. Edge router was connected to internet with a speed of 100 mbps to download the traffic flow. A dedicated hard disk was used to store the tarces with the help of AIA controller. A total of 78998 flows, which includes Web(61.2%), Mail (5.7%), P2P traffic(32.9%), and others (0. 2%). The first set is of 317 MB, the second is of 236 MB, and the last is of 1. 94 GB.

TABLEV:THE TRACES OF UNIBS DATA SETS [37]

Dataset unibs20090930.anon unibs20091001.anon unibs20091002.anon

Cambridge datasets [38] were downloaded from Genome campus network. The datasets were captured in August 2003. University of Cambridge has provided the dataset for researchers. Apart from the discussed dataset, ten different datasets were used in the research. Dataset has covered the wide range of TCP flows. High dimensional dataset of 248 features were in the dataset.

TABLEVI:THE SAMPLES OF THE CAMBRIDGE DATA SETS [38]

Dataset Instances (flows)

1 2 3 4 5 6 7 8 9 10

24863 23801 22932 22285 21648 19384 55835 55494 66248 65036

B. Evaluation Metric

Accuracy and computation cost are the evaluation metric applied on the proposed approach. Retrieval capacity is one of the feature of the classifier. True Positive (TP), False Positive (FP), True Negative (TN), and False Negative (FN) are the attributes of the classifiers, which reveals the performance. Timer for training and testing phase are also recorded to measure the computation time [39].

server Switch

Hub Packeteer

Platform: CentOS CPUs 2.4GHz Intel 1GB DDR memory Ultra320 SCSI drives Snort

Jflow

Colleges area Academic area

Main router

Colleges router Admin

router

Fig. 6. Framework of Data Preparation

(6)

C. Evaluation Metric

Accuracy and computation cost are the evaluation metric applied on the proposed approach. Retrieval capacity is one of the feature of the classifier. True Positive(TP), False Positive(FP), True Negative(TN), and False Negative (FN) are the attributes of the classifiers, which reveals the performance. Timer for training and testing phase are also recorded to measure the computation time.

D. Comparision of Results

Table VI and Fig. 5 define the classification performance of the proposed topology. The benefit (accuracy) of the training part using individual model is 98.8% of neural network classifier and 98.74% using a decision tree. While it is 98.78% and 98.71% respectively for the testing time. The accuracy of the multi-classifier has shown significant improvement. The benefit is 99.72% for the training and 99.72% for testing. The cost is 0.28.

Table VII presents the comparison between our proposed approach, hybrid naïve Bayes Tree and PORT-SCAN. As compared to these methods in term of false positive, our proposed approach has less FP which is 0.28%. Moreover, our classifier is speed up the process of the classification as compared to the result for NBTree (which is 416s) and port- based (which is 4s) when using the same dataset. This improvement is a result of using multi classifier and also because of reducing the number of features.

Table VIII depicts the validation of the proposed approach. The classification performance results of our approach shows that the system provides higher accuracy compared to SNORT and individual ML. Also, the approach is able to detect the emergence of new P2P application.

TABLEVII: THE EVALUATION RESULTS

Partition Classifier TP FP

Training

ANN 98.43% 1.57%

J48 SVM

98.58%

98.00%

1.42%

2.00%

ANN+J48 99.9% 0.1%

Testing

ANN 98.31% 1.69%

J48 SVM

98.46%

97.90%

1.54%

2.10%

ANN+J48 99.72% 0.28%

Fig. 7. Evaluation of ANN and C5.0 using Gains Chart

TABLEVIII:COMPARISON OF OUR METHOD, HYBRID NBTREE AND PORT-

BASED METHOD

Methods TP FP TN FN Time

Port based NBTree

97.7%

99.5%

5.2%

0.3%

94.8%

99.7%

2.3%

0.5%

4.09 416.32

Our proposal 99.7% 0.28% 97.2% 0.3% 1.94

TABLEIX:THE VALIDATION OF THE PROPOSED APPROACH

Methods Our method Snort ML

Need port info Accuracy

Yes High

No High

Yes Fluctuant Unknown P2P

Online learning

Yes Yes

No No

Yes Yes

E. Mitigation strategy

The creation of dynamic mitigation strategy is mention in the following table:

TABLEX:THE DYNAMIC MITIGATION STRATEGY

Strategy 1

Categorize the application into Critical application

Online (sensitive to latency) application P2P applications or protocols Inbound P2P application

Outbound P2P applications/protocols Calculate bandwidth management statistics Create bandwidth classes

Dynamic strategy

Set min bandwidth to critical applications

Set med bandwidth to sensitive to latency application Set max bandwidth to P2P application

Limit total P2P traffic to 30% of the link capacity Limit P2P outbound bandwidth to 24kbps per user Limit offline P2P Internet traffic

No limit online P2P traffic

Provide optional P2P bandwidth-on-demand to allow users more bandwidth

VI. CONCLUSION

The growth of internet has changed the nature of network traffic. Management of network traffic will lead to the improvement of network performance. Classification of traffic flow will improve the efficiency of network. This paper has proposed a technique that uses SNORT rules and ML to detect and mitigate online traffic flow. The experiment results have confirmed that the classifier has scored a high accuracy and flow computation cost. The SVM classifier has obtained 99.5% of accuracy in less computation time than other classifiers.

ACKNOWLEDGMENT

The authors would like to express gratitude towards University of Cambridge, University of Brescia and CAIDA for sharing the datasets. We are thankful for the Cooperation and support of the University of Elimam Elmahdi.

REFERENCES

[1] Jamil, H.A. and B. M Ali, Classifying Internet Traffic Using an Efficient Classifier. International Journal of Recent Technology and Engineering (IJRTE), 2019. 8(3).

[2] Jamil, H.A., Feature Selection and Machine Learning Classification for Live P2P Traffic. IJEOM, 2019.

[3] Abdalla, B.M.A., et al. Multi-stage Feature Selection for On-Line Flow Peer-to-Peer Traffic Identification. in Asian Simulation Conference. 2017. Springer.

[4] Jamil, H.A., A. Abdalla, and B. M K, Improving P2P Network Traffic Classification with ML multi-classifiers. International Journal of P2P Network Trends and Technology (IJPTT), 2014. 4(2).

[5] Ibrahim, H.A.H., S.M. Nor, and H.A. Jamil. Online hybrid internet traffic classification algorithm based on signature statistical and port

(7)

methods to identify internet applications. in 2013 IEEE International Conference on Control System, Computing and Engineering. 2013.

IEEE.

[6] Jamil, H.A., Detection and Mitigation Framework of Peer-to-Peer Traffic in Campus Networks. International Review on Computers and Software (I.RE.CO.S.), 2013. 8(8).

[7] O. Mula-Valls, "A practical retraining mechanism for network traffic classification in operational environments," Master Thesis in Computer Architecture, Networks and Systems, Universitat Politecnica de Catalunya, 2011.

[8] M. M. Hassan and M. Marsono, "A three-class heuristics technique:

Generating training corpus for Peer-to-Peer traffic classification," in Internet Multimedia Services Architecture and Application (IMSAA), 2010 IEEE 4th International Conference on, 2010, pp. 1-5.

[9] H. Lu and C. Wu, "Identification of P2P traffic in campus network,"

2010, pp. V1-21-V1-23.

[10] A. Moore and K. Papagiannaki, "Toward the accurate identification of network applications," Passive and Active Network Measurement, pp.

41-54, 2005.

[11] A. W. Moore and D. Zuev, "Internet traffic classification using bayesian analysis techniques," 2005, pp. 50-60.

[12] J. Erman, A. Mahanti, M. Arlitt, I. Cohen, and C. Williamson,

"Offline/realtime traffic classification using semi-supervised learning," Performance Evaluation, vol. 64, pp. 1194-1213, 2007.

[13] L. Bernaille, R. Teixeira, I. Akodkenou, A. Soule, and K. Salamatian,

"Traffic classification on the fly," ACM SIGCOMM Computer Communication Review, vol. 36, pp. 23-26, 2006.

[14] J. Erman, M. Arlitt, and A. Mahanti, "Traffic classification using clustering algorithms," in ACM SIGCOMM 2006 - Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, September 11, 2006 - September 15, 2006, Pisa, Italy, 2006, pp. 281-286.

[15] N. Williams, S. Zander, and G. Armitage, "A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification," ACM SIGCOMM Computer Communication Review, vol. 36, pp. 5-16, 2006.

[16] T. Auld, A. W. Moore, and S. F. Gull, "Bayesian neural networks for internet traffic classification," Neural Networks, IEEE Transactions on, vol. 18, pp. 223-239, 2007.

[17] Y. Ma, Z. Qian, G. Shou, and Y. Hu, "Study of information network traffic identification based on C4. 5 algorithm," 2008, pp. 1-5.

[18] Y. Luo, "Survey on P2P traffic managements," vol. 145 AISC, ed.

Bali, 2012, pp. 191-196.

[19] K. Salah and A. Kahtani, "Performance evaluation comparison of Snort NIDS under Linux and Windows Server," Journal of Network and Computer Applications, vol. 33, pp. 6-15, Jan 2010.

[20] K. Salah and F. Haidari, "Performance evaluation and comparison of four network packet rate estimators," Aeu-International Journal of Electronics and Communications, vol. 64, pp. 1015-1023, 2010.

[21] D. A. Carvalho, M. Pereira, and M. M. Freire, "Towards the Detection of Encrypted BitTorrent Traffic through Deep Packet Inspection," in Security Technology, ed: Springer, 2009, pp. 265-272.

[22] (2012). Emergingthreats (ET) Rules. Available:

http://rules.emergingthreats.net/open/snort-2.9.0/rules/emerging- p2p.rules

[23] J.-j. Zhao, X.-h. Huang, Q. Sun, and Y. Ma, "Real-time feature selection in traffic classification," The Journal of China Universities of Posts and Telecommunications, vol. 15, Supplement, pp. 68-72, 2008.

[24] H. A. Jamil, R. Zarei, N. O. Fadlelssied, M. Aliyu, S. M. Nor, and M.

N. Marsono, "Analysis of features selection for P2P traffic detection using support vector machine," in Information and Communication Technology (ICoICT), 2013 International Conference of, 2013, pp.

116-121.

[25] A. W. Moore, D. Zuev, and M. Crogan, "Discriminators for use in flow-based classification," Technical report, Intel Research, Cambridge2005.

[26] (2012). Support vector machines (SVM). Available:

http://www.support-vector-machines.org

[27] R. Wang, Y. Liu, Y. Yang, and H. Wang, "A new method for P2P traffic identification based on support vector machine," Artificial

Intelligence Markup Language. Egypt: IEEE Computer Society, pp.

58-63, 2006.

[28] A. Nogueira, P. Salvador, A. Couto, and R. Valadas, "Towards the On-line Identification of Peer-to-peer Flow Patterns," Journal of Networks, vol. 4, 2009.

[29] (2012). Peer-to-Peer rules for snort. Available:

http://rules.emergingthreats.net/open/snort-2.9.0/rules/emerging- p2p.rules

[30] (2012). SOURCEfire. Available: http://www.sourcefire.com/security- technologies/snort/snort-rules

[31] (2013). SANS detecting-torrents-snort. Available:

http://www.sans.org/reading-room/whitepapers/detection/detecting- torrents-snort-33144

[32] (2012). Snort community-rules. Available:

http://www.snort.org/snort-rules

[33] H. A. Jamil, A. M, A. Hamza, S. M. Nor, and M. N. Marsono,

"Selection of online Features for Peer-to-Peer Network Traffic Classification," in Recent Advances in Intelligent Informatics. vol.

235, ed: Springer International Publishing, 2014, pp. 379-390.

[34] (2010). Wireshark. Available: http://www.wireshark.org

[35] SNORT Network Intrusion Detection System. Available:

www.snort.org

[36] (2013, 10 April 2013). The Cooperative Association for Internet Data Analysis. Available: http://www.caida.org/data

[37] (19 Nov). Università Brescia data sets. Available:

http://www.ing.unibs.it/ntw/tools/traces/download/

[38] (18 nov 2012). Cambridge data sets. Available:

http://www.cl.cam.ac.uk/research/srg/netos/nprobe/data/papers/sigmet rics/index.html

[39] H. L. Zhang, G. Lu, M. T. Qassrawi, Y. Zhang, and X. Z. Yu,

"Feature selection for optimizing traffic classification," Computer Communications, vol. 35, pp. 1457-1471, Jul 1 2012.

Haitham Ahmed Jamil is assistant professor in the faculty of Computer Science and Information Technology at Elimam Elmahdi university. He received the B.Sc. and M.Sc. from University of Gezira in Sudan and PhD, from the faculty of Electrical Engineering at University Technology Malaysia. His research interests include computer network, Network Traffic classification, Peer-to-Peer computing, and optimization techniques.

Bushra Mohammed Ali, is a PhD candidate at the Faculty of Engineering University Technology Malaysia. He obtained the B.Sc. and M.Sc. from University of Gezira, Sudan. He is a lecturer in University of Kordofan. His research interests include computer architecture, network traffic classification and control, Artificial Intelligence and optimization techniques.

Mosab Hamdan is a PhD student at VECAD research group in university technology Malaysia. He obtained BSc in Electronic and Electrical engineering from University of Science and Technology (Sudan), and MSc in computer architecture and networking from University of Khartoum (Sudan). His current research interests are Software Defined Networking, load balancing, network traffic classification, and future network.

Ahmed Elhaj Osman is a PhD student in the school of Electrical Engineering (faculty of Engineering) at University Technology Malaysia. He obtained his Bachelor degree from University of Khartoum and MSc in Computer Engineering and Networking from University of Gezira (Sudan). His current research interests are Software Defined Networking, Network Communication, and Network Disaster.