• No results found

Applying Image Analysis Methods to Network Traffic Classification

N/A
N/A
Protected

Academic year: 2021

Share "Applying Image Analysis Methods to Network Traffic Classification"

Copied!
18
0
0

Loading.... (view fulltext now)

Full text

(1)

Motivation Analysing Network Traffic with GLCM Parameters Summary and Results

Applying Image Analysis Methods to Network

Traffic Classification

Thorsten Kisner, Alex Essoh and Firoz Kaderali

Department of Communication Systems Faculty of Mathematics and Computer Science

FernUniversität in Hagen, Germany

SPRING 2007

SPRING SIDAR Graduierten-Workshop über Reaktive Sicherheit

(2)

Motivation Analysing Network Traffic with GLCM Parameters Summary and Results

Outline

1 Motivation

Texture Analysis Methods Network Traffic

2 Analysing Network Traffic with GLCM Parameters

Determining the GLCM matrix size Evaluation of feature vectors 3 Summary and Results

Accuracy of classification Conclusion and Future Work

(3)

Motivation

Analysing Network Traffic with GLCM Parameters Summary and Results

Texture Analysis Methods Network Traffic

Outline

1 Motivation

Texture Analysis Methods Network Traffic

2 Analysing Network Traffic with GLCM Parameters Determining the GLCM matrix size

Evaluation of feature vectors 3 Summary and Results

Accuracy of classification Conclusion and Future Work

(4)

Motivation

Analysing Network Traffic with GLCM Parameters Summary and Results

Texture Analysis Methods

Network Traffic

Grey Level Co-occurrence Matrix

Definition

Grey Level Co-occurrence Matrix (GLCM)

C(δ,T) = [s(i,j, δ,T)]for texture analysis [1] [2].

s(i,j, δ,T)is a second order probability going from one grey level i to another grey level j given the displacement vectorδ= (∆x,∆y).

s(i,j, δ,T) = Θ{~x|~x, ~x +δ∈T,g(~x) =i,g(~x +δ) =j}

Θ{~x|~x, ~x+δ ∈T} (1)

(5)

Motivation

Analysing Network Traffic with GLCM Parameters Summary and Results

Texture Analysis Methods

Network Traffic

Grey Level Co-occurrence Matrix

Parameters describing a texture

Angular Second Moment=X

i X j (s(i,j))2 (2) Entropy=−X i X j s(i,j)·log(s(i,j)) (3)

Inverse Difference Moment=X

i X j s(i,j) 1+ (ij)2 (4) Inertia=X i X j (ij)2·s(i,j) (5)

(2) describes the energy of the matrix, (3) the information content. (5) can be interpreted as the contrast and (4) as an inverse weighted measure of contrast.

(6)

Motivation

Analysing Network Traffic with GLCM Parameters Summary and Results

Texture Analysis Methods

Network Traffic

Network Traffic

In- and outgoing traffic, two types: SMTP and HTTP Measured at the gateway to the external network with the

built-in packet and byte counter ofiptables(1 second

resolution in time).

70 independent tracesof 9 hours (weekdays between 7:30am and 4:30pm) for each type of traffic.

60 traces for training data 10 traces for verification

Like thewindowing mechanism(see T in eq. (1)) in the

texture analysis we divide each 9 hour time series in 6 segments of 90 minutes

(7)

Motivation

Analysing Network Traffic with GLCM Parameters

Summary and Results

Determining the GLCM matrix size Evaluation of feature vectors

Outline

1 Motivation

Texture Analysis Methods Network Traffic

2 Analysing Network Traffic with GLCM Parameters Determining the GLCM matrix size

Evaluation of feature vectors 3 Summary and Results

Accuracy of classification Conclusion and Future Work

(8)

Motivation

Analysing Network Traffic with GLCM Parameters

Summary and Results

Determining the GLCM matrix size

Evaluation of feature vectors

Determining the GLCM matrix size

In texture analysis the size of the co-occurrence matrix is explicitly given by the range of the greyscale values In our scenario the source for the co-occurrence is a time series withno explicitly given limitfor the values

Huge matrix size to the magnitude of 107x107doesn’t

make sense thus requiringquantisation. We analysed a

linear quantisation to a matrix size of 2iwith

(9)

Motivation

Analysing Network Traffic with GLCM Parameters

Summary and Results

Determining the GLCM matrix size

Evaluation of feature vectors

Determining the GLCM matrix size

2 4 6 8 10 12 −5 0 5 10 15 20 25 30 35 40 45 Size of GLCM log 2 log 2 Linearly Dependent Inertia Cluster Shade Cluster Prominence 2 4 6 8 10 12 0 0.5 1 1.5 2 2.5 Size of GLCM log 2 Not Dependent

Inverse Difference Moment Correlation Angular Second Moment Entropy

Figure:Parameters as a function of matrix size

(10)

Motivation

Analysing Network Traffic with GLCM Parameters Summary and Results

Texture Analysis Methods

Network Traffic

Network Traffic

Example 0 50 100 150 200 250 300 350 400 450 500 550 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 x 106 Bytes / T i time interval T i

(11)

Motivation

Analysing Network Traffic with GLCM Parameters

Summary and Results

Determining the GLCM matrix size

Evaluation of feature vectors

Evaluation of feature vectors

0 0.2 0.4 0.6 0.8 1 0 100 200 300 400 ASM 0 0.5 1 1.5 2 0 20 40 60 80 100 ENT −1 0 1 2 3 x 10−3 0 50 100 150 200 CORR 0 0.2 0.4 0.6 0.8 1 0 20 40 60 80 IDM 0 500 1000 1500 2000 2500 0 50 100 150 INE 0 0.5 1 1.5 2 2.5 3 x 108 0 200 400 600 CP SMTP−Traffic HTTP−Traffic

Figure: Histograms of selected GLCM parameters

(12)

Motivation

Analysing Network Traffic with GLCM Parameters

Summary and Results

Determining the GLCM matrix size

Evaluation of feature vectors

Inverse Difference

Moment (IDM) and

Correlation (CORR)

plotted against each

other.

Intersection of both

classes, but clustering can be observed. −0.5 0 0.5 1 1.5 2 2.5 3 x 10−3 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 CORR IDM SMTP−Traffic HTTP−Traffic

(13)

Motivation Analysing Network Traffic with GLCM Parameters

Summary and Results

Accuracy of classification Conclusion and Future Work

Outline

1 Motivation

Texture Analysis Methods Network Traffic

2 Analysing Network Traffic with GLCM Parameters Determining the GLCM matrix size

Evaluation of feature vectors

3 Summary and Results Accuracy of classification Conclusion and Future Work

(14)

Motivation Analysing Network Traffic with GLCM Parameters

Summary and Results

Accuracy of classification

Conclusion and Future Work

Accuracy of classification

k-Nearest-Neighbor(kNN) algorithmwith k =5 to classify

the 120 segments1of unknown traffic to the classes SMTP

or HTTP

Only use of thefour most relevant parameters(Angular

Second Moment (2), Entropy (3), Inverse Difference Moment (4) and Inertia (5)).

Traffic Positive Negative Classification rate

HTTP 55 5 91.67%

SMTP 52 8 86.67%

Total 107 13 89.17%

(15)

Motivation Analysing Network Traffic with GLCM Parameters

Summary and Results

Accuracy of classification

Conclusion and Future Work

Conclusion

Novel approach for identifyingnetwork trafficby mapping

given time series to the known co-occurrence matrix of the

domain of texture analysis.

Using texture analysis methods we classified even

inaccurate and aggregrated datawith an accuracy of 90%.

(16)

Motivation Analysing Network Traffic with GLCM Parameters

Summary and Results

Accuracy of classification

Conclusion and Future Work

Future Work

Analysation ofmulti-dimensional time series.

Examination of network traffic with the proposed method

on packet level also includingnetwork flow information.

Implementing avisualisation frameworkbased on Grey

(17)

Appendix For Further Reading

End

For Further Reading

R. M. Haralick, K. Shanmugam and I. Dinstein, Textural features for image classification, IEEE Transactions on

Systems, Man, and Cybernetics, 3(6), November 1973,

610-621

R.W. Conners, M. M. Trivedi, C.A. Harlow, Segmentation of a High-Resolution Urban Scene using Texture Operators,

Computer Vision, Graphics and Image Processing, 25,

1984, 273-310

(18)

Appendix For Further Reading End

References

Related documents

Š Key recommendation of Phase I was for the completion of a field reconnaissance of NPR-2 to identify and inventory oil field features and related environmental conditions;..

[r]

Recently, epigenetic dysregulation of tumor suppressor miRNA genes by promoter DNA methylation has been implicated in human cancers, including multiple myeloma

In this paper, we develop a case that demonstrates the pension accounting concepts, assumptions and estimates, as well as earnings management incentives and techniques.. A pension

A heat pump, like an air conditioner, is not designed for a rapid change of indoor temperature, but is designed to maintain a constant temperature 24 hours a day.. If a heat

For information regarding services, students with disabilities should contact the Office of Disability Services in the OC Help Center located in Room 204 of the Student Union

SOME THOUGHTS ON MONEY LAUNDERING ON THE INTERNET Izelde van Jaarsveld 685 INVESTIGATING THE STATUTORY PREFERENTIAL RIGHTS THE LAND BANK REQUIRES TO FULFIL

Mor ales-Díaz, J. and Zamor a-Ramír ez, C. AESTIMA TIO, THE IEB INTERNA TIONAL JOURNAL OF FINANCE , 2018.. In lease operations, this results in the recovery of the leased asset in