• No results found

Monitoring Network Traffic to Detect Stepping-Stone Intrusion

N/A
N/A
Protected

Academic year: 2021

Share "Monitoring Network Traffic to Detect Stepping-Stone Intrusion"

Copied!
6
0
0

Loading.... (view fulltext now)

Full text

(1)

Monitoring Network Traffic to Detect Stepping-Stone Intrusion

1

Jianhua Yang,

1

Byong Lee,

2

Stephen S. H. Huang

1

Department of Math and Computer Science, Bennett College

E-mail: {jhyang, blee }@bennett.edu

2

Department of Computer Science, University of Houston

shuang@cs.uh.edu

Abstract

Most network intruders tend to use stepping-stones to attack or to invade other hosts to reduce the risks of being discovered. There have been many approaches that were proposed to detect stepping-stone since 1995. One of those approaches proposed by A. Blum detects stepping-stone by checking if the difference between the number of the send packets of an incoming connection and the one of an out-going connection is bounded. One weakness of this method is in resisting intruders’ evasion, such as chaff perturbation. In this paper, we propose a method based on random walk theory to detect stepping-stone intrusion. Our theoretical analysis shows that the proposed method is more effective than Blum’s approach in terms of resist-ing intruders’ chaff perturbation.

1. Introduction

Most intruders tend to invade a computer host by launching their attacks through a chain of compromised computers which are called stepping-stones [1]. The at-tackers are called stepping-stone intruders. One obvious reason why intruders use stepping-stone is that it makes them hard to be caught. Detecting a stepping-stone intru-sion is difficult because of the nature of TCP/IP protocol, in which a computer in a TCP/IP session is visible only to its immediate downstream and upstream neighbors, but not to anyone else. That is, if an intruder uses a chain of more than one computer to invade, only the computer hav-ing a direct TCP/IP connection to the victim host is visible and the intruder’s identity would be hidden.

There are many approaches developed to detect step-ping-stone intrusion. They are divided into two categories: passive and active approaches. The passive approaches use the information gathered from hosts and networks to de-tect a stepping-stone intrusion. One advantage of the pas-sive approach is that it does not interfere with the sessions. _______________________________________________ This work is supported by NSF BPC-Alliance grant: Contract number #CNS-0540577

Its disadvantage is that it takes more computations than the active approach does because it finds a stepping-stone pair by checking all the incoming and outgoing connec-tions of a host. Most approaches proposed to detect step-ping-stone intrusion, such as Content-based Thumbprint [2], Time-based Approach [1], Deviation-based Approach [3], Round-trip Time Approach [4, 5], and Packet Number Difference-based Approach [6, 7], are classified as passive category.

Staniford-Chen and Heberlein proposed the content-based thumbprint method that identifies intruders by com-paring different sessions for suggestive similarities of connection chains [2]. The fatal problem of this method is that it cannot be applied to encrypted sessions because their real contents are not available and therefore unable to make thumbprint. Zhang and Paxson [1] proposed the time-based approach that can be used to detect stepping-stones or to trace intrusion even if a session is encrypted. However, there are three major problems in the time-based approach. First, it can be easily manipulated by intruders. Second, the method requires that the packets of connec-tions have precise and synchronized timestamps in order to correlate them properly. This makes difficult or imprac-tical to correlate the measurements those were taken at different points in the network. Third, Zhang and Paxson also were aware of the fact that a large number of legiti-mate stepping-stone users routinely traverse a network for a variety of reasons. Yoda and Etoh [3] proposed the de-viation-based approach that is a network-based correlation scheme. This method is based on the observation that the deviation for two unrelated connections is large enough to be distinguished from the deviation of those connections within the same connection chain. In addition to the prob-lems the time-based approach has, this method has other problems, such as not efficient and not applicable to com-pressed session and to the padded payload. Yung [5] pro-posed the round-trip time (RTT) approach that detects stepping-stone intrusion by estimating the downstream length using the gap between a request and its response, and the gap between the request and its acknowledgement. The problem of the RTT approach is that it makes inaccu-rate detection because it cannot compute the two gaps ac-curately.

(2)

Blum [7] proposed the packet number difference-based (PND-based) approach that detects stepping-stones by checking the difference of Send packet numbers between two connections. This method is based on the idea that if the two connections are relayed, the difference should be bounded; otherwise, it should not. This method can resist intruders’ evasions such as time jittering and chaff pertur-bation. D. Donoho et al. [6] show for the first time that there are theoretical limits on the ability of attackers to disguise their traffics using evasions during a long interac-tive session. The major problem with the PND-based ap-proach is due to the fact that the upper bound on the num-ber of packets required to monitor is large, while the lower bound on the amount of chaff an attacker needs to evade his detection is small. This fact makes Blum’s method very weak in resisting to intruders’ chaff evasion.

In this paper, we propose a novel approach that exploits the optimal numbers of TCP requests and responses to detect stepping-stones. A random walk process can model the differences between the number of requests and the number of responses. A theoretical analysis in this paper shows that the performance of our approach is better than the Blum’s approach in terms of the number of packets to be monitored under the same confidence with the assump-tion that the session is manipulated by time jittering or chaff perturbation.

The rest of this paper is arranged as following. In Sec-tion 2, we present the problem statement. SecSec-tion 3 pre-sents the stepping-stone detection algorithm. In section 4, we analyze the performance of this algorithm, and in Sec-tion 5, we present the result of comparisons with PND-based approach. Finally, in Section 6, we summarize the work and discuss about future work.

2. Problem Statement

The basic idea of detecting a host or a network of com-puters used as a stepping-stone is to compare an incoming connection with one of the outgoing connections. If they are relayed, we call them a stepping-stone pair; otherwise, a normal pair. As Figure 1 shows, host hi has one incom-ing connection 1

i

C and one outgoing connection 2 i

C , while each connection has one request stream and one response stream. If we make the three assumptions below, then in a period of time, the number of packets monitored in each connection should be close to be equal for any two con-nections that are relayed:

1) Each packet that appears in one connection must ap-pear in its relayed one;

2) An intruder could hold any packet at any place, but the holding time has an upper bound;

3) An intruder could insert meaningless packets into an interactive session at any time, but the inserting rate is bounded.

The assumption 1) means that there are no packet drops, combinations, or decompositions. It guarantees that the number of the packets in an incoming connection must be greater than or equal to the number of the packets in the relayed outgoing connection. If two connections are re-layed, we can at least find a relationship between the number of the requests of the outgoing connection and the number of responses of the incoming connection. The problem of detecting stepping-stones becomes the problem of finding a correlation between the number of requests and the number of responses. The assumption 2) comes from the fact that each user has a time tolerance of using an interactive session; and the assumption 3) indicates that the rate in which a user can insert packets into an interac-tive session is bounded.

From the above three assumptions, we know that if two connections are relayed, there should be a suggestive rela-tionship between the number of requests and responses. We can use the existence of this relationship to determine whether two connections are in the same chain. We claim that it is possible to detect stepping-stone by comparing the number of Sends in an outgoing connection with the number of Echoes in an upstream connection. In other words, it is possible to detect stepping-stone intrusion by monitoring network traffic.

3. Stepping-Stone Detection Algorithm

3.1 Basic Idea to Detect Stepping-Stone

We monitor an interactive TCP session that is estab-lished using OpenSSH for a period of time, capture all the Send and Echo packets, and put them in two sequences, S with n packets and E with m packets, respectively. In an interactive session, the user will input a command by typ-ing a sequence of letters, and then execute the command at the server side. The execution result will return to the cli-ent side in terms of packets. In general, when a user types one letter (keystroke) it will be echoed by a response packet. We call them single letter Send and Echo, respec-tively. If we filter out the non-single letter packets and keep only the single letter Sends and Echoes, then the number of the Sends in an outgoing connection should be

hi 2 i

C

1 i

C

) 1 ( i S ) 2 ( i E ) 1 ( i E

Figure 1. Illustration of connections and streams of a host

) 2 ( i S

(3)

close to the number of the Echoes in an incoming connec-tion if the two connecconnec-tions are relayed.

We use (2) ) , (is

N to denote the number of requests of the outgoing connection, and use (1)

) , (ie

N to denote the number of responses of the incoming connection, and use ∆ to denote (2) ) , ( ) 1 ( ) , (ie Nis

N , the difference between the two num-bers. For relayed connections, should vary but close to zero. Ideally, it should be zero. However, there are two reasons why ∆ may not be exactly zero. First, multiple Sends or Echoes may be combined to one packet during the propagation. And also due to the nature of the TCP/IP protocol, we may not be able to identify all single letter packets. Second, we cannot completely remove the pack-ets of command execution result by checking packet size. However, if the two connections are relayed, then

∆should be close to zero with a high probability.

If two relayed connections are manipulated, ∆ should be bounded within a range [l,u] based on the as-sumptions 2) and 3). For time jittering evasion, we assume that if a packet is held, a packet holding time cannot be larger than Η and the number of packets that can be held in each connection cannot be larger thanΗ . For chaff perturbation, we assume that the number of packets that can be introduced in a unit time for each connection can-not be larger than r. Assuming that we collect the packets in Φ units of time, ∆ should be bounded within a range [ Φ , Φ ] for two relayed connections, whereΩΦ=Φ*r. Now, the problem of detecting a

step-ping-stone pair is reduced to the task of judging if the dif-ferences of the number of single letter packets between two connections are bounded, i.e. for a stepping-stone pair, the following relationship should hold:

Φ Φ ≤∆≤Ω

− (1) 3.2 Stepping-Stone Detection Algorithm

To reduce the false alarms and misdetections in detect-ing steppdetect-ing-stone pair, we check the condition (1) every time when a packet is received. If we monitor a total of w

packets, the condition (1) will be checked w times. We propose the following algorithm to detect stepping-stones. We call this algorithm Detecting Stepping-stone Evasion (DSE). DSE (Si(2),Ei(1), ,w Φ Ω ) ( 1 ) ( 2 ) ( , ) ( , ) ( 2 ) ( 2 ) ( , ) ( 1 ) ( 1 ) ( , ) ( 1 ) ( 2 ) ( , ) ( , ) 0 ; 1 : ; ; ; i e i s j i i s j i i e i e i s N N f o r j w i f p S N i f p E N N N i f o r r e t u r n N o r m a l E n d f o r r e t u r n S t e p p i n g S t o n e Φ Φ = = = ∈ + + ∈ + + ∆ = − ∆ < − Ω ∆ > Ω −

In this algorithm, we capture and check up to w packets on two connections to see if formula (1) is satisfied. If there is one time that the formula (1) is not satisfied, we conclude that there is no stepping-stone pair. The conclu-sion about the existence of a stepping-stone should be made only after all the connections are checked. If (1) is satisfied within w times of checking, we conclude that there is a stepping-stone with a very high probability. It is not necessary to check if formula (1) holds for all the con-nections. The larger the checking times w is, the higher the confidence of the DSE. For a given confidence, which is also called false positive probability, what would be an optimal number of packets to be monitored on the two connections?

4. Performance Analysis

We assume that a collected packet is a Send with prob-abilityq, and an Echo with probabilityp. The difference

between the number of the Sends of a stream and the Echoes of another stream can be modeled as a random walk process with independent jumps Z1, Z2,…, Zi,…, where i is a positive integer. If a captured packet is a Send, the difference ∆ will make a jump Zi=-1, otherwise, a jump Zi=1; there is no other choice. We have the following equations.      = + = = = − = 1 ) 1 ( ) 1 ( q p p Z prob q Z prob i i . (2)

We evaluate the performance of the algorithm DSE by computing the smallest w for a given false positive detec-tion probability or false negative detecdetec-tion probability.

NC

p

False negative probability PC

p

False positive probability N

δ

A given false negative probability P

δ

A given false positive probability Table 1. Notations used in the analysis of random walks

(4)

Also, for a given w, we compute the false positive prob-ability and the false negative probprob-ability to evaluate the algorithm DSE. A false negative probability indicates the possibility that the condition (1) does not hold even if the two connections are in the same chain. A false positive probability indicates the possibility that the condition (1) holds when the two connections are not in the same chain. For convenience, we use the notations in Table 1 in the rest of this paper.

4.1 False Negative Probability

False negative probabilitypNCof DSE is actually the sum of the probabilities that the random walk process ∆ hits the lower bound Φ or the upper boundΦ. Based on the results of the random walk process from [8], we have: ) ) ( ) (( 2 1 1 ) ( 2 1 1 ) ( 2 1 2 1 2 1 1 1 1 1 2 1 1 1 2 1 ) ( 0 ) ( 0 Φ Φ Φ Φ Φ Φ Ω Ω − − Ω − Ω Ω Ω + = + ≤ + = p q q p s s p q s q p f f p w w w w w NC (3) where Φ Φ

=

+

=

2

cos

)

(

2

1

2

cos

)

(

2

1

1

2 1 2 1 1

π

π

pq

pq

q

p

s

.

One special case when p=q we have:

Φ − − Ω Ω

+

=

=

Φ Φ

2

cos

1

1 1 1 ) ( 0 ) ( 0 w w

π

w w NC

f

f

s

p

(4) ,where Φ Φ Ω = Ω = 2 cos 1 2 cos ) ( 2 1 2 1 1

π

π

pq s .

From the condition (4), we get the least packet number

w needed for a given false negative probability δN when

p=q: 1 ) 2 log(cos log + Ω ≥ Φ

π

δ

N w (5)

4.2 False Positive Probability

False positive probabilitypPCof DSE is the probability that the difference ∆could walk within the range

[

−ΩΦ,ΩΦ

]

in all w times checked even though the two

connections are not relayed. From the results in [8], we get the following: 1 1 1 2 2 1 1 1 2 2 1 2 2 1 1 1 ) ( 0 ) ( 0 1 ) ) ( ) (( 2 1 ) 1 ( ) ) ( ) (( 2 1 ) ) ( ) (( 2 1 ) ( − Ω Ω ∞ + = − Ω Ω ∞ + = Ω Ω − ∞ + = Ω Ω − + = + = + = + = Φ Φ Φ Φ Φ Φ Φ Φ

w w w k k w k k w k k k PC s s p q q p s p q q p p q q p s f f p (6) ,where Φ Φ Ω = Ω + − − = 2 cos ) ( 2 1 2 cos ) ( 2 1 1 2 1 2 1 1

π

π

pq pq q p s .

When p=q, we have the following simplified results,

Φ Φ Ω − Ω = 2 cos 1 2 cos π π w PC p (7)

Similarly, we get w from (7) for a given δPwhen p=q

as the following: ) 2 log(cos )] 2 cos 1 ( log[ Φ Φ Ω Ω − = π π δP w (8)

5. Comparison

The best way to evaluate an algorithm for its effective-ness is to compare its performance with the best existing algorithm. So far, Blum’s approach has been considered to be the best way to detect stepping-stones and Blum’s De-tect-Attacks-Chaff stepping-stone detection algorithm (DAC) is known to be able to resist to time perturbation and to chaff evasion. In this study, we compare DSE with the Blum’s DAC for its performance. Their performances are compared in terms of the number of packets required to monitor for a given false positive rate

δ

P, i.e. the algo-rithm that requires fewer packets is considered to perform better. We must mention that Blum did not give false negative analysis in his paper [7]. Thus, we compare the performance between the two algorithms in terms of false positive probability only. We discuss two cases: the case with consideration of chaff and the case without consid-eration of chaff.

5.1 Comparison between DSE and the Best Existing Algorithm without Chaff Perturbation

(5)

In order to compare with Blum’s DAC, we assume the equationp=ΩΦis satisfied.

Let wB and w be the minimum number of packets re-quired to monitor in order to get a given false positive probability δPby the DAC and the DSE respectively. Our purpose is to compare wB and w. The fewer number represents the better performance. The numbers

w

B and

w can be computed by the following formula:

P B p w

δ

1 log ) 1 ( 2 + 2 = ∆ (9) ∆ ∆ − = p p w P 2 cos log )] 2 cos 1 ( log[ π π δ (10)

We cannot compare the two numbers directly by using Equations (9) and (10) because there is no guarantee that one of them is absolutely larger than the other. Figure 2 and Figure 3 show the results of comparisons between

B

w and w with varyingpwhere the Y axis uses the loga-rithmic scale, under fixedδPvalues 0.1 and 0.0001 respec-tively. Figure 2 shows that DSE has better performance than DAC only when pis under eight. Whenpis larger than eight, DAC outperforms DSE. Figure 2 and Figure 3 show that when δP becomes smaller, DSE performs better than Blum’s DAC does. Based on the comparisons shown in Figure 2 and Figure 3, we conclude that under a high confidence (low false positive probability) without chaff perturbation, DSE outperforms DAC because DSE needs fewer packets to be monitored than DAC does

5.2 Comparison between DSE and the Best Existing Algorithm with Chaff Perturbation

When a session is manipulated with a chaff perturba-tion, Blum claimed that his method still can detect step-ping-stone, but with a condition that no more than x pack-ets can be inserted for every 8(x+1)2 packets. Otherwise,

his method would not work. We evaluate the performance of our DSE by comparing it with Blum’s DAC again. We assume that we insert x packets into a send stream for every x send and approximate x echo packets. This means

p/q = x/(x + x) = 1/2 and the inserting rate is approxi-mately 50%, which is much bigger than Blum’s DAC al-lows. From equation (6), we obtain the least number of

packets w monitored by DSE with a given

δ

P.

) 2 cos 998 . 0 log( ] )) 2 , 2 ( ) 2 , 5 . 0 ( ( ) 2 cos 998 . 0 1 ( 2 log[ ∆ ∆ ∆ ∆ + − = p p pow p pow p w P π π δ (11) According to [7], the least number of packets

w

B monitored by DAC with a given

δ

Pcan be obtained by the equation (12): P B

p

w

δ

1

log

)

1

(

8

2

+

=

∆ (12) Figure 4 and Figure 5 show the results of comparisons between DSE and DAC with chaff perturbation. Figure 4 shows that DSE outperforms DAC when the detection boundary is less than 50 with given δP is 0.1. Figure 5 shows the results of comparisons when the false positive probability δPis decreased to 0.0001. Figure 4 and Figure 5 show that the lower the false positive probability, the

1 10 100 1000 10000 100000 2 8 1420 26 32 3844 50 56 6268 74 80 8692 98 w DAC DSE ∆ p

Figure 2. Comparison of number of packets moni-tored with Blum’s method under δP=0.1

1 10 100 1000 10000 100000 1000000 2 8 14202632384450 56626874 80869298 w DAC DSE ∆ p

Figure 3. Comparison of number of packets moni-tored with Blum’s method under

δ

P=0.0001

(6)

better performance of the DSE. With chaff perturbation, our DSE still outperforms Blum’s DAC.

6. Conclusions and Future Work

In this paper we propose an algorithm that detects stepping-stone intrusion. With this algorithm, we need to monitor the TCP/IP request and response packets, count the packet numbers, and coumpute the difference between them. The results of theoritical analysis show that this method outperforms Blum’s DAC, which is known to be

the best method of detecting stepping-stone, in resisting to intruders’ chaff perturbation. For the same false positive probability, our approach needs fewer packets to be monitored than Blum’s DAC Does.

What we have presented in this paper is based on purely theoritical analysis under the assumption that intruder’s inserting rate is bounded. Our future work is to develpe a program to do chaff perturbation over a real interactive session and to determine the upper boundary of an user’s chaff rate, as well as Φ.

References

[1] Yin Zhang, Vern Paxson: Detecting Stepping-Stones. Pro-ceedings of the 9th

USENIX Security Symposium, Denver, CO, August (2000) 67-81.

[2] S. Staniford-Chen, L. Todd Heberlein: Holding Intruders Accountable on the Internet. Proc. IEEE Symposium on Secu-rity and Privacy, Oakland, CA, August (1995) 39-49.

[3] K. Yoda, H. Etoh: Finding Connection Chain for Tracing Intruders. Proc. 6th European Symposium on Research in Computer Security (LNCS 1985), Toulouse, France, Septem-ber (2000) 31-42.

[4] Jianhua Yang, Shou-Hsuan Stephen Huang: Matching TCP Packets and Its Application to the Detection of Long Connec-tion Chains, Proceedings (IEEE) of 19th

International Confer-ence on Advanced Information Networking and Applications (AINA’05), Taipei, Taiwan, China, March (2005) 1005-1010. [5] Kwong H. Yung: Detecting Long Connecting Chains of In-teractive Terminal Sessions. RAID 2002, Springer Press, Zu-rich, Switzerland, October (2002) 1-16.

[6] D. L. Donoho (ed.): Detecting Pairs of Jittered Interactive Streams by Exploiting Maximum Tolerable Delay. Proceed-ings of International Symposium on Recent Advances in In-trusion Detection, Zurich, Switzerland, September (2002) 45-59.

[7] A. Blum, D. Song, And S. Venkataraman: Detection of Inter-active Stepping-Stones: Algorithms and Confidence Bounds. Proceedings of International Symposium on Recent Advance in Intrusion Detection (RAID), Sophia Antipolis, France, Sep-tember (2004) 20-35.

[8] D. Cox, H. Miller: The Theory of Stochastic Process. New York, NY: John Wiley & Sons Inc., 1965.

Figure 5. Comparison of number of packets monitored with Blum’s method under chaff

andδP=0.0001 1 10 100 1000 10000 100000 1000000 2 8 14202632 384450 56626874 80869298 w DAC DSE ∆ p 1 10 100 1000 10000 100000 1000000 2 8 14202632 384450 56626874 80869298 w DAC DSE ∆ p

Figure 4. Comparison of number of packets moni-tored with Blum’s method under chaff and δP =0.1

References

Related documents

• power and nuclear power industries • chemical and petrochemical industry • water and waste water treatment... KRÁLOVOPOLSKÁ

Applying this knowledge to in-store analytics, Intel data analytics consulting services yield actionable data that allows retailers and brands to respond to customers’ desires in

Daily Fractional Snow Cover to these reference data, we try to identify how the comparison reflects 35.. the possible inaccuracies of the DFSC and to define the conditions where

Observations and model simulations showed that GHG forcing is mainly responsible for the negative trends in the DTR over Asia but that anthropogenic aerosol forcing was also behind

By Customer accepting the Equipment Protection Plan on the front of this Rental Agreement and with immediate notification in the event of any accident and the prompt submission

As suggested, Arnott and Pervan (2012) indicated major five issues related to the research method (75 % of previous DSS articles as being “weak” with respect to research

Muhammad at 2011 designed clustering algorithm called Based Random Energy-Efficient Routing Algorithm (BREERA) .The scenario of (BREERA) is to make the active node as a

First, a model of inertial energy storage system (ywheel + asynchronous machine) is presented; then, two control methods, namely, the Direct Torque Control (DTC) and DTC-