• No results found

2581 0 RAID 1999b pdf

N/A
N/A
Protected

Academic year: 2020

Share "2581 0 RAID 1999b pdf"

Copied!
25
0
0

Loading.... (view fulltext now)

Full text

(1)

Using Key-String Selection and Neural Networks to Reduce False Alarms

and Detect New Attacks with Sniffer-Based Intrusion Detection Systems

R i c h a r d P . L i p p m a n n a n d R o b e r t K . C u n n i n g h a m

[email protected]

MIT Lincoln Laboratory

Room S4-121

244 Wood Street

Lexington, MA 02173-0073

Presented at the Recent Advances in Intrusion

Detection, RAID 99 Conference,

7-9 September

(2)

Outline

Intrusion Detection Background

Goals and System Design

Evaluation and Results

Summary

(3)

Common Input Features for Intrusion Detection

Network Sniffing Data (NSM, ASIM,EMERALD, BRO, IBM-HAXOR

Cisco-NetRanger, ISS-RealSecure, Network Radar, Network Flight Recorder)

Host Audit Data (STAT, EMERALD, AXENT-Intruder Alert, Centrax)

Router, Firewall, … Audit Data (Ji Nao, Cisco-NetRanger, EMERALD)

L i n u x

S u n O S S o l a r i s

S p a r c

U l t r a 1. Network

S n i f f i n g D a t a

W i n d o w s N T

2. Host Audit Data 4 8 6 4 8 6 4 8 6

Router/ Firewall

Internet Local

N e t w o r k

(4)

Visibility of Attacks with Different Inputs

Infrastructure

Host Denial of Service

Probes/Scans

Remote to Local

User to Root

Malicious-Code/Data

Insider Attack

Host

Audit

Network

Sniffer

(5)

Sniffer-Based Intrusion Detection

Popular, Low Cost, No Impact on Hosts, Monitor Many

Hosts Simultaneously ( ASIM, Cisco-NetRanger, BRO, …)

Capture Network Traffic, Reconstruct Network Sessions,

Count Number of Keywords in Each Session

Examples of Keyword Strings

ftp: root, anonymous

login: guest, root, incorrect, daemon, passwd, permission

C A P T U R E T R A F F I C

H O S T H O S T R O U T E R I N T E R N E T

K E Y W O R D L I S T

R E C O N S T R U C T N E T W O R K S E S S I O N S

C O U N T K E Y W O R D S

O U T P U T K E Y W O R D

(6)

Example Telnet Session Reconstruction

HP-UX hqdadev A.09.03 D 9000/750 (ttyt1)

login: ~tftp

P

assword

:

Login incorrect

login: efs

P

assword

Login incorrect

login: efs

P

assword

:

P

assword

:

/usr/efs w

3:03pm up 14 days, 21:33, 1 user, load average: 0.00, 0.00

User tty login@ idle JCPU PCPU what

efs pty/ttyt1 3:03pm 1 w

/usr/efs> cd /

ls -al

total 16336

drwxr-xr-x 47 root sys 3072 Sep 23 10:41 .

drwxr-xr-x 47 root sys 3072 Sep 23 10:41 ..

drwx---

2 root mail 1024 Nov 28 1994 .elm

-rw---

1 bin bin 8690 Jul 9 15:39 .profile

-rw---

1 root sys 37 Nov 18 1994

.rhosts

dr-x---

3 root other 1024 May 26 1994 .secure

drwxr-x---

3 root sys 1024 Jul 3 14:30 Perl

grep :0: /etc/

passwd

(7)

Problems With Keyword-Based Systems

High False Alarm Rates

Little Use of Context Around Keywords

Humans Select Keywords to Detect Attacks With Little

Thought of Impact on False Alarms and Little Validation

Many Systems Produce 100’s of False Alarms Per Day

Keywords Accumulate for Old Attacks and May Generate

False Alarms for New Types of Normal Network Traffic

Requires Knowledge Of Attack Details, Sometimes Difficult

To Select Keywords to Detect an Attack

Misses New Attacks, May Miss Attack Variants, Requires

Constant Updating (Like Virus Detection)

Keywords are Often Too Attack Specific and Depend on

Visibility of Attack Script and Use of an Unchanging Script

Does not Provide a Correct Attack Name

It is Often Difficult to Infer the Attack Name from Keyword

(8)

Outline

Intrusion Detection Background

Goals and System Design

Evaluation and Results

Summary

(9)

Goal of This Work

Improve Performance Of Existing Keyword-Based Systems

Use Neural Networks and Automatic Training on Normal Data and

Attacks to Select Keywords and Keyword Weightings that Provide

Good Detection and Few False Alarms

Select More Robust Keywords that Can Detect New and Old

Attacks

Focus on UNIX Attacks Where Users Illegally Become Root

This is a Difficult, but Important Class of Attacks

Determine if Neural Networks can Provide Attack Labels

Constraints

To Permit Retrofitting in Existing Systems, Continue to Use

Keyword Counts in Telnet, Rlogin, and other Sessions

(10)

Approach for Attack Detection and

Classification

S N I F F E D D A T A

C R E A T E T R A N S C R I P T S

F O R T E L N E T S E S S I O N S

K E Y W O R D L I S T

C O U N T K E Y W O R D O C C U R R E N C E S

D E T E C T I O N N E U R A L N E T

C L A S S I F I C A T I O N N E U R A L N E T

T O T A L K E Y W O R D C O U N T

A T T A C K

P R O B A B I L I T Y

(11)

Training and Test Data from DARPA

Training and Test Data from DARPA

1998 Intrusion Detection Evaluation

1998 Intrusion Detection Evaluation

Outside

Internet

Eyrie AF Base

Inside

Router

Sniffer

Seven Weeks Training Data

System Development34 User-to-Root Attacks1202 Telnet Sessions

Two Weeks Test Data

One Evaluation Pass35 User-to-Root Attacks11,800 Telnet Sessions

Simulates Traffic In and Out of an

Air Force Base

1000’s of Simulated UNIX Hosts

100’s of Simulated Users

Rich Mix of Background Traffic

(12)

Eject Attack Example from Training Data

U N I X ( r ) S y s t e m V R e l e a s e 4 . 0 ( p a s c a l ) @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

@ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @ l o g i n : a l i e P a s s w o r d :

L a s t l o g i n : W e d J u l 1 1 6 : 1 2 : 3 4 f r o m 1 9 4 . 2 7 . 2 5 1 . 2 1

S u n M i c r o s y s t e m s I n c . S u n O S 5 . 5 G e n e r i c N o v e m b e r 1 9 9 5

O f f i c i a l U . S . g o v e r n m e n t s y s t e m f o r a u t h o r i z e d u s e o n l y . D o n o t d i s c u s s ,

e n t e r , t r a n s f e r , p r o c e s s o r t r a n s m i t c l a s s i f i e d / s e n s i t i v e n a t i o n a l s e c u r i t y . . .

p a s c a l > @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

p a s c a l > w h i c h g c c @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @ / b i n / g c c

p a s c a l > u u d e c o d e < < X X 8 9 9 3 4 7 3 6 8 X X\ ` @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

? b e g i n 6 4 4 / t m p / 1 7 8 5 7 . c @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

? M ( V E N 8 V Q U 9 & 4 @ / ' - T 9 & E O + F @ ^ " B - I ; F - L = 6 1 E ( # Q S = & 1 L : 6 ( N : # X * ( V E N 8 V Q U @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @ . . .

? M ( & ) U 9 E L Q 7 2 P @ * & - H 8 7 ( @ * B D @ , " D [ " B ` @ < & 5 R < F ] R * " ) E > & 5 C ; " ! F 8 6 E L 9 6 0 B @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @ ? ` @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

? e n d @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

? X X 8 9 9 3 4 7 3 6 8 X X \ ` @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

p a s c a l > / b i n /g c c - o / t m p / 1 7 8 5 7 2 / t m p / 1 7 8 5 7 . c @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @ p a s c a l > w h i c h g c c @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

/ b i n / g c c

p a s c a l > u u d e c o d e < < X X 8 9 9 3 4 7 3 7 5 X X\ ` @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @ ? b e g i n 6 4 4 / t m p / 1 7 8 5 7 . c @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

? M ( V E N 8 V Q U 9 & 4 @ / ' 5 N : 7 - T 9 " Y H / @ I V ; V E D " F U A : 6 X H : 6 Y T ( & % R 9 V , L ( & - H 8 7 ( @ @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

? M * F % R 9 W 9 ; 7 2 D * > P H @ ( ' - E = ' ) E = 6 E D * # ` L , " D [ " B ` @ 9 7 A E 8 V P H ( B ] B : 6 X O = & - S @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @ ? / : " ( L ( G 1 C < V @ B + # ` I . P I ] @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

? ` @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @ ? e n d @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

? X X 8 9 9 3 4 7 3 7 5 X X \ ` @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

p a s c a l > / b i n / g c c - o / t m p / 1 7 8 5 7 3 / t m p / 1 7 8 5 7 . c @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

p a s c a l > / t m p / 1 7 8 5 7 2 @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

J u m p i n g t o a d d r e s s 0 x e f f f f 7 e 0

J u m p i n g t o a d d r e s s 0 x e f f f f 7 e 0 B [ 3 6 4 ] E [ 4 0 0 ] S O [ 4 0 0 ]

#

/ t m p / 1 7 8 5 7 3 # # #

^ D @ # $ [ B S ] $ # @ @ # $ [ B S ] $ # @ # @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

# e x i t

#

^ D @ # $ [ B S ] $ # @ @ # $ [ B S ] $ # @ #

p a s c a l > @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

p a s c a l > ^ D @ # $ [ B S ] $ # @ @ # $ [ B S ] $ # @ l o g o u t

1. Gain User Access: Attacker Logs Into Telnet Using Sniffed Password

2. Download Attack Code: Type Directly into

uudecode to Hide Keywords

4. Run Attack:

Buffer Overflow Creates Root Shell

3. Preparations:

Compile Attack Programs Using gcc With

Innocuous Names

5. Actions:

(13)

Keywords for User to Root Attacks

Initially We Had 58 Old Keywords Commonly Used in Existing

Intrusion Detection Systems

D e t e c t S u s p i c i o u s A c t i o n s ( “ p a s s w d ” , " + + ” , “ d a e m o n ” , " w a r e z ” , “ s h a d o w ” , “ p e r m i s s i o n d e n i e d ” , “ s h o w m o u n t ” )

D e t e c t O l d A t t a c k s ( " f r o m : \| ” , " C W D ~ R O O T ” , " L D _ P R E L O A D ” , "login: guest ”)

Added 31 New Keywords Based on Training Data

Detect Root shell (“root:”, “uid=0(root)”)

D e t e c t S e t u p A c t i o n s ( “ c h m o d ” , “ g c c ” )

D e t e c t A t t a c k C o d e D o w n l o a d i n g ( “ u u d e c o d e ” , “ < < “ , “ > f t p g e t ” )

A t t a c k S p e c i f i c ( “ I F S = “ , “ F D F O R M A T ” , “ F F B C O N F I G ” )

(14)

Simple Single-Layer Network Provided

Good Performance on Training Data

I n p u t s A r e C o u n t s o f t h e N u m b e r o f K e y W o r d s i n a T e l n e t S e s s i o n

T h e T w o O u t p u t s E s t i m a t e P o s t e r i o r P r o b a b i l i t i e s f o r N o r m a l

S e s s i o n s a n d A t t a c k s ( S q u a r e d- E r r o r S t o c h a s t i c G r a d i e n t D e s c e n t )

F e a t u r e ( K e y w o r d ) - S e l e c t i o n a n d T r a i n i n g / T e s t i n g P e r f o r m e d U s i n g 1 0 - F o l d C r o s s - Validation

0 3 0 0 14 3 2 1 1 . . .

N O R M A L A T T A C K

(15)

Neural Net Detection Keywords

1 0- Fold Cross- Validation Testing on the Training Data

Demonstrated that Best Detection Performance Was Obtained

with Only 30 Keywords

F e w e r K e y w o r d s D e c r e a s e d D e t e c t i o n R a t e

M o r e K e y w o r d s I n c r e a s e d F a l s e A l a r m R a t e

F e w e s t C r o s s - Validation Errors (1 Miss, 1 False Alarm) at 30 K e y w o r d s

Top Ten Keywords (Specified Using Perl Regular Expressions)

"cat\ s*>"

"Jumping to address"

"begin [0 - 9 ] "

"uudecode \ s * [ \< \ - ] "

"linsniff"

"uid\ = 0 \( r o o t \) "

" \$ \ > \ = 0\ ; \ $ \ <=0\ ; "

(16)

U s e r-to- Root Attack Types in Test Data

S o l a r i s S u n O S L i n u x

O L D e j e c t l o a d m o d u l e perl f f b c o n f i g

f d f o r m a t

N E W ps ps x t e r m

S e v e n U s e r -t o- Root Attack Types, 35 Instances in 11,859 Telnet S e s s i o n s

Different Techniques Used to Encrypt, Transport, Prepare Script, D i f f e r e n t A c t i o n s A f t e r B r e a k i n , S o m e A t t a c k s S p r e a d O v e r M u l t i ple S e s s i o n s
(17)

Outline

Intrusion Detection Background

Goals and System Design

Evaluation and Results

S u m m a r y

(18)

Overall Test Results,

User to Root (u2r) ROCs

Trained Neural Net With New Keywords Provides Best Performance

(Roughly 80% Detection at 1 False Alarm Per Day)

Keyword Count with Old Keywords Provides Poor Performance

(100 False Alarms Per Day for Good Detection)

Adding and Selecting 30 Keywords Reduces False Alarm Rate by x10

0 20 40 60 80 100

0.5 1 10 100

FALSE ALARMS PER DAY

OLD

KEYWORD COUNT

NEW

KEYWORD COUNT

NEURAL NET

(19)

Neural Network Detection of Old versus

New Attacks

Good Detection of Both Old and New Attacks Due to

Common Script Transport and Preparation Mechanisms

0 20 40 60 80 100

0.5 1 10 100

FALSE ALARMS PER DAY

24 OLD ATTACKS

(20)

Stealthy, Multi - Session Attacks Versus

Single Session Attacks

Multi- Session Attacks More Difficult to Detect

S e t u p ( E x p l o i t S c r i p t T r a n s m i s s i o n ) a n d B r e a k i n O c c u r i n S e p a r a t e S e s s i o n s a n d C l u e s a r e D i s p e r s e d O v e r T i m e

0 20 40 60 80 100

0.5 1 10 100

FALSE ALARMS PER DAY

12 MULTI-SESSION ATTACKS

(21)

Clear- Text Exploit Transmission Versus

Hidden Transmission

0 20 40 60 80 100

0.5 1 10 100

FALSE ALARMS PER DAY

26 HIDDEN-SCRIPT ATTACKS

(22)

Attack Classification for Clear-Text

Attacks

D e s i r e d C o m p u t e d C l a s s

S o l a r i s S u n O S L i n u x C l a s s e j e c t f o r m a t f f b l o a d m p e r l

-e j -e c t 6

f o r m a t 5 1

f f b c o n f i g 1 1

l o a d m o d u l e 2

p e r l m a g i c 2

D e s i r e d C o m p u t e d C l a s s S o l a r i s S u n O S L i n u x C l a s s e j e c t f o r m a t f f b l o a d m p e r l -e j -e c t 2

f o r m a t 1

f f b c o n f i g 1

l o a d m o d u l e p e r l 1

Perfect Classification

for Clear- Text Attacks

100% Correct

(23)

Outline

Intrusion Detection Background

Goals and System Design

Evaluation and Results

S u m m a r y

(24)

S u m m a r y

Using A Neural Network with Extended Keyword Strings

Provides a High- Performance Intrusion Detection System for

the DARPA 1998 Test Data (Unofficial Results)

D r a m a t i c a l l y L o w e r F a l s e A l a r m R a t e

False Alarm Rate Near 1 per Day, Detection Rate > 80%

F i n d s B o t h O l d a n d N e w A t t a c k s

D e t e c t s M a n y A t t a c k C o m p o n e n t s ( e . g . s e t u p , b r e a k i n , a c t i o n s )

(25)

Future Work

Embed Neural Network Approach in Existing System

A d d N e w K e y w o r d s

U s e D e t e c t i o n N e u r a l N e t w o r k t o C o m p u t e S c o r e

U s e I d e n t i f i c a t i o n N e u r a l N e t w o r k t o L a b e l A t t a c k s

Potential Improvements

U s e R e c e n t A t t a c k s a n d T r a f f i c t o I m p r o v e K e y w o r d s a n d S c o r i n g

I n t e g r a t e I n f o r m a t i o n A c r o s s M u l t i p l e T e l n e t S e s s i o n s a n d Services (e.g. ftp).

A d d s t r i n g s t o d e t e c t a d d i t i o n a l a p p r o a c h e s t o d o w n l o a d c o d e and prepare for an attack (e.g. vi, mail, ..) and additional acti o n s .

References

Related documents

Select the camera you want to enable for motion detection in the Select channel list.. Select Detect Detail

Indonesian vocational high school students find difficult to speak and even understand English as the target Language even though they have learned English for more

PBGC is funded by five sources of revenue: (1) mandatory premiums paid for all covered pension plans; (2) a special “termination premium” ($1,250 per participant, per year, for

property data may be found in other DuPont publi- cations. Bulletin ART-1 contains viscosity, thermal conductivity, and heat capacity data for saturated liquid and vapor in addition

Choosing a right set of keyword is really an art, select targeted keywords with the help of different keywords tools such as a Google Keywords tool, Word tracker

The scanning process for detection of ellipse candidate is efficient because the representation based on arcs and lines allows us to work with a small number of primitives in

Adding new keywords lowers that false-alarm rate of the baseline keyword counting system to roughly 10 false alarms per day to detect 80% of the attacks and using a neural network

2) Communicate a cancellation to the law enforcement or fire personnel as soon as possible following a determination that emergency response is unnecessary. 3) Communicate any