2581 0 RAID 1999b pdf

(1)

Using Key-String Selection and Neural Networks to Reduce False Alarms

and Detect New Attacks with Sniffer-Based Intrusion Detection Systems

R i c h a r d P . L i p p m a n n a n d R o b e r t K . C u n n i n g h a m

[email protected]

MIT Lincoln Laboratory

Room S4-121

244 Wood Street

Lexington, MA 02173-0073

Presented at the Recent Advances in Intrusion

Detection, RAID 99 Conference,

7-9 September

(2)

Outline

•

Intrusion Detection Background

•

Goals and System Design

•

Evaluation and Results

•

Summary

(3)

Common Input Features for Intrusion Detection

•

Network Sniffing Data (NSM, ASIM,EMERALD, BRO, IBM-HAXOR

Cisco-NetRanger, ISS-RealSecure, Network Radar, Network Flight Recorder)

•

Host Audit Data (STAT, EMERALD, AXENT-Intruder Alert, Centrax)

•

Router, Firewall, … Audit Data (Ji Nao, Cisco-NetRanger, EMERALD)

L i n u x

S u n O S S o l a r i s

S p a r c

U l t r a 1. Network

S n i f f i n g D a t a

W i n d o w s N T

2. Host Audit Data 4 8 6 4 8 6 4 8 6

Router/ Firewall

Internet Local

N e t w o r k

(4)

Visibility of Attacks with Different Inputs

Infrastructure

Host Denial of Service

Probes/Scans

Remote to Local

User to Root

Malicious-Code/Data

Insider Attack

Host

Audit

Network

Sniffer

(5)

Sniffer-Based Intrusion Detection

•

Popular, Low Cost, No Impact on Hosts, Monitor Many

Hosts Simultaneously ( ASIM, Cisco-NetRanger, BRO, …)

•

Capture Network Traffic, Reconstruct Network Sessions,

Count Number of Keywords in Each Session

•

Examples of Keyword Strings

–

ftp: root, anonymous

–

login: guest, root, incorrect, daemon, passwd, permission

C A P T U R E T R A F F I C

H O S T H O S T R O U T E R I N T E R N E T

K E Y W O R D L I S T

R E C O N S T R U C T N E T W O R K S E S S I O N S

C O U N T K E Y W O R D S

O U T P U T K E Y W O R D

(6)

Example Telnet Session Reconstruction

HP-UX hqdadev A.09.03 D 9000/750 (ttyt1)

login: ~tftp

P

assword

:

Login incorrect

login: efs

P

assword

Login incorrect

login: efs

P

assword

:

P

assword

:

/usr/efs w

3:03pm up 14 days, 21:33, 1 user, load average: 0.00, 0.00

User tty login@ idle JCPU PCPU what

efs pty/ttyt1 3:03pm 1 w

/usr/efs> cd /

ls -al

total 16336

drwxr-xr-x 47 root sys 3072 Sep 23 10:41 .

drwxr-xr-x 47 root sys 3072 Sep 23 10:41 ..

drwx---

2 root mail 1024 Nov 28 1994 .elm

-rw---

1 bin bin 8690 Jul 9 15:39 .profile

-rw---

1 root sys 37 Nov 18 1994

.rhosts

dr-x---

3 root other 1024 May 26 1994 .secure

drwxr-x---

3 root sys 1024 Jul 3 14:30 Perl

grep :0: /etc/

passwd

(7)

Problems With Keyword-Based Systems

•

High False Alarm Rates

–

Little Use of Context Around Keywords

–

Humans Select Keywords to Detect Attacks With Little

Thought of Impact on False Alarms and Little Validation

–

Many Systems Produce 100’s of False Alarms Per Day

–

Keywords Accumulate for Old Attacks and May Generate

False Alarms for New Types of Normal Network Traffic

–

Requires Knowledge Of Attack Details, Sometimes Difficult

To Select Keywords to Detect an Attack

•

Misses New Attacks, May Miss Attack Variants, Requires

Constant Updating (Like Virus Detection)

–

Keywords are Often Too Attack Specific and Depend on

Visibility of Attack Script and Use of an Unchanging Script

•

Does not Provide a Correct Attack Name

–

It is Often Difficult to Infer the Attack Name from Keyword

(8)

Outline

•

Intrusion Detection Background

•

Summary

(9)

Goal of This Work

•

Improve Performance Of Existing Keyword-Based Systems

–

Use Neural Networks and Automatic Training on Normal Data and

Attacks to Select Keywords and Keyword Weightings that Provide

Good Detection and Few False Alarms

–

Select More Robust Keywords that Can Detect New and Old

Attacks

•

Focus on UNIX Attacks Where Users Illegally Become Root

–

This is a Difficult, but Important Class of Attacks

•

Determine if Neural Networks can Provide Attack Labels

•

Constraints

–

To Permit Retrofitting in Existing Systems, Continue to Use

Keyword Counts in Telnet, Rlogin, and other Sessions

(10)

Approach for Attack Detection and

Classification

S N I F F E D D A T A

C R E A T E T R A N S C R I P T S

F O R T E L N E T S E S S I O N S

K E Y W O R D L I S T

C O U N T K E Y W O R D O C C U R R E N C E S

D E T E C T I O N N E U R A L N E T

C L A S S I F I C A T I O N N E U R A L N E T

T O T A L K E Y W O R D C O U N T

A T T A C K

P R O B A B I L I T Y

(11)

Training and Test Data from DARPA

1998 Intrusion Detection Evaluation

Outside

Internet

Eyrie AF Base

Inside

Router

Sniffer

•

Seven Weeks Training Data

– System Development – 34 User-to-Root Attacks – 1202 Telnet Sessions

•

Two Weeks Test Data

– One Evaluation Pass – 35 User-to-Root Attacks – 11,800 Telnet Sessions

•

Simulates Traffic In and Out of an

Air Force Base

–

1000’s of Simulated UNIX Hosts

–

100’s of Simulated Users

–

Rich Mix of Background Traffic

(12)

Eject Attack Example from Training Data

U N I X ( r ) S y s t e m V R e l e a s e 4 . 0 ( p a s c a l ) @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

@ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @ l o g i n : a l i e P a s s w o r d :

L a s t l o g i n : W e d J u l 1 1 6 : 1 2 : 3 4 f r o m 1 9 4 . 2 7 . 2 5 1 . 2 1

S u n M i c r o s y s t e m s I n c . S u n O S 5 . 5 G e n e r i c N o v e m b e r 1 9 9 5

O f f i c i a l U . S . g o v e r n m e n t s y s t e m f o r a u t h o r i z e d u s e o n l y . D o n o t d i s c u s s ,

e n t e r , t r a n s f e r , p r o c e s s o r t r a n s m i t c l a s s i f i e d / s e n s i t i v e n a t i o n a l s e c u r i t y . . .

p a s c a l > @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

p a s c a l > w h i c h g c c @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @ / b i n / g c c

p a s c a l > u u d e c o d e < < X X 8 9 9 3 4 7 3 6 8 X X\ ` @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

? b e g i n 6 4 4 / t m p / 1 7 8 5 7 . c @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

? M ( V E N 8 V Q U 9 & 4 @ / ' - T 9 & E O + F @ ^ " B - I ; F - L = 6 1 E ( # Q S = & 1 L : 6 ( N : # X * ( V E N 8 V Q U @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @ . . .

? M ( & ) U 9 E L Q 7 2 P @ * & - H 8 7 ( @ * B D @ , " D [ " B ` @ < & 5 R < F ] R * " ) E > & 5 C ; " ! F 8 6 E L 9 6 0 B @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @ ? ` @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

? e n d @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

? X X 8 9 9 3 4 7 3 6 8 X X \ ` @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

p a s c a l > / b i n /g c c - o / t m p / 1 7 8 5 7 2 / t m p / 1 7 8 5 7 . c @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @ p a s c a l > w h i c h g c c @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

/ b i n / g c c

p a s c a l > u u d e c o d e < < X X 8 9 9 3 4 7 3 7 5 X X\ ` @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @ ? b e g i n 6 4 4 / t m p / 1 7 8 5 7 . c @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

? M ( V E N 8 V Q U 9 & 4 @ / ' 5 N : 7 - T 9 " Y H / @ I V ; V E D " F U A : 6 X H : 6 Y T ( & % R 9 V , L ( & - H 8 7 ( @ @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

? M * F % R 9 W 9 ; 7 2 D * > P H @ ( ' - E = ' ) E = 6 E D * # ` L , " D [ " B ` @ 9 7 A E 8 V P H ( B ] B : 6 X O = & - S @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @ ? / : " ( L ( G 1 C < V @ B + # ` I . P I ] @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

? ` @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @ ? e n d @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

? X X 8 9 9 3 4 7 3 7 5 X X \ ` @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

p a s c a l > / b i n / g c c - o / t m p / 1 7 8 5 7 3 / t m p / 1 7 8 5 7 . c @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

p a s c a l > / t m p / 1 7 8 5 7 2 @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

J u m p i n g t o a d d r e s s 0 x e f f f f 7 e 0

J u m p i n g t o a d d r e s s 0 x e f f f f 7 e 0 B [ 3 6 4 ] E [ 4 0 0 ] S O [ 4 0 0 ]

#

/ t m p / 1 7 8 5 7 3 # # #

^ D @ # $ [ B S ] $ # @ @ # $ [ B S ] $ # @ # @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

# e x i t

#

^ D @ # $ [ B S ] $ # @ @ # $ [ B S ] $ # @ #

p a s c a l > @ # $ [ C R ] $ # @ @ # $ [ 0 ] $ # @

p a s c a l > ^ D @ # $ [ B S ] $ # @ @ # $ [ B S ] $ # @ l o g o u t

1. Gain User Access: Attacker Logs Into Telnet Using Sniffed Password

2. Download Attack Code: Type Directly into

uudecode to Hide Keywords

4. Run Attack:

Buffer Overflow Creates Root Shell

3. Preparations:

Compile Attack Programs Using gcc With

Innocuous Names

5. Actions:

(13)

Keywords for User to Root Attacks

•

Initially We Had 58 Old Keywords Commonly Used in Existing

Intrusion Detection Systems

– D e t e c t S u s p i c i o u s A c t i o n s ( “ p a s s w d ” , " + + ” , “ d a e m o n ” , " w a r e z ” , “ s h a d o w ” , “ p e r m i s s i o n d e n i e d ” , “ s h o w m o u n t ” )

– D e t e c t O l d A t t a c k s ( " f r o m : \| ” , " C W D ~ R O O T ” , " L D _ P R E L O A D ” , "login: guest ”)

•

Added 31 New Keywords Based on Training Data

– Detect Root shell (“root:”, “uid=0(root)”)

– D e t e c t S e t u p A c t i o n s ( “ c h m o d ” , “ g c c ” )

– D e t e c t A t t a c k C o d e D o w n l o a d i n g ( “ u u d e c o d e ” , “ < < “ , “ > f t p g e t ” )

– A t t a c k S p e c i f i c ( “ I F S = “ , “ F D F O R M A T ” , “ F F B C O N F I G ” )

(14)

Simple Single-Layer Network Provided

Good Performance on Training Data

•

I n p u t s A r e C o u n t s o f t h e N u m b e r o f K e y W o r d s i n a T e l n e t S e s s i o n

•

T h e T w o O u t p u t s E s t i m a t e P o s t e r i o r P r o b a b i l i t i e s f o r N o r m a l

S e s s i o n s a n d A t t a c k s ( S q u a r e d- E r r o r S t o c h a s t i c G r a d i e n t D e s c e n t )

•

F e a t u r e ( K e y w o r d ) - S e l e c t i o n a n d T r a i n i n g / T e s t i n g P e r f o r m e d U s i n g 1 0 - F o l d C r o s s - Validation

0 3 0 0 14 3 2 1 1 . . .

N O R M A L A T T A C K

(15)

Neural Net Detection Keywords

•

1 0- Fold Cross- Validation Testing on the Training Data

Demonstrated that Best Detection Performance Was Obtained

with Only 30 Keywords

– F e w e r K e y w o r d s D e c r e a s e d D e t e c t i o n R a t e

– M o r e K e y w o r d s I n c r e a s e d F a l s e A l a r m R a t e

– F e w e s t C r o s s - Validation Errors (1 Miss, 1 False Alarm) at 30 K e y w o r d s

•

Top Ten Keywords (Specified Using Perl Regular Expressions)

"cat\ s*>"

"Jumping to address"

"begin [0 - 9 ] "

"uudecode \ s * [ \< \ - ] "

"linsniff"

"uid\ = 0 $ r o o t $ "

" \$ \ > \ = 0\ ; \ $ \ <=0\ ; "

(16)

U s e r-to- Root Attack Types in Test Data

S o l a r i s S u n O S L i n u x

O L D e j e c t l o a d m o d u l e perl f f b c o n f i g

f d f o r m a t

N E W ps ps x t e r m

•

S e v e n U s e r -t o- Root Attack Types, 35 Instances in 11,859 Telnet S e s s i o n s

•

Different Techniques Used to Encrypt, Transport, Prepare Script, D i f f e r e n t A c t i o n s A f t e r B r e a k i n , S o m e A t t a c k s S p r e a d O v e r M u l t i ple S e s s i o n s

(17)

Outline

•

S u m m a r y

(18)

Overall Test Results,

User to Root (u2r) ROCs

•

Trained Neural Net With New Keywords Provides Best Performance

(Roughly 80% Detection at 1 False Alarm Per Day)

•

Keyword Count with Old Keywords Provides Poor Performance

(100 False Alarms Per Day for Good Detection)

•

Adding and Selecting 30 Keywords Reduces False Alarm Rate by x10

0 20 40 60 80 100

0.5 1 10 100

FALSE ALARMS PER DAY

OLD

KEYWORD COUNT

NEW

KEYWORD COUNT

NEURAL NET

(19)

Neural Network Detection of Old versus

New Attacks

•

Good Detection of Both Old and New Attacks Due to

Common Script Transport and Preparation Mechanisms

0 20 40 60 80 100

0.5 1 10 100

FALSE ALARMS PER DAY

24 OLD ATTACKS

(20)

Stealthy, Multi - Session Attacks Versus

Single Session Attacks

•

Multi- Session Attacks More Difficult to Detect

– S e t u p ( E x p l o i t S c r i p t T r a n s m i s s i o n ) a n d B r e a k i n O c c u r i n S e p a r a t e S e s s i o n s a n d C l u e s a r e D i s p e r s e d O v e r T i m e

0 20 40 60 80 100

0.5 1 10 100

12 MULTI-SESSION ATTACKS

(21)

Clear- Text Exploit Transmission Versus

Hidden Transmission

•

0 20 40 60 80 100

0.5 1 10 100

26 HIDDEN-SCRIPT ATTACKS

(22)

Attack Classification for Clear-Text

Attacks

D e s i r e d C o m p u t e d C l a s s

S o l a r i s S u n O S L i n u x C l a s s e j e c t f o r m a t f f b l o a d m p e r l

-e j -e c t 6

f o r m a t 5 1

f f b c o n f i g 1 1

l o a d m o d u l e 2

p e r l m a g i c 2

D e s i r e d C o m p u t e d C l a s s S o l a r i s S u n O S L i n u x C l a s s e j e c t f o r m a t f f b l o a d m p e r l -e j -e c t 2

f o r m a t 1

f f b c o n f i g 1

l o a d m o d u l e p e r l 1

•

Perfect Classification

for Clear- Text Attacks

100% Correct

(23)

Outline

•

S u m m a r y

(24)

S u m m a r y

•

Using A Neural Network with Extended Keyword Strings

Provides a High- Performance Intrusion Detection System for

the DARPA 1998 Test Data (Unofficial Results)

– D r a m a t i c a l l y L o w e r F a l s e A l a r m R a t e

– False Alarm Rate Near 1 per Day, Detection Rate > 80%

– F i n d s B o t h O l d a n d N e w A t t a c k s

– D e t e c t s M a n y A t t a c k C o m p o n e n t s ( e . g . s e t u p , b r e a k i n , a c t i o n s )

(25)

Future Work

•

Embed Neural Network Approach in Existing System

– A d d N e w K e y w o r d s

– U s e D e t e c t i o n N e u r a l N e t w o r k t o C o m p u t e S c o r e

– U s e I d e n t i f i c a t i o n N e u r a l N e t w o r k t o L a b e l A t t a c k s

•

Potential Improvements

– U s e R e c e n t A t t a c k s a n d T r a f f i c t o I m p r o v e K e y w o r d s a n d S c o r i n g

– I n t e g r a t e I n f o r m a t i o n A c r o s s M u l t i p l e T e l n e t S e s s i o n s a n d Services (e.g. ftp).

– A d d s t r i n g s t o d e t e c t a d d i t i o n a l a p p r o a c h e s t o d o w n l o a d c o d e and prepare for an attack (e.g. vi, mail, ..) and additional acti o n s .