Blacklisting and Blocking

(1)

Blacklisting and Blocking

Sources of Malicious Traffic

Athina Markopoulou

Uni

sit f C lif ni I in

University of California, Irvine

Joint work with Fabio Soldo, Anh Le @ UC Irvine Jo nt work w th Fab o Soldo, Anh Le @ UC Irv ne

(2)

Outline

Motivation

Mot vat on

Malicious Internet Traffic: Attack and Defense

Two Defense Mechanisms

Proactive: Predictive Blacklisting

d F l

Reactive: Source-Based Filtering

C

l si

(3)

Malicious Traffic on the Internet

Compromising systems

Malicious Traffic on the Internet

p

g y

scanning, worms, website attacks phishing, social engineering attacks ....

Launching attacks

spam click fraud click-fraud Denial-of-Service attacks …

B t t

Botnets

(4)

The solution requires many components

Monitoring and detection of malicious activity

The solution requires many components

Monitoring and detection of malicious activity

– in the network and/or at hosts

– signature-based, behavioral analysis

Mitigation

– at the hosts: remove malicious code

– in the network: block, rate-limit, scrub malicious traffic

(5)

Defense at the edge of the network

N k 1 Network 2

Network 1 Network 2

Logging IDS Firewall Logging IDS Firewall

router router

L i IDS Fi ll

Network 3 _{Network 4}

Logging IDS Firewall Logging IDS Firewall

(6)

Dshield Dataset

6 months of IDS+firewall logs from Dshield.org (May-Oct 2008):

Dshield Dataset

6 months of IDS firewall logs from Dshield.org (May Oct 2008)

~600 contributing networks, 60M+ source IPs, 400M+ logs

Contributing

network Dshield.org

LogsLogs _{Time Victim ID}_{Time Victim ID} _{Src IP} _{Dst IP} _{Src Port} _{Dst Port} _{Protocol Flags} (contributor) Src IP Dst IP Src Port Dst Port Protocol Flags

P h f d d l d b h

(7)

Outline

Background

Two Defenses Mechanisms

d F l

Reactive: Source-Based Filtering

C

l si

(8)

Predictive Blacklisting

Problem definition:

– Given past logs of malicious activity collected at various

locations

P di t lik l t d li i t ffi t h i ti

– Predict sources likely to send malicious traffic to each victim

network in the future.

Blacklist:

– list of “worst” (e.g. top-100) attack sources

Prediction vs Detection

(9)

Data analysis

Superposition of several behaviors

Data analysis

erts mber of al D Nu Day

(10)

A multi-level prediction model

Different predictors capture different patterns in

the dataset:

– Model temporal dynamics

M d l i l l i b i i / k

– Model spatial correlation between victims/attackers

Combine different predictors

Comb ne d fferent pred ctors

Formulate as a

Recommendation Systems

problem

(11)

Recommender systems: example

Netflix: you rate movies and you get suggestions

(12)

Formulating Predictive Blacklisting

Recommendation System

as a Recommendation System (CF)

3 2 ? ? 13 4 ? Attackers Users 3 2 ? ? 1 ? ? 4 - 13 4 ? ? - - ? 3 ? ? 1 ? - - _?7 12_? ? 1 _? 1 _? ims m s 6 3 1 9 ? ? 2 ? ? 11 - 2 3 8 ? -? - 12 1 4 3 ? - - ? 27 9 1 ? ? ? ? ? ? ? ? Vict i Ite m ? ? 2 ? R = Rating Matrix 8 ? 2 ? ? 216 - - ? 11 2 ? -? ? ? ? ? ? ? ? User ? Attack? ? ? User

(13)

Predictor I: (attacker, victim) pair

T

l d

i

Temporal dynamics

)

(

,

t

r

_aTS_v

(14)

Predictor I: (a v) time series

Predictor I: (a, v) time series

)

(

,

t

r

_aTS_v

Data analysis: repeated attacks within short time periods Prediction:

– Use EWMA model to capture this temporal trendp p – Accounts for the short memory of attack sources. – Computationally efficient

– Includes as special case t=1

Past activity at time t’ ≤ t Predicted

(15)

Predictor II: similar victims

Data analysis: victims share common attackers.

spatial correlation

– [Katti et al, IMC 2005], [Zhang et al, Usenix Security 2008]

C

attackersCommon

Our approach:

Victims

(16)

Predictor II: similar victims

defining similarity

• Similarity of victims u,v captures:y p

– the number of common attackers – and when they are attacked

C

Our approach:

1 1 0 0 v1 a1 a2 a3 a4 Common attackers 1 1 0 0 1 1 0 0 1 1 1 0 v2 v3 victims 0 0 1 1 v4

(17)

Predictor II: similar victims

k-nearest neighbors (kNN)

)

(

,

t

r

_aKNN_v

Traditional kNN: “trust” your peers

– Identify k most similar victims (“neighbors”) + predict your rating based on theirs

N h ll d i i i

New challenges due to time varying ratings

Sum over the

Our approach:

Sum over the neighborhood of v

Time series forecast given past logs Predicted

activity

given past logs

Similarity between y time-varying vectors

(18)

Predictor III: Attackers-Victims

l

Data analysis:

Co-clustering

– group of attackers consistently target the same group of victims. – this behavior often persists over time

We used the Cross-Association (CA) method to automatically identify

(19)

Predictor III: Attackers-Victims

P d

Prediction

)

(

,

t

r

_aEWMA_v −CA Intuition:

– pairs (a,v) in dense clusters are more likely to occur – use the density of the cluster, as the predictor

, where

(20)

p

Summary

Different predictors capture different patterns:

– Temporal trends

EWMA TS of (attacker victim)

• EWMA TS of (attacker,victim) – Neighborhood models:

• KNN: Similarity of victims

• EWMA CA: Interaction of attackers-victims

(21)

Combining different predictors

W i ht d A

Combining different predictors

Weighted Average

(22)

Performance Analysis

B

li Bl kli i T h i

Baseline Blacklisting Techniques

•

Local Worst Offender List (LWOL)

•

Local Worst Offender List (LWOL)

– Most prolific local attackers – Reactive but not proactive

•

Global Worst Offender List (GWOL)

•

Global Worst Offender List (GWOL)

– Most prolific global attackers – Might contain irrelevant attackers

– Non prolific attackers are elusive to GWOL

•

Collaborative Blacklisting (HPB)

– [J. Zhang, P. Porras, J. Ullrich, “Highly Predictive Blacklisting”, USENIX Security 2008] – Also implemented and offered as a service (HPB) by Dshield.org

(23)

60 d f D hi ld l 5 d t i i 1 d t ti BL l th 1000

total hit count

60 days of Dshield logs, 5 days training, 1 day testing, BL length=1000, The combined method

– significantly improves the hit count (up to 70%, 57% on avg) – exhibits less variation over timeexhibits less variation over time

Combined method

HPB HPB GWOL

(24)

Predicting Attacks

h i h b

d ?

what is the best we can do?

Training, day t₁ Test, day t₂

12 - 1 33 5 - - 3 5 - 17 4 -

-v_i LocalUB(vi)=3

Local Upper Bound: #IPs in training & test window of a particular

contributor 2 - 1 1 - - -12 - 1 33 5 - -- - 7 - 3 29 6 - 1 - - 5 - -3 5 - 17 4 - -1 2 - 1 5 31 4 - - - - 2 - - 1 - - 2 4 - -x - x x x x x x x - x x x x GlobalUB=5

Global Upper Bound: # IPs in training window of any contributor

(25)

Predicting Attacks

room for improvement

Collaboration helps!

Large gap from prior methods

(26)

Robustness achieved by diverse methods

y

robustness to random errors

E.g. an attacker may send traffic to a single victim (detected by temporal) or to several victims (detected by spatial behavior); or he can limit his attack activity

(27)

Predictive Blacklisting as a RS System

b

Summary

Predictive Blacklisting as a RS System

Contributions

– Combined predictors that capture different patterns in the data – Significant improvement with simple techniques

• still room for further improvement • still room for further improvement

– New formulation as a recommenders system (collaborative filtering) problem

• paves the way to powerful techniques:

• e.g., capture global structure (latent factors), joint spatio-temporal models

References

– F.Soldo, A.Le, A.Markopoulou, "Predictive Blacklisting as an Implicit Recommendation system“, IEEE INFOCOM 2010 and in arXiV.org

(28)

How to use a list of malicious sources?

•

A policy decision:

– E.g. scrub, give lower priority, block, monitor, do nothing …

•

One option is to

block (filter)

malicious sources

– when: during flooding attacks by million-node botnets – where: at firewalls or at the routers

(29)

Outline

Background

l

d F l

Reactive: Optimal Source-Based Filtering

C

l si

(30)

Filtering at the routers

•

Access Control Lists (ACLs)

(

)

– Match a packet header against rules, e.g. source and

destination IP addresses

– Source-based filterSource based filter:: ACL that denies access to a source ACL that denies access to a source

IP/prefix

l

•

Filters implemented in TCAM

– Can keep up with high speeds – Limited resource Limited resource

(31)

Filter Selection at a Single Router

d ff b

f

fil

ll

l

d

tradeoff: number of filters vs. collateral damage

c

attackers

Filter an attack source A.B.C.D

. . . .

c

c _c

c c _c

legitimate users

Filter a prefix A.B.C.*

ISP

edge router

C edge router

(32)

Optimal Source-Based Filtering

Optimal Source Based Filtering

Design a family of filter selection algorithms that: t k i t

• take as input:

– a blacklist of malicious (bad) sources – a whitelist of legitimate (good) sources – a constraint on the number of filters Fmax – a constraint on the number of filters Fmax – a constraint on the access bandwidth C – the operator’s policy

• optimally select which source IP prefixes to filteroptimally select which source IP prefixes to filter – so as to optimize the operator’s objective

– subject to the constraints

A B C *

0 2^32_-1

A.B.C.D A.B.C.

(33)

Optimal Source-Based Filtering

p

g

A General Framework

[l,r]: range in the IP spaceg p p/l: prefix p of length l

F _max: number of filters (<<N)

: whether we block range [l r] or not : whether we block range [l,r] or not : weight assigned to source IP address, i.

(34)

E

i

O

’ P li

Expressing Operator’s Policy

• Assignment of weights W_i is the operator’s knob:

– indicates volume of traffic sent, or importance assigned by the operator

– W_i>0 (good source i), W_i<0 (bad source i ), W_i=0 (indifferent)

• Objective function

=

cost of good sources in range [l,r] cost of bad sources in range [l r] cost of bad sources in range [l,r]

(35)

Filter Selection Algorithms

P bl O

i

Problem Overview

• RANGE-based: filter IP or range [l,r]g

[Soldo, El Defrawy, Markopoulou, Van De Merwe, Krishnamurthy: ITA’09] – FILTER-ALL-RANGE

– FILTER-SOME-RANGE

FILTER ALL DYNAMIC RANGE

– FILTER-ALL-DYNAMIC-RANGE

• PREFIX-based: filter IP source or prefix

[Soldo, Markopoulou, Argyraki: INFOCOM’09, arXiv.org] [Soldo, Markopoulou, Argyraki INFOCOM 09, arXiv.org]

– FILTER-ALL: block all malicious sources

– FILTER-SOME: block some malicious sources – FILTER-ALL-DYNAMIC: BL varies over time

FLOODING: b d idth st i t t ss t

– FLOODING: bandwidth constraint at access router

(36)

Filter Selection Algorithms

Al

ith O

i

Algorithms Overview

• RANGE-based: filter IP or range [l,r]g

[Soldo, El Defrawy, Markopoulou, Van De Merwe, Krishnamurthy: ITA’09] – FILTER-ALL-RANGE

– FILTER-SOME-RANGE

FILTER ALL DYNAMIC RANGE – FILTER-ALL-DYNAMIC-RANGE

• PREFIX-based: filter IP source or prefix

[Soldo, Markopoulou, Argyraki: INFOCOM’09, arXiv.org] [Soldo, Markopoulou, Argyraki INFOCOM 09, arXiv.org]

– FILTER-ALL: O(N) – FILTER-SOME: O(N)

– FILTER-ALL-DYNAMIC: O(N)

FLOODING: NP h d s d l i l l O(C2_{N) h} _isti

– FLOODING: NP-hard, pseudo-polynomial alg. O(C2N) + heuristic – DISTRIBUTED-FLOODING: distributed solution

(37)

Longest Common Prefix Tree of a BL

• LCP-Tree(BL) : binary tree, leaves are addresses in BL,

intermediate nodes are their longest common prefixesg p f

• It can be found from the full binary tree of IP prefixes

• E.g. for BL={10.0.0.2, 10.0.0.3, 10.0.0.7}, the LCP-Tree(BL) is:

10.0.0.2/31

10.0.0.0/29

3 bad, 5 good addresses

10.0.0.2/31

10 0 0 2/32 10 0 0 3/32 10 0 0 7/32

0 good, 2 bad addresses

• Finding a set of filters:

– no need to look for all possible sets of prefixes

10.0.0.2/32 10.0.0.3/32 10.0.0.7/32

(38)

Filter-All-Prefix

P bl S

Problem Statement

• Given: a blacklist BL, weight w_i (for each good IP i), F_max filters • choose:choose: prefixes p/l prefixes p/l (x(x_p/l_/l))

(39)

Filter-All-Prefix

D

i P

i

Al

i h

Dynamic Programming Algorithm

: cost of optimal allocation of F filters within a prefix p p pp

p s_LL ss_R_R F-n ≥ 1, filters within left subtree n ≥ 1,

filters within right subtree

(40)

Filter-All-Prefix

P l

h

E

l

Fmax = 4 N = 10

DP Algorithm: Example

Fmax = 4 0/1 32/5

(41)

Filter-Some-Prefix

Fmax = 4 N = 10 Fmax = 4

(42)

N 10

Filter-All-Prefix-Dynamic

Ti

i

Fmax = 4 N = 10 Need to be

Time-varying case

(re)computed: O(F_maxlog(N))

26 7 0 ₂₂ 7 7 5 31 37 10 15 17 22 32 33 57 58 3 6 6 0 2

(43)

FLOODING

P bl S

Problem Statement

• Given: a blacklist BL, a whitelist WL, a

weight of address = traffic volume generated

weight of address = traffic volume generated, a constraint on the link capacity C, and F_max filters

• choose: source IP prefixes, x_p/l

• so as to: minimize the collateral damage g

(44)

FLOODING

DP Al

i h

DP Algorithm

•

FLOODING is NP-hard

– reduction from knapsack with cardinality constraint (1.5K)

•

An optimal pseudo-polynomial dynamic programming

An optimal pseudo polynomial dynamic programming

algorithm, solves the problem in: O((CF

_max

)

2

N)

– similar to the previous DP but solve 2-dimensional KP

l

– the LCP-Tree includes both good and bad addresses

(45)

Distributed Flooding

fil

l

filters at several routers

attackers

• Deploy filters at several routers – increase total filter budget

E h ( ) h

. . .

c c _c c c _c

• Each router (u) has its own: – view of good/bad traffic

– capacity in incoming link – filter budget

. . .

filter budget

• Filtering at several routers: – not only which prefix to block – but also on which router

• Solution:

– can be solved in a distributed way

outperforms independent decisions Victim

(46)

Evaluation using Dshield data

FLOODING

li i i

FLOODING vs. rate limiting

• Attack sources, from a point of view of a single victim in Dshield • Good sources: [Kohler et al. TON’06, Barford et al. PAM’06]

• Before attack: good traffic was C/10 < C • During attack: bad traffic is 10C g

(47)

Intuition why optimization helps

y p

p

compared to non-optimized filtering

• Malicious sources are clustered in the IP address spacep

• Malicious sources are not co-located with legitimate sources

(48)

Evaluation using Dshield data (2)

l l

FILTER-ALL-PREFIX vs. generic clustering algorithms

• Malicious addresses:

– attacking 2 specific victim networks (most and least clustered) in Dshield datasetg p ( )

• Good addresses generated:

(49)

Evaluation using Dshield data (3)

DISTRIBUTED FLOODING h l f

di

i

DISTRIBUTED-FLOODING: the value of coordination

D/N

(50)

S

Summary

F

k

f ti l

filt l ti

•

Framework

for optimal filter selection

– defined various filtering problems

– designed g efficient algorithms g to solve them

•

Lead to significant improvements

on real datasets

– Compared to non-optimized filter selection , to generic

clustering, or to uncoordinated routers

(51)

Outline

Background

Malicious Internet Traffic: Attack and Defenses

T D f

M h

Proactive: Blacklisting as a Recommendation System Reactive: Filtering as an Optimization Problem

Reactive: Filtering as an Optimization Problem

Conclusion

Parts of larger system that collects and analyzes data from multiple sensors and takes appropriate action

(52)

Thank you!

[email protected]