Network Security: Attacks and Defenses

(1)

Network Security: Attacks and Defenses

Christian CALLEGARI

Department of Information Engineering University of Pisa

PhD Winter School

IP Traffic Characterization and Anomaly Detection

(2)

(3)

Short Bio

Post-Doctoral Fellow with the Telecommunication Network research group at the Dept. of Information Engineering of the University of Pisa

B.E. degree in 2002 from the University of Pisa, discussing a thesis on Network Firewalls

M.S. degree in 2004 from the University of Pisa, discussing a thesis on Network Simulation

PhD in 2008 from the University of Pisa, discussing a thesis on Network Anomaly Detection

Contacts

Dept. of Information Engineering Via Caruso 16 - 56122 Pisa - Italy

(4)

(5)

Outline

1 Introduction

2 Network Anomalies

(6)

Outline

1 Introduction Basic Principles Intruders Intrusions 2 Network Anomalies

(7)

Network Security

Definition from Wikipedia

In the field of networking, the specialist area of network security consists of the provisions made in an underlying computer network infrastructure, policies adopted by the network

administrator to protect the network and the network-accessible resources from unauthorized access, and consistent and continuous monitoring and measurement of its effectiveness (or lack) combined together.

(8)

Information Security

Definition from Wikipedia

Information security means protecting information and

information systems from unauthorized access, use, disclosure, disruption, modification or destruction

(9)

Who is vulnerable?

Financial institutions and banks Internet service providers Pharmaceutical companies

Government and defense agencies

Contractors to various government agencies Multinational corporations

(10)

Cornstones

Confidentiality Availability Integrity Authenticity Non-repudiation

(11)

Confidentiality

Definition

Confidentiality has been defined by the International Organization for Standardization (ISO) in ISO-17799 as “ensuring that information is accessible only to those authorized to have access”

Confidentiality is one of the design goals for many crypto-systems

(12)

Availability

Definition

The degree to which a system, subsystem, or equipment is operable and in a committable state at the start of a mission, when the mission is called for at an unknown, i.e., a random, time. Simply put, availability is the proportion of time a system is in a functioning condition

(13)

Integrity

Definition

Data integrity is data that has a complete or whole structure. All characteristics of the data including business rules, rules for how pieces of data relate, dates, definitions and lineage must be correct for data to be complete.

Integrity can be guaranteed by several security mechanisms (e.g., hash function, data authentication, digital signature)

(14)

Authenticity

Definition

Authentication is the act of establishing or confirming

something (or someone) as authentic, that is, that claims made by or about the subject are true

Mechanisms able to guarantee Information authenticity: A difficult-to-reproduce physical artifact, such as a seal, signature, watermark, special stationery, or fingerprint. A shared secret, such as a passphrase, in the content of the message.

(15)

Non-repudiation

Definition

Non-repudiation is the concept of ensuring that a party in a dispute cannot repudiate, or refute the validity of a statement or contract

The most common method of asserting the digital origin of data is through digital certificates. The ways in which a party may attempt to repudiate a signature present a challenge to the trustworthiness of the signatures themselves. The standard approach to mitigating these risks is to involve a trusted third party.

(16)

(17)

A taxonomy of the intruders

Intruders can be classified as

Masquerader: an individual who is not authorized to use

the computer and who penetrates a system’s access control to exploit a legitimate user’s account

Misfeasor: a legitimate user who accesses data,

programs, or resources for which such access is not authorized, or who is authorized for such access, but misuses his/her privileges

Clandestine User: an individual who seizes supervisory

control of the system and uses the control to evade

(18)

A taxonomy of the intrusions

Intrusions can be classified as

Eavesdropping and Packet Sniffing: passive interception of network traffic

Snooping and Downloading

Tampering and Data Diddling: unauthorized changes to data or records

Spoofing: impersonating other users

Jamming or Flooding: overwhelming a system’s resources Injecting Malicious Code

Exploiting Design or Implementation Flaws (e.g., buffer overflow)

(19)

(20)

Outline

1 Introduction 2 Network Anomalies Information gathering Passive attacks Spoofing attacks Scanning attacks DoS attacks Man-in-the-Middle DNS Cache Poisoning 3 Intrusion Detection Systems

(21)

Security Vulnerabilities

Attacks on Different Layers: L2 Attacks

IP Attacks ICMP Attacks Routing Attacks TCP Attacks

(22)

Network Attack: Life Cycle

1 _{Information gathering (active or passive)}

2 _Scanning

3 _{Gaining access}

4 _{Maintaining access}

(23)

Social Engineering

Definition

Social engineering is the act of manipulating people into performing actions or divulging confidential information, rather than by breaking in or using technical hacking techniques

Passive: involves acquiring information without directly

interacting with the target (e.g., searching public records or news releases)

Active: involves interacting with the target directly by any

means (e.g., telephone calls to the help desk or technical dept.)

(24)

whois

WHOIS is a query/response protocol that is widely used for querying databases in order to determine

the registrant or assignee of Internet resources, such as a domain name

an IP address block

an autonomous system number some additional information

(25)

(26)

(27)

(28)

host & nslookup

host and nslookup are utilities for performing Domain Name

(29)

nslookup

(30)

Passive attacks

A passive attack is characterised by the interception of messages without modification

There is no change to the network data or systems The message itself may be read or its occurrence may simply be logged.

Some protocols do not crypt the data!!!

(31)

Eavesdropping

Eavesdropping is the act of secretly listening to the private conversation of others without their consent

Some tools - Packet Analyzers (usually called “Sniffers”): tcpdump

ethereal/wireshark Cain and Abel

Sniffing can be easily done on a “classical” ethernet, but not on a switched ethernet

In a switched ethernet there are two possibilities: switch mirroring port

(32)

(33)

Spoofing attacks

A spoofing attack is a situation in which one person or program successfully masquerades as another by falsifying data and thereby gaining an illegitimate advantage.

Spoofing can be realized at different layers:

MAC layer: ifconfig <if> hw ether <MAC addr> IP layer: ifconfig <if> <ip addr>

Application layer: e.g., The sender information shown in e-mails (the ”From” field) can be spoofed easily. This technique is commonly used by spammers to hide the origin of their e-mails

(34)

(35)

(36)

(37)

(38)

Scanning attacks

Network Scan

Network Scanning is a procedure for identifying active hosts on a network, either for the purpose of attacking them or for network security assessment

Host Scan

Host attack is a procedure for identifying open/filtered/closed port on a host

(39)

Network Scan

The most common way of a scanning a network is theping

sweep technique

Ping Sweep

A ping sweep is a technique used to determine which of a range of IP addresses map to live hosts

It consists of ICMP ECHO request packets sent to multiple hosts

If a given address is live, it will return an ICMP ECHO reply ICMP TIMESTAMP and ICMP INFO can be used in a similar manner (useful if the victim machines are configured not to answer to the ICMP ECHO request

(40)

Network Scan

It can be useful to identify victim machines as well as zombies

Classical tool: hping,nmap

(41)

Host Scan

The result of a scan on a port is usually generalized into one of three categories:

Open or Accepted: The host sent a reply indicating that a

service is listening on the port

Closed or Denied or Not Listening: The host sent a reply

indicating that connections will be denied to the port (e.g., ICMP port unreachable message)

Filtered, Dropped or Blocked: There was no reply from

(42)

Host Scan

A host scan can be performed in several ways:

SYN scanning also known as half-open scanning

nmap -sS 192.168.1.10

UDP scanning

nmap -sU 192.168.1.10

ACK scanning: can be useful in the case packet filtering

blocks packets without the ACK flag set. It does not exactly determine whether the port is open or closed, but whether the port is filtered or unfiltered (useful to detect the

(43)

Host Scan

A host scan can be performed in several ways:

FIN scanning: useful if the SYN packets are blocked by

the firewall. Closed ports reply to a FIN packet with the appropriate RST packet, whereas open ports ignore the packet on hand

nmap -sF 192.168.1.10

Xmas scanning: TCP packets with all the flags set

(44)

(45)

OS guess

A Sniffer can use the TCP/IP stack fingerprinting to guess the O.S. running on a machine.

The TCP/IP fields that may vary include the following: Initial packet size (16 bits)

Initial TTL (8 bits) Window size (16 bits) Max segment size (16 bits) Window scaling value (8 bits) “don’t fragment” flag (1 bit) “sackOK” flag (1 bit) “nop” flag (1 bit)

(46)

OS guess

(47)

Idle Scan

The idle scan is a TCP port scan method that through utility software tools such as Nmap and Hping allows sending spoofed packets to a computer.

First of all it is necessary to identify a zombie (by means of a ping sweep)

The zombie must be inactive in the Internet

(48)

(49)

(50)

(51)

Denial of Service

A denial-of-service attack (DoS attack) or distributed

denial-of-service attack (DDoS attack) is an attempt to make a computer resource unavailable to its intended users.

The five basic types of attack are:

1 _{Consumption of computational resources, such as}

bandwidth, disk space, or processor time

2 _{Disruption of configuration information, such as routing}

information

3 _{Disruption of state information, such as unsolicited}

resetting of TCP sessions

4 _{Disruption of physical network components}

5 _{Obstructing the communication media between the}

(52)

SYN Flooding

The SYN flood is a well known type of attack and is generally not effective against modern networks. It works if a server allocates resources after receiving a SYN, but before it has received the ACK.

There are two methods, but both involve the server not receiving the ACK:

A malicious client can skip sending this last ACK message Or by spoofing the source IP address in the SYN

The technology often used in 1996 for allocating resources for half open TCP connections involved a queue which was often

(53)

SYN cookies

SYN cookies provide protection against the SYN flood by eliminating the resources allocated on the target host. Daniel J. Bernstein, the technique’s primary inventor, defines SYN Cookies as ”particular choices of initial TCP sequence numbers by TCP servers.”

SYN Cookies allows a server to avoid dropping connections when the SYN queue fills up

The server sends back the appropriate SYN+ACK response to the client but discards the SYN queue entry If the server then receives a subsequent ACK response from the client, the server is able to reconstruct the SYN queue entry using information encoded in the TCP

(54)

SYN cookies

SYN cookie calculation: n = t(5bits)km(3bits)ks(24bits), where

t = A slowly-incrementing timestamp (typically time() logically right-shifted 6 positions, which gives a 64 second resolution)

m = The maximum segment size (MSS) value that the server would have stored in the SYN queue entry

s = The result of a cryptographic secret function computed over the server IP address and port number, the client IP address and port number, and the value t. The returned value ”s ” must be a 24-bit value.

When the server receives back an ACK (the seq number will be n + 1), it performs the following operations:

Checks the value t against the current time to see if the connection is expired. Recomputes s to determine whether this is, indeed, a valid SYN Cookie. Decodes the value m from the 3-bit encoding in the SYN Cookie, which it then can use to reconstruct the SYN queue entry

(55)

UDP Flooding

UDP flood attack can be initiated by sending a large number of UDP packets to random ports on a remote host.

As a result the host will:

Check for the application listening at that port; See that no application listens at that port;

Reply with an ICMP Destination Unreachable packet. For a large number of UDP packets, the victimized system will be forced into sending many ICMP packets, eventually leading it to be unreachable by other clients

(56)

ICMP Flooding

A ping flood is a simple denial-of-service attack where the attacker overwhelms the victim with ICMP Echo Request (ping) packets

Smurf attack

A Smurf attack consists of sending a large amount of ICMP echo request traffic to IP broadcast addresses, all of which have a spoofed source IP address of the intended victim

Most OSs can be configured not to answer to ICMP packets sent to broadcast IP address, but this doen prevent the victim to be attacked Ping of Death

A ping of death (abbreviated “POD”) is a type of attack on a computer that involves sending a malformed or otherwise malicious ping to a

(57)

Recall: ARP

The Address Resolution Protocol (ARP) is a computer

networking protocol for determining a network host’s link layer or hardware address when only its Internet Layer (IP) or Network Layer address is known

(58)

ARP Spoofing

ARP poisoning

An attacker C can send a spoofed ARP reply (gratuitous ARP) message to a given host A, “saying” to be host B

MAC flooding

By MAC flooding a switch’s ARP table with spoofed ARP replies, the attacker can overload switches, making them enter in “forwarding mode”

(59)

Man-in-the-Middle

A given host C poisoning the ARP cache of two hosts communicating with each other A and B, can realize a Man-in-the-Middle attack, where:

host A communicates with host C, believing to communicate with host B

host B communicates with host C, believing to communicate with host A

host C can intercept (active sniffing) or modify the whole traffic

(60)

(61)

DNS packet

Query ID: a unique identifier created in the query packet QR (Query / Response): Set to 0 for a query by a

client, 1 for a response from a server

Opcode: Set by client to 0 for a standard query; the

other types aren’t used in our examples

AA (Authoritative Answer): Set to 1 in a server

response if this answer is Authoritative, 0 if not

TC (Truncated): Set to 1 in a server response if the

answer can’t fit in the 512-byte limit of a UDP packet response

RD (Recursion Desired): The client sets this to 1 if it

wishes that the server will perform the entire lookup of the name recursively

RA (Recursion Available): The server sets this to

indicate that it will (1) or won’t (0) support recursion

(62)

DNS packet

rcode: Response code from the server: indicates

success or failure

Question record count: The client fills in the next

section with a single ”question” record that specifies what it’s looking for: it includes the name

(www.unixwiz.net), the type (A, NS, MX, etc.), and the class (virtually always IN=Internet)

Answer/authority/additional record count: Set by the

server, these provide various kinds of answers to the query from the client

DNS Question/Answer data: This is the area that

holds the question/answer data referenced by the count

(63)

(64)

(65)

(66)

(67)

(68)

(69)

(70)

Standard DNS query

DNS Cache

Once we get an authoritative answer for a given name, we can save it in a local cache to use to satisfy future queries directly

DNS Cache TTL

Each entry in the DNS cache has a time-to-live, measure in seconds. The administrator of the zone specifies this information for every resource record

(71)

(72)

(73)

DNS Cache Poisoning

Note that the attack only works if: name isn’t already in the cache bad guy guesses the query ID

(74)

(75)

Outline

1 Introduction

2 Network Anomalies

3 Intrusion Detection Systems

Motivations

Taxonomy of the Intrusion Detection Systems Some Useful Definitions

(76)

Why an intrusion detection system?

Network security mainly means PREVENTION

Physical protection for hardware

Passwords, access tokens, etc. for authentication Access control list for authorization

Cryptography for secrecy

Backups and redundancy for authenticity . . . and so on

BUT . . .

(77)

What is an Intrusion Detection System?

Prevention is suitable when

Internal users are trusted

Limited interaction with other networks

Need for a system which acts when prevention fails

Intrusion Detection System

An intrusion detection system (IDS) is a software/hardware tool used to detect unauthorized accesses to a computer system or a network

(78)

IDS Taxonomy

Intrusion Detection Systems are classified on the basis of several criteria:

1 _Scope

Host IDS (HIDS) Network IDS (NIDS)

2 _Architecture Centralized Distributed 3 _{Analysis Techniques} Stateful Stateless

(79)

Host based vs. Network based

Host based IDS

Aimed at detecting attacks related to a specific host Architecture/Operating System dependent

Processing of high level information (e.g. system calls) Effective in detecting insider misuse

Network based IDS

Aimed at detecting attacks towards hosts connected to a LAN

Architecture/Operating System independent

(80)

Centralized IDS vs. Distributed IDS

Centralized IDS

All the operations are performed by the same machine More simple to realize

Only one point of failure

Distributed IDS

Composed of several components

Sensors which generate security events

Console to monitor events and alerts and control the sensors CentralEngine that records events and generate alarms

(81)

Stateless IDS vs. Stateful IDS

Stateless IDS

Treats each event independently of the others Simple system design

High processing speed

Stateful IDS

Maintains information about past events

The effect of a certain event depends on its position in the events stream

More complex system design

(82)

Misuse based IDS vs. Anomaly based IDS

Misuse based IDS

Identifies intrusion by looking for patterns of traffic or of application data presumed to be malicious

Pattern of misuses are stored in a database Effective in detecting only “known” attacks

Anomaly based IDS

Identifies intrusions by classifying activity as either anomalous or normal

(83)

IDS State of the Art

Focus is on Network based IDSs (The only ones effective in detecting Distributed Denial of Service - DDoS)

State of the art IDSs are Misuse Based

Most attacks are realized by means of software tools available on the Internet

Most attacks are “well-known” attacks BUT . . .

. . . The most dangerous attacks are those written ad hoc by the intruder!

(84)

The best choice?

Combined use of both

HIDS (for insider attacks) & NIDS (for outsider attacks) Misuse IDS (low False Alarm rate) & Anomaly IDS (for “new” attacks)

Stateless IDS (fast data process) & Stateful IDS (for “complex” attacks)

Distributed IDS

Not a single point of failure

(85)

(86)

Definitions

False Positive (FP): the error of rejecting a null hypothesis

when it is actually true. In our case it implies the creation of an alarm in correspondence of normal activities

False Negative (FN): the error of failing to reject a null

hypothesis when it is in fact not true. In our case it corresponds to a missed detection

(87)

ROC Curve

Plots Detection Rate vs. False Positive Rate

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

False Alarm Rate

Detection Rate

Non Stationary ECDF Non Homogeneous MC Homogeneous MC

(88)

ROC Curve

Results presented by the ROC are often considered incomplete because

they do not take into account the cost of missed attacks they do not take into account the cost of false alarms they do not say if the system itself is resistant to attacks . . .

Several researchers are working on more complete ways of representing the results

(89)

DARPA Evaluation Program

The1998/1999 DARPA/MIT IDS evaluation programis the most comprehensive evaluation performed to date

It provides a corpus of data for the development, improvement, and evaluation of IDSs

Different kind of data are available:

Operating systems logs Network traffic

Collected by an “inside” sniffer Collected by an “outside” sniffer

The data model the network traffic measured between a US Air Force base and the Internet

(90)

(91)

The DARPA Dataset

5 weeks data

Data from weeks 1 and 3 are attack free and can be used to train the system

Data from week 2 contains labeled attacks and can be used to realize the signatures database

Data from weeks 4 and 5 contains several attacks and can be used for the detection phase

An Attack Truth list is provided Attacks are categorized as

Denial of Service (DoS) User to Root (U2R) Remote to Local (R2L) Data

(92)

Other Data-sets

The DARPA data-set has many drawbacks: simulated environment

not up-to-date traffic

the methodology used for generating the traffic has been shown to be inappropriate for simulating actual networks Other Data-sets:

several publicly available traffic traces e.g. CAIDA, Abilene (Internet2), GEANT, . . .

(93)

Base Rate Fallacy

Let’s define: A = alarm

−A = not an alarm I = attack

−I = not an attack

P(A| − I) = False positive probability P(−A|I) = False negative probability Some definitions:

P(A|B) = P(A)·P(B|A)_P(B)

P(B) =P

(94)

Base Rate Fallacy

Let’s suppose: P(A|I) = 0.99 P(−A| − I) = 0.99

we have 2 attacks a day over 106_{pkts (base rate =}

1/500000)

Applying the Bayes theorem:

P(I|A) = P(I) · p(A|I)

P(I) · p(A|I) + P(−I) · p(A| − I) =

(95)