• No results found

Using Argus to analyse network flows. David Ford OxCERT Oxford University Computer Services

N/A
N/A
Protected

Academic year: 2021

Share "Using Argus to analyse network flows. David Ford OxCERT Oxford University Computer Services"

Copied!
60
0
0

Loading.... (view fulltext now)

Full text

(1)

Using Argus to analyse

network flows

David Ford OxCERT

(2)

What are network flows?

A convenient way of representing traffic on your network

Contain a timestamp, the source/

destination IP, protocol/port, traffic

volumes, and a status (eg RST, CON, TIM)

One flow may represent many packets

(3)

Why would I want to

use them?

Good for understanding network incidents

Can help you to identify suspicious/abusive behaviour

Can help in tracing other network issues (eg tracing the source of load on a

particular link)

(4)

What is argus?

Which Argus? we mean:

(http://www.qosient.com/argus) -

Audit Record Generation and Utilisation System

It can capture from a live interface (eg a

mirror/span port, or a fibre tap), or from a Cisco netflow source, or from a pcap file, and indirectly from other sources such as sFlow (note some features require a live

(5)

What is argus (2)

It stores data in its own record format, and contains tools to extract data as required

Most of the tools for using it are command line driven, but can easily be automated to produce useful reports, or to extract the data you need

(6)

Where to capture?

Depends largely on your network topology, what you want to see, and how much data you wish to collect

Things to consider include:

locations of NATs - do you want to see traffic before, or after NATing

firewalling - do you want to see traffic that gets through the firewall, or traffic that doesn’t

(7)

A warning about NATs

even having flows from before and after NATing does not guarantee you can trace the source of a malicious flow

If a single destination host has both malicious and non-malicious connections from behind the NAT, it may not be possible to distinguish these without logs of NAT translations

(8)

How much data

Argus records can be compressed (using gzip) - our experience suggests this can cut the size

requirements by a factor of 5-10 (or more in some cases - particularly for regular scanners)

Data size will depend a lot on how many sources and how many flows you are recording - for our systems - based on a typical unit this might be

(9)

Understanding your

data sources

It’s important to understand how various different types of packet flows are handled by your flow

capturing devices

Dependent on network configuration, the

equipment/protocol you are using to capture

flows. For example a scan of unresponsive hosts may be recorded as INT, or TIM/RST this

(10)

data sources (part 2)

How do your capture devices cope if they receive too many flows to process, do they sample the data - if so, how?, do they stop passing packets in the case of a router, or do they stop recording flows

Remember the point where you receive unexpectedly large numbers of flows is

(11)

What can you do with

the flows?

Incident investigation - how did a machine get hacked, from where?

Spotting malicious hosts, P2P, other rogue traffic

(12)

Incident Investigation

Starting point here is that we know that a particular machine has been compromised (possibly through

other flow analysis, or possibly because we’ve been alerted to it from elsewhere)

We want to know:

-

how was it compromised, and when

-

did the attackers get in anywhere else?

(13)

-Spotting malicious

traffic as it happens

We’ve so far been looking at data that’s been collected and archived

We can also analyse live data to identify unexpected traffic patterns such as scanners, P2P users, botnets etc.

Older Argus versions (2.0.5) came with an example perl script to do this - you may wish to write your

(14)

The aim is to import argus data as it is recorded and to look for patterns such as repeated

connections out to different hosts on a single port (scanning), or huge numbers of inbound

connections

The “holy grail” could be some way to track

connection behaviour for hosts against past traffic patterns, however I’m not aware of any such

scripts for argus

(15)

Use of Argus

You can read and process Argus data in one of a number of ways:

-

Connections via a TCP socket

-

Reading flat files

(16)

Argus tools

Argus server (typically called argus)

collects argus data from a mirrored port or a pcap file

ra - Read Argus

a tool to read argus data from a socket, file or netflow, this can be used for filtering and basic processing

(17)

racount - a simple tool to count total packets, and bytes from an argus file

ranonymize - a tool to make data

anonymous (ie replace IPs that might be identifiable)

rastrip - remove certain data from argus files to make file sizes smaller (eg MAC addresses)

rasplit - split argus file into smaller files based on number of entries

(18)

racluster - merge data based on certain criteria - this is useful for reporting,

graphing, and many other tasks -

understanding the basics of racluster is key to getting the most out of argus data

ragraph - a perl script designed to graph argus data - we’ll see this later

(19)

rasort - sort a list of flows

ratop - show top traffic - useful for live

traffic (similar to iftop, but lower overhead if already using argus - you might wish to combine with multitail for nicely coloured output

there are other tools included, these are the most common ones I use

(20)

Socket Based Argus

Argus provides a socket based interface for reading live or near to live flow based data

You can use this for live flow analysis, for example you might wish to identify when machines are connecting to a known

malicious or forbidden site, or hosts generating certain traffic patterns

(21)

Processing Argus data

from files

There are command line argus tools that you can use to process argus data, and to extract the data you need

We consider now a few common cases of data you may need

(22)

Extracting data for a

particular host/port

Often you will need to hunt for traffic

flowing to/from a particular host to identify what traffic they have been generating.

You may also need to know what traffic is flowing on a particular port:

(23)

ra -nnr /var/log/argus.log - host 192.168.1.3 The ra command is used

for reading argus data

The arguments define whether to resolve IPs to

hostnames (first n), whether to resolve protocols to names (second n), and instructs argus to read from a

file

The - indicates the argus filter

(24)

Alternatively, for a port:

(25)

Aggregating data

Argus stores the full set of flows generated by a system.

For some purposes you may get more useful information by aggregating data together

Argus can produce results aggregated by number of flows and by traffic volumes

(26)

Examples (Based on

Argus 3)

Top talkers (well, actually, traffic total by IP):

Top talkers using port 22

racluster -M rmon -m saddr -r inputfle

(27)

If you store MAC addresses:

Top ports in use:

racluster -M rmon -m smac -r inputfle

(28)

Statistics by protocol:

traffic by host pairs:

to convert to top N:

racluster -M rmon -m proto -r inputfile

racluster -M matrix -r inputfile

racluster -M matrix -r inputfile -w - | rasort -m bytes -w - -r - |

(29)

Traffic graphs

Argus comes with some scripts designed to perform graphing functions

Alternatively you can extract data and pass them to whatever graphing libraries you

(30)

ragraph

a perl script supplied with Argus for

aggregating data into a sensible format and then graphing it

quite powerful, however be warned it can take lots of RAM - also there is often more than one way to get a particular graph, the earlier aggregation is performed the more

(31)

Simple examples:

Traffic in a day

(32)

Simple examples:

Packets in a day

(33)

ragraph dbytes sbytes -M 30s -r /var/log/argus.yesterday

(34)
(35)

Now lets go crazy:

(36)

Or perhaps not

(37)

Performance

In the latter examples, we’re not being very efficient - we can add the option

-m proto dport

to speed them up - this means the

aggregation of the data happens initially, rather than when the data is passed to

(38)

Malicious DNS

(a case study)

I’m sure lots of you have had notifications for hosts talking to “malicious DNS

servers”

Servers set up by malware authors to redirect legitimate sites to their own version

(39)

Graphing traffic to the known malicious DNS servers is interesting:

(40)
(41)

The large spikes are noticeable, and correspond with malware setting up malicious DHCP servers

Filtering by groups of DNS server is even more interesting:

(42)

We can start to suspect that the incidents are connected (especially when we take

account timing of router blocks)

In fact, we discover we can begin to see

actual infected hosts, and collateral caused by the rouge DHCP server

This fact only becomes clear after the malware has been analysed

(43)

How are graphs useful

visualisation of traffic use at particular point on your network - if you’re capturing flows from routers/switches you may have

visibility at several places

what proportion of traffic at a particular point is from what class of user

(44)

can help your planning for capacity upgrades

visual check for sudden unexpected surges

you may find abusive users are very visible on certain graphs

visualisations of incidents - can tell us something new

(45)

Argus and Perl

You can easily automate lots of tasks with Argus

Here I will look at a short simple alerting script to read live argus data and alert

every 100 flows (this script is designed to be over simplified)

(46)

#!/usr/bin/perl use strict;

my $argus_host='127.0.0.1:8887';

open (RA, "/usr/bin/ra -c, -nn -S $argus_host |") || die "Failed to start /usr/bin/ra"; while (<RA>)

{

chomp($_);

my ($time,$opts,$proto,$sip,$sport,$dir,$dip,$dport,$pkts,$bytes,$stat) = split(',', $_);

# do something clever and interesting here

print "received a flow from $sip:$sport going to $dip:$dport, $bytes were transferred \n";

}

(47)

Perl (or other similar language) scripts that use argus are very helpful to automate

repetitive or boring tasks

You may wish to write scripts to output graphs, or to produce summaries, or

determine flows with certain patterns (eg certain types of P2P traffic)

(48)

GUIs

It can be helpful to use a GUI for some sorts of interactive analysis

one of the great benefits of Argus in our opinion is the ability to easily combine

(49)

The tool ArgusEye allows an Argus 3 user to use a graphical interface to process and filter argus data

The interface is superficially like wireshark, however it is distinctly less powerful, and earlier in development

The version on the web has some critical

bugs - I have a version with some fixes (also to be sent upstream)

(50)
(51)
(52)

Argus Issues

Dataformat issues in argus 2.0.6 - designed for

32-bit platforms only - don’t switch platform and expect data to be reliably readable, and don’t use argus 2.0.6 on AMD64

For a new deployment you almost certainly want to use argus 3 - the datafiles are much saner

(53)

Argus Issues (2)

With large amounts of data, such as OxCERT’s monitoring, processing the logs is slow

If we discover a new host we need to investigate it could take 20-30 minutes to get all the flows extracted we need

(54)

Use of a database

We find it helpful to put some of our argus data into a relational database

Tables can get quite large so being selective as to what is most useful is important

We are using a system with Postgres as an SQL backend, with some perl/PHP scripts

(55)

Our model is to use one table per day

Even with these restrictions expect large tables - We find 4-6GB to be typical (but only include certain

data), yours will undoubtedly be smaller

You may wish to consider removing UDP as it generates lots of noise

We import data every 30 minutes (when we rotate

argus data files), and index only when the day’s table is filled (indexed on ip and port)

(56)

Once indexed tables exist you can extract other data and do other analysis eg. packet counts, volumes per IP

you may wish to store this data in your database for longer than the full flow data

graphing, trend analysis etc. (we will post notes about this on the web)

(57)

In fact, the Argus developers have been working on SQL support. Initial code was released in early March 2009

Their code deals with some potential issues with ICMP flows (which we’ve never dealt with in our DB)

Their code currently targets MySQL

however in either case the benefits of using a database should be similar

(58)

Other potentially useful

Argus related tools/features

Flowscan - can produce graphs from

various types of network flow sources. Reportedly supports Argus

GeoIP/AS based matching - new in Argus 3.0.2

(59)

Conclusions

Network flow analysis is useful for both security and other purposes

Argus can help capture, collate and process flow data

For large volumes of data you may find storing your data in a database improves performance

(60)

Questions?

Thanks to:

Robin Stevens (OUCS, OxCERT) Jonathan Ashton (OUCS, OxCERT)

Oliver Gorwits (OUCS) Patrick Green (Warwick)

References

Related documents

Shen, Fong; Huang, Tao; Chen, jian; Cui, Claire; and Song, Xiaodan, &#34;SELECTING ADS RELEVANT TO LIVE EVENTS TO AN ONLINE AUDIENCE&#34;, Technical Disclosure Commons, (April

The findings of the present study reveal that the main characteristics of a training programme that increase training effectiveness are appropriate training design,

Post office Post Office near where you live 5 No Post Office near where you live 0 Police station Police station near where you live 5 No Police Station near where you live 0

Compared to the control arm, the percentage of patient-months with at least 20% of the drug doses missed (called “ poor adherence ” and measured by pill counts.. and data from

Abstract Prokaryotic transcriptional networks possess a large number of regulatory modules that formally implement many of the logic gates that are typical of digital,

Generally, the result on the cross-sectional relationship between climate volatility; deviations of temperature and precipitation from their long-run average values,

Travel Benefit** Covers travel costs for the insured person being treated in an approved private hospital when treatment is not available at a local approved private hospital..

You can watch video live on a computer screen by connecting the camera to a computer using a USB or HDMI cable. Live