IT Data Visualization
Raffael Marty, GCIA, CISSP
• Chief Security Strategist @ Splunk>
• Looked at logs/IT data for over 10 years
- IBM Research
- Conference boards / committees
• Presenting around the world on SecViz • Passion for Visualization
- http://secviz.org
http://afterglow.sourceforge.net
Raffael Marty
Applied Security Visualization
Paperback: 552 pages
Agenda
• IT Data Visualization
- Security Visualization Dichotomy - Research Dichotomy
• IT Data Management
- A shifted crime landscape
• Perimeter Threat • Insider Threat
Visualization is a more effective way of IT data management and
Visualization Questions
• Who analyzes logs?
• Who uses visualization for log analysis?
• Who has used DAVIX?
• Have you heard of SecViz.org?
What is Visualization?
A picture is worth a thousand log records.
Generate a picture from IT data
Inspire
Explore and
Discover
Increase
Answer a
The 1st Dichotomy
•
security data•
networking protocols•
routing protocols (the Internet)•
security impact•
security policy•
jargon•
use-cases•
are the end-users•
types of data•
perception•
optics•
color theory•
depth cue theory•
interaction theory•
types of graphs•
human computer interactiontwo domains
Security
&
Visualization
The Failure - The Wrong Integration
• Using proprietary data format
• Provide parsers for various data formats • does not scale
• is probably buggy / incomplete • Use wrong data access paradigm
• complex configuration
e.g., needs an SSH connection
/usr/share/man/man5/launchd.plist.5 <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>_name</key> <dict> <key>_isColumn</key> <string>YES</string> <key>_isOutlineColumn</key> <string>YES</string> <key>_order</key> <string>0</string> </dict> <key>bsd_name</key> <dict> <key>_order</key> <string>62</string> </dict> <key>detachable_drive</key> <dict> <key>_order</key> <string>59</string> </dict> <key>device_manufacturer</key> <dict> <key>_order</key> <string>41</string>
• Keep It Simple Stupid • Use CSV input
• Use files as input
• Offload to other tools • parsers
• data conversions
The Right Thing - KISS
# Using node sizes: size.source=1;
size.target=200 maxNodeSize=0.2
/usr/share/man/man5/launchd.plist.5 <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>_name</key> <dict> <key>_isColumn</key> <string>YES</string> <key>_isOutlineColumn</key> <string>YES</string> <key>_order</key> <string>0</string> </dict> <key>bsd_name</key> <dict> <key>_order</key> <string>62</string> </dict> <key>detachable_drive</key> <dict> <key>_order</key> <string>59</string> </dict> <key>device_manufacturer</key> <dict> <key>_order</key> <string>41</string> </dict> <key>device_model</key> <dict> <key>_order</key> <string>42</string>
The Right Thing - Apply Good Visualization Practices
• Don't use graphics to decorate a few numbers • Reduce data ink ratio
The 2nd Dichotomy
•
don’t understand the real impact•
get the 70% solution•
don’t think big•
no time/money for real research•
can’t scale•
work based off of a few customer’s input•
don’t know what’s been done in industry•
don’t understand the use-cases•
don’t understand the environments / data / domain•
work on simulated data•
construct their own problems•
use overly complicated, impractical solutionsSome comments are based on paper reviews from RAID 2007/08, VizSec 2007/08
Industry
Academia
two worlds
The Way Forward
• Building a secviz discipline • Bridging the gap
• Learning the “other” discipline
• More academia / industry collaboration
Security Visualization
My Focus Areas
• Use-case oriented visualization
• IT data management • Perimeter Threat
• Governance Risk Compliance (GRC)
• Insider Threat
• IT data visualization
A Shifted Crime Landscape
• Crimes are moving up the stack • Insider crime
• Large-scale spread of many small attacks
• Are you prepared?
• Are you monitoring enough?
Application Layer Transport Layer
Network Layer Link Layer Physical Layer
Questions are not known in advance!
Have the data when you need it!
Configurations
Traps & Alerts
Scripts & Code Logs
What Is IT Data?
/var/log/messags /opt/log/* /etc/syslog.conf /etc/hosts 1.3.6.1.2.1.25.3.3.1.2.2iso. org. dod. internet. mgmt. mib-2. host. hrDevice. hrProcessorTable. hrProcessorEntry. hrProcessorLoad
ps
netstat
File system changes
multi-line files entire files
multi-line structures multi-line table format hooks into the OS
Sparklines
• "Data-intense, design-simple, word-sized graphics".
• Examples:
- stock price over a day
Edward Tufte (2006). Beautiful Evidence. Graphics Press.
Average
}
Standard Deviation• Java Script Implementation:
Sparklines
Three Types of Insider Threats
Fraud
Information
Leak
Example - Insider Threat Visualization
• More and other data sources than for the traditional security use-cases
• Insiders often have legitimate access to machines and data. You need to log more than the exceptions
• Insider crimes are often executed on the application layer. You need
transaction data and chatty application logs
• The questions are not known in advance! • Visualization provokes questions and
helps find answers
• Dynamic nature of fraud
• Problem for static algorithms
• Bandits quickly adapt to fixed threshold-based detection systems
User Activity
High ratio of failed logins
Color indicates
failed logins
Security Visualization
Community
Data Analysis and Visualization Linux
davix.secviz.org
D
V
Tools
Capture - Network tools ‣ Argus ‣ Snort ‣ Wireshark - Logging ‣ syslog-ng - Fetching data ‣ wget Processing - Shell tools‣ awk, grep, sed
- Graphic preprocessing ‣ Afterglow ‣ LGL - Date enrichment ‣ geoiplookup ‣ whois/gwhois Visualization - Network Traffic ‣ EtherApe ‣ InetVis ‣ tnv - Generic ‣ Afterglow ‣ Treemap ‣ Mondrian ‣ R Project