Sorting Through the Noise

(1)

Sorting Through the Noise

SANS Eighth Annual

2012 Log and Event Management Survey Results

May 2012

A SANS Whitepaper

Written by: Jerry Shenk

Advisors: Dave Shackleford & Barbara Filkins

Why Collect Logs?

PAge 2

Changes in Log Collection and Analysis

PAge 4

Top Challenges: Sorting through the Noise

PAge 6

Learning from Logs

PAge 9

Survey Demographics

PAge 13

LogLogic

(2)

Executive Summary

The key finding that stands out in SANS’ Eighth Annual Log and Event Management Survey is the inability of organizations to separate normal log data from actionable events. More than 600 respondents report that detecting and tracking suspicious behavior, supporting forensic analysis and meeting and proving regulatory compliance are the most important and problematic issues they are dealing with in using their logs. As attacks become more sophisticated, IT and security practitioners are identifying what they must do to not just keep up, but also to get proactive about their security practices. At the heart of this issue is log management. The survey respondents are also looking at more data, according to year-over-year survey results. As the log management industry continues to mature, organizations expect to get more meaningful and actionable results from log data. Nearly every product that manages logs now ships with one or more built-in processes for extracting, analyzing and alerting on data.

In the survey, 58 percent of organizations report that they use a log manager to collect and analyze logs. Also, 37 percent said they are using a Security Information and Event Management (SIEM) system in some capacity, while 22 percent are collecting the logs and processing them entirely with their SIEM systems.

A large percentage of organizations—22 percent of the respondents —say they have little or no automation and no plans to change. The most common reasons given for not automating include lack of time and money… resources that are closely intertwined. Respondents cited two additional reasons: the lack of management buy-in and insufficient time to evaluate the options available in different SIEM and log management products.

As in the past two years, this year’s survey responses indicate that organizations are trying to squeeze as much actionable data as they can out of their log management systems, so the convergence with SIEM / event management systems makes good sense. However, they are still struggling with advanced threats, and screening out actionable data from background noise on their networks. Even when we look at the 22 percent of respondents who are using SIEM for collecting logs and processing them, nearly the same percentage say it is difficult to prevent incidents and detect advanced threats. This similarity indicates that log and event management systems, or the way they are being used, have a long way to go in finding the critical needle in a haystack that organizations need during a network crisis.

(3)

Why Collect Logs?

One of the biggest challenges for law enforcement and other agents responding to a breach is the inability to identify the attacker, according to the 2012 Verizon Data Breach Investigations Report.1_{The report shows that,} in many cases, organizations cannot identify the attackers because of insufficient log data.

This shortcoming directly corresponds to the top challenges our survey respondents reported. When asked what importance was placed on each of 12 reasons for collecting log data (Figure 1), the most critical was related to internal and external security issues. Respondents’ top reasons included detecting and tracking suspicious behavior (82 percent), supporting forensic analysis and correlation (65 percent) and preventing incidents (58 percent).

Figure 1. Why Collect Logs?

Detecting advanced threats was also important (54 percent), as was using logs to meet regulatory compliance with requirements (55 percent). These reasons have stayed consistent since we started asking these questions although the individual questions and options have changed enough to prevent year-by-year comparisons.

(4)

Why Collect Logs?

(CONTINUED)

Many respondents are also collecting logs for operational and business improvements, including IT operations and support, application and system performance and monitoring service levels and other lines of business. These issues were identified as critical by 24 percent to 30 percent of respondents. One reason for log collection we have asked about ever since the first log management survey in 2005 pertains to compliance with various regulations, requirements and policies. This year, in a question about what reasons people have for collecting logs, 55 percent stated that compliance issues were a critical reason, 36 percent said compliance was important and the remaining nine percent said that compliance was not important.

The final responses were related to costs, chargebacks and understanding customer behavior. These received positive responses of 17 percent to 11 percent, although30 percent to 40 percent said they were not important. Almost all respondents (except for .3 percent) said that detecting and tracking suspicious behavior was important. This has been the top reason for collecting logs since we started asking this question in 2008.

(5)

Changes in Log Collection and Analysis

The same year SANS conducted its first Log Management Survey (2005), the term SIEM (Security Information and Event Management) was coined.2_{SIEM includes the collection of log data as well as correlation of} different log events from various sources, together with suspicious event information. This data is correlated and presented through other features such as dashboards, real-time alerting and reports and charts,

depending on a particular vendor’s implementation. In 2005, respondents were running manual or automated scripts to constantly glean information from log data.

Over the years, we have hypothesized that log management systems would eventually migrate to include more fully automated correlating, analysis and reporting functions. This year’s survey shows that organizations are integrating their log systems into security and other event management systems for better analysis and reporting.

This year we tried to determine the percentage of organizations primarily performing traditional log analysis versus the percentage using what they would call a SIEM. Of course, this is a hard line to define because of overlapping functions between the tools. For example, if an organization collects logs using syslog and uses scripts to count the number of inbound or outbound blocked ports, could that combination of processes be considered a SIEM? Probably not, but as the automation and intelligence gets deeper, at some point the combination might cross that line.

To learn how respondents are analyzing and correlating log and security information, we asked them to identify their log collecting activities under one of the following categories:

• Collect data directly from hosts into a log manager • Collect logs from syslog (UDP/TCP) into a log manager • Use agents to collect data from sources into a log manager

• Use Security Information Event Management (SIEM) to correlate and analyze log data that is collected by other means (e.g., log servers)

• Use SIEM to collect, correlate and analyze log data • None of the above

(6)

Changes in Log Collection and Analysis

(CONTINUED)

The responses included a mix of sending logs directly to a log manager, or through syslog or agents. A good number, 22 percent, indicate that they are collecting and analyzing log data with their SIEM. Log Management systems are still in high use, however, with 58 percent using one of the three log management options, as shown in Figure 2.

Figure 2. Methods of Collecting and Analyzing Log Data

In a separate question about what type of log and event management software organizations are using, we found that many of them are using internally developed and commercial packages, so there is some overlap among these options chosen by respondents. The first three options (depicted in shades of blue) relate to log management, the fourth option is a hybrid with 15 percent and the fifth option (dark red) is solidly in the SIEM category. It will be interesting to see how these numbers change in the coming years.

(7)

Top Challenges: Sorting through the Noise

Collecting and accessing logs are no longer a problem for most organizations as it was in the beginning years of this survey. For the past three years, about 90 percent of respondents have consistently indicated they are collecting logs. Because organizations are more aware of their logs and the value they can gain from them, we tried to learn more about how organizations want to use their logs. In the first question, we asked them to rank the top three challenges they face when integrating their logs with other tools in their organization’s overall information infrastructure. “First” represents the most challenging and “third” represents the least challenging aspects. The issue that ranked most challenging and also had the highest total number of votes overall was “Identification of key events from normal background activity,” as shown in Figure 3.

Figure 3. First, Second and Third Most Challenging Aspects of Log Management and Integration

The second most cited challenge was “Correlating events from multiple sources.” Their third most problematic issue was “Lack of analytics capabilities.”

One of the least challenging issues was “Lack of native visualization capabilities,” which indicates that dashboards are helpful and graphics can help identify trends and explain issues; however, the lack of concern about “native visualization capabilities” may also be a trending indicator signaling greatly improved visualization capabilities in current log monitoring and SIEM products, compared to similar products reviewed in previous years. What organizations really want is assistance with good, solid analysis.

(8)

Top Challenges: Sorting through the Noise

(CONTINUED)

Whether Advanced Persistent Threat (APT), or some other type of event, the identification of key events was clearly the largest pain point this year. One example of a key event would be a dramatic change in the size of logs or the size of specific types of logs. For example, if your firewall typically blocks 200 packets a day in your egress filtering and suddenly blocks 5,000 in one day, it would be worthwhile to look through those 5,000 events and see what internal computer is generating the traffic and what port it is trying to connect to. If your organization has multiple sites and tens of thousands of computers, you could split up your “outbound block report” by subnet so that you could have a quick on-line summary of blocked traffic for each subnet. A report like that could be reviewed on a normal daily review in about 10 seconds. This same process can be applied to most common events. Each organization will need to determine what common events are for them and customize the analytics to match the specifics of their network.

Detecting APT style malware, detecting and tracking suspicious behavior and preventing incidents ranked highest among respondents’ problems with using their logs. Detecting and tracking suspicious behavior was also reported as the issue with the highest increase since last year (up from 65 percent last year to 83 percent this year). See Figure 4.

(9)

Top Challenges: Sorting through the Noise

(CONTINUED)

Advanced Persistent Threat attacks have recently been in the news a lot and some have argued that this style of attack is getting blamed for attacks that are not advanced or persistent; however, there are also reports of organized attackers that have deeply infiltrated organizations for many years. One capability generally agreed upon is that log data should be giving organizations the information they need to help identify APT-style threats and other data-exfiltration attacks. In this year’s survey, 90 percent of respondents indicated that APT-style threats were at least on their radar, with “detecting suspicious behavior” on the radar for virtually all respondents (98 percent). According to the newly released Verizon Data Breach Investigations Report3_{(DBIR) 85 percent of} breaches took at least weeks to discover, 54 percent took months and two percent took years to discover. In a March 28, 2011, Open Source Security blog4_{, Martin (no last name given) discusses using logs to detect} APT, stressing the need to first collect all logs. Some suggestions for detecting APT include searching firewall logs for large outgoing sessions and for a high number of outgoing sessions to a single IP address. This requires researching network traffic to determine which IP addresses belong to valid business partners. Searching Domain Name System (DNS) server logs for lookups related to suspect domains can also be helpful. In some cases, logs can be cross-referenced with known Real-time Blacklists (RBLs) or internally-identified lists of suspect IP addresses and domain names. Some of the SIEM players have already started integrating reputation data into SIEM systems to inform organizations in case there is any communication with known bad IP addresses or domains.

In another article about targeted attacks, Dark Reading’s senior editor, Kelly Jackson Higgins,5_Mandiant,6_was quoted as saying that advanced attacks are only the “tip of the iceberg.” Higgins likens the security field to a weapon’s race, so even as detection gets better, the attackers keep “perfecting their trade.”

Automated analysis is critical and needed as a primary method for dealing with logs. However, with all that is at stake, log data needs to be monitored using a combination of analysis methods, including automated and manual analysis, with assistance from SIEM-type tools, and other available resources. Organizations that continuously collect, evaluate, and interpret log data will be in the best position to avoid hosting the next headline-grabbing attack.

3 www.verizonbusiness.com/resources/reports/rp_data-breach-investigations-report-2012_en_xg.pdf?CMP=DMC-SMB_Z_ZZ_ZZ_Z_TV_N_Z037

4 http://ossectools.blogspot.com/2011/03/fighting-apt-with-open-source-software.html

5 www.darkreading.com/risk-management/167901115/security/news/232602533/apt-type-attack-a-moving-target.html

(10)

Learning from Logs

Logs from each device produce different records that, when put together properly, can tell a story for auditors and responders. Respondents are collecting data from multiple devices, the most popular of which is

Windows servers at 85 percent. Security and networking devices, and networking and security systems are also among the top sources, as shown in Figure 5.

Figure 5. What Logs They Collect

This year we expanded the choices for types of sources they use to collect log data. This expansion was based on write-in comments from last year. Some of the new items included “Control systems for physical plant/operations” with eight percent, “Access controls for physical plant” with 17 percent and “Cloud-based or outsourced services/applications” with eight percent.

(11)

Learning from Logs

(CONTINUED)

Every year, organizations collect more logs from increasingly different types of devices, but they also want to derive more actionable information from what they already have. One respondent noted that, “We could collect more but we need to make the ones we have useful and really finish baselining...7_{” This comment continued} to specifically mention computer security specialist Dr. Anton Chuvakin’s definition of baselining where organizations need to learn what “normal” (their baseline) is, and then act on any deviations from that norm. An example of this related to a website would be to track the number of individual error codes logged by the web server. Some web attacks require trying lots of different requests. If an organization sees an abnormally high number of successful hits, that is a red flag. Another indicator may be a dramatically higher instance of 400 or 500 range error codes, as they indicate failed authentication, invalid pages and server errors. A dramatic increase in any of these events could indicate that an attacker is performing some type of reconnaissance, or harvesting data from your site, or trying to guess at authentication and page layout. The next step in cases like these would be to examine the logs to determine if there is a pattern of IP addresses making the requests and if there is anything interesting about the timing of the request.

Organizations say they want to be able to detect suspicious behavior. Yet, when asked how much time they normally spend on log data analysis, the largest group (35 percent) spent “None to a few hours a week” with their logs, as shown in Figure 6.

Figure 6. Time Spent on Logs

(12)

Learning from Logs

(CONTINUED)

Last year, 29 percent of respondents chose “None to a few hours a week” managing their logs. This six

percent variation may indicate an improvement in log management systems and other management systems designed to automate the task of event management. It may also be that one of the two options added this year – “Integrated into normal workflow” took 24 percent of the answers.

Even when broken down by organizational size, more than 20 percent of respondents from enterprise organizations (defined as having more than 2,000 employees) selected this option. About 50 percent of the smaller organizations spent zero to just a few hours per week analyzing logs. That is really not very much time spent getting familiar with logs. Given the advanced threats they are struggling with, we would have expected the time organizations spend on log analysis to increase, not decrease. We cannot stress enough that the best way for organizations to quickly detect abnormalities is to gain an understanding of their baseline or “normal” activity by reviewing/analyzing log data on a regular basis.

SIEM-type tools, including log management tools with analysis and reporting options, will help organize and identify patterns and activities that are generally recognized as indicators of problems. Yet, 58 percent of organizations are not anywhere close to that level of automation. At a minimum, these organizations need to keep to a consistent schedule for viewing and analyzing log data. For help analyzing logs, organizations like SANS also teach courses on log analysis8_{for IT security professionals.}

Even organizations with more automated log collection and analysis capabilities need to establish a baseline by analyzing logs regularly. Automated tools, although very useful, cannot substitute for the “sixth sense” log analysts develop when they spend some time each day getting familiar with their log data. As they become increasingly familiar with their log data, organizations will be better able to differentiate anomalies from baseline traffic much more efficiently. On the data collection front, the trend line is good; more organizations are collecting log data from increasingly diverse sources, which improves the prospects for creating accurate baselines and also provides the hard data necessary to identify areas in which improvements are needed.

(13)

Survey Demographics

This year, more than 600 professionals took the survey, representing a large number of organizations across a broad spectrum of industries including government, financial, technology, medical and pharmaceutical, as shown in Figure 7.

Figure 7. Survey Demographics by Industry/Sector

Organizations represented in the survey ranged in size from enterprise to small-business, with 57 percent representing enterprises of more than 2,000 employees, 30 percent with between 100 and 2,000 employees, and 13 percent with fewer than 100 employees.

(14)

Conclusion

As this year’s survey indicates, although organizations are collecting log data from most data sources, the issue has been getting usable and actionable information out of the data when they need it for detection and response.

Organizations are connecting their log managers to SIEM and other management systems, or simply bypassing their log managers and collecting directly to their SIEM or third-party management systems. Log Management and SIEM systems are now capable of storing the data and allowing it to be recalled quickly. This year organizations are realizing new problems with detection, tracking and preventing suspicious behavior. Part of the reason for this realization may be related to the increased media coverage of extended network intrusions. Respondents indicate that their organizations need better integration and correlation among their systems to catch attacks that often try to hide in normal traffic. Log and SIEM systems that help familiarize and baseline normal log activity and that can support whitelisting will help filter out normal events from suspicious events. As log management systems continue to become more automated via enhanced log management systems and/or SIEM (or hybrid solution), organizations will always need to know and understand their logs.

(15)

About the Author

Jerry Shenk currently serves as a senior analyst for the SANS Institute and is senior security analyst for

Windstream Communications, working out of the company’s Ephrata, Penn., location. Since 1984, he has consulted with companies and financial and educational institutions on issues of network design, security, forensic analysis and penetration testing. His experience spans networks of all sizes, from small home-office systems to global networks. Along with some vendor-specific certifications, Jerry holds six Global Information Assurance Certifications (GIACs), all completed with honors: GIAC-Certified Intrusion Analyst (GCIA), GIAC-Certified Incident Handler (GCIH), GIAC-Certified Firewall Analyst (GCFW), GIAC Systems and Network Auditor (GSNA), GIAC Penetration Tester (GPEN) and GIAC-Certified Forensic Analyst (GCFA). Five of his certifications are Gold certifications.