Securing Network Software using Static Analysis

(1)

Lauri Kolmonen

Helsinki University of Technology

[email protected]

Abstract

Writing network software is not easy and developing secure network software is even more challenging. A more efficient solution is required because using security professionals for software security reviews and audits is expensive and prone to human errors.

In this paper, we survey a range of static analysis tools that stress software security on multiple different programming languages. We also present the underlaying methods these tools use for efficiently finding network application vulner-abilities. With the help of static analysis tools, software de-velopers can efficiently identify and eliminate many types of common vulnerabilities well before the software is deployed. KEYWORDS: Static analysis, network software, security

1 Introduction

Developing network software is not easy and developing se-cure network software is even more challenging. One of the weaknesses of such software is its open nature. For example, a web server is exposed to the whole of the Internet, which means in turn that almost anyone can exploit the software vulnerabilities. Statistics from the CERT Coordination Cen-ter [2] (Fig. 1) show that the number of reported software vulnerabilities have been rising rapidly since 1999.

Figure 1: Software vulnerabilities reported to CERT 1995-2006

Set against this background, it is clear that openness presents particular security requirements to developers of network software. To make matters worse, it is easy for in-experienced developer to introduce a vulnerability without realizing it. Fortunately, tools for identifying and eliminat-ing these kinds of vulnerabilities exist. One helpful method is static analysis, which can find programming errors or inse-cure functions that may lead to a range of security problems. With many static analysis tools available, software devel-opers have the possibility to filter out many of the common security problems in the source code with fairly low effort and cost compared to e.g. code reviews.

2 Static Analysis in Brief

Static analysis means analyzing the code without executing it. This analysis can be preformed based on software source code, binary format (native executable or bytecode) or both. With the help of static analysis, the developers are able to perform quick, thorough and objective check of the source code to pinpoint different types of issues they are concerned about. For example, to ensure that source code adheres to a certain coding standard, programmers might be required to run a static analysis tool designed to check these conven-tions on a regular basis (e.g. before committing the code to repository), and then to change the code to follow these con-ventions.

No static analysis tool is perfect. All of them produce some number of false alarms (called false positives) or leave some important issues unnoticed (false negatives), or both. The balance between false positives and false negatives is a compromise which depends on the nature of the tool. Skim-ming through a long list of issues that contains too many unwanted results can be a tedious task thus reducing the us-ability. For example, as a style checker only checks pro-gramming style, it can accept some false negatives in order to reduce the list of false positives without causing major problems. On the contrary, static analysis tools that stress software security usually try minimize the number of false negatives in order to avoid leaving out any possibly impor-tant problems. As a result, they tend to generate more false positives.

There are other trade-offs as well. In order to process thousands or even millions lines of code efficiently within reasonable time limits, the tools have to make compromises in the depth of the analysis. Different tools aim for differ-ent execution time: some tools provide almost instant results and can be built as a part of the development environment (e.g. a simple syntax checker found in almost every modern

(2)

IDE), whereas the others might require many hours to com-plete and suit better to be used, for example to make manual code reviews more effective.

2.1 Static Analysis and Security

With the help of static analysis tools, developers can im-prove reliability, security and overall quality of software by detecting typical programming errors early in development process. This gives the programmers an opportunity to cor-rect these problems in advance, before the deployment of the software and before a malicious user has a possibility to ex-ploit program vulnerability.

Static analysis provides feedback of certain security issues in the source code. These tools should not be utilized to ver-ify software security because they only detect set of prede-fined security issues. However, static analysis tools enable the software developers or test engineers to perform fast and frequent checks to detect vulnerabilities, which makes these tools suitable to be used alongside with manual code review and security audit processes to improve the efficiency, thus also lowering the costs of these sessions.

It is relatively easy to create security problems with pro-gramming languages. Static analysis is at its best for find-ing these kinds of general problems from the program code. More complex defects that are only visible in the program design can be found through different methods, such as ar-chitectural analysis.

3 Static Analysis Methods

There are many different kinds of methods for performing static analysis and most of the current tools exploit various different methods for examining the program code. This sec-tion describes some of these methods the tools use to detect problems.

Historically, developers have performed static analysis by means of well-known UNIX tool, namelygrep[5]. Armed with set of predefined regular expressions,grepcan be used to find dubious lines of code that need manual inspection. However, employing regular expressions for finding blocks of code has many disadvantages. First, regular expression syntax is difficult to read and defining new rules is cumber-some. This approach also ignores an important property of software — the actual structure of the program. For exam-ple, it is difficult to distinguish comment blocks from actual code using regular expressions or to do more detailed anal-ysis. Grepcan hardly be called a static analysis tool. It is more a general purpose tool for finding user defined pattens in a file, and has not been designed with software analysis in mind. The program lacks important knowledge of program syntax, structure and execution order.

All real static analysis tools working on source code first have to transform it into a tokenized form. This means break-ing source code file into a series of lexical tokens for easier processing[17]. This process is called lexical analysis. For example, by use of lexical analysis, the tools can create more detailed model of program, for example to separate unsafe function calls from innocent comments.

While lexical analysis helps detecting some of the most straightforward security defects, most require more detailed analysis to be detected. In their book[6], West and Chess ex-plain that to make this analysis possible, a tool has to build an abstract syntax tree (AST) from the tokenized form in or-der to unor-derstand program semantics. At this point, as the semantics of the analyzed program is known, a static analy-sis tool can perform more precise analyanaly-sis by tracking con-trol and data flow path of the program under analysis. This may be done on multiple different levels: on a function level, module or class level, or on a global level, considering in-terprocedural calls between all functions in the program[5]. The deeper the analysis context, more computation power it requires.

3.1 Model Checking

Some of the tools presented in section 5 employ model checking for inspecting temporal safety properties, such as “memory should be freed only once”[6]. This can be done by transforming the property to be checked into a finite state automaton (the model) and then comparing the program to this model to detect a violation of a given safety property.

3.2 Taint Propagation

Many of the typical attacks are result from trusting user in-put or failing to escape it correctly. Open Web Application Security Project (OWASP) lists top ten most serious web ap-plication vulnerabilities[14]. Insufficient input validation is the most common origin of software vulnerabilities and the top three positions in this ranking.

Many of the static analysis tools use taint propagation to find software vulnerabilities which originate from failing to validate user input correctly. In taint propagation, the tool tracks the path of tainted input through program and exam-ines the parts the input has effect on. For example, assigning a tainted variable to an another variable taints also the target. When tainted data reaches a sink, a program location that should not receive tainted data, static analysis tool reports a vulnerability alert. There are also functions that remove taint from a variable, typically performing different types of input validation.

Another problem closely related to taint propagation is pointer aliasing. In order to ensure reliable taint propagation analysis, tool has to also perform alias analysis to understand relationships between variables that contain tainted data.

4 Network Application

Vulnerabili-ties

This section describes three most common security vulnera-bilities concerning network software. All of these security problems are due to unvalidated user input, an issue dis-cussed in Section 3.2.

(3)

4.1 Buffer Overflow

Buffer overflows are one of the most common forms of se-curity threats in software[18]. Programs implemented with C and C++ programming languages are very likely to have buffer overflow vulnerabilities because of their low-level ac-cess to memory and some common library functions that lack important bounds checking. This covers all network software written with C/C++, ranging from simple network tools to web servers and operating system protocol stacks.

Buffer overflows occur when too much data is written to a fixed-length buffer without checking whether or not the data actually fits into memory allocated for the buffer. For ex-ample, C functionstrcpycopies string to an array. If the supplied string contains more data than is reserved for the ar-ray, function will overwrite memory locations following the array, causing a buffer overflow. Buffer overflow vulnerabil-ities enable malicious user e.g. to overwrite function return addresses and thereby allowing remote execution of arbitrary code. For example, Wilander[20] explains buffer overflows in more detail.

4.2 Cross-site Scripting

Cross-site scripting (XSS) is a web application vulnerabil-ity that enables an attacker to execute remote code with the credentials of another user. For example, by entering (e.g. with URL parameter or via database) JavaScript code to a web site that displays unescaped user input, an attacker is able to execute arbitrary JavaScript commands with the ac-cess rights of the viewer and to steal the credentials of the user. OWASP ranks Cross-site scripting the topmost serious web application vulnerability of the year 2007[14]. More in-formation about cross-site scripting can be found e.g. from CERT advisory 02/2000 [1].

4.3 Injection

Command injection means injecting arbitrary commands as input to an application, which subsequently executes com-mands without performing adequate input validation.

SQL injection is a special form of command injection, di-rected against databases. SQL injection can occur when a unescaped malicious input is used to construct an a SQL database query. This enables execution of arbitrary SQL commands given by an attacker. As can be seen from PHP example below, the $name parameter is initialized from HTTP GET parameter and used to construct SQL query without any input validation.

$name = $_GET[’name’];

$query = "SELECT address FROM users WHERE name = ’$name’;" $result = pg_query($query);

This example functions as it should, returning and address corresponding to a name, if the name parameter really con-tains a name consisting of alphabets. But consider a situation where a malicious user sets the input to:

’; DELETE FROM users WHERE ’’ = ’

SQL query to be executed then becomes:

SELECT address FROM users WHERE name = ’’;

DELETE FROM users WHERE ’’ = ’’;

Execution of this query would result deleting all the rows in users table, but any other kind of manipulation to backend database is also possible (e.g. changing user credentials for bypassing authentication). Catching SQL injection errors is closely related to web application and other database-driven network application development.

5 Static Analysis Security Tools

Various static analysis tools for finding different kinds of se-curity issues have been implemented and have been success-fully used to detect security problems in many widely de-ployed programs. We will give some examples of these tools and programs later in this section.

5.1 Detection Rules

A good static analysis tool separates the program logic and rules for detecting vulnerabilities. By using separate rules the tool makes possible for the users to extend or change the rules of the tool, thereby making the tool more diverse and flexible. However, this requires a special syntax for the rules, which is readable both by the human and the computer. Many of the tools use external files for defining the detection rules, but there are other methods as well, such as annota-tions.

Some of the tools exploit annotations to document rules defining what kind of problems the tool should report and how the program is designed to behave. Annotations are written directly to program code. Special comment syntax or suitable annotation facility built into programming language is typically used for defining annotations. For example, Java programming language introduced annotations with the re-lease of its version 5.0 to replace ad hoc annotation mecha-nisms.

5.2 Review of Current Tools

Different programming languages pose different challenges for performing static analysis because of the different char-acteristics they have. For example, one of the main con-cerns of C/C++ programs are buffer overflow vulnerabilities, which are very easy to implement by accident (e.g. using

getsfunction without bounds check. Scripting languages, such as PHP or Ruby, bring unique challenges for static analysis tools as they implement many dynamic properties. These properties include dynamic typing of variables (with implicit casts), lack of explicit variable declarations and dy-namic inclusion of code[21]. Because of these unique pro-gramming language-specific features and problems, almost all of these static analysis tools are specialized to find issues only in one language. Below we list a number of tools for this purpose:

(4)

• ITS4 by Cigital, Inc., one of the early static analysis

security tools, concentrates on detecting function calls, such asgets()that may pose a security threat when used incorrectly. The tool performs basic lexical anal-ysis, for example to separate comments from function calls, thus providing only little help detecting more complex and context specific security problems[16]. ITS4 supports C and C++ programming languages and it has separate vulnerability definition file. ITS4 is no longer officially supported by Cigital but the source code is available for any use that does not compete with Cigital’s consulting practice.

• FlawFinder[19] is an open source tool distributed

un-der the terms of the GNU Public License (GPL) de-signed for detecting risky function calls in C/C++ pro-grams. Like ITS4, the tool employs lexical analysis to detect hazardous functions but uses built-in vulnerabil-ity database instead. Because of these properties, the tool is only to be used for detecting very basic vulnera-bilities.

• RATS[10] (Rough Auditing Tool for Security) is also

licensed under GPL, and is used for detecting security problems in various programming languages (C, C++, Perl, PHP and Python.) As ITS4 and FlawFinder, the tool exploits lexical analysis for performing the analysis and therefore only provides only rough analysis of cer-tain relatively simple security problems, in other words, hazardous function calls.

• BOON[18], focuses solely on detecting buffer overflow

vulnerabilities using integer range analysis. BOON ig-nores many important issues, such as pointer aliasing, statement order and interprocedural dependencies[6]. BOON has successfully been used to find buffer over-flow vulnerabilities in popular software, for example in Linux net tools package [18].

• Pixy is an open source tool for detecting taint-style

vul-nerabilities (cross-site scripting, SQL and command in-jection) in PHP code. Pixy uses flow-sensitive, inter-procedural, and context sensitive data flow analysis to detect taint-style vulnerabilities[11]. Also literal and alias analysis is employed to gain better results [12].

• Splint is an open source static analysis tool for finding

software vulnerabilities in ANSI C code. The program uses annotations to find abstraction violations, unan-nounced modifications to global variables and other problems. The tool can also detect different types of buffer overflow and memory leak vulnerabilities [7].

• LAPSE (Lightweight Analysis for Program Security in

Eclipse) is an open source tool for detecting common web application security problems implemented with Java J2EE. The tools is available as a plugin for Eclipse IDE. LAPSE detects different types of tainted input vul-nerabilities, including SQL injection, Cross-site script-ing, cookie poisoning and parameter manipulation [13].

• The ARCHER tool employs simulation-based approach

for detecting memory access errors in C programs. It

has a low false positive rate, does not need annotations, and scales well to handle programs with millions lines of code. One of the drawbacks of ARCHER is the lack of understanding C string operations, causing it to miss many common errors. The tool has found hundreds of errors in Linux kernel and other systems[22].

• CQual is an open source tool that performs type-based

analysis for example to detect deadlocks and format string vulnerabilities in C/C++ programs. The tools re-quires the user to define some annotations (type quali-fiers) as a basis of taint propagation analysis it performs. The tools has been successfully used to finding poten-tial deadlocks in Linux kernel[8].

• WebSSARI (Web application Security via Static

Anal-ysis and Runtime Inspection) is a tool for detecting se-curity vulnerabilities in PHP code. WebSSARI is also able to automatically insert runtime guards in sections that it finds possibly unsafe[9]. WebSSARI has been successfully used to find various security problems in multiple widely-used PHP software components.

• The Eau Claire is theorem prover based static

analy-sis tool for identifying buffer overflows, file access race conditions and format string vulnerabilities in C pro-grams. By default the tool can identify array bounds er-rors and null pointer dereferences. It also allows user to define specifications for checking custom security prop-erties for functions[4].

• MOPS (MOdel checking Program for Security) is a

tool that employs model checking for detecting tempo-ral safety property violations in C programs. Because of the formal approach of MOPS, it can reliably ver-ify the absence of certain classes of vulnerabilities in a program.[3]. MOPS has been successfully used to de-tect multiple security vulnerabilities e.g. in Red Hat Linux[15].

• SATURN is a framework for detecting violations of

temporal safety properties in C programs. The tool is based on boolean satisfiability and has been used to de-tect multiple locking problems in Linux kernel. How-ever, it can be used to detect many other types of prob-lems as well, such as memory-leaks.

6 Static Analysis and Securing

Net-work Applications

In this section, we discuss the use of static analysis to im-prove security of network software. Many of the tools de-scribed in Section 5, can be used to enhance security of all kinds of applications. As a matter of fact, some of these tools have successfully found various vulnerabilities in known and widely deployed network applications and frameworks[13].

The term network software is very abstract and can mean just about any application connected to a network of some kind. Also the safety requirements of different network soft-ware varies a lot.

(5)

As stated in section 4, examples of typical attacks that ex-ploit current network software are buffer overflow attacks, SQL injections or cross-site scripting (XSS). Previous sec-tion described many tools that can be used to detect these vulnerabilities in applications. We were pleased to find that there are multiple tools for detecting these very common vul-nerabilities which are unfortunately easy to introduce. Es-pecially tools such as Pixy, WebSSARI and LAPSE seem promising for eliminating web application vulnerabilities in common implementation languages, such as PHP and Java. Also there are many tools for detecting security problems in more low-level network applications such as web servers of-ten implemented with C/C++.

7 Conclusions

There are many static analysis tools available for improv-ing network application security. However, because these tools are very language-specific, the implementation lan-guage may rule out useful tools. Static analysis tools can help to detect many complex security problems which other-wise would be left unnoticed. However, these tools are not perfect and the results always need human inspection. Static analysis tools are unable detect security problems that result from insecure software design but can help developers avoid many of the common mistakes which seem to occur in soft-ware over and over again.

References

[1] CERT. Malicious html tags embedded in client web requests. CERT Web Site, February 2000. http://www.cert.org/advisories/CA-2000-02.html. [2] CERT. Cert statistics: Vulnerability

reme-diation. CERT Web Site, September 2007. http://www.cert.org/stats/vulnerability_remediation.html.

[3] H. Chen and D. A. Wagner. Mops: an infrastructure for examining security properties of software. Technical report, Berkeley, CA, USA, 2002.

[4] B. Chess. Improving computer security using extended static checking. In SP ’02: Proceedings of the 2002

IEEE Symposium on Security and Privacy, page 160,

Washington, DC, USA, 2002. IEEE Computer Society.

[5] B. Chess and G. McGraw. Static analysis for security.

IEEE Security and Privacy, 2(6):76–79, 2004.

[6] B. Chess and J. West. Secure Programming with Static

Analysis. Addison Wesley, 2007.

[7] D. Evans and D. Larochelle. Improving security using extensible lightweight static analysis. Software, IEEE, 19(1):42–51, 2002.

[8] J. S. Foster, T. Terauchi, and A. Aiken. Flow-sensitive type qualifiers. Technical report, Berkeley, CA, USA, 2001.

[9] Y.-W. Huang, F. Yu, C. Hang, C.-H. Tsai, D.-T. Lee, and S.-Y. Kuo. Securing web application code by static analysis and runtime protection. In WWW ’04:

Pro-ceedings of the 13th international conference on World Wide Web, pages 40–52, New York, NY, USA, 2004.

ACM Press.

[10] S. S. Inc. Rats — rought auditing tool for security, 2001. http://www.securesoftware.com/.

[11] N. Jovanovic, C. Kruegel, and E. Kirda. Pixy: A static analysis tool for detecting web application vulnerabili-ties (short paper). In SP ’06: Proceedings of the 2006

IEEE Symposium on Security and Privacy (S&P’06),

pages 258–263, Washington, DC, USA, 2006. IEEE Computer Society.

[12] N. Jovanovic, C. Kruegel, and E. Kirda. Precise alias analysis for static detection of web application vulner-abilities. In PLAS ’06: Proceedings of the 2006

work-shop on Programming languages and analysis for se-curity, pages 27–36, New York, NY, USA, 2006. ACM

Press.

[13] V. B. Livshits and M. S. Lam. Finding security vulner-abilities in Java programs with static analysis. In

Pro-ceedings of the 14th Usenix Security Symposium, pages

271–286, Aug. 2005.

[14] OWASP. The ten most serious web application vulnerabilities. OWASP Web Site, October 2007. http://www.owasp.org/index.php/Top_10_2007.

[15] B. Schwarz, H. Chen, D. Wagner, J. Lin, W. Tu, G. Morrison, and J. West. Model checking an entire linux distribution for security violations. In ACSAC

’05: Proceedings of the 21st Annual Computer Secu-rity Applications Conference, pages 13–22,

Washing-ton, DC, USA, 2005. IEEE Computer Society.

[16] J. Viega, J. Bloch, Y. Kohno, and G. McGraw. Its4: A static vulnerability scanner for c and c++ code. acsac, 00:257, 2000.

[17] J. Viega, J. T. Bloch, T. Kohno, and G. McGraw. Token-based scanning of source code for security problems.

ACM Trans. Inf. Syst. Secur., 5(3):238–261, 2002.

[18] D. Wagner, J. S. Foster, E. A. Brewer, and A. Aiken. A first step towards automated detection of buffer overrun vulnerabilities. In Network and Distributed System

Se-curity Symposium, pages 3–17, San Diego, CA,

Febru-ary 2000.

[19] D. A. Wheeler. Flawfinder, 2001. http://www.dwheeler.com/flawfinder/.

[20] J. Wilander and M. Kamkar. A comparison of pub-licly available tools for dynamic buffer overflow pre-vention. In Proceedings of the 10th Network and

Dis-tributed System Security Symposium, pages 149–162,

(6)

[21] Y. Xie and A. Aiken. Static detection of security vul-nerabilities in scripting languages. In USENIX-SS’06:

Proceedings of the 15th conference on USENIX Se-curity Symposium, pages 13–13, Berkeley, CA, USA,

2006. USENIX Association.

[22] Y. Xie, A. Chou, and D. Engler. Archer — an auto-mated tool for detecting buffer access errors. In