• No results found

Application Code

Client-side source code, such as HTML, can be expected to be seen by a Web site's visitors. Simply click View, Page Source (Netscape), or View, Source (MS-IE). However, often the amount of client-side code needed to build a Web page is so voluminous that a developer might doubt that anyone other than him- or herself could be bothered to read this source code. Unfortunately, this assumption can lull some developers into a false sense of security, allowing them to include sensitive information in the source code that they would not normally do, had they thought that an intruder would read this information.

The same is true for server-side interpretive code (CGI scripts, SSI includes, ASP code, and so on), where the developer assumes no one other than the webmaster would see this code. But as can be seen from some of the exploits previously discussed, it should not be assumed that any source code (client-side or server-side) will not be reviewed by an intruder. There are, however, a few precautions that can be practiced to minimize the amount of useful information an intruder can glean, should he or she gain access to an application's code base.

Compileable Source Code

Should a Web server become compromised, needlessly storing source code used to compile executables on the Web server may provide an intruder looking for sensitive information (such as userIDs and passwords), with the opportunity to review this code. Yet this opportunity can easily be denied by only storing the executable (object code) on the production server. Note that although code reengineering tools can reconstruct source code from object code, this computer-generated source code is often devoid of comments and meaningful naming conventions, typically making the code extremely hard to read and comprehend.

side or server-side) may also prove useful to an intruder trying to figure out how an application works, and thereby deduce a potential flaw in the application that could be exploited.

Source code comments for interpretive code should ideally be removed before the code is placed into production. Removal could be done by hand or with the assistance of a tool (such as Imagiware's HTML Squisher (www.imagiware.com), which removes superfluous HTML). Table 6.9 lists some Web sites that have software tools that can be used to scan source code for specific character strings, such as a comment header. Note that removing comments should also have the beneficial side effect of slightly speeding up the download of any client-side code.

Table 6.9: Sample Software Tool Libraries Web Sites

NAME ASSOCIATED WEB SITE

ACME Laboratories www.acme.com

CNET Networks www.download.com/shareware.com/zdnet.com Tucows www.tucows.com

Ultimate Search www.freeware32.com

VA Software www.davecentral.com/osdn.com/sourceforge.net

Note that the Unix Grep command can also be used to search source code and the strings command can search object code.

If testing occurs against nonfinal code (such as commented code that has yet to be stripped of its comments), then another iteration of testing against the final (comment-stripped) version would be recommended. Otherwise, the application runs the risk that an error will be accidentally introduced as a by-product of removing the comments (especially if the comments are deleted by hand).

In the case of client-side code, an intruder may not even need to compromise the Web site. Certain tools can enable an intruder to download and then search an entire Web site for such useful keywords as passwords or userIDs (see Table 6.10).

Table 6.10: Sample Web Site Crawlers/Mirroring Tools

NAME ASSOCIATED WEB SITE

Crawl www.monkey.org

Sam Spade www.samspade.org

Teleport Pro www.tenmax.com

Wget www.wget.sunsite.dk

Copyrights

One exception to the no production comments rule is the inclusion of a copyright statement in the source code. Depending upon the jurisdiction, adding such protection may increase the number of offenses intruders would commit if they attempted to access a Web site by altering the Web site's code, potentially increasing the penalty they would face should they be caught.

Helpful Error Messages

In the same way that dynamic code environment should be checked (described previously in this chapter), applications should also be checked to ensure that if an error occurs in the production environment the application doesn't give up too much detailed information about the internal workings of the application to an external user. For example, returning the source code of a failed SQL statement to the requesting browser may provide an intruder with information on the application's database schema, which he or she may be able to utilize, should the intruder gain limited access to the database server.

Another common error message that is often more helpful than it needs to be is that of the failed login attempt. Some applications will inadvertently let attackers know when they have guessed the right userID. For instance, if entering an invalid userID and password combination results in an "Invalid login attempt" error message, but guessing the right userID (while still using an invalid password) results in an "Invalid password" error message, observant attackers will notice the change in error messages. They will thus be able to deduce that they have stumbled across a valid userID and thereby focus their effort on cracking the password (a much easier task to accomplish now that a valid userID is known).

Old Versions

Any components of a Web application that are no longer needed or that have been superseded by a more recent version should be removed from the production server as soon as configuration management procedures permit. Should an intruder discover these components, they might be able to use these legacy components to interfere with (or crash) a legitimate process or even corrupt the application's database, in effect launching a form of a denial-of-service attack. Table 6.11 summarizes the application code security checks that should be considered. Viega (2001) provides additional information on auditing application source code for security vulnerabilities.

Table 6.11: Application Code Checklist YES NO DESCRIPTION

□ □ Do the coding standards used to build the application include guidelines on the use/nonuse of comments, copyright notices, error messages, and hard-coding sensitive data into an application?

□ □ With the exception of a copyright notice, have all comments been removed from the production version of any noncompileable (interpretive) application code?

□ □ Have the application's error handlers been reviewed to ensure they do not divulge too much information to the client when invoked in the production environment?

could disable or modify any client-side validation routine, none of this input data should be processed until it has first been reexamined by a server-side validation routine. There are several different data input scenarios that the server-side validators should be able to deal with, and should therefore be tested to ensure that they are indeed robust enough to handle these situations. Whittaker (2002) offers extensive guidance on the categorization and selection of test input data that may be used by a testing team to break software.

Invalid Data Types

Most programming languages do not take kindly to receiving input data in a data format different from the one specified by a program's input parameters. Such an occurrence may result in data truncation, incorrect conversations, or even the demise of the program itself. For example, a developer may have used a drop-down HTML control for a user to rate a new product, and expect to receive a rating between 1 and 10 (inclusive) from the resulting HTML form submittal. Unfortunately, a HTML-savvy attacker could quite easily edit the HTML form (or HTTP message) and replace the expected numeric input with a value such as astalavistababy. If this erroneous data is not caught by the data validation routine, the recipient application may do any number of things with the bogus input, none of which are likely to be desirable.

Each data validation routine should therefore be tested to ensure that it is able to appropriately handle input data of the wrong data type. This testing can be accomplished by editing the HTML used to build the data input Web page, writing a custom test harness, or using the scripting capabilities of a functional testing tool (such as one of those listed in Table 6.12).

Table 6.12: Sample Web-Testing-Based Scripting Tools

NAME ASSOCIATED WEB SITE

AutoTester ONE www.autotester.com

e-Test Suite www.empirix.com

EValid www.soft.com

Function Checker www.atesto.com

SilkTest www.segue.com TeamTest and Test Studio www.rational.com

TestPartner www.compuware.com WebART www.oclc.org

WebFT www.radview.com WinRunner, XRunner, and Astra QuickTest www.mercuryinteractive.com

WinTask www.wintask.com

Invalid Ranges

A Web developer may decide to use some of the built-in validation capabilities of a client- side language (such as HTML, JavaScript, or VBScript) to ensure that an input value is no longer (or shorter) than expected. For example, the developer may have used an HTML form field with a maxlength of 2, combined with some client-side JavaScript, to ensure that

when the user is asked to "enter the month you were born," he or she enters a value between 1 and 12. However, if the user has disabled client-side scripting (perhaps as a security precaution of his or her own), the user would be able to enter values such as 0, 35, 58, or 93. If he or she were also willing to edit the HTML form itself, values such as 123, 4567, 89012, and so on, become possible. This would potentially cause problems for an unsuspecting server-side process.

Rather than relying on manual tests to input numerous combinations of test input data, scripting tools (such as those listed in Table 6.12) can be used to automate this often- tedious process and also potentially reduce the effort needed to run a regression test of these features in the future.

Instead of testing each input field with a random selection of input values or a large range of numbers traditional testing techniques such as equivalence partitioning and boundary value analysis can be used to help identify optimal test data for invalid range testing as it is typically impossible to test every conceivable input value, because the possible number of input values are often infinite.

Buffer Overflows

A variation on the invalid input attack is to submit huge volumes of data in the attempt to cause a buffer overflow (a technique described in great detail by Aleph One in his white paper; "Smashing the Stack for Fun and Profit," available from www.insecure.org/stf/smashstack.txt). A buffer overflow occurs when the size of the input data exceeds the room reserved for it in a program's memory, causing the program that was processing the data to fail. An attacker may use this type of attack to bring down the Web site, effectively creating a denial-of-service attack. He or she could also use a buffer overflow as a means of inserting a command into the portion of memory that the server uses to store the data and commands that it is currently processing (for instance, a stack or heap). This is done in the hope that the server will inadvertently execute this covert command, a sequence of events described in greater detail by Skoudis (2001) and Viega (2001). If the program that is compromised is running with administrator (or root) privileges, then the rogue command will be executed with this inherited power. All input data should therefore be bounds-checked to avoid buffer overflows (a technique expanded on later in this chapter).

A CRASH COURSE IN TWO TESTING TECHNIQUES

Equivalence partitioning attempts to group the entire set of possible values for an input field into two classes: valid and invalid (a task that may necessitate gaining knowledge of how the application will process the data). A minimal set of test cases would necessitate using at least one value from each class.

Boundary value analysis (BVA) complements equivalence partitioning. Rather than selecting any element in an equivalence class, those values at the edge (or boundary) of the class are selected. The assumption being that errors tend to occur at the boundaries of valid input data, and using the values that are closest to the boundary will most likely

(basically the boundary value plus and minus the next possible input value). For the upper boundary, the values 11, 12, and 13 would be appropriate.

Jorgensen (1995) and Kaner et al. (1999) provide much more detailed explanations of these two testing techniques, while Beizer (1995) explains an associated testing technique: domain testing.