• No results found

Buffer Overflow Testing

If application source code reviews and inspections are not an option, or nonfailure of the application is so critical that additional testing is warranted, then an application (together with any third-party and system utilities that it utilizes) can be further tested. This can be done by executing destructive tests intentionally designed to detect the existence of a buffer overflow vulnerability. Dustin et al. (2001) and Whittaker (2002) provide additional information on testing applications for buffer overflow vulnerabilities.

Unlike an attacker trying to create a new exploit, a tester should not have to refine an application crash (or any observable weird behavior) to the point where he or she would insert a malicious system command into memory to prove that the application is vulnerable to a buffer overflow. Any application crash or uncontrolled behavior is a potential denial-of- service issue.

ƒ Network packet manipulation tools such as SendIP (www.earth.li/projectpurple). ƒ HTML authoring tools, which can be used to edit an application's Web pages to

permit huge data inputs via a regular browser, such as FrontPage (www.microsoft.com.)

ƒ Scripting languages, such as Perl (www.perl.com)

ƒ Testing tools which record and play back HTTP transactions at the network layer, such as Webload (www.radview.com).

ƒ Testing tools that record and play back browser interactions at the browser layer, such as e-Test Suite (www.empirix.com).

Whichever data submission tool is used, it should be evaluated with the rest of the test environment to ensure that it does not prematurely truncate the test data that it is being asked to submit to the target application.

Calibrating the Test Environment

Before being able to test any application for a buffer overflow, the testing environment must first be checked out to determine the maximum size of test input data that will be permitted by the test environment. This is due to the fact that many system software products will truncate extremely long input data. This truncation not only protects any recipient of the data, but also ironically inhibits tests designed to probe an application beyond this system limitation. Unfortunately, these system software truncations can't always be counted on to protect an application, as the point at which a truncation may occur may vary from module to module. For example, the system software code used to handle an HTTP Get command may have a different limitation than that used to process an HTTP Post request.

In an ideal test environment, it should be possible for the testing team to use input data of an infinite size. Unfortunately, sooner or later the test environment itself will truncate the data, thereby prohibiting further testing. Since a truncation could occur at any system software layer, the fewer layers that the test input data has to pass through, the more likely it is that it will make it to the intended target application without being curtailed. At a minimum, the test environment should permit an input string of 64K characters to be passed to the target application without being truncated. Figure 6.2 illustrates these different layers.

Figure 6.2: Test environment layers.

Test Logs

Using an automated tool makes data submission a lot easier (no need to hit the X key several thousand times). Unless the tool generates some sort of test log, however (recording the input used for each test and the associated test results), identifying the input

stream that first caused an application to fail is likely to necessitate several reruns and prove rather tedious.

Identifying the specific input field that caused the failure, together with the length of data being used at the time, should prove invaluable for the developer charged with diagnosing the application's overflow problem.

Test-Entry Criteria

If not already done as part of functional testing, the initial series of tests should establish the positive functionality of the application (for example, does the application process valid input sizes correctly, never mind invalid lengths), effectively creating an entry criteria for the destructive testing that will be performed to search for the existence of a buffer overflow. The exact size of the input data should be the maximum size allowed by application's user interface, program specification, application requirements, or the size of the database field used to store the data (which theoretically should all be the same).

A supplemental functional check would involve reviewing the final destination of the input data (typically a database field) to ensure that the largest permissible input has not been inadvertently truncated somewhere on its way through the application. This is useful when conducting internationalization testing using non-Latin character sets that utilize double- character byte encodings.

Small-Scale Overflows

The next set of tests should focus on testing the functional boundary of the application. By using input values just a few characters larger than those used in the previous set of tests, the application can be tested to make sure that the input data does not solely rely on any client-side checks (which can be easily circumvented) to prohibit invalid data from being submitted to the server-side component of the application.

A small-scale test may be easier to monitor if the final destination of the input data can be modified to accommodate input data slightly larger than what it would normally be expected to receive. For example, if the final destination were a database field defined as 256 characters, temporally expanding the field to accommodate 300 characters would facilitate the testing team by demonstrating whether or not any truncation took place. If the input data is not truncated, then a high probability exists that the input field is not being bounds- checked correctly and is therefore a good candidate for a buffer overflow scenario. Unfortunately, an appropriately truncated file being deposited in the corresponding database file does not necessarily mean the input field is immune from a buffer overflow attack. At this point, there is no telling where the truncation took place, and a buffer overflow could occur before the point at which the data is truncated.

(implicitly or explicitly) the size of a data field based on its data type declaration (such as integer, varchar, smallint, or text). Therefore, selecting input stream lengths that are clustered around the common maximum lengths for data types is likely to drastically reduce the number of tests used, without significantly reducing the quality of the testing. (This is an example of boundary value analysis at work.)

For example, for a text-based input field, increasing the data input size in the increments of 256, 512, 1K, 2K, 4K, 8K, 16K, 32K, 64K, and so on, is likely to be a much more efficient testing strategy than each time simply incrementing the input data by single character. The latter would result in 65,536 tests before reaching a test that uses a data input stream of 64K characters.

Since some languages or platforms add or subtract a few characters from the logical length of a data declaration (for instance, a varchar(256) declaration may actually consume 258 bytes), unless the testing team completely understands the inner workings of the platform being used, it may pay to simply add a few additional characters to each input stream. An example would be using data sizes of 264 (256 + 8) and 520 (512 + 8), instead of 256 and 512.

Test Optimization

Assuming no observable failure occurs, sooner or later the input steam will reach the maximum size that the test environment can handle without the input data being truncated. If, based on previous experience, the application should be able to handle the maximum input that the test environment can punish it with, a quicker way to prove this hypothesis is to start with the maximum input size that the test environment can submit. If the application can withstand this worst-case scenario, then there is no need to perform the smaller-sized tests. If, on the other hand, this extreme test fails, additional smaller tests will probably be needed to assist with diagnosing the approximate location of the problem.

The Return Leg

If a database field has been defined to be larger than what would normally be needed to store input data from a legitimate source (for example, a varchar(256) field that under normal circumstances never exceeds 50 characters), the testing team may want to consider inserting data directly into the database to fill up such fields. Once populated, these fields can be requested via the application, causing these unexpectedly long values to travel back through the application to the clients, thereby allowing the testing team to check for buffer overflows on the return leg of an application.

Test Observations

One of the challenges of testing an application for buffer overflows is that in many instances a small buffer overflow does not produce any observable symptoms. For example, a 257- character input stream may actually cause a buffer overflow to occur, but if the application does not crash because the extra character overflowed into a portion of memory that by chance isn't referenced, the defect may be indistinguishable from an application that correctly truncates the input string to 256 characters.

Even if a buffer overflow has occurred, a significantly larger input may be required before the situation can be detected, and then the event may not be as dramatic as a program crash but much more subtle. Examples would be a degradation in system performance or the gradual loss of allocated memory (a memory leak). These are symptoms that may not have much of an impact on the system under light processing loads but could prove disastrous under higher loads.

The ease with which a buffer overflow can be observed is also dependent upon the resources available to the testing team. For example, testing teams that have access to memory monitors which can dynamically report on an application's memory allocation, are at a distinct advantage to those who have to rely on symptoms observable to the naked eye.

Diagnostics

Once a program crashes, or some sort of weird behavior is detected, the testing team may be able to refine their input data to determine the approximate point of the failure, thereby speeding up debugging. Although executing additional destructive tests may identify the exact circumstances that cause a program to fail, it is likely that a code review will prove to be the most efficient way of identifying the exact program location of the defect that results in the buffer overflow.

Since all input data should first be bounds-checked before being processed by any other component of the application, the first candidate for a code review is likely to be the server- side routine that initially receives the input data (and therefore theoretically performs the bounds check). Often the cause can be as simple as a developer forgetting to bounds- check just one of the input fields, or using logic that only permits bounds-checking the data for reasonable out-of-bounds values.

Escape Characters

Some operating systems will execute system-level commands if they are embedded in an application's data input stream. This can occur when the system command is hidden in input data that is prefixed by special control (escape) characters, such as $$. The application may then permit the command to escape up to the process that is currently running the application. The receiving process then attempts to execute the system command using its own system privileges.

Data input streams should therefore always be scanned for suspicious characters as soon as they arrive. Although the specific escape sequence will vary from platform to platform, it's generally considered a safer programming practice to check for the inclusion of only legal characters than to check for and attempt to discard illegal ones. A Justification for this viewpoint includes the possibility that a developer might have inadvertently forgotten to check for one illegal character (not checking for one legal character would not pose a security risk). Another possibility is that the application was being ported to a platform the developer had not considered (thereby offering the opportunity for a new set of escape characters).

One temporal usability problem with having extremely tight input data validation rules is that some legitimate input may get rejected, stripped, or replaced, because it was wrongly identified as illegal input. For example, a validation routine that checks the input data for the surname field may only permit characters A through Z (lower-and uppercase) and space. Unfortunately, this routine fails to consider the situation of a double-barreled name (Baxter Smith-Crow), stripping the hyphen and offending a small number of individuals. This problem is usually temporal in nature, because it is typically identified and fixed relatively

input other than a through z, A through Z, or 0 through 9, including spaces, is suspicious. The team should thus report on any input fields that do not discard this input. For convenience, Table 6.14 lists all possible 128 (7-bit) ASCII input characters. Ideally, the application's specification should explicitly document which of these characters are acceptable (and consequently which ones are not).

Table 6.14: ASCII Data Input Characters

CHARACTER ASCII HEXDEC CODE

Null 00 Start of heading 01 Start of text 02 End of text 03 End of transmission 04 Enquiry 05 Acknowledge 06 Bell 07 Backspace 08 Character tabulation 09 Line fed 0A Line tabulation 0B Form fed 0C Carriage return 0D Shift out 0E Shift in 0F Datalink escape 10 Device control 1 11 Device control 2 12 Device control 3 13 Device control 4 14 Negative acknowledgement 15 Synchronous idle 16

End of transmission block 17

Cancel 18

End of medium 19

Substitute 1A Escape 1B

Table 6.14: ASCII Data Input Characters

CHARACTER ASCII HEXDEC CODE

File separator 1C Group separator 1D Record separator 1E Unit separator 1F Space 20 Exclamation 21 Quote 22 Number sign 23 Dollar sign 24 Percent sign 25 Ampersand 26 Apostrophe 27 Left parenthesis 28 Right parenthesis 29 Asterisk 2A Plus sign 2B Comma 2C Hyphen/minus sign 2D Full stop 2E Forward slash 2F Digits 0 through 9 30-39 Colon 3A Semicolon 3B

Less than sign 3C

Equals sign 3D

Greater than sign 3E

Question mark 3F

Table 6.14: ASCII Data Input Characters

CHARACTER ASCII HEXDEC CODE

Circumflex 5E

Low line 5F

Grave 60

Lowercase a through z 61-7A

Left curly bracket 7B

Vertical line 7C

Right curly bracket 7D

Tilde 7E Delete 7F Source: www.ansi.org.

POISONING DATABASE INPUT DATA

One technique that attackers may use to execute illicit commands against a database is to insert a database expression where the developer was expecting to receive a single input parameter. For instance, suppose a login Web page consistent of two input fields (userID and password). In addition to entering a bogus userID and password, an attacker might append to each input field the following string:

" or "123" <> "1234"

If the input fields are fed directly into a database request, then it's possible that a database might not treat this input string as a single parameter, but instead attempt to evaluate the expression and then find that 123 is indeed not equal to 1234, and thereby permit the attacker to successfully log in. Of course the probability of such an attack being successful is increased if the attacker is first able to view the source code that he or she intends to manipulate (or poison).

Unfortunately, writing individual, customized data validation routines for each data input stream may result in more coding errors making it into production than if a common (and therefore probably slightly more liberal) set of well-tested data valuation routines were reused. Therefore, a trade-off exists between using a large collection of tight validation routines and using a small set of more liberal validation routines.

While certainly not a replacement for good data input (and output) validation routines, there are some tools (examples of which are listed in Table 6.15) that attempt to validate all the input data sent to a Web site, intercepting and discarding any suspicious input data before it is able to do any harm.

Table 6.15: Input Data Validation Tools

NAME ASSOCIATED WEB SITE

APS www.stratum8.com G-Server www.gilian.com

Table 6.15: Input Data Validation Tools

NAME ASSOCIATED WEB SITE

iBroker SecureWeb www.elitesecureweb.com

URLScan www.microsoft.com Whichever approach is used, all data input options should be tested to ensure that their corresponding data input validation routines (third-party or otherwise) are robust enough to withstand the worst possible scenarios. Table 6.16 summarizes these scenarios.

Table 6.16: Valuation Routine Checklist YES NO DESCRIPTION

□ □ Do the coding standards used to build the application include guidelines on the use of input data validation routines?

□ □ Have all data input validation routines been inspected and/or tested to ensure they are able to handle invalid data types?

□ □ Have all data input validation routines been inspected and/or tested to ensure they are able to handle invalid data ranges?

□ □ Have all data input validation routines been inspected and/or tested to ensure they are able to handle buffer overflow attempts?

□ □ Have all data input validation routines been inspected and/or tested to ensure they are able to detect system command escape characters?