9.1 Introduction
Performance test execution is the activity that occurs between developing test scripts and reporting and analyzing test results. Much of the performance testing–related training available today treats this activity as little more than starting a test and monitoring it to ensure that the test appears to be running as expected. In reality, this activity is significantly more complex than just clicking a button and monitoring machines.
9.2 Validate Tests
Term Benefits Poor load simulations can render all previous work useless. To understand the data collected from a test run, the load simulation must accurately reflect the test design. When the simulation does not reflect the test design, the results are prone to misinterpretation. Even if our tests accurately reflect the test design, there are still many ways that the test can yield invalid or misleading results. Although it may be tempting to simply trust our tests, it is almost always worth the time and effort to validate the accuracy of our tests before we need to depend on them to provide results intended to assist in making the “go-live” decision. It may be useful to think about test validation in terms of the following four categories:
Test design implementation. To validate that we have implemented our test design accurately (using whatever method we have chosen), we will need to run the test and examine exactly what the test does.
Concurrency. After we have validated that our test conforms to the test design when run with a single user, run the test with several users. Ensure that each user is seeded with unique data, and that users begin their activity within a few seconds of
One another — not all at the same second, as this is likely to create an unrealistically stressful situation that would add complexity to validating the accuracy of our test design implementation. One method of validating that tests run as expected with multiple users is to use three test runs; one with 3 users, one with 5 users, and one with 11 users. These three tests have a tendency to expose many common issues with both the configuration of the test environment (such as a limited license being installed on an application component) and the test itself (such as parameterized data not varying as intended).
Combinations of tests. Having validated that a test runs as intended with a single user and with multiple users, the next logical step is to validate that the test runs accurately in combination with other tests. Generally, when testing performance, tests get mixed and matched to represent various combinations and distributions of users, activities, and scenarios. If we do not validate that our tests have been both designed and implemented to handle this degree of complexity prior to running critical test projects, we can end up wasting a lot of time debugging our tests or test scripts when we could have been collecting valuable performance information.
Test data validation. Once we are satisfied that our tests are running properly, the last critical validation step is to validate our test data. Performance testing can utilize and/or consume large volumes of test data, thereby increasing the likelihood of errors in our dataset. In addition to the data used by our tests, it is important to validate that our tests share that data as intended, and that the application under test is seeded with the correct data to enable our tests.
9.3 Test Validation
The following are some commonly employed methods of test validation, which are frequently used in combination with one another:
Run the test first with a single user only. This makes initial validation much less complex.
Observe us test while it is running and pay close attention to any behavior we feel is unusual. Our instincts are usually right, or at least valuable.
Use the system manually during test execution so that we can compare us observations with the results data at a later time.
Make sure that the test results and collected metrics represent what we intended them to represent.
Check to see if any of the parent requests or dependent requests failed.
Check the content of the returned pages, as load-generation tools sometimes report summary results that appear to “pass” even though the correct page or data was not returned.
Run a test that loops through all of us data to check for unexpected errors.
If appropriate, validate that we can reset test and/or application data following a test run.
At the conclusion of us test run, check the application database to ensure that it has been updated (or not) according to us test design. Consider that many transactions in which the Web server returns a success status with a “200” code might be failing internally; for example, errors due to a previously used user name in a new user registration scenario, or an order number that is already in use.
Consider cleaning the database entries between error trials to eliminate data that might be causing test failures; for example, order entries that we cannot reuse in subsequent test execution.
Run tests in a variety of combinations and sequences to ensure that one test does not corrupt data needed by another test in order to run properly.
9.4 Run Tests
Although the process and flow of running tests are extremely dependent on us tools, environment, and project context, there are some fairly universal tasks and considerations to keep in mind when running tests.
Once it has been determined that the application under test is in an appropriate state to have performance tests run against it, the testing generally begins with the highest-priority performance test that can reasonably be completed based on the current state of the project and application. After each test run, compile a brief summary of what happened during the test and add these comments to the test log for future reference. These comments may address machine failures, application exceptions and errors, network problems, or exhausted
disk space or logs. After completing the final test run, ensure that we have saved all of the test results and performance logs before we dismantle the test environment.
Whenever possible, limit tasks to one to two days each to ensure that no time will be lost if the results from a particular test or battery of tests turn out to be inconclusive, or if the initial test design needs modification to produce the intended results. One of the most important tasks when running tests is to remember to modify the tests, test designs, and subsequent strategies as results analysis leads to new priorities.
A widely recommended guiding principle is: Run test tasks in one- to two-day batches. See the tasks through to completion, but be willing to take important detours along the way if an opportunity to add additional value presents itself.
9.5 Dynamic Data
The following are technical reasons for using dynamic data correctly in load test scripts:
Using the same data value causes artificial usage of caching because the system will retrieve data from copies in memory. This can happen throughout different layers and components of the system, including databases, file caches of the operating systems, hard drives, storage controllers, and buffer managers. Reusing data from the cache during performance testing might account for faster testing results than would occur in the real world.
Some business scenarios require a relatively small range of data selection. In such a case, even reusing the cache more frequently will simulate other performance-related problems, such as database deadlocks and slower response times due to timeouts caused by queries to the same items. This type of scenario is typical of marketing campaigns and seasonal sales events.
Some business scenarios require using unique data during load testing; for example, if the server returns session-specific identifiers during a session after login to the site with a specific set of credentials. Reusing the same login data would cause the server to return a bad session identifier error. Another frequent scenario is when the user enters a unique set of data, or the system fails to accept the selection; for example, registering new users that would require entering a unique user ID on the registration page.
In some business scenarios, we need to control the number of parameterized items; for example, a caching component that needs to be tested for its memory footprint to evaluate server capacity, with a varying number of products in the cache.
In some business scenarios, we need to reduce the script size or the number of scripts; for example, several instances of an application will live in one server, reproducing a scenario where an independent software vendor (ISV) will host them. In this scenario, the Uniform Resource Locators (URLs) need to be parameterized during load test execution for the same business scenarios.