Baseline and Attack Execution Stage - DYNAMIC PART PROCEDURE

4. BENCHMARK PROCEDURES AND RULES

4.5 DYNAMIC PART PROCEDURE

4.5.4 Baseline and Attack Execution Stage

This stage collects security measurements when no attacks are executed during the benchmark run (Phase 1) and when attacks are executed (Phase 2). As described in Chapter 3, the maximum time recommended to the execution of each benchmark run is 40 minutes.

The steps to execute this stage are as follows and are illustrated in Figure 4-3:

 Step 1. The workload is submitted without the presence of attacks. This corresponds to the workload measurement interval and are used to collect baseline security measures through security checks that represent the security of the system with normal optimization settings.

 Step 2. Execution of the workload in the presence of attacks, which are organized in execution slots. Each slot corresponds to a measurement interval during which the workload is run and one or more attacks are executed to evaluate the system behavior. As mentioned in Chapter 3, this notion is adapted from the fault injection slot definition present in DBench-OLTP Clause 2.3.1 (Marco Vieira 2005).

The evaluation of the system behavior involves security checks, which are conducted in two steps:

- Collection of the measurements of the System monitor and Data Collector after the execution of each attack.

- Based on the impact and exploitability rules defined in Chapter 3, it determines the level of security impact and exploitability of a successfully exploited vulnerability.

The execution of the first phase of this stage must comply with the following rules:

 Benchmark tools and the workload shall be run without the execution of attacks (baseline execution stage).

 Baseline execution shall be organized in three periods:

Ramp-up time. This is the period that workload applications are starting and performing the first transactions. During this time, no SUB responses and security measurements should be collected. The duration of the ramp- up time should not last more than the start-up time. We acknowledge that this duration can vary from system to system, but a small system with few applications and transactions will not take more than a few seconds of ramp-up time (e.g., 30 seconds), while a large system could take minutes of ramp-up time (e.g., 5 minutes). The same notion is applied to the ramp- down time described later on. The maximum recommended time for ramp-up is 5 minutes.

Measurement time. This period immediately follows the ramp-up. At this time, all applications of the workload and BMS tools were fully started

and are responding to workload client requests, and measurements can commence. Security measurements from the BT should be collected during this period. The duration of the measurement time should not last more than the time needed to completely exercise the different parts of the systems, with requests of different formats and sizes and with different users. This should last a few minutes for a small system (e.g., 5 minutes) and several minutes for a large one. The maximum recommended time for the measurement interval is 30 minutes.

Ramp-down time. This period corresponds to the closing of the operations of the system related to the workload (i.e., the workload has ended). Depending on the system, it may require explicit commands from the BMS. No security measurements should be collected during this period. The maximum recommended time for ramp-down time is 5 minutes.

 Baseline measurements (e.g., response time) and data to be used by the security checker shall be collected during this stage.

The execution of the second phase of this stage should comply with the following rules:

 Attacks should target the vulnerabilities injected into the Vulnerable Component (source code, configuration, interface, etc.).

Phase 1 Phase 2 Ramp-up time Ramp- down time Measurement interval Recovery time BT Sec urit y Che ckin g Attack (1) Vulnerable Component Vul ( 2) Vul ( 3) Vul ( 4) Time Vul ( N) ATTACK EXECUTION SLOT Vul ( 1) BT Sec urit y Che ckin g BT Sec urit y Che ckin g Attack (...) Attack (N) Baseline security evaluation

Security evaluation in the presence of attacks

 The attack execution should be organized in four periods, which are very similar to the baseline execution. Here we focus on the items that are particular to the attack execution, which are:

Ramp-up time. During this time, no attacks should be executed. The ramp-up duration is the same one set for the baseline phase, with an upper limit of 5 minutes.

Measurement time. Attacks should last until the measurement time of the workload reach the timeout period, with an upper limit of 30 minutes. Ramp-down time. No attacks should be executed during this period. The ramp-up duration is the same one set for the baseline phase, with an upper limit of 5 minutes.

Recovery time. This is the time needed to recover the vulnerable component to the state prior the attack execution. This is aimed at avoid that the impact of an attack can affect the result of the next attack, with an execution limit of 5 minutes.

 Each attack execution slot should exploit exactly one vulnerability. This is required to be able to assign the impact to each vulnerability.

 Attacks should be executed within the maximum measurement time. The execution of security checks should comply with the following rules:

 The Security Checker component must monitor and assess any impact in the security attributes of the Benchmark Target during and after the execution of each attack.

 The Security Checker must also consider the exploitability level of each attack following the rules specified by CVSS.

 At the end of attack execution, the security checker must determine how many attacks were successful.

 As the vulnerability exploited will be always the same during an attack injection slot, just one impact and exploitability metric must be produced.

 The attackload component must be configured and prepared to exploit each vulnerability at a time during the workload execution.

 The Security Checker component must be prepared to analyze the expected result of the attack of the vulnerable BT and verify if the typical response of non-vulnerable BT was changed (and in which degree was changed) during the attack execution.

 At the end of each successful attack execution, the individual risk of discovered vulnerabilities must be measured. The criteria defined in the Common Vulnerability Scoring System to compute individual vulnerability risk must be used.

In document Security Benchmarks for Web Serving Systems (Page 140-144)