Hypothesis Testing - Data Collecting and Processing

3. Risk-based analysis

3.4. Data Collecting and Processing

3.4.3. Hypothesis Testing

Normally, data needed for transportation and traffic engineering are in large amount and continuously. It is impossible to obtain, as well as processing, such amount of data. In order to figure out input data for the risk analysis model, the research has to work with samples. From empirical samples´ characteristics, required data will be obtained with the technology of hypothesis testing.

From theory of statistics (e.g., Introduction to the Theory of Statistics (McGraw-Hill Series in Probability and Statistics)), we can see some main points related to hypothesis testing as follows. A Hypothesis is a statement or assertation about the state of nature (about the true value of an unknown population parameter). For example: ―The accused is innocent‖.

One may be faced with the problem of making a definite decision with respect to an uncertain Hypothesis which is known only through its observable consequences. A statistical Hypothesis test, or more briefly, Hypothesis testing, is an algorithm to state the alternative (for or against the Hypothesis) which minimizes certain risks.

Every Hypothesis implies its contradiction or alternative. A null Hypothesis, denoted by H0, is an

assertation about one or more population parameters. This is the assertation we hold to be true until we have sufficient statistical evidence to conclude otherwise.

The alternative Hypothesis, denoted by H1, is the assertation of all situation not covered by the null

Hypothesis. It is required that H0 and H1 are mutual exclusive (which mean that only one can be true)

and exhaustive (meaning that together they cover all possibilities, so one or the other must be true). The null Hypothesis:

- often represents the status quo situation or an existing belief.

- is maintained, or held to be true, until a test leads to its rejection in favor of the alternative Hypothesis.

- is accepted as true or rejected as false on the basis of a consideration of a test statistic. A test statistic is a sample statistic computed from sample data. The value of the test statistic is used in determining whether or not we may reject the null Hypothesis.

The decision rule of a statistical Hypothesis test is a rule that specifies the conditions under which the null Hypothesis may be rejected.

A Hypothesis testing is conducted in order to give out a decision. Generally speaking, there are two possible states of nature that H0 is true or H0 is false. Then a decision may be incorrect in two ways: (i)

a true H0 is rejected (Type I Error) with the probability denoted by ( is called the level of

significance of the test); (ii) a false H0 is failed to be rejected (Type II Error) with the probability

denoted by (1 - is called the power of the test).

The p-value is the smallest level of significance, , at which the null Hypothesis may be rejected using the obtained value of the test statistic.

The rejection region of a statistical Hypothesis test is the range of numbers that will lead us to reject the null Hypothesis in case the test statistic falls within this range. The rejection region, also called the critical region, is defined by the critical points. The rejection region is defined so that, before the

sampling takes place, our test statistic will have a probability P of falling within the rejection region if the null Hypothesis is true.

The non-rejection region is the range of values (also determined by the critical points) that will lead us not to reject the null Hypothesis if the test statistic should fall within this region. The non-rejection region is designed so that, before the sampling takes place, our test statistic will have a probability 1-

P of falling within the non-rejection region if the null Hypothesis is true

In the progress of Hypothesis testing, a Hypothesis is considered as a scientific Hypothesis when the Hypothesis has falsifiability (or so-called falsibility). Empirical falsification is the method to conduct experiments in order to find out evidence to approve or falsify a state of the nature. It is able to consider the falsification progress as a learning progress from try and errors. Then doing scientific research is the progress of Hypothesis testing which consists of the following steps:

- Determining the null Hypothesis

- Determining the alternative Hypothesis

- Examining the correctness of the null Hypothesis (calculating , , and p-value)

- Deciding to accept or reject the null Hypothesis.

- If the null Hypothesis is rejected, then the alternative Hypothesis is accepted by default. Let´s take an example of speed of motorcycles in the traffic flow.

Speed of motorcycle traffic was measured from two different locations with the following characteristics.

Table 1. Average speed of motorcycles (examples)

Location Motorcycle volume Number of sampling Observed speed Mean (km/h) Max (km/h) Min (km/h) Standard deviation (km/h) % 1 3.240 582 32,3 55,2 14,3 5,7 17 2 2.621 270 32,7 56,3 20,9 5,2 15 Average 32,5

Assuming that speed of motorcycle traffic is a variable (called X) which follows Normal distribution. From the above mentioned empirical observation, there may be different types of Hypothesis testing to determine mean and standard deviation of variable X such as:

- Testing x (when x unknown) - Testing x

- Comparing characteristics of motorcycle mean speed and deviation among the two selective locations.

Solution: (to test x for the location 1)

Hypothesis for two-tailed test:

H0: µ = µ0 = 32,5H1: µ≠ µ0For = 0,05, critical values of z are 1,96 (based on the table of z for Normal distribution)

The test statistic is:

n

s

x

t

Decision making will be:

- _{Do not reject H0} if (-1,96 z 1,96)

- _{Reject H0} if (z < -1,96) or (z > 1,96) In the example of location 1, we have:

n = 582

3 ,

32 x

s = 5,7

Done, it is calculated that:

85 ,

0

12 ,

24

7 ,

5

2 ,

0

582

7 ,

5 ,

32

3 ,

32 n

s

x

t

4. Traffic Safety in Motorcycle-dominated Traffic Flow

In document Risk Analysis, Driver Behaviour and Traffic Safety at Intersections in Motorcycle-Dominated Traffic Flow (Page 66-69)