• No results found

2.6 Statistical Data & Tests

2.6.2 Continuous Data Analysis

2.6.2.1 Kolmogorov-Smirnov Test

The two-sample Kolmogorov-Smirnov (KS) test is a hypothesis test that measures the maximum difference between two cumulative distributions and is commonly used to determine if two sets belong to the same population [131].

A hypothesis test has two opposing hypotheses; the null hypothesis H0 and the

alternative hypothesis Ha. In general, H0 states that the samples were taken from

the same population and differences in the sample data can be attributed to random sampling error. The null hypothesis represents the“devil’s advocate” position, and data is tested to generate evidence against the null hypothesis. The alternative hypothesis states the opposite of H0, and is the statement that the experiment is

trying to prove. A hypothesis test will identify which statement (H0 or Ha) is most

likely, however the rejection of H0 does not automatically mean the acceptance of

Ha since a hypothesis test only tests the evidence against the null hypothesis. While

rejection ofH0 may provide supporting evidence forHa, it does not provide complete

evidence for proving Ha. Hence, hypotheses are always falsified and never proven.

For most hypothesis tests, a p-value is calculated based on the sample data to determine if the null hypothesis can be rejected. The p-value corresponds to the probability of obtaining a difference at least as extreme as the one observed in the sample data, assuming the truth of the null hypothesis [132]. The p-value ranges from 0 to 1, where a small p-value indicates that the observed difference cannot be accounted for by random sampling errors and hence there is sufficient evidence to reject the null hypothesis for the full population.

The cut-off point for this distinction is known as the significance level (α), where p-values less than or equal toα provide “statistically significant” results [133]. While somewhat arbitrary, a significance level of 0.05 is the most widely used within academia [130]. However, since α represents the possibility that the null hypothesis will be incorrectly rejected, reducing α will increase the confidence in calculated

results.

The KS test is a non-parametric test (or distribution free method), which means that the test makes fewer and less stringent assumptions about the distribution of the data [134]. In contrast, parametric tests assume that the data arises from a distribution (a normal distribution is commonly assumed) which can be described by a number of parameters (mean, variance, etc.). Since non-parametric tests do not make these assumptions, they can be applied to all kinds of data which makes them more versatile than parametric tests. On the other hand, with correct assumptions parametric tests are more efficient and powerful. Accordingly, while the KS test is more versatile, it tends to be more conservative than other statistical tests i.e. it is less likely to incorrectly reject H0 for the given significance level.

2.6.2.2 Detectable Difference

A commonly used parameter for describing a set of n continuous data points is the sample mean, ¯x: ¯ x= 1 n n X i=1 xi (2.4)

From which the sample spread can be defined using the variance (σ2) or standard deviation (s) of the data set:

s2 = 1 n−1 n X i=1 (xi −x¯)2 (2.5)

The true mean or population mean (µ) is the average of all elements in an entire population, and is usually an unknown constant [135]. Conversely, the sample mean (¯x) varies during testing, and may or may not be close to the true mean depending on the sample size (n) and standard deviation (s). A confidence interval is used to relate the sample and true mean, and defines a range of values around the sample mean in which the true mean is likely to be located (with a given confidence level) [136]. The amount of random sampling error at this confidence level is known as the margin of error (E), which allows the true mean range to be defined as follows:

For non-parametric testing, the margin of error can be related to the sample size (n) and standard deviation (s) by the equation:

n = 1.15(s∗zα/2

E )

2

(2.7) where zα/2 is the two-tailed z-score for a (1−α) confidence. In non-parametric tests

where the distribution shape is unknown, the sample size should be increased by 15% in order to ensure that sufficient data is gathered regardless of the distribution [137], [138].

Z-scores (or standard scores) are variable values transformed to zero mean and unit variance [131]. The larger the value of z, the less likely that the experimental result is due to chance. The two-tailed z-score is calculated by the equation:

zα/2 =invN orm(1−

α

2) (2.8)

where invNorm is the inverse of the cumulative normal distribution function. The invNorm function can be calculated using statistical tools, but z-tables are also available [134]. From these, the two-tail z-score for a 0.05 significance level is

z0.025 = 1.96.

When comparing two sample means, the margin of error determines if there is a detectable difference between the two sample means (see Figure 2.23). If a gap exists

(a) (b)

Figure 2.23: Comparison between two distributions,X1 = ¯x1±E1 andX2 = ¯x2±E2:

(a) The sample means ¯x1 and ¯x2 differ by at least a value ofd for the given confidence

level α (b) The sample means ¯x1 and ¯x2 cannot be said to differ for the given

between the two intervals, then there is a detectable difference between the two population means of at least that gap size. Conversely, there is not a detectable difference if the two intervals overlap. The threshold difference at which this occurs is known as the effect size. With reference to Equation 2.7, the effect size can be reduced by reducing the confidence level or increasing the sample number.

Related documents