Time structure - Data Analysis - – test (Ofsted report, May 2008)

School 2 – test (Ofsted report, May 2008)

3.12 Data Analysis

3.12.10 Time structure

Time is often a variable in studies however, whether it is or not, data will always have a time structure. Fundamentally there is a distinction to be made between longitudinal data and cross-sectional data. When gathering longitudinal data, a variety of time points are used to take measurements. The same point in time is used to take a single set of observation measurements on appropriate variables in cross-sectional data collection (Jargowsky and Yang, 2003).

Page 124 of 308 3.12.11 Inferential statistics and hypothesis testing

Whether the difference found between control groups and experimental groups is due to the groups variation in performance happening by chance or whether the manipulation of the independent variable causes the difference is determined using inferential statistics. There can be confidence in the inferences made, if there is a low probability that the difference is caused by chance variation, from the samples to the populations they represent. To test the null hypothesis inferential statistics are used on the experimental data. The statement of the null hypothesis is that the independent variable has no effect on the dependent variable (Coolidge, 2006).

A significant difference between the control and experimental groups would not be seen if there was no effect caused by the experimental manipulation. Therefore the null

hypothesis would not be rejected. However, if the two groups differ significantly, that is the experimental manipulation has an effect; the null hypothesis would be rejected. In this case the research hypothesis would be indirectly supported. However, the statistical significance of the difference between the groups must be determined to minimize chance variation as an alternative explanation of the results (Coolidge, 2006).

In hypothesis testing there are four possible outcomes (Coolidge, 2006) - two types of error and two correct decisions.

Retain the null hypothesis when the null hypothesis is true

When looking for a relationship between two variables the null hypothesis is that no relationship exists between these variables. A correlation statistical test is performed on the sample data which shows that any observed relationship is due to chance. Therefore the null hypothesis is retained and the inference is that no relationship exists between the

Page 125 of 308

two variables in the population sampled. Actually it is not known whether the null

hypothesis is true. However, if we retain the null hypothesis for the sample and it is true for the population a correct decision is made.

The null hypothesis is actually true but rejected - Type I Error

Of the two types of error, Type I is considered the more serious. When committing a Type I error researchers are, in effect, claiming that their research hypothesis is true when it is not. This is considered to be a very serious type of error since it misleads people. The type I error probability should be 5% or less or p<0.05.

The null hypothesis is actually false and rejected - Correct decision

The conclusion in this scenario is that there is a relationship between the two variables. Also there is only a very small probability that this relationship can be attributed to chance. Consequently the null hypothesis is rejected and the assumption is made that there is a relationship in the population between these two variables. Therefore if there is a

relationship between these variables in the population then rejecting the null hypothesis is the correct decision.

The null hypothesis is actually false but accepted - Type II Error

When the alternative hypothesis is rejected and the null hypothesis is accepted this states that there is no relationship between the variables. However, a Type II error has been committed if there is a relationship between the variables. As these types of errors do not mislead people they are not considered as serious as Type I errors. However potentially useful discoveries can be missed.

Page 126 of 308 3.12.12 Statistical significance

Usually, samples drawn from the population they represent have characteristics that to some degree vary from the true population. This is termed sampling error. If random samples were repeatedly taken it would usually be found that they differed from the population. As the difference between the two sample means becomes greater then the lower the likelihood becomes that this could be attributed to chance. If the probability that the difference between the means could occur by chance is less than five percent it is normally considered by researchers as statistically significant (McGraw-Hill, 2001). This statistical significance level is known as the .05 level or the 5 percent level. If thereiis a 5ipercent, or less, probability of the difference between the control group and the experimental group occurring by chance this would cause the rejection of the null hypothesis. A stricter standard can be employed, if required, which is the .01 level of statistical significance. Therefore if the probability that the difference between sample means is one percent or less of being obtained by chance alone it would be statistically significant (Field, 2006).

When the sample sizes are large, the variability between the groups is small, therefore it is much more likely that the difference between the means of the groups will be statistically significant. It can never be definitely ascertained that effects shown by samples will be shown by the population they represent. Statistical significance is only a statement of probability. Therefore all scientific findings are considered tentative. Also practical significance is not indicated by statistical significance. Even though a statistically

significant effect may be attained the financial cost may be too great or the usefulness may be too small to pursue any practical applications (McGraw-Hill, 2001).

Page 127 of 308 3.12.13 Parametric and non-parametric statistics

When analysing research data a decision must be made regarding what kind of statistical analysis to perform. Care must be taken to select tests that are most appropriate for the data generated. Selection of an inappropriate test may cause the interpretation to be incorrect. One of the main decisions to be made is whether parametric or nonparametric statistical tests should be used (Winks, 2007).

The mean is typically one of the first statistics calculated after experimental data is

gathered. The average value of a sample is indicated by this statistic. The spread and the central tendency of the group of numbers is given when this is combined with the standard deviation. A large spread in the data is reflected by a large standard deviation and a small standard deviation reflects a small spread in the data. In order that these statistics can be considered dependable and accurate the assumption must be made that the data follows a normal distribution – a Gaussian distribution (Field, 2006). A normal distribution is

achieved if the following conditions are met: 65% of data within the mean plus/minus one standard deviation and 95% of data within the mean plus/minus two standard deviations. When the assumption is made that the data generated follow a normal distribution the statistical test used to perform the calculations are called parametric statistical tests (Winks, 2007). Many well-known parametric statistical tests can be used to analyse the data when this is normally distributed. Tests such as t-tests, for example. If the data is not normally distributed then there are several ways it can be approached. The usual method is to use non-parametric statistical tests. The assumption is not made in these tests, also called distribution-free tests, that a normal distribution is followed by the data. If the data was normally distributed, for example, an independent group t-test may have been

Page 128 of 308

performed on the data, however a Mann-Whitney U test may be performed if there is not a normal distribution as this is the corresponding nonparametric test (Winks, 2007).

The answer to the research question of whether there is a significant difference between the values of the observations between groups can be found using this test to calculate a significance level. This however does not compare the means. Most standard parametric tests have a corresponding non-parametric equivalent. For example the paired t-test’s

non-parametric counterpart is the Wilcoxon Signed Rank test. The problem with non- parametric tests is that they are less powerful than parametric tests since they make no assumption regarding the distribution and therefore have less information upon which to determine significance. However a non-parametric test is a reasonable substitute if a parametric test is not appropriate (Winks, 2007).

A reasonably useful guide to approximating the normality or non-normality of a distribution is by examining a histogram which has the normal distribution curve superimposed. The question then arises regarding what the degree of deviation from the normal is before the distribution should be considered non-normal. This question cannot be answered

definitively by only viewing the histograms. However, there is another method which can be used to decide whether a distribution is normal or not which are more objective.

This method of determining if the distribution is significantly skewed requires the

determination of the range within which the distribution can be considered normal. This range is given by twice the standard error of the skewness, both positive and negative. The skewness is not considered significantly outside the normal distribution if the value for skewness is within this range (Price, 2000).

Page 129 of 308

A distribution can also be described by its kurtosis. This describes the relative concentration of scores in the tails, the shoulders and the centre of a distribution.

The kurtosis, relative to a normal distribution, can be checked using the same process to determine if it fits within the normal range. This is established by doubling the standard error of kurtosis, both positive and negative, to produce a normal range. If the kurtosis fits into this range it can be considered normal. (Price, 2000)

In document Through pedagogy to safety : A study to identify more productive pedagogies for teaching home chemical safety education interventions to primary school children (Page 136-142)