6.4.1 Quantitative data: Numeric data collected in a research project can be analysed
quantitatively using statistical tools in two different ways. Descriptive analysis refers to statistically describing, aggregating, and presenting the constructs of interest or associations between these constructs. Inferential analysis refers to the statistical testing of hypotheses. Much of this work is today conducted using software programmes such as SPSS or SAS, but before such software can be used the data must be prepared. The first step in this process is to ‘code’ the data so that each response from a person surveyed will placed in a numeric format. Coded data can be entered into a spreadsheet, database, text file, or directly into a statistical program like SPSS. The entered data should be frequently checked for accuracy, via occasional spot checks on a set or items or observations, during and after entry. Furthermore, while entering data, the coder should watch out for obvious evidence of bad data, which would need to be entered but excluded from subsequent analysis (Bhattacherjee 2012, p120).
Missing data is an inevitable part of any empirical data set. Respondents may not answer certain questions if they are ambiguously worded or too sensitive. If possible such issues should be identified during testing of the questions and altered before the main data collection process begins. During data entry, some programmes automatically treat blank entries as
150
missing values while others require a specific numeric value such as -1 or 999 to be entered to denote a missing value.
Univariate analysis, or analysis of a single variable, refers to a set of statistical techniques that can describe the general properties of one variable. The frequency distribution of a variable is a summary of the frequency of individual values or ranges of values for that variable. For example, a researcher could measure how many times a sample of respondents attend religious services, as a measure of how religious they are, using a categorical scale such as never, once per year, several times per year, about once a month, several times per month, several times per week, and an optional extra of did not answer. If the number of observations are countered within each category as a percentage, they can be displayed in various ways, such as table or a bar or pie chart (Bhattacherjee 2012, p121).
The real analysis starts when you examine variables, not one at a time, but in pairs or more complex combinations, known as bivariate analysis, and the most common form of this is the bivariate correlation, often simply known as correlation. The point of this is to look at the relationship between variables, usually in order to explain differences on one variable in terms of differences on the other. The researcher is looking to discover if there is a correlation between the variables or more importantly a causal effect, but must always be wary of a third variable affecting the result and negating any possible causal effect (Gilbert 2001, p261). Researchers will usually always want to know whether the correlation is significant or caused by mere chance, and answering such a question requires the testing of a hypothesis.
In statistical testing, the alternative hypothesis cannot be tested directly. Rather, it is tested indirectly by rejecting the null hypothesis with a certain level of probability. The probability that a statistical inference is caused by pure chance is called the p-value. The p-value is compared with the significant level, which represents the maximum level of risk to be taken that the inference is incorrect. For most statistical analysis the significant level is set at 0.05. A p-value less than this indicates enough statistical evidence to reject the null hypothesis, and thereby, indirectly accept the alternative hypothesis. If the p-value is > 0.05, then there is not adequate statistical evidence to reject the null hypothesis or accept the alternative hypothesis (Bhattacherjee 2012, p125).
151
6.4.2 Qualitative data: Qualitative data are words rather than numbers, which help describe
and explain. But words can also be ambiguous and difficult to compare objectively, so that reliability and validity of any interpretation is a serious concern and therefore there is a need to be able to demonstrate how the conclusions were reached from the data available (Robson 2002, p. 459). It is never clear how much of a verbal description of one instance carries over to other instances. One observer's description, however precise, may not concur with another's.
‘It is easy for a qualitative researcher to jump to hasty, partial, unfounded conclusions’ (Miles and Huberman 1984, p21).
Key to successful qualitative analysis is the need for the researcher to become thoroughly familiar with the data and to devise a practical system that enables rigorous comparison to be made between interviews while retaining the context of data within each interview. Qualitative analysis involves systematic, rigorous consideration of the data in order to identify themes and concepts that will contribute to the understanding of social life. These themes and concepts can then be compared and contrasted with similar material in other interviews (Gilbert 2001, p137).
One possible problem with such a system is the effect the interviewer may have on the validity and reliability of the data, particularly in non-standardised interviews. On the other hand, it is easy to overstate the problem of interviewer bias. It is suggested that
‘much of what we call interviewer bias can more correctly be described as interviewer differences, which are inherent in the fact that interviewers are human beings and not machines’ (Selltiz and Jahoda 1962, p41)
As well as interviewer bias there are several other ways that the interview data can be corrupted such as misdirected probing and prompting, neglecting the cultural context of the parties involved, and problems with the questions themselves. These are mostly capable of being overcome by quality control measures. However the logic of analysing interviews at all
152
is based on assumptions that can be challenged, and so the researcher needs to be conscious of these and any possible criticisms.
One such assumption is that language is a good indicator of thought and action. Attitudes and thoughts are assumed to be a direct influence on behaviour and, in turn, language is presumed to be an accurate reflection of both. However many studies question whether expressed attitude is an accurate indicator of what people have done, or will do. The relationship between attitude and action has to be empirically tested in all cases, so that collecting information about people’s attitudes is only one part of any study concerned with explaining or predicting behaviour. These problems are one of the reasons multiple method studies are desirable (Gilbert 2001, p139).
The above chapter briefly outlines the theory behind effective research methods and the following chapter will describe how the author put those theories and practices to best use during his research over a period of three years.
153