VALIDITY AND RELIABILITY IN QUANTITATIVE RESEARCH

Quantitative research has various objectives including generalisation of the findings, and validity is used to examine the degree to which the outcomes of the study are generalizable or transferable (Bryman & Cramer 1990; Corbetta 2003). Validity is best examined through face validity which is usually achieved through examining the wording or structure of the constituent items or through examining the content of the instrument. Content validity concentrates on the test’s ability to include or represent all of the content of a particular construct that it is supposed to be measuring (Adcock & Collier 2001; Corbetta 2003). It tests whether the items on the data represent the entire range of possible items the data should cover. Construct validity puts more focus on the degree to which a test measures the construct at which it is aimed (Bryman &Cramer 1990). It is done using factor analysis to examine whether the scale scores in the instrument define more global dimensions (Richardson 2004). While reliability looks at the accuracy of the measuring instrument or the procedure used, validity is the degree to which it accurately reflects what the study set out to measure. Validity is vital for the test because it does not focus only on the statistic; it looks at the relationship between the test and the behaviour it intends to measure (Adcock & Collier 2001).

Convergent validity measures the same factors that are measured by other instruments, while discriminative validity describes the degree to which the measured observation differs from observations. This refers to the extent to which an instrument yields different scores on groups of participants who would be expected to differ in the underlying traits (Richardson 2004). Validity can be examined through using criterion validity which uses the correlation between the scores on an instrument and the scores obtained on some independent criterion. The criterion measures may be obtained at the very time the instrument is administered. This process is referred to as concurrent validity where the test scores accurately estimate an individual’s current state regarding the criterion. In criterion-related validity, the test has to demonstrate that it is effective in predicting indicators of a construct. Predictive validity has to do with criterion measures that are obtained at a time after the test (Richardson 2004).

Face and content validity of the instrument was ascertained by giving copies of the questionnaire to the supervisor and other experts from the College of Higher Degrees to examine the questionnaires to ensure face validity and the content to meet the VV

specifications of Presser (2004). Their comments and suggestions were used to revise the questionnaires before making the final one. The content validity refers to the

representativeness of the item content domain: the manner in which the questionnaire and its items are built to ensure the reasonableness of the claims of content validity (Presser 2004; Sing 2007). Rigorous procedures were used to select the questionnaire constructs to form the initial items, personal interviews with experts, and the iterative procedures of scale

purification imply that the instrument has strong content validity.

The construct validity can be demonstrated by validating the theory behind the instrument. Researchers have used various validation strategies to establish it, including item-to-total correlations, factor analysis, and assessment of convergent and discriminant validity, which demonstrates construct validity by showing that an instrument not only correlates with variables with which it should correlate, but also does not correlate with variables from which it should differ (Kombo & Tromp 2006).

5.4.1 Reliability

In order to understand whether the questions in the Technology Adoption to Support Learning and teaching (TASTL) questionnaire all reliably measure the same latent variables (adoption level, Technology Use and Perception about using technology) a Likert scale was constructed, and a Cronbach's alpha was run on a sample size of 20 respondents. Cronbach’s alpha

reliability coefficients were used to measure the internal consistency of each measure (Creswell 2009). So as to generate the general reliability of each of the latent constructs used in the model, Construct reliabilities were calculated by determining Cronbach’s Coefficient Alpha using the following Kunder- Richardson (K-R) 20 formulae;

Where:

= Reliability coefficient of internal consistency = Number of items used to measure the concept = Variance of all scores

= Variance of individual items

A high coefficient implies that items correlate highly among themselves. This is sometimes referred to as homogeneity of data. A Cronbach’s alpha estimate value above 0.70 is generally considered as acceptable. According to Sekaran (2010: 56), if the value of Cronbach’s alpha reliabilities is less than 0.6, they are considered as poor, if the value is in 0.7 they are acceptable, and the reliabilities value above 0.8 are considered good. Therefore, the closer the Cronbach’s alpha gets to 1.0 the better is the reliability.

Table 5.1 Reliability Statistics

Cronbach's Alpha N of Items

General .819 Group A .613 Group B .951 Group C .706 60 8 21 31

All of the measures used in the testing stage showed an adequate average reliability with Cronbach’s alpha value of 0.819. Group Cronbach alpha ranged between 0.951 and 0.706 that are considered to be good and acceptable except for two items, that is, one item from Technology Adoption (TAL14), and one from Attitude and Perception on the technology (APT3) constructs, which were later dropped in the final survey instrument. They were coded as Strongly Agree =5, Agree = 4, Undecided = 3, Disagree = 2 and Strongly Disagree = 1. Questions under Technology adoption level among users were coded as “TAL”; hence they ranged from TAL1 to TAL13.

Items under the section on Use of Technology were coded as “UOF” and the ranged from UOT1 to UOT15. Finally, items under Attitude and Perception of students on technology were coded as “APT” and ranged between APT1 to APT26. The function of reliability is to examine whether the instrument measures a trait in the same way each time it is used under the same condition with the same subjects (Richardson 1990). A test is considered reliable if the same results are achieved repeatedly. Reliability of instruments could be estimated by examining the internal consistency; and grouping the items in a questionnaire that measure the same concept (Adcock & Collier 2001, Richardson 2004). The reliability of the instrument is estimated by looking at how well the items that reflect the same construct yield similar results. Internal consistency reliability can be measured when a single measurement instrument is administered to a group of people on one occasion to estimate reliability (Bryman & Cramer 1990). This is measured by using Cronbach coefficient alpha which aims to estimate the internal consistency of an instrument by comparing the variance of the total scores with the variances of the scores of the constituent items (Richardson 1990; 2004). Cronbach alpha tends to be higher when there is homogeneity of variances among items than when there are not. The higher the value the greater the indication that the item responses are collectively and empirically consistent with what it is measuring (Field 2000). Gliem & Gliem (2003) point out the following rules of thumb in estimating consistency: α>0.9 should be considered excellent; α>0.8 is good; α>0.7 is acceptable; α>0.6 is questionable; α>0.5 is poor and anything below 5 is unacceptable. When an alpha is 0.70, the standard error for measurement will be over half (0.55) standard deviation.

Although the high value for Cronbach’s alpha indicates good internal consistency, it does not mean that the scale is un dimensional (Gliem & Gliem 2003). Reliability can also be measured by using split-half reliability where all items that purport to measure the same construct are randomly divided into two sets. The entire instrument is administered and a correlation coefficient is calculated between the scores obtained on the two halves (Richardson 1990). The purpose is to check the extent to which the scores obtained on its individual items correlate with one another (Bryman & Cramer 1990; Adcock & Collier 2001).

Richardson (1990) and Adcock and Collier (2001), posit that the test-retest reliability is used to examine the replicability of the instrument It involves calculating the correlation coefficients between scores obtained by the same individuals on successive administrations. It

is assumed that there is no change in the underlying condition between the scores of the two tests. The amount of time awarded between the administrations depend in part, by how much time elapses between the two measurement occasions. To avoid the problem of changes that may occur in the longer time gap, the administration should take place within a relatively short interval for the instrument to be reliable (Adcock and Collier 2001). “The correlation coefficient between scores obtained at the two administrations is more a measure of its stability than its reliability, and variability in the scores obtained on different occasions need not cast doubt on the adequacy (Adcock and Collier 2001).

In document The adoption of technology to support teaching and learning in a distance learning programme at Africa Nazarene University (Page 108-112)