Chapter 4 Methodology
4.8 Validity Analysis
Hair et al. (2010) assured that if a scale conforms to its conceptual definition, is unidimensional, and meets the necessary levels of reliability, scale validity should be finally assessed.Validity is whether a measure accurately represents what it is supposed to. In other words, it is concerned with how well the concept of interest is defined by the scale or set of measures. Ensuring validity starts with a thorough understanding of what is to be measured and then making the measurement as correct and accurate as possible. Furthermore, validity highlights the need to eliminate or minimise the effects of
irrelevant factors that can confound a study and reduce the accuracy of its conclusions. That is, its primary purpose is to increase the accuracy and usefulness of findings by eliminating or controlling as many detracting variables as possible, which allows for greater confidence in the study’s findings (Marczyk et al. 2005). Consequently, validity is an important and useful criteria in all forms of research methodology that refers to the conceptual and scientific soundness and quality of a research study (Graziano and Raulin 2004; Marczyk et al. 2005).
There are several types of validity: construct validity, content validity, criterion validity, face validity and nomological validity.
84 4.8.1 Construct validity
Essentially, construct validity refers to whether a set of measured items actually reflects the theoretical latent construct (Hair et al. 2010). Construct validity evidence involves the empirical and theoretical support for the interpretation of the construct. Such lines of evidence include statistical analyses of the internal structure of the test including the relationships between responses to different test items. They also include relationships between the test and measures of other constructs. That is, evidence of construct validity provides confidence that item measures taken from a sample represent the actual true score that exists in the population. Thus, construct validity is regarded as a slightly more complex issue relating to the internal structure of an instrument and the concept it is measuring (Muijs 2004), and an accuracy of measurement (Hair et al. 2010).
Construct validity is comprised of two subtypes: convergent and discriminant validity. Convergent validity refers to the degree to which two measures of the same concept are correlated (Hair et al. 2010). For example, if a measure captures what it really is
supposed to measure, scores on that measure should be more related to scores on other similar constructs (Tharenou et al. 2007). Here, high correlations indicate that the scale is measuring its intended concept. Whereas, discriminant validity describes the degree to which a construct is truly distinct from other constructs. Generally, a measure should correlate more highly with other measures of the same construct than with measures of other constructs (Shih 2004). Fornell et al. (1982) suggest that the squared correlations between two different measures in any two constructs should be statistically lower than the variance shared by the measures of a construct. Therefore, high discriminant validity provides evidence that a construct is unique and captures some phenomena other
measures do not (Hair et al. 2010).
4.8.2 Content validity
Content validity refers to whether the items designed for the measure adequately cover the latent concept that to be measured. Anastasi and Urbina (1997) have added that content validity is a non-statistical type of validity that involves the systematic examination of the test content to determine whether it covers all of the content
85
associated with the construct to be measured in the course. Thus, content validity is focused on the extent to which the content of a measure is representative of the behaviour domain that is trying to assess (Tharenou et al. 2007).
Clearly there is an important role for theory in determining content validity. The better the subject and the concepts are theoretically defined, the better an instrument that is content-valid will be designed. The main judgement of whether an instrument is content valid is therefore its accordance to a theory of how the concept works and what it is. A thorough review of the relevant literature on the concept wanting to be measured will help to achieve content validity (Muijs 2004) to determine whether the items in the measure have adequately sampled the domain (Tharenou et al. 2007).
4.8.3 Criterion validity
Like content validity, criterion validity is closely related to theory (Muijs 2004). For example, when developing a measure, it is usually expected to be related to other measures or to predict certain outcomes. That is, criterion validity evidence involves the correlation between the test and a criterion variable taken as representative of the
construct. Simply, criterion validity is signified that the measure predicts a relevant criterion. Tharenou et al (2007) also imply that it attempts to answer the question. Criterion validity is practical and pragmatic. However, the choice of the criterion
variable is critical. Smithson (2005) notes that the criterion measure should be known to be reliable and valid already.
Criterion validity is consisted of predictive and concurrent validity, depending on how it is measured. Predictive validity is the extent to which a measure predicts subsequent performance or behaviour. Thus, it is determined by the strength of the correlation between a measure and subsequent performance (Tharenou et al. 2007). On the other hand, concurrent validity makes a less stringent assumption than predictive validity (Muijs 2004). In concurrent validity, the measure is correlated with other measures of the same construct that are measured at the same time.
86
In order for validity coefficients to have criterion validity, the coefficient should be as high as possible. One rule of thumb is that a relationship may be considered weak if the validity coefficient is 0.10, medium if 0.30, and strong if 0.50 (Cohen 1988).
Accordingly, what is needed to establish criterion validity are two things: a good knowledge of theory relating to the concept so that what variables expecting to be predicted by and related to it can be decided, and a measure of the relationship between the measure and those factors (Muijs 2004).
4.8.4 Face validity and nomological validity
Hair et al. (2010) state that constructs also should have face validity and nomological validity. The processes for testing these properties are the same whether using
confirmatory factor analysis or exploratory factor analysis. Nomological validity is the degree to which a construct behaves as it should within a system of related constructs called a nomological set. In other words, it refers to the degree to which the construct as measured by a set of indicators predicts other constructs that are deemed to be
theoretically and empirically predicted in the past work (Droge 1997). Nomological validity is tested by examining whether the correlations among the constructs in a measurement theory make sense. For example, the tendency to purchase premium brands should show a high correlation with a person’s need for status and materialism and a negative correlation with price sensitivity. Consequently, the matrix of construct correlations can be useful in this assessment (Hair et al. 2010).
Face validity is an estimate of whether a test appears to measure a certain criterion. However, there is no guarantee that the test is empirically demonstrated in that domain (Tharenou et al. 2007). Face validity must be established prior to any theoretical testing when using confirmatory factor analysis because without an understanding of every item’s content or meaning, it is impossible to express and correctly specify a
measurement theory. Thus, face validity is considered as very closely relative to content validity since it is determined by a review of the items, not through the use of statistical analyses. Hair et al. (2010) affirm that face validity is the most important validity test in a very real way. Tharenou et al. (2007) agree with this statement suggesting that all
87
measures must have face validity nevertheless it is subjective. When using previously used scales, face validity should also be checked because there is a possibility that when two borrowed scales are used together in a single measurement model even if they have been applied successfully with adequate reliability and validity in other research, face validity issues become apparent that were not seen when the scales were used
individually (Hair et al. 2010).