Introduction
Validity assesses the extent to which an instrument measures what it is intended to represent. In Messick’s words,69 it is ‘an integrated evaluative judgement of the degree to which empirical
evidence and theoretical rationales support the adequacy and appropriateness of interpretations and actions based on test scores or other modes of assessment’. The ASCOT measure is intended to be of use in economic evaluations and to provide information for decisions about resource allocation across social care. It should enable decision-makers to compare the value of different types of social care provision, such as a meals service with a home care worker. Validating ASCOT is therefore about identifying the extent to which the instrument captures the value of social care.70
The ASCOT consists of several components to capture the value of social care. The main instrument is composed of two types of questions. The first set of questions asks people to rate their current SCRQoL state in terms of eight domains or attributes. A second set of questions then requires people to rate their current SCRQoL state in the absence of the ‘intervention’, within seven of the eight attributes, where ‘intervention’ can be defined variously according to the purpose of the study. The dignity attribute does not have an item for the second set of questions because it is process based and it cannot be asked when people are not receiving services. We refer to the second set of questions as the ‘expected’ SCRQoL items and the first set as the ‘current’ SCRQoL items. The third component of ASCOT is a set of preference weights that can be used to attach a value to each SCRQoL state. (The generation of preference weights is described in Chapters 4 and 5.)
The different sets of questions serve different purposes and can be used in a variety of contexts. The current items capture the prevailing SCRQoL state of the individual and could be used to compare the states of otherwise equivalent groups (such as matched samples). They could also be used for evaluation of interventions, where they could be administered before and after an intervention, to generate pre- and post-test scores, where the pre-test scores act as a proxy for the expected SCRQoL state in the absence of the intervention and the difference in the estimated effect. The alternative, pragmatic approach proposed in Chapter 1 is to ask individuals what their expected SCRQoL in the absence of services is directly, with the difference between that and currently experienced SCRQoL representing the contribution that social care makes to a person’s SCRQoL. For the expected score, dignity is assumed to be at the second level – where the care process has no impact on the person’s sense of self-worth. All of the SCRQoL measures (current, expected and gain) can be preference weighted.
The psychometric criteria of ‘construct under-representation’ (the failure to capture important aspects of the concept being measured) and ‘construct-irrelevant variation’ (when responses to the measure are influenced by factors irrelevant to the concept being measured) are useful for thinking about validity in the context of valuing social care.71,72 However, as Brazier et al.70
recognise, the psychometric approaches used to determine validity need modification to make them applicable to a preference measure. Brazier et al.70 identify three aspects of preference
20 Testing validity
empirical validity of the instrument, which refers to whether people, through their behaviour in practice, appear to value the different states in the way that they are valued in the measure. Here we focus our assessment of validity on the validity of the descriptive system, which refers to the choice of domains, the specification of the items in the instrument, and the ability of the instrument to detect changes or known differences in SCRQoL. This was achieved in four separate sets of analysis. We first examined the construct validity of the individual items in terms of whether or not they reflect the concepts as intended. We then evaluated the construct validity of the three preference-weighted scales by exploring their ability to detect known differences in SCRQoL. In this we wanted to establish:
■ current SCRQoL scale as a measure of social care-related QoL ■ expected SCRQoL scale as a measure of social care need
■ gain in SCRQoL scale as a measure of the contribution of services to SCRQoL.
Methods
Data collection
Throughout the project, in order to access service user samples, we made use of the annual UES conducted by local councils. The main data collection conducted to test the validity of the instrument with service users took place in 2009 when the UES was of older people (aged > 65 years) using home care services.13 Ten councils across England took part, covering a
variety of regions and local authority (LA) types: six shire counties, two London boroughs, one metropolitan district and one unitary authority. A sampling frame was generated from respondents who had indicated that they were happy to be approached to take part in further research. Data were collected face to face through computer-aided personal interviews (CAPIs). Interviewers were briefed prior to interviewing. Data collected included sociodemographic information; service receipt and informal support; QoL and psychological well-being; health; functional ability; control and autonomy; nature of the locality and environment; social contact and support; and participation in groups and volunteering.
Analysis
The content of the instrument is clearly an important aspect of the validity of the descriptive system. If key aspects of SCRQoL relevant to a person’s utility function are absent, the instrument will not provide an adequate valuation of social care. We followed the method used by Coast
et al.,73 who assessed validity by observing relationships between the items of their measure
(ICECAP, now renamed ICECAP-O) and other factors thought to be related to it. Variables were divided into thematic groups for testing associations between these and items in the ASCOT measure. We examined the statistical significance of associations, and considered patterns of percentages and means to form a judgement about the strength or otherwise of relationships. To demonstrate the validity of the current scale as a measure of SCRQoL, the aim was to explore its relationship with other variables that capture the same construct. However, because of the uniqueness of this measure – in its focus on SCRQoL – it was difficult to find measures with which to compare its performance. We therefore examined its relationship with other measures capturing related constructs. These included HRQoL, where we would expect a moderate relationship with SCRQoL, and psychological well-being, where we might expect a closer relationship.
To reflect HRQoL we used the European Quality of Life-5 Dimensions (EQ-5D),74,75 a widely
used indicator that has preference weights that generate a measure of health value. For psychological well-being we used the 12-item version of the General Health Questionnaire (GHQ-12). Although originally developed as a measure of mental ill health, with a cut-off score below which it is likely the person is clinically depressed,76–78 GHQ-12 has been tested as a
measure of positive mental health in the general population.79 In addition, as ASCOT attempts
to capture capability, we anticipated a moderate relationship with measures of concepts such as control, autonomy and independence. To reflect these concepts, we used the control and autonomy subscale of the CASP-12,80 a reduced form of CASP-19, which is a theoretically based
needs satisfaction measure of quality of life for older people.81 The items capturing expected
SCRQoL in the absence of services can be viewed as measuring the need for social care services, as the items capture what a person’s life would be like without the compensatory action of services. We would therefore expect the expected SCRQoL scale to be associated with other measures that capture need for help in activities of daily living (ADL) and instrumental activities of daily living (IADL). These measures capture the functional ability and are frequently used in needs assessments for social care.
The SCRQoL gain measure is designed to capture the contribution of services to SCRQoL, so construct validity was explored in relation to service receipt. ‘Services’ here were any publicly funded service and included home care, day centres and meals services, as well as newer forms of service delivery, such as direct payments. We would expect the gain in SCRQoL measure to have a positive correlation with intensity of service receipt, although the strength of the correlation will depend on a number of factors that affect the production of welfare, including the quality of the care delivered and other factors that may influence the ability of workers to deliver optimal care, such as the design of the person’s home or challenging behaviour of the individual. It is also possible that the relationship is non-linear, as increasing levels of service input deliver diminishing marginal returns. Therefore, we would not necessarily expect a strong relationship, but merely that the correlation is significant and positive.
We examined the relationships with our individual SCRQoL items using chi-squared tests (for unordered or ordered categorical variables) or one-way analysis of variance (for continuous variables). For comparisons with the SCRQoL current, expected and gain scales, we used a series of Pearson correlations with continuous variables and one-way analysis of variance (for unordered or ordered categorical variables).